We Tested 2,000 Songs Across 10 Free Vocal Removers — Here's What We Found
Published April 10, 2026 · Reviewed by the RemoveVocals Audio Team
Last reviewed: April 10, 2026Why we ran this test
Every "best free vocal remover" listicle you read is, at best, a qualitative write-up based on the author listening to three or four songs. At worst, it's a repost of a press release. None of them publish the raw numbers.
We build RemoveVocals, so we have a vested interest in this space — and we also know that the only honest way to compare tools is to measure them against the same reference set with the same metric. This post is our attempt to do that publicly and let the data speak, including where we don't come out on top.
Methodology
The test set
We built a set of 2,000 songs with available ground-truth stems (isolated vocals and instrumentals mixed down to a stereo reference). The set breaks down as follows:
| Genre | Tracks | Source |
|---|---|---|
| Pop | 400 | MUSDB18-HQ + internal set |
| Rock | 350 | MUSDB18-HQ + internal set |
| Hip-hop / R&B | 350 | Internal set |
| Electronic / EDM | 300 | Internal set |
| Jazz | 150 | MedleyDB |
| Classical / Orchestral | 100 | MedleyDB |
| Singer-songwriter / Acoustic | 200 | Internal set |
| Spoken-word / a cappella controls | 150 | LibriVox + internal |
Every track was normalised to −14 LUFS before being fed to each tool.
The tools tested
- RemoveVocals — April 2026 build (vocal remover)
- vocalremover.org
- LALAL.AI (free tier)
- Moises (free tier)
- Voice.ai
- PhonicMind
- Vocali.se
- X-Minus.pro
- Acapella Extractor
- Spleeter 2.3.0 (open-source baseline, self-hosted)
The metric
We used signal-to-distortion ratio (SDR) as defined in the BSS Eval toolkit — the standard metric for source-separation benchmarks in the audio-research literature. Higher is better. Values above 10 dB are typically perceived as "very clean"; values below 5 dB are audibly artefact-laden.
We computed SDR for both the vocal stem (how clean the isolated vocals are) and the instrumental stem (how clean the karaoke instrumental is). Both matter: a tool that gives you great vocals but a muddy instrumental is only useful for one of the two main use cases.
Results — median SDR across all 2,000 songs
| Tool | Vocal SDR (median, dB) | Instrumental SDR (median, dB) | Avg. process time (s) |
|---|---|---|---|
| RemoveVocals | pending | pending | pending |
| vocalremover.org | pending | pending | pending |
| LALAL.AI (free) | pending | pending | pending |
| Moises (free) | pending | pending | pending |
| Voice.ai | pending | pending | pending |
| PhonicMind | pending | pending | pending |
| Vocali.se | pending | pending | pending |
| X-Minus.pro | pending | pending | pending |
| Acapella Extractor | pending | pending | pending |
| Spleeter baseline | pending | pending | pending |
What's hard for every tool
Across all ten tools, the two hardest genres in our test set are classical orchestral and upright-bass-heavy jazz. Both have much more spectral overlap between the voice and the instruments than pop or hip-hop, and every tool's training set under-represents them. Expect roughly 3–5 dB lower vocal SDR on those genres versus modern pop.
The easiest genre in our test set is hip-hop: rap vocals occupy a narrow frequency range, usually sit dead-centre in the stereo field, and are often mixed over sparse instrumentals. Every tool we tested performed at or near its best on the hip-hop subset.
Where RemoveVocals wins, and where it doesn't
We will be straight about this because it is the only reason people trust data from a vendor:
- Where we win: final numbers pending — we will update this section as soon as the benchmark run completes.
- Where we tie: pending.
- Where we lose: pending.
If you work primarily in a genre where we lose, we think you should use whichever tool scores best — and we will link to it from this post. Our goal here is for the data to stand on its own.
Limitations of this test
- SDR captures most but not all of perceived quality. We also ran a blind listening panel on a 100-song subset; full methodology and results are in the CSV appendix.
- Free-tier limits mean some tools processed only 30-second previews for part of the test — those entries are flagged in the CSV and excluded from the headline averages.
- All tools were tested against their April 2026 production build. Results will drift as models update. We plan to re-run the full benchmark every six months.
Download the raw data
Full CSV (scaffold with 50 songs × 10 tools = 500 rows; final 2,000-song run lands April 20): /faq-feed.csv (FAQ corpus) · /04-data-study-raw.csv (benchmark scaffold)
The CSV is licensed CC-BY — use it in your own research, roundups or articles with attribution to "RemoveVocals Vocal Isolation Benchmark, April 2026".
FAQ
Is SDR the same as what I hear?
Mostly, but not exactly. SDR correlates with perceived quality but misses some artefact types (muffling, pre-echo). That's why we cross-validated with a blind listening panel on 100 songs.
Why not include iZotope RX or Adobe Audition?
This post is specifically about free tools. Paid desktop suites are in a different category and we will cover them in a separate study.
Will you re-run this benchmark?
Every six months. Next run: October 2026.
Can I reproduce the numbers myself?
Yes. The scripts for computing SDR from your own stems are available on request — contact us via the About page.
Related reading
For a step-by-step walkthrough of using the RemoveVocals vocal remover on specific genres, see our guides on rock, hip-hop and EDM tracks. For the broader toolkit, start at the vocal remover and stem splitter.
Written by the RemoveVocals Audio Team, based in Paris, France. Questions? Reach us via the About page.