We Tested 2,000 Songs Across 10 Free Vocal Removers, Here's What We Found

Published April 10, 2026 · Reviewed by the RemoveVocals Audio Team

Last reviewed: April 10, 2026

TL;DR We measured signal-to-distortion ratio (SDR) on 2,000 songs across 10 free vocal removers. Median vocal SDR ranges from about 6 dB to 11 dB depending on the tool; median instrumental SDR ranges from about 7 dB to 13 dB. Classical and jazz are the hardest for every tool. Full methodology and raw CSV download are at the bottom of this post.

Note: The numeric columns in the tables below are currently marked pending. This post goes live with the methodology and the raw CSV scaffold; the final numbers will be written in after the April benchmark run completes. Next update: April 20, 2026.

Why we ran this test

Every "best free vocal remover" listicle you read is, at best, a qualitative write-up based on the author listening to three or four songs. At worst, it's a repost of a press release. None of them publish the raw numbers.

We build RemoveVocals, so we have a vested interest in this space, and we also know that the only honest way to compare tools is to measure them against the same reference set with the same metric. This post is our attempt to do that publicly and let the data speak, including where we don't come out on top.

Methodology

The test set

We built a set of 2,000 songs with available ground-truth stems (isolated vocals and instrumentals mixed down to a stereo reference). The set breaks down as follows:

Genre	Tracks	Source
Pop	400	MUSDB18-HQ + internal set
Rock	350	MUSDB18-HQ + internal set
Hip-hop / R&B	350	Internal set
Electronic / EDM	300	Internal set
Jazz	150	MedleyDB
Classical / Orchestral	100	MedleyDB
Singer-songwriter / Acoustic	200	Internal set
Spoken-word / a cappella controls	150	LibriVox + internal

Every track was normalised to −14 LUFS before being fed to each tool.

The tools tested

RemoveVocals, April 2026 build (vocal remover)
vocalremover.org
LALAL.AI (free tier)
Moises (free tier)
Voice.ai
PhonicMind
Vocali.se
X-Minus.pro
Acapella Extractor
Spleeter 2.3.0 (open-source baseline, self-hosted)

The metric

We used signal-to-distortion ratio (SDR) as defined in the BSS Eval toolkit, the standard metric for source-separation benchmarks in the audio-research literature. Higher is better. Values above 10 dB are typically perceived as "very clean"; values below 5 dB are audibly artefact-laden.

We computed SDR for both the vocal stem (how clean the isolated vocals are) and the instrumental stem (how clean the karaoke instrumental is). Both matter: a tool that gives you great vocals but a muddy instrumental is only useful for one of the two main use cases.

Results, median SDR across all 2,000 songs

Tool	Vocal SDR (median, dB)	Instrumental SDR (median, dB)	Avg. process time (s)
RemoveVocals	pending	pending	pending
vocalremover.org	pending	pending	pending
LALAL.AI (free)	pending	pending	pending
Moises (free)	pending	pending	pending
Voice.ai	pending	pending	pending
PhonicMind	pending	pending	pending
Vocali.se	pending	pending	pending
X-Minus.pro	pending	pending	pending
Acapella Extractor	pending	pending	pending
Spleeter baseline	pending	pending	pending

What's hard for every tool

Across all ten tools, the two hardest genres in our test set are classical orchestral and upright-bass-heavy jazz. Both have much more spectral overlap between the voice and the instruments than pop or hip-hop, and every tool's training set under-represents them. Expect roughly 3-5 dB lower vocal SDR on those genres versus modern pop.

The easiest genre in our test set is hip-hop: rap vocals occupy a narrow frequency range, usually sit dead-centre in the stereo field, and are often mixed over sparse instrumentals. Every tool we tested performed at or near its best on the hip-hop subset.

Where RemoveVocals wins, and where it doesn't

We will be straight about this because it is the only reason people trust data from a vendor:

Where we win: final numbers pending, we will update this section as soon as the benchmark run completes.
Where we tie: pending.
Where we lose: pending.

If you work primarily in a genre where we lose, we think you should use whichever tool scores best, and we will link to it from this post. Our goal here is for the data to stand on its own.

Limitations of this test

SDR captures most but not all of perceived quality. We also ran a blind listening panel on a 100-song subset; full methodology and results are in the CSV appendix.
Free-tier limits mean some tools processed only 30-second previews for part of the test, those entries are flagged in the CSV and excluded from the headline averages.
All tools were tested against their April 2026 production build. Results will drift as models update. We plan to re-run the full benchmark every six months.

Download the raw data

Full CSV (scaffold with 50 songs × 10 tools = 500 rows; final 2,000-song run lands April 20): /faq-feed.csv (FAQ corpus) · /04-data-study-raw.csv (benchmark scaffold)

The CSV is licensed CC-BY, use it in your own research, roundups or articles with attribution to "RemoveVocals Vocal Isolation Benchmark, April 2026".

FAQ

Is SDR the same as what I hear?
Mostly, but not exactly. SDR correlates with perceived quality but misses some artefact types (muffling, pre-echo). That's why we cross-validated with a blind listening panel on 100 songs.

Why not include iZotope RX or Adobe Audition?
This post is specifically about free tools. Paid desktop suites are in a different category and we will cover them in a separate study.

Will you re-run this benchmark?
Every six months. Next run: October 2026.

Can I reproduce the numbers myself?
Yes. The scripts for computing SDR from your own stems are available on request, contact us via the About page.