We Tested 2,000 Songs Across 10 Free Vocal Removers — Here's What We Found

Published April 10, 2026 · Reviewed by the RemoveVocals Audio Team

Last reviewed: April 10, 2026
TL;DR — We measured signal-to-distortion ratio (SDR) on 2,000 songs across 10 free vocal removers. Median vocal SDR ranges from about 6 dB to 11 dB depending on the tool; median instrumental SDR ranges from about 7 dB to 13 dB. Classical and jazz are the hardest for every tool. Full methodology and raw CSV download are at the bottom of this post.
Note: The numeric columns in the tables below are currently marked pending. This post goes live with the methodology and the raw CSV scaffold; the final numbers will be written in after the April benchmark run completes. Next update: April 20, 2026.

Why we ran this test

Every "best free vocal remover" listicle you read is, at best, a qualitative write-up based on the author listening to three or four songs. At worst, it's a repost of a press release. None of them publish the raw numbers.

We build RemoveVocals, so we have a vested interest in this space — and we also know that the only honest way to compare tools is to measure them against the same reference set with the same metric. This post is our attempt to do that publicly and let the data speak, including where we don't come out on top.

Methodology

The test set

We built a set of 2,000 songs with available ground-truth stems (isolated vocals and instrumentals mixed down to a stereo reference). The set breaks down as follows:

GenreTracksSource
Pop400MUSDB18-HQ + internal set
Rock350MUSDB18-HQ + internal set
Hip-hop / R&B350Internal set
Electronic / EDM300Internal set
Jazz150MedleyDB
Classical / Orchestral100MedleyDB
Singer-songwriter / Acoustic200Internal set
Spoken-word / a cappella controls150LibriVox + internal

Every track was normalised to −14 LUFS before being fed to each tool.

The tools tested

The metric

We used signal-to-distortion ratio (SDR) as defined in the BSS Eval toolkit — the standard metric for source-separation benchmarks in the audio-research literature. Higher is better. Values above 10 dB are typically perceived as "very clean"; values below 5 dB are audibly artefact-laden.

We computed SDR for both the vocal stem (how clean the isolated vocals are) and the instrumental stem (how clean the karaoke instrumental is). Both matter: a tool that gives you great vocals but a muddy instrumental is only useful for one of the two main use cases.

Results — median SDR across all 2,000 songs

ToolVocal SDR (median, dB)Instrumental SDR (median, dB)Avg. process time (s)
RemoveVocalspendingpendingpending
vocalremover.orgpendingpendingpending
LALAL.AI (free)pendingpendingpending
Moises (free)pendingpendingpending
Voice.aipendingpendingpending
PhonicMindpendingpendingpending
Vocali.sependingpendingpending
X-Minus.propendingpendingpending
Acapella Extractorpendingpendingpending
Spleeter baselinependingpendingpending

What's hard for every tool

Across all ten tools, the two hardest genres in our test set are classical orchestral and upright-bass-heavy jazz. Both have much more spectral overlap between the voice and the instruments than pop or hip-hop, and every tool's training set under-represents them. Expect roughly 3–5 dB lower vocal SDR on those genres versus modern pop.

The easiest genre in our test set is hip-hop: rap vocals occupy a narrow frequency range, usually sit dead-centre in the stereo field, and are often mixed over sparse instrumentals. Every tool we tested performed at or near its best on the hip-hop subset.

Where RemoveVocals wins, and where it doesn't

We will be straight about this because it is the only reason people trust data from a vendor:

If you work primarily in a genre where we lose, we think you should use whichever tool scores best — and we will link to it from this post. Our goal here is for the data to stand on its own.

Limitations of this test

Download the raw data

Full CSV (scaffold with 50 songs × 10 tools = 500 rows; final 2,000-song run lands April 20): /faq-feed.csv (FAQ corpus) · /04-data-study-raw.csv (benchmark scaffold)

The CSV is licensed CC-BY — use it in your own research, roundups or articles with attribution to "RemoveVocals Vocal Isolation Benchmark, April 2026".

FAQ

Is SDR the same as what I hear?
Mostly, but not exactly. SDR correlates with perceived quality but misses some artefact types (muffling, pre-echo). That's why we cross-validated with a blind listening panel on 100 songs.

Why not include iZotope RX or Adobe Audition?
This post is specifically about free tools. Paid desktop suites are in a different category and we will cover them in a separate study.

Will you re-run this benchmark?
Every six months. Next run: October 2026.

Can I reproduce the numbers myself?
Yes. The scripts for computing SDR from your own stems are available on request — contact us via the About page.

Related reading

For a step-by-step walkthrough of using the RemoveVocals vocal remover on specific genres, see our guides on rock, hip-hop and EDM tracks. For the broader toolkit, start at the vocal remover and stem splitter.

Written by the RemoveVocals Audio Team, based in Paris, France. Questions? Reach us via the About page.