Responsiveness of Audio files to Vocal isolation...
As of late it dawns upon me, that not every extraction model is suited for every song.
(In the beginning, I utilized Spleeter, but that is barely relevant nowadays)
The “goto” suite imo is “UVR5” a.k.a. Ultimate Vocal Remover 5. however even that has it's demerits, as it will cause voice shadows in certain language dubs (german f.ex) or crackles in frequency overly heavy song sections.
There exist other methods, which are applicable depending on the each track. To name a model with an example of where it worked well:
DemucsV3: “Kaiba Mokuba Conv” ( youtu.be/watch?v=BlYwpj1k7iI ) => Was able to lift even german vocals away from the mixdown without harming the result “Reversal” ( youtu.be/Gyyd1lSVLU0 ) => left the instruments unharmed where all other methods struggled (!)
DANNA Sept: “KKJ Transformation 2” => Could split the source track into many stems, allowing to reapply chanting in the beginning (Which UVR5 f.ex. would eradicate)
DNR: “Sad Historie” ( youtu.be/watch?v=UERPGzUwK-Q ) and “Champion of Smiles” ( youtu.be/watch?v=0GK4qyI58JM ) => Excellent in removing Soundeffects post splice and even UVR5 usage. ==> Was able to remove “Sad Historie's” rain sfx and the keyboard typing in CoS almost perfectly
MDX-B: “Consolation” ( youtu.be/watch?v=DbF2pl9PkqQ ) => Very rare, almost “original quality” display of vox removal (sfx remain for now)
So, as it seems, there is no “magic bullet” for getting rid of vocals before resplicing. (Some songs actually depnd on vox removal prior repiecing, as there is almost nothing to work with in the original state)
I personally prefer to work with unaltered audio material, directly from the show in question, but in the end, sometimes one gotta do what one gotta do.
- diarykeeper
Edit: 4.Oct.2022 For a ranking on models see: paperswithcode.com/sota/music-source-separation-on-musdb18 mvsep.com/quality_checker/leaderboard.php