Chapter 5
Singing and Background Accompaniment Separation
1. DAMP-VSEP samples to illustrate the challenges
Sample 1
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 2.5 seconds.
| Track | |
|---|---|
| Mixture | |
| Background | |
| Vocal |
Sample 2
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 1 seconds.
| Track | |
|---|---|
| Mixture | |
| Background | |
| Vocal |
Sample 3
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 1 seconds.
- Background source with vocal from the original artist.
| Track | |
|---|---|
| Mixture | |
| Background | |
| Vocal |
2. Recordings used to make the figures.
Figure 5.2 in page 131
| Track | |
|---|---|
| Mixture | |
| Vocal Target | |
| English | |
| English+Duets | |
| English+nonEnglish |
Figure 5.3 in page 132
| Track | |
|---|---|
| Vocal Target | |
| Original | |
| Remix |
Figure 5.4 in page 135
| Track | |
|---|---|
| Mixture | |
| Vocal Target | |
| Baseline | |
| Composite Loss |
Figure 5.5 in page 136
| Track | |
|---|---|
| Mixture | |
| Vocal Target | |
| Composite Loss |
3. Samples of separation results using instrumental embeddings.
| Track | |
|---|---|
| Mixture | |
| Target | |
| Baseline | |
| VGGish | |
| VGGish 2pass | |
| X-vectors | |
| X-vectors 2pass |