Chapter 5
Singing and Background Accompaniment Separation
1. DAMP-VSEP samples to illustrate the challenges
Sample 1
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 2.5 seconds.
Track | |
---|---|
Mixture | |
Background | |
Vocal |
Sample 2
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 1 seconds.
Track | |
---|---|
Mixture | |
Background | |
Vocal |
Sample 3
- Mixture provided contain proprietary studio effect (non-linear operation).
- Vocal and background sources are shifted by 1 seconds.
- Background source with vocal from the original artist.
Track | |
---|---|
Mixture | |
Background | |
Vocal |
2. Recordings used to make the figures.
Figure 5.2 in page 131
Track | |
---|---|
Mixture | |
Vocal Target | |
English | |
English+Duets | |
English+nonEnglish |
Figure 5.3 in page 132
Track | |
---|---|
Vocal Target | |
Original | |
Remix |
Figure 5.4 in page 135
Track | |
---|---|
Mixture | |
Vocal Target | |
Baseline | |
Composite Loss |
Figure 5.5 in page 136
Track | |
---|---|
Mixture | |
Vocal Target | |
Composite Loss |
3. Samples of separation results using instrumental embeddings.
Track | |
---|---|
Mixture | |
Target | |
Baseline | |
VGGish | |
VGGish 2pass | |
X-vectors | |
X-vectors 2pass |