Montreux Jazz “ANUBIS” project ‒ LTS5 ‐ EPFL

Spatial filtering tests

Spatial filtering for stereo audio files: (py)DEMIX + Beamformer versus (py)FASST

We used here a Python/NumPy implementation of the DEMIX algorithm to estimate the inter-channel delays and gains [Arberet2010], and FASST for the multichannel NMF algorithm to separate the contributions. We first estimate the spatial parameters (anechoic case) for the audio files using DEMIX. Then, using a beamformer (for instance [Maazaoui2011]) and only these parameters, we compute the corresponding separated sources (“Spatially filtered”). Then, we use these parameters to initialize the spatial parameters of the multichannel NMF model in FASST [Ozerov2012], and re-estimate both the spatial and spectral parameters, before separating the signals, using an adaptive Wiener filter.

Name [(delays bw srcs)]	Algo	Source 1	Source 2
Tamy (10, 9)	original
Tamy (10, 9)	Beamforming	voice	guitar

[Arberet2010] Arberet, S.; Gribonval, R. & Bimbot, F., “A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture”, IEEE Transactions on Signal Processing, 2010, 58, 121 -133.

[Maazaoui2011] Maazaoui, M.; Grenier, Y. and Abed-Meraim, K., “Blind Source Separation for Robot Audition using Fixed Beamforming with HRTFs”, in proc. of INTERSPEECH, 2011.

[Ozerov2012] Ozerov, A.; Vincent, E. & Bimbot, F., “A general flexible framework for the handling of prior information in audio source separation”, IEEE Transactions on Audio, Speech and Signal Processing, 2012, 20, 1118-1133