Keynote presentation summaries and speakers bio-sketches
Multi-Microphone Speaker Localization on Manifolds: Achievements and Challenges
Speech enhancement is a core problem in audio signal processing, with commercial applications in devices as diverse as mobile phones, conference call systems, hands-free systems, or hearing aids. An essential component in the design of speech enhancement algorithms is acoustic source localization. Speaker localization is also directly applicable to many other audio related tasks, e.g. automated camera steering, teleconferencing systems and robot audition.
From a signal processing perspective, speaker localization is the task of mapping multichannel speech signals to 3-D source coordinates. To accomplish viable solutions to this mapping, an accurate description of the source wave propagation, captured by the respective acoustic channel, is required. The acoustic channels in reverberant environments represent a complex reflection pattern stemming from the surfaces and objects characterizing the enclosure. Hence, they are usually modelled by a very large number of coefficients, resulting in an intricate high-dimensional representation.
We start our talk, by analyzing these acoustic responses with nonlinear dimensionality reduction techniques (diffusion maps). We claim that in static acoustic environments, despite the high dimensional representation, the difference between acoustic channels is mainly attributed to the changes in the source position. Thus, the true intrinsic dimensions of the variations of the acoustic channels are significantly fewer than the number of variables commonly used for their representation, namely, they pertain to a low-dimensional manifold that can be inferred from data collected in a training stage. This claim is validated by a comprehensive experimental study in actual acoustic environments.
Motivated by this result, we present a data-driven and semi-supervised source localization algorithm based on two-microphone measurements, which accurately recovers the inverse mapping between the acoustic samples and their corresponding locations. The gist of the algorithm is based on the concept of manifold regularization in a reproducing kernel Hilbert space (RKHS), which extends the standard supervised estimation framework by adding an extra regularization term, imposing a smoothness constraint on possible solutions with respect to a manifold learned in a data-driven manner.
We then show that the mapping operator between the acoustic channel and the source location can be estimated using a Bayesian inference framework, and discuss the analogy between this Bayesian approach and the manifold regularization approach. This Bayesian framework serves as a corner stone for extending the single node (microphone pair) setup to an ad hoc network of microphone pairs. Each node represents a different viewpoint that may be associated with a specific manifold. Merging the different manifolds is shown to increase the spatial separation and to improve the ability to accurately localize the source.
We conclude the talk by discussing future challenges, e.g. source tracking, multiple sources localization and speech enhancement.
Sharon Gannot received his B.Sc. degree (summa cum laude) from the Technion Israel Institute of Technology, Haifa, Israel in 1986 and the M.Sc. (cum laude) and Ph.D. degrees from Tel-Aviv University, Israel in 1995 and 2000 respectively, all in Electrical Engineering. In 2001 he held a post-doctoral position at the department of Electrical Engineering (ESAT-SISTA) at K.U.Leuven, Belgium. From 2002 to 2003 he held a research and teaching position at the Faculty of Electrical Engineering, Technion-Israel Institute of Technology, Haifa, Israel. Currently, he is a Full Professor at the Faculty of Engineering, Bar-Ilan University, Israel, where he is heading the Speech and Signal Processing laboratory and the Signal Processing Track.
Prof. Gannot has served as an Associate Editor of the EURASIP Journal of Advances in Signal Processing in 2003-2012, and as an Editor of several special issues on Multi-microphone Speech Processing of the same journal. He has also served as a guest editor of ELSEVIER Speech Communication and Signal Processing journals.
Prof. Gannot has served as an Associate Editor of IEEE Transactions on Speech, Audio and Language Processing in 2009-2013. Currently, he is a Senior Area Chair of the same journal. He also serves as a reviewer of many IEEE journals and conferences.
Prof. Gannot research interests include multi-microphone speech processing and specifically distributed algorithms for ad hoc microphone arrays for noise reduction and speaker separation; dereverberation; single microphone speech enhancement and speaker localization and tracking.
Latent variable models for the inference of ancestry and natural selection in Humans
Estimating genetic ancestry and understanding the action of natural selection in genomes are major objectives of evolutionary biology and have important implications for human health. Latent variable models have a very long tradition in this scientific area, and they have become one of the most prominent techniques in analyzing massive genomic data sets. This talk will survey the history of application of latent variable models in population genetics, emphasizing the connection between statistical factors and the biological notion of genetic ancestry. It will provide key insights on recent developments in the field, and their relationships to parallel developments in other fields such as text or image analysis. My presentation will provide several examples of latent variable modeling in population genetics with applications to detecting fine-scale population structure and targets of natural selection in Humans.
José Bioucas Dias
Blind Hyperspectral Unmixing
Hyperspectral cameras (HSCs) acquire spectral vectors with hundreds or thousands of components from each pixel in a surface or scene. The wide spectral coverage and high spectral resolution of these instruments enable fine material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis.
However, due to the usual low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, the spectra measured by HSCs are mixtures of spectra of the materials present in the scene (termed endmembers) under analysis. Hyperspectral unmixing (HU) is a blind source separation problem, which aims at estimating the number of endmembers, their spectral signatures, and the material abundances at each pixel. HU is a challenging ill-posed inverse problem owing to model inaccuracies, highly correlated spectral signatures, mixing nonlinearities, observation noise, atmospheric perturbations, and endmember variability.
In this talk I will review the blind HU inverse problem, which is formally similar to blind source separation problems that appear in, namely, chemometrics, topic modeling, and audio. I will address key developments, which include pure pixel search, convex geometry, dictionary-based sparse regression, and nonnegative matrix factorization. Mathematical problems and potential solutions are described from a viewpoint of signal processing theory and methods. Algorithm characteristics are illustrated.
José Bioucas-Dias received the EE, MSc, PhD, and Habilitation degrees from Instituto Superior Técnico (IST), Portugal, in 1985, 1991, 1995, and 2007, respectively, all in electrical and computer engineering. Since 1995, he has been with the Department of Electrical and Computer Engineering, IST, where he is an Associate Professor. He is also a Senior Researcher with the Pattern and Image Analysis group of the Instituto de Telecomunicações, which is a private non-profit research institution.
Jose Bioucas-Dias has introduced scientific contributions in inverse problems, signal and image processing, pattern recognition, optimization, and remote sensing. He has been involved in several IEEE editorial activities and has been a member of program/technical committees of several international conferences. He is an IEEE Fellow and was included in Thomson Reuters' Highly Cited Researchers 2015 list.