■ Speaker #1: Prof. Shoichi Koyama (The University of Tokyo, Japan)
- Sound Field Analysis and Synthesis: Theoretical Advances and Applications to Spatial Audio Reproduction
Sound field analysis and synthesis are fundamental techniques in spatial acoustic signal processing, which are aimed at estimating/synthesizing an acoustic field by a discrete set of microphones/loudspeakers. These techniques are essential in visualization/auralization of a sound field, VR/AR audio, creating personal sound zones, canceling noise in a regional space, and so forth. Conventional techniques are largely based on boundary integral representations of the Helmholtz equation, such as Kirchhoff-Helmholtz and Rayleigh integrals. In recent years, machine learning techniques incorporating characteristics of acoustic fields, which are referred to as wavefield-based machine learning (WBML), have evolved in this research field. WBML has the potential to further enhance the performance and applicability of sound field analysis and synthesis. In this overview talk, we will introduce these recent advancements. Specifically, kernel methods for sound field estimation and their application to spatial audio reproduction will be highlighted.
- Shoichi Koyama received the B.E., M.S., and Ph.D. degrees from the University of Tokyo, Tokyo, Japan, in 2007, 2009, and 2014, respectively. In 2009, he joined Nippon Telegraph and Telephone (NTT) Corporation, Tokyo, Japan, as a Researcher in acoustic signal processing. In 2014, he moved to the University of Tokyo and has been a Lecturer since 2018. From 2016 to 2018, he was also a Visiting Researcher with Paris Diderot University (Paris 7), Institut Langevin, Paris, France. His research interests include audio signal processing, acoustic inverse problems, and spatial audio. He was the recipient of Itakura Prize Innovation Young Researcher Award by the Acoustical Society of Japan in 2015, and the Research Award by Funai Foundation for Information Technology in 2018.
■ Speaker #2: Prof. Yusuke Hioka (University of Auckland, New Zealand)
- Audio Signal Processing for Unmanned Aerial Vehicles Audition
Along with the rapid advancement of its technologies and expanding abilities, new applications of unmanned aerial vehicles (UAV, a.k.a. drones) have been actively explored over the last decade. One of such emerging applications of UAV is its use for recording sound, i.e. equipping UAVs with “auditory” function, which has a huge potential to deliver both commercial and societal benefits through various industries and sectors, such as filming/broadcasting, monitoring/surveillance, and search/rescue. However, a challenge with the UAV audition is the extensive amount of noise generated by the UAV’s propellers, known as ego noise, significantly deteriorating the quality of sound recorded on UAV. Research in signal processing for better UAV audition has been actively conducted to achieve better auditory function by addressing this challenge. This talk will overview recent studies on audio signal processing for UAV audition, with a particular focus on techniques to emphasise sound from the target source while minimising the propeller noise. The talk will also introduce a case study where such technology is applied to commercial products.
- Yusuke Hioka is a Senior Lecturer at the Acoustics Research Centre of the Department of Mechanical Engineering, the University of Auckland, Auckland, New Zealand. He received his B.E., M.E., and Ph.D. degrees in engineering in 2000, 2002, and 2005 from Keio University, Yokohama, Japan. From 2005 to 2012, he was with the NTT Cyber Space Laboratories, Nippon Telegraph and Telephone Corporation (NTT) in Tokyo. From 2010 to 2011, he was also a visiting researcher at Victoria University of Wellington, New Zealand. In 2013 he permanently moved to New Zealand and was appointed as a Lecturer at the University of Canterbury, Christchurch. Subsequently, in 2014, he moved to the current position at the University of Auckland, where he is also the Co-director of the Acoustic Research Centre and leads the Communication Acoustics Laboratory at the Centre. His research interests include audio and acoustic signal processing, room acoustics, human auditory perception and psychoacoustics. He is a Senior Member of the IEEE and a Member of the Acoustical Society of Japan and the Acoustical Society of New Zealand. Since 2016 he has been serving as the Chair of the IEEE New Zealand Signal Processing & Information Theory Chapter.
■ Speaker #3: Prof. Daichi Kitamura (National Institute of Technology, Kagawa College, Japan)
- Blind Audio Source Separation Based on Time-Frequency Structure Models
Blind source separation (BSS) for audio signals is a technique to extract specific audio sources from an observed mixture signal. In particular, multichannel determined BSS has been studied for many years because of its capability: the separation can be achieved by a linear operation (multiplication of a demixing matrix) and the quality of estimated audio sources is much better than that of other non-linear ASS algorithms. Determined BSS algorithms have its roots in independent component analysis (ICA), which assumes the independence among sources and estimates the demixing matrix. Then, ICA was extended to independent low-rank matrix analysis (ILRMA) by introducing a low-rank time-frequency structure model for each source. With the advent of ILRMA, the combination of “demixing matrix estimation for linear BSS” and “time-frequency structure models for each source” has become a reliable approach for audio BSS problems. In this talk, we focus on a new flexible BSS algorithm called time-frequency-masking-based BSS (TFMBSS). In this method, thanks to a model-independent optimization algorithm, arbitrary time-frequency structure models can easily be utilized to estimate the demixing matrix in a plug-and-play manner. In addition to the theoretical basis of this algorithm, some TFMBSS applications combining group sparsity, harmonicity, or smoothness in the time-frequency domain will be reviewed.
- Daichi Kitamura received the Ph.D. degree from SOKENDAI, Hayama, Japan. He joined The University of Tokyo in 2017 as a Research Associate, and he moved to National Institute of Technology, Kagawa Collage as an Assistant Professor in 2018. His research interests include audio source separation, statistical signal processing, and machine learning. He was the recipient of the Awaya Prize Young Researcher Award from The Acoustical Society of Japan (ASJ) in 2015, Ikushi Prize from Japan Society for the Promotion of Science in 2017, Best Paper Award from IEEE Signal Processing Society Japan in 2017, Itakura Prize Innovative Young Researcher Award from ASJ in 2018, and Young Author Best Paper Award from IEEE Signal Processing Society.