中文版
English
博士後研究  |  蘇黎  
 
contact
education
experience
interests
descriptions
activities
invited_talk
invited_visit
publications
Personal (New window)
lab (New window)
 
 
 
 
 
Research Descriptions
 

Content analysis of polyphonic music is arguably one of the most challenging tasks in computer audition, as it tackles the complexity of sound mixtures with overlapping harmonics components spreading over a wide frequency range. This project proposes two research directions to overcome these hurdles.

The first is to find novel signal representations for music by using advanced time-frequency analysis and complex-valued signal processing techniques. The second is to design robust feature learning methods based on dictionary learning and sparse coding techniques. From these viewpoints we focus on challenging MIR problems, including soft onset detection, playing technique classification, multipitch estimation and multiple instrument recognition. The findings from these investigations are hoped to improve machine understanding of musical signals in a semantic sense, and will benefit various applications including automatic transcription, online music recommendation, source separation, performance analysis, computational musicology, and music education.

In the followings, my research projects are described in three folds, namely expression-level music content analysis, sparse representations for music signals, and automatic music transcription.

Expression-level music content analysis
Expressiveness is a crucial factor which makes music interpreted by musicians different from series of robotic notes. It colors the emotional, musicological as well as cultural contents of music. Understanding expressiveness can also help in music education and interactive entertainment. I am interested in some expressive-level concepts including playing techniques (of bowed string instruments and guitar) and playing faults, which can be explicitly related to signal-level representations.
Sparse representations for music signals 
This project investigates sparse representation techniques to obtain a detailed representation of music using (possibly large scale) dictionaries. Envisioned applications include music classification/tagging, multi-pitch extraction, singer identification, audio source separation, amongst others.For example, in 2013 we found that group-delay function (GDF) and instantaneous frequency deviation (IFD) are very informative quantities in characterizing the audio signal contents which conventional feature might misses. With our feature learning system, these phase-derived features are verified helpful in playing technique classification and singer identification, both of which are considered difficult using conventional approach. Our sparse coding scheme for MIR problems is now open-source. 

Automatic music transcription 
Automatic musical transcription is a process converting an audio music signal into some form composed of musical notes, instruments, and even information of expression. Ultimately we wish a system which can convert an audio musical signal into scoresheet. This task is challenging since most of our real-world music is polyphonic. In this case, to transcribe all concurrent pitches with accurate onsets and offsets respectively remains a fundamental challenge. Therefore I am interested in issues on multipitch estimation (MPE), onset detection, offset detection, and some more specific topics such as octave dual-tone detection.

 
 
bg