Hi! I am a researcher by profession. On a regular day, I use machine learning techniques to analyze and understand the contents of audio, speech, and text.
I obtained my PhD in 2020 from the Tampere University in Finland, where I was supervised by Prof. Tuomas Virtanen. As part of my doctoral studies, I worked on numerous topics in the domain of computational auditory scene analysis (CASA). Specifically, as my PhD topic, I studied sound event localization, detection, and tracking using deep learning methods.
During my doctoral studies, I spent the summer of 2017 as an intern at Adobe Research, working on the visualization of spatial audio for virtual reality (VR) applications with Gautham Mysore and Stephen DiVerdi. In the winter of 2017, I was awarded the Nokia Scholarship by the Nokia Foundation encouraging my efficient and fast-progressing doctoral research. In the summer of 2018, I interned at Facebook reality labs (FRL), where I mainly collaborated with Vladimir Tourbabin and Haytham Fayek.
Before diving into doctoral studies, I was in the industry for over five years, exploring various research topics related to audio signal processing, music information retrieval (MIR), and speech recognition. At my first company SensiBol, I worked on a multitude of projects related to speech recognition and MIR. Details and demos of which are listed on my demos page. Thereafter, I was working in a media-tech company ZAPR, where we profiled mobile users based on their media consumption using audio-fingerprinting methods and delivered the right advertisements to them.
After my doctoral studies, I worked at ZAPR (again), this time in the conversational AI domain. I was leading a team of researchers that developed multiple modules related to text-to-speech and spoken language understanding of human-to-human conversations.
I am currently organizing the sound event localization and detection task at DCASE 2022, and working as the Technology chair at ISMIR 2022.