Selective Listening: Shaping The Soundscape With Technology
New Technology Lets Users Tune In To Sounds That Matter
Clients of surrounding sound blocking earphones know very well the benefit of hearing the right solid with perfect timing. Envision being engaged in work, wanting to dull the blast of traffic horns while inside, yet expecting to remain caution to those very sounds while exploring a clamoring road. Regardless of headways, earphone wearers have had restricted capacity to channel explicit sounds however they would prefer.
Revolutionizing Personal Audio:
Semantic Hearing The University of Washington’s team has made a breakthrough in audio technology, developing a real-time selective sound filtering system they dub “semantic hearing.” THis cutting-edge system enables headphone users to control sounds that pass through their headphones. This can be done through voice commands or a smartphone application.
Boasting a selection of 20 distinct sounds such as emergency sirens, baby cries,
This technological marvel debuted at the UIST ’23 conference in San Francisco on November 1st, 2023. The inventors of this system are planning to introduce the advanced technology to the consumer market.
Synchronizing Audio with Real-Time Precision:
One significant leap over traditional noise-cancellation devices is the capability for real-time audio processing, which is essential for aligning auditory and visual cues in the user’s environment. The necessity of avoiding delays, such as hearing a voice after the speaker’s lips have moved. This emphasizes the importance of syncing sound with sight. To achieve this, the deep-learning algorithms work at incredible speeds, processing sounds in fractions of a second.
Leading the study:
Senior Author Shyam Gollakota, a distinguished professor at the Paul G. Allen School of Computer Science and Engineering, emphasizes the importance of onboard device processing due to latency challenges. The innovative semantic hearing system utilizes a connected smartphone to process sounds quickly and accurately. It maintains spatial cues crucial for a natural listening experience.
Extensive testing of the system in various environments, such as urban streets, parks, and office spaces. It demonstrated its ability to precisely filter desired sounds and mute extraneous noise. Impressively, a group of 22 individuals evaluated the system’s performance, with many finding the filtered audio superior to the original.
Despite its effectiveness:
The system sometimes struggles to differentiate between similar sounds, such as vocal music versus spoken words. The researchers propose enhancing the accuracy and performance of the models. This could occur through the utilization of more robust and diverse training data.
The project involves notable co-researchers, including Takuya Yoshioka from AssemblyAI and Justin Chan. Justin is now affiliated with Carnegie Mellon University after completing his doctoral work at the Allen School. Additionally, Allen School PhD candidates Bandhav Veluri and Malek Itani are part of the project.
Journal Reference
Veluri, B., et al. (2023) Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. doi:10.1145/3586183.3606779.
Source: https://www.washington.edu/