Google AI can differentiate individual voices in a crowd

Google AI can differentiate individual voices in a crowd
HIGHLIGHTS

Google developers trained a neural network to identify the voices of individual people speaking by themselves and then constructed virtual ‘parties’ with background noise to teach the neural network to isolate the multiple voices into individual audio tracks.

Google’s AI is inching closer to acting like humans. The developers at Google have developed a solution that uses deep learning to pick out human voices from a crowd by look at people’s faces while they’re talking.

Essentially, the developers trained a neural network to identify the voices of individual people speaking by themselves and then constructed virtual ‘parties’ with background noise to teach the neural network to isolate the multiple voices into individual audio tracks.

Following the method, the AI can recognise the voice of one person by just focusing on their face. It works even if the face is obscured with their hands or a microphone. Google demonstrated the results through a clip on YouTube and the results are uncannily accurate.

You might soon be able to see the AI at work. Google is “exploring opportunities” to use the feature in video chat apps like Hangouts and Duo which could help the AI to learn even more. It could also help users understand people better in a crowded room. Furthermore, it could help in speech enhancement and camera-linked hearing aids that can boost the sound of whoever’s speaking in front of you.

There is also a case of potential privacy violation as the feature can be used to publicly eavesdrop on people. Google could bypass that by allowing the feature to work only for people who give their consent.

Digit NewsDesk

Digit NewsDesk

Digit News Desk writes news stories across a range of topics. Getting you news updates on the latest in the world of tech. View Full Profile

Digit.in
Logo
Digit.in
Logo