Project
_

Microsoft Speech 

Tools
_

Cinema 4D
Octane Render
After Effects

Synopsis
_

Microsoft has figured out real-time conversation transcription, revealing a new Azure-integrated conical reference design speaker along with a way to turn every phone and laptop in a meeting into an ad-hoc voice recognition array. 

Microsoft’s Conversation Transcription demo wows as new hardware revealed

Voice to text isn’t difficult, but trying to keep track of a conversation complete with overlapping speech is much harder. That’s the nut which Microsoft says it has cracked, showing off a new Conversation Transcription system at Build 2019 this week. It massages the existing Azure Speech Service to support a combination of real-time, multi-person, far-field speech transcription and speaker attribution.

Everybody’s speaking, Azure is listening

princeton_cam
princeton_black_cam
conbersation_thumb

Microsoft has figured out real-time conversation transcription, revealing a new Azure-integrated conical reference design speaker along with a way to turn every phone and laptop in a meeting into an ad-hoc voice recognition array. The Build 2019 demo highlighted how a combination of edge devices and cloud processing could better work in harmony, as well as potential improve future smart speakers that could understand multiple commands and do away with the wake-word.

In the Build 2019 demo, a meeting device was able to track multiple people talking and not only correctly transcribe them, but do so even during periods of “cross-talk.” It uses both audio and video signals, with audio-visual fusion to help identify who is saying what. The edge device isn’t responsible for the processing, unsurprisingly: instead, the data crunching is all done in the Azure cloud.

Cognition_New_Hire_Book16
princeton_black_wide
princeton_black_v2_buttons
princeton_black_light

Copyright
_

Shitty Renders LLC

Contact
_


work@shittyrenders.com