Here’s another interesting project: a computer was taught how to read lips with a 93% accuracy rate. People trained in lip reading don’t get any higher than 52% (apparently is more difficult than movies suggest).
Calling this artificial intelligence system a “computer” is a bit of understatement. The system was trained to read lips not by standard programming, taking singular sounds and encoding their particularities, but by exposing it to tens of thousands of video recordings. The system was automatically able to analyze vocal and face patterns and auto-learn how to read lips.
The usage of this system is wide and especially important for digitizing information, just like the book scanner I’ve presented a few days back. Using this technology, many videos would have a text version that’s fully indexable by search engine. This would work particularly well for TED Talks, vlogs and speeches. Moving further, with advanced enough algorithms, the system could automatically write subtitles for any movie, in any language.
Universal translation tools are no longer something of Sci-Fi movies; we now have all the required techs to make it real, just that they need to be further refined and integrated.
A similar machine learning process has been developed for Xbox Kinect, the system being exposed to tens of thousands of photos regarding people in various positions.