Google’s “new approach” to hand and finger tracking could radically change the way sign language is interpreted via smartphones, according to a recent announcement.
In a post on Google’s AI Blog, researchers Valentin Bazarevsky and Fan Zhang revealed the tech giant is using machine learning to accurately track difficult hand and finger movements.
“This approach provides high-fidelity hand and finger tracking by employing machine learning (ML) to infer 21 3D keypoints of a hand from just a single frame,” the blog post read. “Whereas current state-of-the-art approaches rely primarily on powerful desktop environments for inference, our method achieves real-time performance on a mobile phone, and even scales to multiple hands.”
The software has been implemented in MediaPipe, an open-source platform for building pipelines to process perceptual data. The simple design of the tech means that this can be run easily on mobile devices. Previously, software such as this has been run on PCs; which although effective, is restrictive.
Traditionally, tracking hand and finger movements has been difficult due to differences in the speed of movements and hand sizes on a person-to-person basis. Similarly, fingers can often prove to be obstructive when tracking movements.
- New coding initiative to benefit Aberdeen woman and schoolchildren
- World’s oldest webcam to be switched off after 25 years
Google’s tech recognises the size and angle of a person’s palm and imposes a graph on 21 points across the fingers, palm and back of the hand. This makes it easier for the software to understand a hand gesture and predict certain positioning of fingers and knuckles.
“On top of the predicted hand skeleton, we apply a simple algorithm to derive the hand gestures,” Google said. “First, the state of each finer, eg bent or straight, is determined by the accumulated angles of joints.”
From here, the technology maps the set of finger states to a set of pre-defined gestures, the blog post added. This “straightforward yet effective” technique allows them to estimate basic static gestures with “reasonable quality”.
The process of teaching this tech to identify where fingers and knuckles could be for different gestures was extensive, the researchers said, with Google manually annotating 30,000 images.
While a dedicated app has not yet been built, researchers said, the company has published algorithms in an attempt to encourage developers to create their own.
“We plan to extend this technology with more robust and stable tracking, enlarge the amount of gestures we can reliably detect, and support dynamic gestures unfolding in time,” the researchers said. “We believe that publishing this technology can give an impulse to new creative ideas and applications by the members of the research and developer community at large. We are excited to see what you can build with it.”