What is the Gesture Authoring Tool?
In our last post, we explored the question, “what is a gesture?”. In this follow-up post, we are excited to announce the availability of Omek’s Gesture Authoring Tool (GAT, for short). This represents just one of a set of tools that fall under the Beckon Usability Framework, all intended to help make development with Beckon that much easier and faster. In this post we will go into a bit of detail on how it works and why it’s so important for the field of gesture recognition. Look out for future posts for details on additional components of our Beckon Usability Framework such as our Beckon Motion Toolkit for Unity3D, C# Extension, and more!
GAT is an easy-to-use, yet highly sophisticated tool that dramatically speeds up your development cycle and makes creating custom gestures accessible to anyone, even without any knowledge of how to code.
How does it work?
To create a custom gesture, you record examples of different people performing the gesture; GAT applies its advanced machine-learning algorithms to automatically “learn” to recognize that gesture. This frees you from the burden of analyzing and coding gestures manually, while offering the additional benefit of producing more accurate gestures than you would get with manual coding.
The GAT is available as an immediate and free download from our support portal.
Why do we think GAT is so great?
Coding gestures can be a complicated undertaking but GAT simplifies the process:
- Gestures are subjective. Ask three different people to wave hello and you are likely to get three different types of waves. Each person may hold their arm at a slightly different height or angle, or they may wave at different speeds. If you define your gesture too closely to one specific model (as often happens with manually-coded gestures), your application will fail to recognize variations of that gesture when performed by different people.
- GAT learns by example. The main idea of training gestures in GAT is to record several examples of a gesture, performed by different people. GAT then applies a machine learning algorithm that analyzes motion features, and learns to detect a specific gesture from among other gestures and movements.
- People are sized differently.For example, the lady writing this blog post is on the shorter side of the spectrum. The range of motion may vary widely when people of different heights perform the gesture (say a child or a taller than average individual) and so you risk the gesture not being identified correctly.
- Normalized skeleton. Before GAT analyzes a gesture, it first normalizes the tracked skeleton to a standard set of dimensions. This eliminates most of the problems related to people of different sizes.
- Gestures are subtle.Small movements can have very different meanings, so we understand the importance of accurately recognizing a gesture for what it is.
- Gesture Packs. GAT allows you to create a “pack” of several gestures that you want to create for your application. GAT can then train all these gestures together, enabling it to differentiate among them even if they are similar.
- Reports and Iterative Improvement. When you train gestures in GAT, it will show you a statistical report of the results, detailing the accuracy of the gesture recognition and the instances where errors occurred. You can then modify or add to your examples, rerun the training process and see how you’ve improved.
- Mirror Gestures. Once you’ve defined a gesture, you can create its “mirror gesture” by clicking a single check-box. For instance, if you’ve defined a “right swipe” gesture, you can automatically create a “left swipe” gesture, without having to train the mirror gesture.
- Composite Gestures. You can create a “composite gesture” by defining a sequence of two or more basic gestures that you’ve already defined.
- Live Test. Once you’ve trained your gestures, you can test them in Live Camera mode. While watching a person’s movements in the GAT viewing pane, you will see gestures detected in real-time.
- Display Options. GAT supports a rich set of options for displaying person tracking information. These include: 2D and 3D skeleton tracking, color and depth images, joint and bone display overlays.