8 Principles for Designing Gesture-Based Applications

Co-authored by: Alona Lerman, Shachar Oz, and Yaron Yanai

There’s plenty to talk about when it comes to best practices for designing gesture-based applications.  On the blog, we’ve covered a range of topics related to building your application: how augmented reality can be used as a feedback system, the role of ergonomics when designing a practical UI, and how to leverage our Gesture Authoring Tool to create custom gestures in your apps. . .just to name a few.

Future posts will continue to offer ideas and feedback on how to design intuitive applications.  But in this post we offer a few practical tips to consider when designing and testing the usability of your gesture or motion sensing application.

Where to begin? Start with the experts

When we talk about usability, we are referring to the “ease of use and learnability” for your application, game or interface.

There are many excellent resources available to help you with your usability testing.  We recommend a few here but there are obviously many, many more out there in Google just waiting for you!

8 Principles for Designing Gesture-Based Applications & Interfaces
(and a few common mistakes to avoid)

We’ve gathered up a few key points to keep in mind as you get started building your applications. These are principles that we’ve learned along the way, or perhaps reflect mistakes we’ve either made ourselves or have seen others make when building gesture-based apps.  While this list isn’t intended to be a comprehensive, it does touch on foundational points to keep in mind.

  1. Precision is a good thing. . . up to a point.
    When designing the UI for a gesture-based app, it’s better to fall on the side of larger buttons making it easier for users to correctly make a selection.  When you require a great deal of precision it can quickly lead to user fatigue. Our last blog post touches on exactly this point – imagine using your finger instead of a mouse to select a specific “cell” in Excel.  Frustrating, no?

    Design for fingers & hands, not a “mouse”

  2. Don’t model your application on existing user interfaces.  Instead, build on the strengths of gesture and motion tracking technology.
    Traditional user interfaces are based on a WIMP model – windows, icons, menus, pointer. As mentioned above and in our last post, your finger is not a mouse!  So, rather than trying to turn it into one, instead leverage the 3 dimensionality of the hand. For example, imagine an online store where you can pick up an object (how about a shoe!) and turn it around by rotating your hand.
  3. Avoid actions that require your users to lift their hand above the height of their shoulder.
  4. Try to avoid awkward poses. It should be fun to interact with your app!

    You don’t need to have a degree in physical therapy to design gesture-based applications, but it does help to give some thought to ergonomics and how human bodies move naturally.  It can get pretty tiring pretty quickly, and can even be challenging for some users to have to lift their arms high.

    Co. Design featured the work of “Curious Rituals”
    to call out the unnatural postures we make to conform to technology. Gesture recognition is a way to free us from these awkward poses.

  5. Do break up activities into small, short actions.
    Try to keep actions relatively brief with rest periods in between rather than having users move all over the screen.  John Pavlus recently wrote an article for the MIT Tech Review on whether gestural computing breaks Fitts’s Law.  You may want to review a brief primer on Fitts’s Law when considering menu placement – users want to be able to navigate from selection to selection as quickly as possible.
  6. Not all users are right-handed.
    You’ll have to make a judgment call as to whether to design your application to be equally useable for both right and left-handed users. The good thing is that if you work with Omek’s Grasp then you are ensured accurate detection of left vs. right hands. This Grasp feature can prove useful if you are creating an application intended for multiple users sitting next to each other.
  7. Gestures have cultural connotations.
    Gestures can have different meanings in different cultures so be conscious of context when designing your app. For example, the “thumbs up” and “peace” signs both have positive connotations in North America but quite the opposite in Greece and areas of the Middle East, respectively. It may be a good idea to do a quick check on gestures before going live in other countries.
  8. Design your interface to keep users within the “Effective Interaction Zone”.
    Try to avoid placing elements (menu selections) too close to the edge of the FOV of the camera. If you do, users may not realize that their hand has moved outside of view of the camera, causing a frustrating experience for them when their selection doesn’t work.
  9. Test your design with hands and people of all sizes.
    Test, test and test some more before going live. That’s a motto that we try to stand by. And the broader the types of people that you have to test (i.e., tall vs short, large vs. small hands, different sized arms) the better! That way you’ll be sure your app and interface will work just as well for a basketball player as it will for a small-handed lady (like the author of this blog post).

Interested in creating your own close-range gesture-enabled experience? Sign up to be notified of our upcoming Grasp beta release.

Don’t Miss Out! Gesture Recognition + Embedded Vision

Have the following questions been keeping you up at night?

  • What is embedded vision all about?
  • What are practical things I should know if I want to incorporate gesture recognition into embedded systems?
  • What are some of the different technologies used to create depth maps?
  • Why is 3D technology better than other technologies at solving certain computer vision problems?

If you live in the Bay Area (or plan to be there April 25th) then you’re in luck! Omek Interactive CTO, Gershom Kutliroff, will be one of a few esteemed speakers represented at the upcoming Embedded Vision Alliance Summit.

Embedded Vision Summit 2013

What exactly is the Embedded Vision Summit? According to the source, it is “a technical educational forum for engineers interested in incorporating visual intelligence into electronic systems and software.” The Summit is part of the larger DESIGN West event being held next week.

Quick Details:
April 25, 2013
San Jose Convention Center
San Jose, California, U.S.A.
In Conjunction with DESIGN West
Register Now!

Why should you attend?

I caught up with Gershom before he leaves for his trip and asked him to provide us with a few highlights from his presentation. Check in on our blog after April 25th for more details from his talk, including videos showing off what you can accomplish using depth data from 3D cameras vs. a standard RGB camera.

Sneak Preview: Gershom’s Talking Points

In addition to touching on the questions listed at the beginning of this post, Gershom will offer analytical thinking about 3D data, explaining how it is inherently different from 2D data. He will explain how these differences drive the need for different algorithms to be constructed in order to understand the data that comes from depth sensors.  Gershom argues that these algorithms are the basis for a more fundamental paradigm shift that has broad implications which go beyond the depth sensor.

Using a case study to illustrate his points, Gershom will show how these 3D-specific algorithms cascade down the value chain. Ultimately, he argues, different software libraries and different hardware architectures will be required — ones which are optimized to support these new algorithms.

Gershom will provide you with key insights into how algorithms for depth cameras are constructed. These  ideas will help form the basis of thinking on how to best design software systems and construct your hardware architecture so that it is optimally designed to run these new software library systems.

What does that mean for you? Well, whether you are a software developer building 3D-based applications or a hardware manufacturer interested in learning how to better support emerging 3D cameras, this session is for you.

Sign up today (for free! space permitting): http://www.embedded-vision.com/embedded-vision-summit-registration

We hope to see you there.

Can your finger replace the mouse?

This is a common suggestion we hear from clients when first discussing possible use cases for close-range gesture recognition. It makes sense and on first glance seems like an easy extension of how things work today. But how well does this work in reality?

Actually, gesture as plain mouse replacement, does not work very well. Gesture recognition can be incredibly intuitive, easy, and fun – but, only when it is implemented with care and thought to the end user experience. This means building in from the start with a wide range of design considerations on how users interact with touchless systems.

Most current user interface paradigms are not designed for gesture. Commonly available Graphical User Interfaces (GUIs) were not intended for hands and fingers to manipulate the objects inside of them. They were designed within the paradigm of a hand-controlled mouse that operates in the screen of the user. Let’s take a closer look.

Your finger is not a mouse.

Take a look at the cursor that appears on your screen. Put your finger next to it and you may be surprised to see how small that cursor actually is. Imagine trying to “maximize” the Word document you are using and accidentally you end up selecting the “close” function and are hit with a dialog box asking you if you want to save your document. Windows 7 is just not conducive to touch.   

A mouse isn’t very expensive.

It may be challenging to make a convincing business case to an OEM to incorporate gesture recognition into their manufactured devices for the sole purpose of replacing the mouse. Does the additional cost of a 3D depth camera justify the added value? Especially when the substitute is a low-cost mouse that a user can pick up for just a few dollars.

The mouse works pretty well for what we want it to do.

The on-screen cursor that represents your mouse offers an accurate way to navigate through most of the tasks that you want to perform on a computer today. They are responsive, fast and familiar. By now, the mouse is ubiquitous and has become the most intuitive way for most people to interact with a GUI system. A lot of people would question, why replace something that isn’t broken?

Gesture adds value, without replacing the keyboard and mouse.

At Omek, we see gesture as a means to enhance your experience when interacting with your computer. We don’t think of gesture vs. keyboard + mouse as a binary choice, where you must choose between having gesture OR your keyboard + mouse. Rather, we see gesture adding another important dimension to how you interact with your device. For application developers, gesture can provide real context and understanding of a user above and beyond what’s known when he is simply using a mouse.

I can imagine a scenario in which I lift up my hand from my keyboard in order to access a “parallel desktop”. In this parallel desktop I am able to pause or change the music I’m listening to with the simple flick of my wrist. Once my hands rest back on the keyboard I immediately return to writing this blog post.

Design the interface to match the input method.

In order for the scenario above to work, the “parallel desktop” should be designed into the system from the start. To create a successful product, the UI should align with how a user interacts with the product, in order to provide the user with the right and continuous feedback he needs.

Tablet computers provide a perfect case study to illustrate this point. Many may have forgotten, but Microsoft first began releasing tablet computers well before the iPad, all the way back in 2000. But it wasn’t until Apple released the iPad that we really saw tablets take off.

From the first glance it is pretty evident that early Microsoft tablets were not designed for touch.  Instead, Microsoft extended the current Windows operating system and put it into a tablet.  Since the operating system wasn’t designed for fingers, users had to use a stylus to make selections.  The icons were small, not easy on the eyes, and not optimized for touch.  In addition, there was the issue of “screen coverage”. Dan Saffer, Interaction Designer and author of Designing Gestural Interfaces, reminds us that since our fingers are attached to a palm, your hand often can get in the way and cover the screen while you are trying to make a selection, creating a frustrating experience for any user.

The right UI can mean the difference between success and failure of a product.

With the recent release of Windows 8, we’ve seen sales of 1.5M Microsoft Surface tablets, with demand beginning to take off in the marketplace. The Windows 8 operating system represents an entirely new UI, designed with “touch” in mind and thus better suited for tablet and mobile devices. It features “swinging slabs”, a boxy design, and a horizontal layout.

Essentially, in order for Microsoft to realize success of their touch-based hardware devices, they needed to redesign their software-based interfaces to support that mode of interaction.

Will gesture ever replace the mouse?

YES! We do believe that gesture will take on a more prominent role in the computing experience and may ultimately fully replace the mouse. Gesture, when implemented well, can be a much more natural way of interacting with the many devices we have in our lives.

It won’t happen on its own, though. Software and hardware companies are already in the process and must continue to rethink and redesign the UI experience with gesture in mind from the start. We will see the best results if hardware and software companies plan for gesture from the concept stage all the way through to release. Once we have integrated systems that are designed for gesture will we see more widespread adoption of gesture-based interfaces.

Gesture control still plays a critical role in interface design and is being incorporated into a wide range of platforms. From infotainment control in your cars, to rehabilitation and physical therapy feedback systems using gesture, we are seeing more and more companies finding innovative uses for 3D gesture control. Authors Shel Israel and Robert Scoble are working on a book (titled The Age of Context) which explores the ever-growing role that contextual sensors are playing and will play in our lives, and what it means for us. More and more we are seeing multi-modal capabilities being rolled out, working to seamlessly integrate voice recognition, gaze detection, and touchless control.

In the meantime, our UX Studio and design team continue to test, learn and share thoughts through meaningful articles on our blog describing how to create truly interactive, intuitive interfaces that incorporate gesture. Stay tuned to the Omek blog to learn how to drive wide adoption of gesture in today’s interfaces. And sign up below to be notified of our upcoming Grasp beta release!

 

Grasp: Powering Gesture-Based Experiences

In our last two posts on the Omek blog we shared our thoughts and advice on how to incorporate gesture-based control into applications and interfaces in a way that is intuitive, easy-to-use, dynamic, and engaging.  We begin with a user-centered approach to interaction design, taking time to observe people’s actual movements to understand how they translate into truly natural gestures.  We incorporate feedback learned from frequent testing, making sure that our ideas work across a broad set of users.  And, perhaps most importantly, we leverage a tool that is unique to Omek – our Grasp SDK.

Grasp is in many ways the “magic” that enables us to transform our ideas into reality.  You may already be familiar with our Beckon SDK, which provides full-body skeleton tracking from distances of 1-5 meters.  Well, Grasp is our “close-range” counterpart.  It is a middleware solution and full set of tools for hand & finger motion tracking and gesture recognition from distances as close as 10cm.

But it is so much more! With Grasp, we took a unique approach in order to solve the question of close-range motion sensing.  We built a sophisticated and detailed solution, because it is our belief that you need technology complexity in order to achieve user simplicity.

Sign up now!

Interested in how you can get your hands on Grasp?  Currently it is in closed alpha testing but we are rapidly gearing up for a beta launch.  You can sign up to be notified of the release and be one of the first to try out these tools.

Below, we highlight several key features that are unique to our Grasp offering, all of which we leveraged when designing the Practical UI to ensure we ended up with a genuinely intuitive interface.

Cross-Camera, Software-only Motion Sensing and Gesture Solution.

While there have been quite a few exciting recent product developments, the gesture recognition market is still quite young with many new innovations underway.  At Omek, we want to enable you to take advantage of the latest advancements in camera technology.  To that end, our solution is cross-platform – we are working closely with different hardware providers to support their 3D sensors. What does this mean for you?

  • Support for depth cameras to offer you the most robust experience. With our eye towards usability, we want to provide developers with the most cutting edge technology to create meaningful user experiences.  At Omek we believe that 3D cameras have the power to change the paradigm of how we interact with our devices. Read our prior blog post on why we work with depth cameras rather than 2D cameras.
  • Seamless integration into your personal computing devices. One of our goals at Omek is to help you simplify your life; not complicate it by adding more peripherals!  That’s why we focus on supporting cameras that will be incorporated into your device, whether it’s an All-in-One PC or the dashboard of your car.

Full Hand Skeleton Model vs. 6-Point Tracking.

Unlike other close-range solutions available, Grasp creates a full 3D skeleton model of the hand, complete with 22 joints. Rather than simply assigning tracking points to each of your fingertips, we offer developers a complete model of the hand, opening up a broad set of possibilities to create a range of experiences.

Our approach offers significant advantages to ensure more robust tracking.  Why?  Well, the hand has several degrees of freedom, making it difficult to track. Think about it – you can open or close your hand, cross your fingers, curl your fingers, or rotate your palm. Your hand can take on many different configurations. By using a hand model, though, we are able to tackle many different scenarios, including self-occlusions, rotations, or closed / curled fingers.  The hand skeletal model effectively constrains the movements of the fingers to only actual possible configurations your hand could feasibly make.

  • Applying it to the Practical UI. Take for example, the pinch and rotate gestures used when taking a book off the shelf and opening it up to see what’s inside. First, we have to recognize the pinch – not too difficult.  It becomes more complicated, though, once the hand rotates.  A lot of information all of a sudden becomes occluded, making it difficult to maintain continuous tracking.  The fingertip points are no longer in the field of view of the camera.  Using Grasp, however, we can define that we are seeing the back of the hand and can instruct the application to continue tracking.

No Calibration Required.

When you develop your application using Grasp, your users will be able to immediately interact with your interface without having to calibrate to get started, providing a smoother and more dynamic introduction. Essentially, behind the scenes and invisible to anyone using Grasp, we calibrate the skeleton model to different hand dimensions and continue to auto-calibrate during the duration of the use of the application.  This ensures we are adjusting our tracking to all variety of users, whether they are kids or adults, those with small hands or someone with long, thin fingers.

When you walk up and begin using our Practical UI, regardless of your hand size or shape you are guaranteed to have a smooth, robust, and instantly-tracked experience.

Hand Detection + Classification.

Unlike other systems, we detect hands by searching for real hand features rather than simply looking for motion in the scene or using a skin color model. This helps us ensure that we are tracking what we want to be tracking – hands and fingers.  It allows us to robustly remove false positives and instead, quickly detect hands as soon as they appear in the scene, so your users won’t get frustrated by the application not working as they expected it to.

  • Differentiate between Right and Left hands. Combined with the full skeleton model of the hand, we offer the ability to detect whether the camera is tracking a right vs. a left hand, even in the case of poses that are non-trivial to detect.   

Designed for Usability

A major benefit to having an in-house UX Studio is that we have actual developers working with our SDK from the first stages of its creation. We leverage this feedback loop for both early testing and to ensure that our tools are designed with developers and their specific needs in mind. Our tools are easy-to-use and process much of the technical components “behind-the-scenes”, allowing you to focus on creating meaningful applications, rather than trying to solve computer vision problems.

In Summary….

Are you a developer seeking to create a gesture-based game?  Perhaps a medical device company looking to design a touchless interface? Or, an automotive company reinventing the in-vehicle-infotainment system? Whatever system you would like to create, Omek offers the most advanced, cutting-edge and user-centric tools to help you power your gesture-based experiences.

Sign up now to be notified of our upcoming Grasp beta: