Co-authored by: Alona Lerman, Shachar Oz, and Yaron Yanai
Part 2 of the Series: Designing a Practical UI
In this post we explore the challenges of providing an effective and useful “feedback” within gesture-based systems and offer our thoughts on how augmented reality can be implemented as a supporting tool for creating an intuitive and engaging interface. Read on to learn how we came to the approach implemented in the video below:
Background
A feedback system is the application’s method of communicating with users to: prompt them to take an action, inform them that a given task has been understood, and assure them that the system is aware and responding to their interaction. Feedback can come in the form of sounds, animations, color changes, highlights, or textual messages such as instructions, pop-ups, balloons, etc…
Omek’s UX Studio has spent much of the past six years researching best practices for gesture based HMI. One of the most important lessons we have learned time and again is that an effective feedback system can make the difference between a frustrating and confusing experience to one that naturally guides the user through an application, in a fun and successful way.
Gesture-based systems require their own form of feedback
What works well in one modality often does not work as well when applied to another modality. This is a learning that we cannot emphasize enough. There are attributes that are unique to gesture-based systems, and how you design your application should take into consideration these attributes:
- No tactile feedback: This one may seem obvious but has important implications. Unlike with most interfaces, gesture-based systems are “touch-less”. They lack physical, tactile feedback. For example, with a mouse or keyboard there is the haptic response when you push down and release a key. In a gesture-based system, you need to find the appropriate means of letting a user know that they have performed a given task.
- Invisible interaction space: The interaction space in a gesture-based system refers to the effective field of view of the camera (check out the photo below). Without feedback, a user has no way of knowing if he is being “seen” by the application. Check out our last blog postfor more detail on this topic.
- Even the best-designed systems sometimes fail: Yes, on occasion even the most accurate of tracking systems may “lose” a user, for example, because of occlusions. When this does happen, the application should fail gracefully to keep frustrated reactions to a minimum.
- No standardized gestural language: We’ve referenced before the article that Don Norman wrote on gesture wars for the magazine Core 77. For example, something as simple as the action of “selecting” can be interpreted in many different ways. One user may hover their hand over a button to select, while someone else may try to push their hand forward, mimicking the action used with a physical button.
Experiments in Feedback
For the Practical UI app we tested out a number of different ways to provide users feedback throughout their experience, from the first interaction all the way through to the end. Below we share examples of a few different feedback methods, their pros and cons, and considerations to keep in mind when you apply these in your own applications.
1. The Traditional: Your hand as a “cursor” on the screen, similar to a mouse pointer. In this scenario, your hand (or, more likely, your finger) becomes the mouse cursor on screen giving you visual feedback of where the “cursor” is at all times.
Pros:
- Extension of the current paradigm, thus requiring little explanation. From both an applications design and a user perspective, this seems like the most natural way to provide feedback in a gesture-based system. Users easily relate to this method of feedback since it takes such a familiar form (who hasn’t used a mouse before?). During testing, we found that users almost expected this to be the means of interacting with the application. There’s no manual or additional guidance you need to offer to users when they are getting started.
- Constant feedback given to the user. With a “cursor” method, the user always knows where his hand is in relation to the screen, providing invaluable information.
Cons:
- Easily leads to user fatigue. The work here is placed on the user, who has to be very accurate in his selection. Users have to hold an extremely steady hand in order to ensure they make the intended selection. All that steady hand-holding will quickly lead to fatigue and frustration. Bernd Plontsch just wrote an excellent post on exactly these challenges when trying to implement this using a Leap Motion.
- Applying Fitts Law in a NUI world. When the hand becomes a cursor in a gesture-based interface, a significant amount of thought must be put into the design and layout of the interface to ensure that users can quickly navigate from one selection to the next. John Pavlus interviewed UX Expert, Francisco Inchauste, on exactly this topic.
2. Active Regions: Highlighting selection buttons on screen when a user’s hand hovers over them (no cursor).
Check out a real-life example of this in our gesture-controlled media selection demo we created in partnership with Jinni for CES 2012. Instead of a cursor appearing on screen, when a user’s hand moves over the selection options, the selection buttons activate indicating that the user is in the “area” of a specific selection. The specific feedback may be manifested by buttons lighting up, enlarging, or creating a “shadow”.
Pros:
- No issue of “accuracy”. This significantly lowers the issue of user accuracy that we saw above with the cursor method. As long as a user is in the general area of the selection button, he is able to make a selection. This means that the tracking feels much more stable for end users – they don’t see the small jumps and shakiness the tracking data creates.
- Fewer “false” selections. Bigger buttons mean less chance for error (see Fitts Law). By creating bigger, more clickable areas it is much easier for the user to focus on their target, point at it and reach it.
Cons:
- Changes in the interface design. This feedback method requires that you construct your interface to fit this approach, creating from the get-go large “selection buttons”.
- No constant feedback for users. In contrast to the mouse cursor approach, users here only know whether their hand is selecting a certain icon or not. The feedback received can be somewhat vague, without offering guidance on how far your hand must move to reach the next icon.
3. 3D Hand Model: Creation of a Hand Avatar. Animating a 3D model of a user’s hand and turning it into a “hand avatar”. Every joint in the user’s hand is directly mapped to the model’s joints. Imagine the extension of a user’s hand into the application.
Pros:
- Very detailed feedback.This example offers a clear physical representation of your hand on screen, representing each and every one of your actual movements. It’s almost as though your hand has extended into the application on screen. There are two options for physically demonstrating this:
- Mirror – where the screen reflects a mirror version of your hand
- First person – the user sees their hand on screen the way they see their hand in front of them
- Creation of an immersive world. You can simulate collisions and physics in order to interact with virtual objects as if they were real – pick them up, push them, squash them etc. You become the puppeteer – moving a virtual you in a virtual world.
Cons:
- Issue of the “uncanny valley”. A hypothesis widely used in robotics and animation, the issue here references the delicate balance that must be struck between creating a virtual hand that users will find engaging to utilize.
- Very sensitive to tracking issues. Since most of the data from the tracking system is used all of the time, any error will be instantly seen. Essentially, it runs the possibility of introducing noise to the experience even though the actual points of interest (such as the index finger) are stable on their own. This issue can arise in any tracking system, even the most accurate ones.
- Requires a certain amount of control. It sounds strange but we are so used to controlling interfaces on a 2D interaction space that having your 3D hand inside the screen full control can actually make for a bulky and awkward experience.
- High sensitivity to the perspective of the interface. Since this is a virtual world the hand might be rendered in a different perspective than what the user is used to, resulting in a disorientating experience for most of the users we tested.
Our current model for the Practical UI: Using Augmented Reality as a Feedback Method
An augmented hand means rendering the hand’s image in each frame and overlaying the user interface on top of that. You use your own hand virtually represented on-screen as the pointing device and the feedback system becomes very responsive and actually fun and intuitive for most users (See part 1). Why?
- Incredibly intuitive system. A user raises their hand to get started and immediately sees it reflected on screen, requiring almost no explanation on how to use the system.
- Constant feedback. The user always understands if the camera is tracking them or not based on whether their hand is being shown on the screen, providing quiet reassurance to the user. Moreover, the user has accurate and constant feedback for his position in space.
- Use of all three dimensions. As the user’s hand moves closer or farther from the camera, their virtual augmented hand inside the application becomes larger or smaller, respectively. The user can easily understand their area of influence inside of the screen
- No pointer required, thus offering the ability for lower latency. Every tracking system has a smoothing algorithm to provide accurate data. In this instance, however, we aren’t rendering an actual pointer since the user’s hand becomes the pointer. Therefore, using “behind-the-scenes” calculations we were able to remove a lot of the smoothing thus eliminating latency in the application.
There are, however, a few things to keep in mind:
- First, it requires rendering of a constant full screen video stream at 60 fps, which can have impact on performance.
- This one is sneaky: what part of a user’s hand is considered the selector? Is it your index finger? The palm of your hand? More about this topic in an upcoming post.
- Pay attention to the details: we didn’t use a classic augmented reality hand. Instead we created almost a silhouette of the hand by cutting the hand out of the background. Rather than seeing a user’s face and arm, all you see is a subtle hand. This doesn’t just provide a more aesthetically pleasing interface; it also reduces the distraction level for the end user, allowing them to focus more on the experience.
- Finally, you will have to design the application so that the hand is visible in all circumstances. For example, when the hand is behind an element (i.e., a button or menu) it becomes obscured and the feedback is lost. Alternatively, if you render the hand above everything else you run the risk of blocking elements on screen (not to mention the fact that it is strange to interact with elements that are behind your hand). We addressed this issue by rendering the hand twice – one rendered opaque in the background and one as an outline in the foreground, thus solving both problems.
In Brief…
Gesture Recognition is an amazing technology that allows the user to interact with devices on his own terms. But it is an entirely new paradigm that requires a different approach to the design and feedback systems. If you simply extend traditional approaches that work well for a different modality (say, touch), you’ll find almost always that it doesn’t fit for gesture.
“Feedback” in a gesture-based system should be subtle yet constant, informative yet fun, and always intuitive.
Gestural interfaces and 3D sensors offer us new way of interaction with machines, computers and applications. As designers we need to keep in mind those do’s and don’ts in order to create clear and responsive feedback systems.











