Touch Gestures for Application Design

by October 9, 2012

In my third video for Intel Software Partners on re-imagining desktop application design, I walk-through an overview of touch gestures, how we can use them in our applications, and ways to make gesture-based interactions more discoverable for the people using our apps.

Gestures are how people get things done on touch interfaces. When coupled with keyboard and mouse interactions, touch gestures expand our palette of input capabilities and the ways people accomplish tasks within our applications. So let's look at how we can apply gestures to an existing desktop application design:

Complete Transcript

Welcome to the Re-imaging apps for Ultrabook series with Luke Wroblewski. Today we’ll continue our look at the impact of touch on desktop application design with an overview of touch gestures, how we can use them in our applications, and ways to make gesture-based interactions more discoverable for the people using our apps.

Earlier in the series we talked about the high level principles behind designing for touch and outlined how and where we should present touch targets in our user interfaces. Today we’ll build on that foundation and drill into the use of touch gestures for interaction design.

Gestures are how people get things done on touch interfaces. When coupled with keyboard and mouse interactions, like on touch-enabled Ultrabooks, touch gestures expand our palette of input capabilities and the ways people accomplish tasks within our applications.

So what kinds of gesture interactions are possible on touch screens? A while back, myself, Craig Villamor, and Dan Willis looked at the touch capabilities of many different operating systems. Amongst all the technical documentation, we found a remarkable consistency of supported gesture types. Though few of these will be a surprise to anyone who’s every used a touch interface, its useful to articulate the differences between each gesture type starting from the simplest gesture first: a tap. A tap gesture consists of nothing more than briefly touching the surface of a screen with your fingertip. Doing so quickly twice is a double tap. Both tap and double tap gestures can be multi-finger operations as well. However, performing and detecting multi-finger double tap operations requires quite a bit of dexterity. In general, the simpler the gesture the better.

After the tap, swipe and flick gestures are probably the most versatile in touch-enabled operating systems. That is, there are lots of interactions controlled by moving a fingertip across the surface of the screen without losing contact, a swipe, or quickly brushing the surface with a fingertip, a flick. Multi-finger versions of swipes are fairly common while multi-finger versions of flicks are again challenging to actively perform and detect.

Simpler multi-touch gestures can, and often are, used effectively. The most common of these are pinch, bringing two fingertips on screen closer together, and spread, moving two fingertips on the screen further apart. These gestures can also be detected with any number of fingers moved closer together or further apart.

Press gestures require touching the surface of a screen for a extended period of time in order to be recognized. While one finger is being pressed against the screen, another could at the same time tap or drag elsewhere forming more complex, two finger gestures. Though press gestures are often used for operating system actions, in their latest touch guidelines, Microsoft advises that most developers avoid using press gestures because they can be difficult to perform and time correctly.

Last but not least on our quick survey of the most common touch gestures is rotate. Looking across touch-enabled operating systems shows there’s actually quite a bit of divergence in how rotate gestures work: two fingers rotating along the opposing edges of a sphere, pressing with one finger and rotating with another, or using two fingers to rotate. Microsoft’s touch language uses the first method of rotating two or more fingers. With the exception of rotate most operating systems implement common touch gestures like tap, swipe, flick, pinch, zoom, and press gestures quite consistently. Which is great because as more devices support touch, its quite likely people will be using their fingers to control more than one operating system in their lives.

Having a basic vocabulary of what touch gestures are commonplace is just step one to meeting people’s expectations in touch-based interactions. Perhaps, more important is knowing how these gestures align with the thing people want to accomplish. To that end, our survey of touch-enabled operating systems also looked at which gestures were most frequently associated with common tasks for navigating, manipulating objects, and managing interfaces. I mentioned earlier that press is quite often used for system tasks like changing modes or bringing up contextual information or actions. Double tap, on the other hand, is frequently used for opening items or zooming into and out of them. Despite a lot of commonality, each operating system ultimately sets its own rules. In the Windows touch language, for example, press brings up detailed information or in context menus and a single tap invoke primary actions like launching an application or executing a command. So if you’re creating an app for the Ultrabook platform and Windows, refer to the Windows touch language guide linked to in th blog post accompanying this video.

In addition to system actions, touch gestures can also be used to manipulate objects. For instance, dragging or swiping across an item on the screen commonly brings up delete actions. Dragging is also frequently used for moving objects around on the screen. In the Windows touch language, swiping the finger a short distance, perpendicular to a list view selects an object in that list and allows you to take action on it.

While gestures can be used for system and object control, they may align most naturally with navigation actions that allow us to move quickly within and between applications. Scrolling areas of an application with one or more fingers, or better yet quickly flicking the screen to move through lots of content quickly. Not only does Microsoft recommend swiping across lists of content for panning but the Windows touch language also suggests including the ability to pinch and stretch lists of content to quickly jump to the beginning or end of content. And Windows operating system makes use of swiping from the bottom or top edge of the screen to bring up application controls and from the right side to reveal system controls. Once again, check the documents referenced in this blog post if you are building specifically for the Windows platform. If you’d like a broader overview of the touch gestures supported across multiple operating systems including the ways they are most commonly used, take a look at the Touch Gesture Guide also included in the blog.

With that overview under our belts, let’s take a look at applying gesture-based interactions to an existing desktop application to rethink our application for touch. In our last video, take a look if you missed it, we began the redesign of Tweester, the ultimate social networking tool for the storm chasing community. So let’s keeping working on adapting that desktop applicatio to take advantage of new Ultrabook capabilities, in this case touch gestures, and once again let’s aim not just for touch-enabled but for a touch optimized interaction design.

Our original desktop implementation of tweester has a feed, which can be moved up and down with a basic scrollbar widget that requires us to position our mouse cursor somewhere on the scrollbar in order to click or click and drag to move the list where we want it. In our touch optimized redesign, we can simply use our fingers to push feed the Tweester feed up or down. Note that the content follows our finger to better align with how our hands actually move objects in the real world. Two fingers also works to move the feed up at the same pace -more on that in a minute. Perhaps most convenient is the ability to easily flick on the list for more rapid scrolling when we want to get back to the top of the list quickly. Since we’re designing this application for the Ultrabook platform, I can still use a mouse to navigate up and down list, or the up and down arrow keys on my keyboard. A well designed app will allow people to use whatever is the most convenient form of input for their task. Whether that’s a quick flick on the touch screen or moving up and down the list with the arrow keys on the keyboard. Let your users decide what works best for them by supporting each option well.

Recall that we allowed any number of fingers to scroll through the Tweester feed. This was intentionally done to align with Microsoft’s guideline that recommends allowing any number of fingers to perform the same action. As they point out, people often have more than on finger on a touch screen so the basics of an interaction shouldn’t change completely with the addition of extra fingers. That is a one finger swipe to scroll a list doesn’t become a show photo gallery action when someone swipes with two fingers instead.

There’s more to our Tweester feed than just scrolling. Each item in the list has a set of possible actions. We can reply, favorite, or get more information about each update. With good keyboard and mouse support we could do that with key commands and mouse clicks but with good touch support, we can easily access these supplementary actions using touch gestures as well. In this case swiping or flicking across a list item brings up a relevant set of options. Swiping again brings up back to the content in the list. You’ll note that it’s still possible for someone to tap on any item in the list or click it with a mouse to reveal those same actions and more in a detailed panel view but the touch gesture acts as a shortcut to quickly take action with the commitment of a full view of data. In this way, we’re using touch gestures similarly to how keyboard shortcuts are used in many desktop apps.

Josh Clark, author of Tapworthy, which is filled with lots of useful touch design tips, has long suggested that thinking about touch gestures like keyboard shortcuts can reveal many places where touch can enhance the efficiency of an application by giving people quick ways to get key tasks done. Of course, just like keyboard shortcuts, gestures shouldn’t be the only way to get things done. We still need to provide alternatives like the detailed view of items in the Tweester feed that can be accessed with a tap or mouse click.

Now that we’ve exposed this nice storm image in our detailed view, what happens if we want an even closer look? In a touch-optimized interface, we may want to allow someone to use a simple spread gesture to increase the size of the image until it snaps into full screen mode for up close viewing.

You’ll note that immediately when someone started to spread the image, we made it larger thereby providing feedback that the touch gesture they were trying was going to work. Once again we’ve done this deliberately. As Microsoft’s Windows guidelines point out instant visual feedback, like a change in color, size, or motion let’s people know the system is responding to their gesture and gives them the confidence to move forward. When using motion for feedback consider the start and end states of a transition as well, easing as animations in and out helps provide a more natural feel.

Once we’re in full gallery mode, all we need to do to move between the pictures we saw in the Tweester feed is swipe or flick across them. Even in this seemingly simple interaction, we’re seeing a couple of important touch principles applied.

To keep our Ultrabook application touch-optimized and aligned with people’s expectations of touch on Microsoft Windows we’ll again make sure we’re considering Windows touch guidelines. In this case, content follows fingers. The photo we pushed off screen stayed in touch with our finger until it moved off screen and allowed the second photo to move into place. Having content follow fingers is also a great way to ensure people have adequate feedback that their touch gesture is having on effect on the elements on the screen. The sooner we provide that feedback the better as even a few 100ms of response times is noticeable. The second principle we’re designing toward is browsing content with touch. We’ve brought our zoomed photo into a large, full screen gallery that can be easily browsed and panning using the kind of broad sweeping gestures touch interfaces are great at.

Your reaction to all this might be. Touch gestures seem great but “how will anybody know these interactions are possible?” After all gesture interactions are invisible. How will people know to spread, pinch, swipe and flick across these images? There’s no visible interface elements that let people know when they’re possible, right? Well first, because we’re designing for a multi-input platform like the Ultrabook we can ensure touch isn’t the only way to get core tasks done. To that end, we can allow people to make their way through the gallery using a mouse to click on each image or the arrow keys on the keyboard to move back and forth. We could even support a few shortcuts to help people jump to the start and end. But we’d still have to account for situations where people’s preferred or only mode of interaction is through touch.

In these case, there are several techniques we can employ to make gesture based interactions more discoverable. If you’re thinking I’m going to suggest starting our application off with an educational tour that explains what gestures are possible and where -I’m going to disappoint you. The only thing most intro tours are good at is testing to see how quickly people can escape out of or skip through them. So instead we’ll look at a few different, and usually more effective techniques. We just saw the first option in action: removal of other options. When in doubt, people will try gestures that seem like the “could work”. If we’ve done our job with providing immediate feedback and allowing content to follow fingers, we can quickly let them know, their experiment worked and they themselves figured out how to use the interface -no instructions necessary. The more we align our gestures with natural, physical motions -the more likely what people guess up front will work. Try it out on a toddler. No really, you’ll quickly see how someone with the baggage of a traditional GUI interfaces expects things should move. But we can actually do some things to increase discoverability as well.

For example, in our full screen image gallery rather than just showing one image we can tease the fact that there’s more images to explore. There’s a number of visual design techniques that can achieve this effect but in this case, I’m just going to place a faded edge of each image on both sides of the screen. Thereby signaling there’s more to explore in this gallery.

Well-timed and meaningful animations can also clue people into functionality. In this design, I’ve quickly moved a little set of image thumbnails in from the bottom of the screen and then hidden them. This quick motion captures people’s attention when the gallery comes up and gives them an understanding that there’s additional content and controls available. To bring the menu back up, just tap or click anywhere on the screen. While this is a common way of introducing people to additional features, I wouldn’t recommend using this specific implementation for a Windows touch-optimized application. Recall in our earlier discussion that Windows brings up app controls when someone swipes up from the bottom edge of the screen. If someone saw our teaser animation for the gallery menu, then tried to swipe up instead of tapping to bring it back they might frustratingly bring up the application controls instead. Unless full gallery controls reside in this menu too, I’d stick with the operating system standard and not try to introduce another set of hiding contextual controls at the bottom edge of the screen. Another reason why it makes a lot of sense to familiarize yourself with the specific guidelines of the platform you’re designing for.

Just in time education is another technique for making otherwise invisible touch gesture interactions known. When someone access our gallery, we can pop up a quick messages that tells them a simple pinch will allow them to close the gallery and go back to the screen they came from.

Now they know putting two fingers on the screen and bringing them closer together will snap the gallery shut.

We probably want to support multiple fingers here as well so people can use any number of fingers to collapse the gallery. So in addition to 2 fingers, someone could use 3 fingers.

Or even all the fingers on one hand in one big sweeping gesture that almost takes up the whole screen.

As Josh Clark, who I referenced earlier suggests big screens invite big gestures. That is why not allow people to use touch gestures in comfortable ways. Mouse interactions are great at finding small targets precisely, fingers are not. So while we can optimize for keyboard and mouse use in our desktop applications, we can also optimize for touch use as well by supporting the kinds of gestures that make touch interactions comfortable and fun.

To summarize what we’ve talked about with touch gestures, there is a common gesture vocabulary that is supported by all touch-enabled operating systems. However, each OS has its own specific touch vocabulary so be sure to familiarize yourself with the expectations created by each platform’s conventions. I’ve referenced the Windows touch guidelines and their unique considerations in this video frequently to help developers building apps for the Ultrabook platform get a handle on how Windows does things. That said, don’t be afraid to experiment with touch gestures. This is new terrain and there’s lot of ground to be explored and discovered. Whether it’s opportunities that arise from looking at gestures as keyboard shortcuts, thinking about how to enable browsing of content with touch, or considering how big screen invite big gestures, there’s a world of possibility open to designers and developers as they re-imagine their existing applications or develop new applications for touch. As you explore, however, be mindful of potential discoverability issues. Touch gestures are often invisible and techniques like content teases, animation cues, or just in time education can help ease people through the transition to touch.

As always further information on developing applications to take advantage of touch gestures is available in intel's ultrabook developer community site. You can find the link in the blog post accompanying this video.

Thanks for your time and I’m looking forward to having you join me in the next video in this ongoing series where we’ll move on from the opportunities that touch provides to rethink desktop applications for the Ultrabook platform and focus on location detection instead. Until then thanks for tuning in.

More Soon...

Stay tuned for more video in the series coming soon...

Disclosure: I am a contracted vendor with Intel. Opinions expressed on this site are my own and do not necessarily represent Intel's position on any issue.