Now this is an interesting article, written by Lukas Mathis. He makes the interesting case that the gesture-based interface, as seen on the iPad and many modern smartphones, is actually more akin to the command line interface than the graphical one. He also offers a number of solutions for that pesky problem of gestures being anything but discoverable.
According to Mathis, the crucial difference between the command line interface and the graphical user interface is that where the former is built around memorisation, the latter is built around recognition. When using a CLI, you need to memorise the various commands before you can use the computer; the advent of the GUI replaced memorisation with recognition.
Memorisation didn’t disappear, though. “Memorization was relegated to shortcuts,” Mathis argues, “Instead of selecting ‘Duplicate’ from a menu in order to create a copy of a file, people could use a keyboard command that they had memorized. But it was an optional, secondary way of achieving the same thing.”
Gestures are often problematic in that they are hard to discover. My personal favourite example is the “shake to undo”-feature on the iPhone. I didn’t know it existed until I read about it on the internet, and I don’t know anyone ‘in real life’ who knows this feature exists. There is no way to find out this feature exists, other than to accidentally shake it and doing 1 + 1 = 2.
Mathis managed to dig up even better examples of incredibly non-obvious and complex gestures that you will never find out without actually diving into the documentation. For instance, you can move objects in Pages one pixel at a time by touching it with one finger, and swiping another finger into the desired direction. The gestures for matching the size of two objects is even more complicated and non-obvious.
He argues that this non-obviousness inherent to current gesture-based interfaces means they once again centre around memorisation instead of recognition. “When natural user interfaces resort to non-obvious gestures, they essentially regress into a really pretty, modern version of the quaint old command line interface,” he states.
An interesting observation, and certainly one that I can agree with. I also happen to believe that despite the solutions Mathis provides us with, this lack of discoverability will ensure that gesture-based interfaces will always play second fiddle when it comes to getting real work done. As great as the iPad and Android tablets may be, they’re made for consuming, not for creating.
Ha, not hardly. Though certainly not perfect, something like an iPad is still a hell of a lot more intuitive than a CLI, assuming you’re approaching both for the very first time. I mean, don’t get me wrong… the command-line does have its strengths, but ‘discover-ability’ is not one of them.
I will agree that a gesture-based interface isn’t going to replace a keyboard and mouse for getting real work done anytime soon, but they’re sure handy to have around when you’re kicked back on the couch and want to browse the web, or whatever
Edited 2010-05-27 23:43 UTC
“help” typed at any command line is reasonably good at directing you in the right direction. Perhaps we need a question-mark gesture that displays a list of gestures with directions.
Great idea! and there’s no irony involved.
You should have patented the gesture before telling us about it. I’m sure it would have been granted.
probably the concept of gesture is already patented.
Everything you say here is true, but none of it invalidates the point of the article, namely that gesture based UIs require memorization instead of recognition.
That said, nothing in the linked article is particularly insightful either – this is a very well known issue and predates the multi-touch interface as Apple implements it by many years…
Apple did not fix this problem – it is rather inherent to the interface paradigm. What they did manage to do, however, was promote application interfaces which by design are limited to 3 obvious and simple control behaviors, i.e. touch to click, drag to scroll, and pinch to zoom. And they mostly stick with these.
Most applications do not attempt to go beyond these 3 mechanisms for their primary control mechanics. Some implement extended gestures, but usually only, as the article suggests, for shortcuts that are not required for basic interaction with the application.
As the applications get more complex the inherent problems with gesture based interfaces become more obvious. Another post here mentioned the possibility of a “help” gesture. This is probably a very good idea for those applications with UI paradigms which simply cannot cope with being limited to the 3 basic gestures.
I would also suggest that such applications are trying to fit a square peg in a round hole so to speak. What I mean by that is some of them will certainly become popular for what they can do as a tool, but it will be in spite of their bad UI, not because of it. The more popular applications in the appstore tend to stick to the basics and simply come up with new ways to utilize them.
Well, and you also hold down on stuff, like moving things around the home screen, and copy/pasting text. So technically, that’s 4 And to me, that seems ok, given the limited scope of the touch interfaces. One thing I really miss though is tooltips. Sometimes, I wish I could just hover my finger over a button or some other icon to see what it does. No such luck though
That’s because it uses a mix of GUI and gestures. The GUI part of the UI is very easy to understand.
Still gestures are much easier than command lines. No exact spellings of commands to remember, just a basic idea on the gesture is usually all you need to know. No command syntax to memorize either. man pages are very helpful for that though.
Nice article, yes gestures are simpler than the command-line but also much less powerful. When comparing power to simplicity ratio they would look similar.
Exact spellings in the command-line make it difficult – true – but we also have tab auto complete and the ability to scroll through recent commands.
I’m beginning to have the opinion that the increase of features in IT devices to make the easier to use, in fact, often makes them more complex and difficult to use.
Oh damn I sounding like a power of the command-line. bore
At first when I read the summary, I was thinking there is no way in hell you can compare gestures with the CLI, but then I actually read the article and the author has a valid point. Gestures are non-obvious and force the user to rote memorize how to get things done unlike the graphical menu system.
Same can be said on most GUI programs. Almost all people are educated on usage of programs and all work in GUI enviroment is also largely based on memorizing different things. Same goes gestures however they can be much more simple (and they are) and they can remind something we used to do so they are easier to learn.
Whole point of NUI research is to transfer something we used to computer enviroment as similiar as possible yet making it more efficient. Reason why NUI is better than GUI is because it has all same powers as GUI and yet have power of gestures which we have used on something else and already know them. One great example is usage of colors in GUI to explain different things, even baby reacts differently on different colors, however in CLI world you need to first either understand commands or understand language to accomplish same.
What is “NUI”?
Natural User Interface.
From the article:
I also happen to believe that despite the solutions Mathis provides us with, this lack of discoverability will ensure that gesture-based interfaces will always play second fiddle when it comes to getting real work done.
Sorry Thom, but this is the wrong conclusion. The relationship between CLI and gestures is an interesting observation. But neither of them are, just because of discoverability issues, hold you back from being productive.
The CLI is used since decades to get real work done. The same goes for Photoshop, 3DS Max, and any other software targeted at professionals, including the “creative” sort.
I would even claim that memorization is the key to productivity. In fact, in behavioral psychology you see a lot of work related to how the human brain adapts to frequent tasks. It all boils down to
a) improving the memory structure (e.g. to memorize a chess board, learn to cluster it into a smaller set of entities)
b) gaining a HUGE memory (e.g. of common and then less common chess constellations)
c) in this memory, not only learn states for quicker reaction, but also the needed reaction itself
So if you work long enough in Photoshop, you don’t search for that item or menu entry anymore, you don’t even ask yourself which tool you will need next. In the CLI, you don’t think about the commands you type, and sometimes, you won’t even think about typing a command at all.
The same goes for gestures and any other interface that you may master with memorization. Shortcuts/Key Accelerators are so popular on the Mac platform for the very same reason.
True. If you look at them from this point of view – targeted use cases aiming for consuming content – then the whole issue is not such a big deal. It’s not a big problem that productivity-wise (let alone development) gestures are worse than traditional interfaces, because they are not – at least I sicnerely hope – intended for productivity and development tasks. And you don’t need a gazillion gestures for a usual consumer product, just a bare minimum that enables the content consumption, and those can be memorized fairly quickly.
Discovering them is another issue, consistency between devices – regarding gestures – is another, but they all can be mastered given enough practice, if the divice – or the content accessible through it – is worth the effort, that is.
I think we’ve all debunked the topic sufficiently, but gesturing still has a very long way to go be useful.
From trying to write with my Palm Pilot iii years ago to now with some UI interfaces now that offer gesturing, it’s an incredibly inefficient process to perform a simple task. It’s a giant Rube Goldberg machine.
It’s not until multi-touch interfaces where more than one input per instant seems to make the process easier to use. Once you have that, make it intuitive for humans (let humans, not programmers, decide how to use it).
I miss Palm’s original Graffiti alphabet; high speed input, low rate of error in recognition.. fantastic. I’ve yet to see a alphabetic gesture input as accurate outside of an onscreen keyboard (lower speed, low rate of error in recognition).
I think there is a fundamental difference in memorization a command and typing it, and using gestures. I currently use a MacBook with jitouch (http://www.jitouch.com/) which offers a vast number of gestures. Even for things like closing a browser tab and filling out my saved passwords on a website, I have a gesture.
At last for me, those are very different than the commands I on the CLI. For remembering the CLI commands, I have to use brain/interlect whereas for the gestures, it is plain instinct (after some training of course). I think that I want to close a tab, and my hand does it. No usage of my brain needed. Just the “brain” of my hand. Now when I use the CLI, I still have to activly “think” and type. This goes really fast most of the time, but still it requires much more “brain usage” than gesture.
I hope I could make my point clear, I am no native english speaker
Actually, at the core, they’re the same. I do a lot with the cli, and I don’t even think about what commands I need most of the time. I need a list of files, I type ls before I even realize what I’ve done. It’s memory, in both cases. You think you want to do something and immediately your gesture comes to mind, I think of something and the command is typed before I even think about it. As far as the brain goes, there’s no difference except *what* is being remembered. Gestures and the cli are certainly not the same interfaces, but the memorization behind them is identical.
That the memorization is not the same is exacty what I wanted to say. The CLI stuff is “stored” in the symbol store of the brain, whereas the gesture is stored in another part of the brain (where movement is handled). And I think that the “movement store” is easier to access than the “symbol store”.
EDIT1:
Or to put it more logically: Typing a command on the CLI involves a lot of gestures, since you need to type the keys (since each keypress is a gesture), and a gesture is just one.
So in the end you have:
CLI:
memorization, gesture, gesture, gesture …..
Gesture:
gesture
Edit2:
If one assumes each gesture involves the same memorization as a CLI command it is:
CLI:
memorization (command), memorization(key), gesture(key), memorization(key), gesture(key), ….
Gesture:
memorization(gesture), gesture
Of course only if you trained the gesture as much as typing. Which probably most people will do once there is a set of standard-gestures.
So I would assume a gesture is more similar to a keypress than to a CLI command.
Edited 2010-05-28 19:18 UTC
“after a bit of learning of course” is no different for command line. Gestures used regularity become second nature “instinct” as do commands typed regularly. I spend much of my cli time in *nix and often get an error or two from “ls *.*” or similar when I first switch over to a Windows cli. Previous to my *nix days, the Dos and Windows cli commands where pretty much automatic; think what I want to do and the commands are on the line before I realize I hit enter.
Not only gesture-controlled UIs, but also voice control seems to require much more focus and thinking.
If you ever tried Vista/7 voice recognition, you might acknowledge that.
I hate gesture interfaces and touchscreens – always have. Admittedly, much of my dislike for touchscreens comes from the fact I can’t stand fingerprints on my screens but moreso I find them really counterintuitive compared to a normal pointing device, especially when half the time I can’t see what my ham-sized finger is actually OVER. (same reason I can’t use Saitek joysticks, they feel like they’re sized for five year olds or girl-hands – certainly not my oven-mitts) Using a stylus alleviates the problem a bit, right up until you lose the bloody thing.
Though the REAL thing I hate is the complete lack of tactile feedback. I love the good solid “SNAP” of a switch registering (why I hate dome/membrane keyboards) – There’s a reason I’ll probably be using a Model M keyboard on my workstation until the day I die. (I’ve got enough of them stockpiled it’s no joke).
Admittedly for small devices they make a lot of sense since you can’t slap a giant keyboard or mouse on one – but that’s a sacrifice that would leave me looking to try and come up with some other solution BESIDES the touchscreen… like say a D-Pad and a couple small buttons.
Edited 2010-05-29 07:16 UTC