Search

A guided tour of my recent papers on keyboard research

Keywords: text input, text entry, keyboard, mobile HCI, smartphones, gesture typing, shape writing, swipe keyboard, word-gesture keyboard, laws of action, Fitts' law, motor skills, typing.


This is an annotated list of papers my co-authors and I have published on keyboards or closely related topics since 2012. It can also be considered a guided tour of research my colleagues and I have done in this area.


  • Shumin Zhai and Per Ola Kristensson. 2012. The word-gesture keyboard: reimagining keyboard interaction. Commun. ACM 55, 9 (September 2012), 91-101. DOI: https://doi.org/10.1145/2330667.2330689, PDF, HTML(with video)

  • This is a summary of a series of papers (2003 ~ 2010). It is a retrospective overview of a long research program establishing the word-gesture keyboard paradigm. It was invited by the Communications of the ACM Research Highlight editorial board (Stu Card being the action editor), with a Foreword by Bill Buxton.

  • Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts law: modeling finger touch with Fitts' law. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 1363-1372. DOI: https://doi.org/10.1145/2470654.2466180. Slides.

One of the most fundamental human performance models brought to HCI by Card, English, and Burr (1978) is Fitts’ law. It was foundational to analyzing mouse cursor based desktop GUI interface design and stylus based keyboard optimization. This project show how Fitts’s law breaks down for finger touch on mobile devices, and how we “remodelled” it into Finger Fitts’s law (FFitts law) in order to make it a useful tool for analyzing touchscreen operations including tapping on touchscreen keyboards.



  • Shiri Azenkot and Shumin Zhai. 2012. Touch behavior with different postures on soft smartphone keyboards. In Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services (MobileHCI '12). ACM, New York, NY, USA, 251-260. DOI: https://doi.org/10.1145/2371574.2371612. Slides.

This experiment is foundational to the construction of spatial models in decoding tap typing. The distribution of taping touch centroids changes according to “hand posture” (right finger, right finger, two thumbs) and the regions of the keyboard. Right thumb tapping points on the right region of the keyboard tended to shift to the right of the key centers, for example. It shows the early idea of universal “offset” (typically vertically) in decoding was naive. We did not find much meaningful / significant bias due to bitap movement (directional or distance). Taken together, a Gaussian distribution around the key center is a good baseline spatial model.


  • Xiaojun Bi, Ciprian Chelba, Tom Ouyang, Kurt Partridge, and Shumin Zhai. 2012. Bimanual gesture keyboard. In Proceedings of the 25th annual ACM symposium on User interface software and technology (UIST '12). ACM, New York, NY, USA, 137-146. DOI: https://doi.org/10.1145/2380116.2380136

One advantage of tapping with two thumbs is to benefit from the fact that letter sequences tend to alternate between the two sides of Qwerty. Bimanual gesture typing allows the two thumbs gesture word fragments hence save the travel distance.


  • Ying Yin, Tom Yu Ouyang, Kurt Partridge, and Shumin Zhai. 2013. Making touchscreen keyboards adaptive to keys, hand postures, and individuals: a hierarchical spatial backoff model approach. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 2775-2784. DOI: https://doi.org/10.1145/2470654.2481384

This experiment takes the findings in Azenkot & Zhai (2012) to practice, by making touchscreen keyboards decoding adaptive to keys, hand postures, and individuals. The gain in accuracy of this approach was greater at a letter level than word level with language model correction. Being able to determine the hand posture instantaneously rather than using classification from behavioral data (essentially Fitts’s law modelling) may bring greater gains in the future.


  • Andrew Fowler, Kurt Partridge, Ciprian Chelba, Xiaojun Bi, Tom Ouyang, and Shumin Zhai. 2015. Effects of Language Modeling and its Personalization on Touchscreen Typing Performance. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems(CHI '15). ACM, New York, NY, USA, 649-658. DOI: https://doi.org/10.1145/2702123.2702503

This Enron email corpus based simulation evaluates the error correction power of n-gram language models on touchscreen finger tap typing. It also measures further gains made by blending a generic background LM and personal LM learned from personal writing history.


  • Shyam Reyal, Shumin Zhai, and Per Ola Kristensson. 2015. Performance and User Experience of Touchscreen and Gesture Keyboards in a Lab Setting and in the Wild. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 679-688. DOI: https://doi.org/10.1145/2702123.2702597

This is an empirical study of users’ tap-typing and gesture- typing performance of Google Keyboard, both in the lab and “in the wild” by embedding typing tests amid their daily activities. Greater gesture typing advantage was shown in the wild than in the lab. Those who did not use or was not aware gesture typing shifted to gesture typing as their primary input method after the month long study. The finding gives the best explanation of swiper and tapper user groups. The latter has not discovered or learned gesture typing, as suggested in this PCWorld review.



  • Brian A. Smith, Xiaojun Bi, and Shumin Zhai. 2015. Optimizing Touchscreen Keyboards for Gesture Typing. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 3365-3374. DOI: https://doi.org/10.1145/2702123.2702357

Word-gestures on Qwerty tended to conflict with each other: tip vs. top, in vs. on, for vs four, write vs wrote are just a few example. This project puts forward the concept of gesture typing clarity, and let it to drive a simulated annealing optimization of a gesture typing keyboard layout. We also hoped to accommodate speed (distance) and learning (= Qwerty similarity).


All modern keyboards suggest word completions as the user tap letter by letter incrementally. This experiment measures the time cost of reading, recognizing, and selecting the right suggestion in comparison to tapping out the rest.


This project builds generative models of gesture typing from the minimum jerk theory of human motor control. Such models enable Sketch type of evaluation of personalized language modelling effect on gesture typing keyboard accuracy based on real text writing corpora.


  • Mitchell Gordon, Tom Ouyang, and Shumin Zhai. 2016. WatchWriter: Tap and Gesture Typing on a Smartwatch Miniature Keyboard with Statistical Decoding. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 3817-3821. DOI: https://doi.org/10.1145/2858036.2858242

The “fat finger” problem gets more pronounced on smartwatches. Traditional HCI techniques tried to solve the problem with multi-step selection of each letter. Here we show machine intelligence based modern statistical decoding outperform these traditional HCI techniques by 100+% margins.

© 2018 by Shumin Zhai

  • Facebook Clean Grey
  • Twitter Clean Grey
  • LinkedIn Clean Grey