One of my main HCI research threads is Fitts' law type of human performance models. They are interesting because they capture the regularities of user behavior in interacting with computing devices hence useful for predicting and optimizing interaction efficacy for a given set of task and UI parameters. They can also be embedded in advanced user interface algorithms, such as those in modern touch screen keyboards.
This page / post introduces and annotates, informally, some of the projects my colleagues and I have published, in a more or less reverse chronological order.
A statistical minimum jerk model of gesture production
Philip Quinn & Shumin Zhai (2016) Modeling Gesture-Typing Movements, Human–Computer Interaction, 33:3, 234-280, DOI: 10.1080/07370024.2016.1215922
How people "gesture type" on swiping keyboards such as Gboard? Obviously they would slide their finger from the first letter toward the subsequent letters in an intended word one after another till reaching the last. Such a gesture stroke is not deterministic in two aspects. First, the precision of passing each letter is a matter of speed-accuracy trade-off. Faster gestures tend to deviate from the center of each key more greatly. Second, between two "via points" there are infinite number of trajectories connecting them from which the human neuromotor control system has to choose. It turns out humans follow a classic rule in movement - minimizing the "jerk" of movement, the third derivative of position. Putting these two aspects together and building on a body of motor control literature, we developed and trained gesture typing models that can produce gestures indistinguishable from hand-drawn gestures to the human eyes or to keyboard gesture recognizers. With such models, we can simulate and evaluate keyboard algorithms through a large and longitudinal text corpus before deploying such algorithms to keyboard products and millions of users, among other applications.
Remodeling Fitts' law for the mobile-first world
Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts law: modeling finger touch with Fitts' law. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 1363-1372.
Fitts' law is one of the cornerstones of "quantitative HCI". One way of stating it is as follows: Movement is determined by the relative precision of the aiming task - a smaller target further away is harder (takes longer) to reach. A simple equation of Fitts' law can accurately predict the time to acquire a target given its size and distance. This was really great, until we moved off the "desktop" to a "mobile first" world of finger-touch screens. The bad news is that Fitts' law no longer holds because it does not account for an absolute error component in "fat finger" pointing. The good news is that this paper "remodeled" Fitts' law into finger Fitts' law, Or the FFitts law.
Curves, Lines, and Corners - The CLC model of gesture stroke complexity
Cao, X., & Zhai, S. (2007). Modeling human performance of pen stroke gestures. Proc. ACM CHI Conference on Human Factors in Computing Systems.
Intro circa 2005
As a developing discipline, research results in the field of human-computer interaction (HCI) tend to be "soft". Many workers in the field, such as A. Newell and S.K. Card, have argued that the advancement of HCI lies in "hardening" the field with quantitative and robust models. In practice, few theoretical, quantitative tools are available in user interface research and development. A rare exception to this is Fitts' law. Applying information theory to human perceptual-motor system, Paul Fitts (1954) found a logarithmic relationship that models speed accuracy trade-offs in aimed movements. A great number of studies have verified and / or applied Fitts' law to HCI problems, making Fitts' law one of the most studied topics in the HCI literature.
Since 1996 Johnny Accot and I have pursued a research program on “Laws of Action” that attempts to carry the spirit of Fitts' law forward. In the HCI context, Fitts law can be considered the “Law of Pointing”. We believe there are other robust human performance regularities in action. The two new classes of action relevant to user interface design and evaluation that we have explored are crossing and steering.
Crossing-based interfaces (“Law of Crossing”)
If traditional interface driven by pointing is all about “dotting the i’s”, it is also possible to develop interfaces based on “crossing the t’s” as a basic interaction event. In comparison to pointing (clicking, or worse double clicking), crossing is better suited for pen-based computing devices such as Tablet PCs, PDAs, and Smart phones. Without a quantitative model for crossing actions, however, the merits of such an idea is a matter of argument only. The following paper describes what we have done on modeling crossing based interaction. It systematically answers the question how crossing compares with pointing:
Accot, J., & Zhai, S., More than dotting the i's - foundations for crossing-based interfaces, in Proc. of CHI'2002: ACM Conference on Human Factors in Computing Systems, Minneapolis, Minnesota, April 2002. pp 73-80.
“Law of Steering”
In this work we first demonstrated that "goal crossing” task follows a logarithmic model. A thought experiment of placing infinite numbers of goals along a movement trajectory led us to hypothesize that there was a simple linear relationship between movement time and the ``tunnel'' width in steering tasks. We then confirmed such a "steering law" with three types of ``tunnels'': rectangle, cone, and spiral, all produced greater than 0.96 fitness. We then generalized the steering law in both integral and local forms. The integral form states that the steering time is linearly related to the index of difficulty, which is defined as the integral of the inverse of the width along the path; the local form states that the speed of movement is linearly related to the normal constraint. The following list of publications covers various aspects of our work on the “Law of Steering”. See the overview section of the fourth paper listed here for a review of early pre-cursors, including the work of Rashevsky (1959) and Drury (1971).
The extension and application of the steering law to input device studies was published in a CHI'99 paper.
The scale effect of the steering action was published in a CHI'2001 paper.
For a more recent survey on “laws of action” and the law of steering in locomotion, see Zhai, S., Accot, J., & Woltjer, R. (2004). Human Action Laws in Electronic Virtual Worlds - an empirical study pf path steering performance in VR, Presence.13(2).
“Law of Pointing”
Being the first law of action, namely Fitts’ law in its original form, many basic questions surprisingly remain subjects of investigation. For example, Fitts’ law only models the relationship between the pointing time the movement amplitude constraint (i.e. target width). In practice targets are usually two dimensional, hence imposing precision constraints on the pointing movement amplitude as well as movement direction. The following paper reviews the previous on this subject and proposes a new model for 2D target pointing.
Accot, J., Zhai, S., Refining Fitts' law models for bivariate pointing, in Proceedings of CHI 2003, ACM Conference on Human Factors in Computing Systems, Fort Lauderdale, Florida, April 5-10, 2003. pp 193-200.
Fitts’ law is frequently used as tool to characterize or measure the performance of input devices. If one compares device A (e.g. a Mouse) with device B (e.g. a Trackball) by measuring the average time needed to point at targets, the results of the average time will depend on the settings of the test (target size, distance etc). Fitts’ law can covert such measures to Fitts’ law parameters a and b as in T = a + b log2 (D/W + 1). Id = log2 (D/W + 1) is called index of difficulty, charactering how “difficult” the target of size W at distance D is to reach. If done properly, a and b will characterize the quality of the input devices beyond the conditions (W’s and D’s) used in the test. Many researchers in the field have further merged the quantity a and b into one metric (throughput). This is illogical, for the simple reason that a and b, which represents a line on the (T, Id) space can’t be reduced to one constant without limiting to a particular point or range of Id. If we do limit to a particular range, we defeat the generalization purpose of using Fitts’ law. In my view the Fitts’ law parameters a and b should be kept separate, representing the non-information (independent of Id in bits) and information (how much time increase per bit of Id) aspect of input respectively. Although the reasoning here sounds obvious, it takes me the following paper to explain this to my colleagues in the field, and some are still not convinced (see Soukoreff and MacKenzie’s article published in the same journal special issue)
Zhai, S. Characterizing computer input with Fitts’ law parameters ― The information and non-information aspects of pointing. International Journal of Human-Computer Studies, 61(6), 791-809. December 2004.