| Chinese is
an ideographic language, and its minimal unit is a character, which can
sometimes correspond to more than one pronunciation and more than one
meaning. Due to the enormous character set, Chinese characters cannot
be mapped to a keyboard for input into a computer system or an
electronic device. Thus an intermediate step is necessary for Chinese
text entry. The prevailing solution on a full-size keyboard to enter a
Chinese character, such as the one shown in Figure 1, is to use
phonetic spelling as an intermediate step. There are two commonly used Chinese phonetic spelling systems, Pin-yin and Zhu-yin. The Pin-yin system uses the English alphabet and is widely used in China, while the Zhu-yin system has a separate alphabet and is used more in Taiwan and areas still using traditional Chinese. For this project, we focus on the Zhu-Yin phonetic spelling system, which has 37 letters, as shown in figure 2. |
![]() Figure 1. Chinese character for "fly" -- pronounced "fai" |


| As
in English,
Chinese text entry on a mobile phone is more complicated than on a
computer since the number of letters greatly exceeds the number of
keys. With more than one letter per key, a sequence of key presses can
be ambiguous. Chinese text entry on mobile phones thus requires an
additional step of selecting the intended Zhu-yin sequence from a list
of Zhu-yin sequences before selecting the intended character from the list of homophonic
characters. For example, to enter the character into a mobile phone with the keypad shown in Figure 4, the user needs to press 1 and then 8, which leads to 12 valid Zhu-yin sequences, as illustrated in Figure 5. The desired sequence is the 11th in the list of 12 Zhu-yin sequences, so the user has to go through three sub-lists to select it. Previous work [3] found that the time taken for a user to react and choose from a list of multiple items was the primary bottleneck in Chinese text entry. |
![]() Figure
4. Keypad layout by Sony Ericsson
|

| Evaluation
a Keypad Layout We developed a "fitness function" to model the relative cost in time to select a target Zhu-yin sequence (assuming a reasonable, generic interface like the one in Figure 5). Each sequence selection scores 10 time units; each page selection scores 15 time units. For example, selecting with the keypad layout in Figure 4 scores 40 time units (30 for selecting down twice and 10 for selecting the target sequence). The set of valid Zhu-yin sequences is finite; therefore, we can compute the overall cost for a particular keypad layout by summing up individual cost weighed by sequence frequencies. The higher the overall cost, the worse the keypad layout. |
![]() Figure 6. The fitness function - evaluation metric for modeling time for zhu-yin selection. |
Manufacturer
|
Keys Used
|
Score
|
| Panasonic | 11 | 2136900 |
| Okwap | 11 | 2266000 |
| Motorola | 9 | 2769955 |
| Sony Ericsson | 10 | 10158755 |
| Commercial
layouts all follow Zhu-yin alphabetical order in some way. These
alphabetical orderings exacerbate the disambiguation problem by
enforcing poor layout choices. Non-alphabetical
ordering offers substantially better disambiguation. Furthermore,
studies have shown that, even in terms of novice users’
learning
time, “alphabetically organized keyboards are slightly
superior
to a randomly organized one, but that this difference is too slight to
be of any practical significance.” [6] What happens if we modify the layout in Figure 4, as shown in Figure 7? To enter character into a mobile phone with the keypad in Figure 7, the user presses 4 and 8, which leads to only 2 valid Zhu-yin sequences and reduces the cost. Based on this idea, our goal is to generate an optimal keypad layout, trying to minimize the number of Zhu-yin sequences generated by each key press sequence. However, the space of options for 37 letters on 12 keys is 12 to the power of 37, so exhaustive search is not feasible. Therefore, we use a best-improvement search algorithm to find keypads whose scores are locally optimal, running many randomized trials to find high quality keypad layouts. |
![]() Figure 7. A modification keypd layout in Figure 4. |
| Number of Keys | 9 | 10 | 11 | 12 |
| Score | 1521460 | 1085885 | 748870 | 469320 |
| The
results are
much better than the commercial layouts. To do a fair comparison, the
layout that uses 11 keys scores 748,870, whereas the best commercial
layout (Panasonic) which also uses 11 keys, scores 2,136,900. User Study Based on the keypad layouts and user study recommendations from this study, Chan et al. [9] conducted a user study with subjects who are familiar with Zhu-yin but are novice Zhu-yin cell phone users indicate that subjects quickly become comfortable with non-alphabetic layouts but, at least initially, do not enjoy significant speedup in text entry. However, analysis of keypress timing indicates the potential for up to 20% speedup with careful layout, mostly by ensuring that the keys pressed for frequently used Zhu-yin sequences can indicate no other legal sequence. Future work should exploit these results to improve the algorithm's cost function and study the resulting new layouts. |
![]() Figure 8. The top scoring 11-key layout generated by the algorithm.
|