Introduction
Text in its written form is less expressive than the spoken word or film, as it loses emotional content and paralinguistic cues such as pitch, tone of voice and even body language. This is especially true in electronic communication like email and instant messaging, as these typically use plain, written text. It is in this situation that Kinetic Typography can help.
Kinetic Typography is text that changes or moves over time, and is a form of animated text. It serves as a bridge to bring written text closer to the realm of film, with its associated expressiveness. In addition, Kinetic Typography can also help to bring emotive content back into written text. Figure 1 below shows a demonstration of Kinetic Typography.
Problem and Motivation
Getting the message across
Computer mediated human-to-human communication has helped to make the world seem smaller, by allowing instantaneous communication with anyone, anywhere in the world. However these communication mediums, such as email or instant messaging, tend to use plain text, which is lacking in emotive content. While systems like video chat or Voice-over-Internet Protocol (VoIP) allow more expressive ways of communicating over distances, email and instant messaging are still the most popular methods.
There have been several methods to try bringing emotive content back to written text. The most prevalent of these are smilies, which were at first devised as a way to separate humorous emails from serious discussion, but has now been used to describe various emotions, or just humor in general. Smilies lend themselves well to being used in email and instant messaging as they are inherently ASCII based and can be expressed easily using the keyboard. However, smilies do not accurately set the tone of a sentence or block of text as it is hard to use a smilie to express something like sarcasm, or even varying degrees of emotion. Another example of bringing emotive content to written text involved displaying descriptive pictures that changed according to a user defined emotion [8].
Exploration and Motivation
Since instant messaging is a popular method of computer mediated human-to-human communication, it seems logical to use it as a platform to explore emotive expression. Using existing technology and infrastructure would also allow us to discover the strengths and limitations of currently available technology in supporting emotive expression. From these findings, a new and improved infrastructure for generating expressive animations using Kinetic Typography could be developed, and using this new framework, the capabilities of Kinetic Typography could be made available to a wider audience, thus paving the way for more emotive personal communication.
Background and Related Work
Kinetic Typography has been available for some time, having made its first appearance in Alfred Hitchcock’s film ‘North by Northwest’, and can now commonly be found in the opening credits of many films. In addition, the general animation utilities that are used to generate these pieces are now commonplace, but are geared towards visual designers and have a steep learning curve. By having specialized tools for text animation in particular, it is possible to make the capabilities of emotive text more accessible to more people.
Specialized Kinetic Typography animation tools do exist, and an example of this is the Kinetic Typography Engine [5]. The engine allowed the generation of affective animations within Java based applications, and could be embedded into different types of applications. In particular, the engine allowed the development of several Kinetic Typography enabled instant messaging clients [1, 6], including KIM [9], a Kinetic Typography enabled instant messaging client with a simple emotion parser that allowed the automatic generation of affective animations based on the content of an instant message. KIM was a primary subject of this research and will be described in more detail in subsequent sections.
Approach and Findings
Outline and Motivation
Instant messaging makes a good platform for exploring emotive expression in computer mediated human-to-human communication as it is used by many people worldwide, who use it as a means to communicate over long distances. Instant messaging belongs to a group of semi-synchronous communication, which is structured almost like a face-to-face conversation, with the various parties in the conversation speaking and responding to each other in some sort of order.
Since instant messaging is analogous to real world conversations, it should carry about the same amount of emotional information as in real life. However, this tends not to be the case as instant messages are usually expressed in plain text, which is less expressive than spoken words. Thus, allowing these widely used communication channels to be more expressive seems to be an interesting problem to tackle.
We approached the problem in 3 phases. In the first phase, we constructed a Kinetic Typography enabled instant messaging client called KIM. KIM was used to explore emotive expression in computer mediated communication, and also to understand the strengths and limitations of existing technology and infrastructure.
In the second phase, using results and experience gained from phase one, we built a new and improved framework with which to generate Kinetic Typography animations. Finally, in phase 3, we aimed to make Kinetic Typography and affective animations more accessible by providing an interface which would allow novice and expert users alike to generate Kinetic Typography animations.
KIM: The ‘Emotional’ Instant Messenger
KIM (for Kinetic Instant Messenger) was developed as a platform to explore emotive expression in computer mediated communication. It was built upon the original Kinetic Typography Engine [5] and helped to understand the strengths and limitations of the existing Kinetic Typography Engine.
While KIM was not the first Kinetic Typography enabled Instant Message (KTIM) client to be developed, much of the prior work in the area resulted in clients that required users to manually select the effects and animations that they wanted. This added to the user’s cognitive load by having to task switch, as well as slows the user down, potentially leading to a negative experience.
KIM was thus designed to automatically handle the selection of animation effects, in a bid to both provide emotive expression in an instant messaging client, as well as improve the user experience. In order to do this, there needed to be a lightweight mechanism with which to apply animation effects. This led to the development of an ‘emotion parser’, which was a system that would analyze the content of an instant message and determine what emotion was meant to be conveyed. However, judging emotions from text is a hard problem, and thus KIM used a simplified, pattern matching approach.
The parser first looks at each word in the message and determine the emotion type by referring to a lookup table. 6 emotions were chosen to be expressed, and these were happiness, sadness, anger, fear, surprise and disappointment. It should be noted that sadness and disappointment are frequently construed as being the same, however, KIM had slightly different animations for them. These emotions comprised the core emotions that are recognized and generated by people worldwide [3], and thus should be universally understood.
When words in the message matched the sample words for a particular emotion, that word was marked as having that emotion. Words that did not match any category were labeled as neutral. In addition to the 6 core emotions, 3 other ‘emotions’ were implemented, and these were sarcasm, ‘whininess’ and smilies. These additional emotions were triggered by prepending a special tag on a word, or in the case of smilies, writing down a common smiley.
In order to animate the text, KIM utilized two animation canvases, one for incoming messages, and one for outgoing messages. This allowed the user to distinguish between the parties in a conversation. Figure 2 shows KIM’s message interface. Additionally, the messages were animated using a queue, to maintain consistency with the semi-synchronous nature of real life communication.
KIM was an initial exploratory project to find potential uses for Kinetic Typography, as well as to understand emotive expression in computer mediated human-to-human communication. A further purpose of the study was to determine the strengths and limitations of the existing Kinetic Typography Engine. We ran an informal, qualitative study that aimed to investigate what people thought about embedding Kinetic Typography in an instant messaging client. Many users found that having the Kinetic Typography animations made the instant message conversations more engaging, and many felt that there was more emotive content as compared with standard instant messages.
However, many of these same users also felt that the animations demanded too much attention, and while they were interesting and engaging, it got tiring to watch after some time. Kinetic Typography animations also typically take longer to display than regular messages. Users also felt that the simple pattern matching approach that KIM employed was too simplistic, and many words that KIM would have deemed “emotional” might actually have different meanings.
On the implementation side, We found that the existing Kinetic Typography Engine was full featured, and allowed the creation of complex animations. It was also fairly easy to write applications that used the engine to generate affective animations. However, it did take a significant amount of time to learn about the nuances of the engine, and to get it working effectively with the instant message client. Creating custom effects or generating animations also required a substantial amount of programming knowledge, and would be out of reach for most novices. Thus, with the findings from this phase of the research, We built a new and improved infrastructure for generating Kinetic Typography.
KTE2: 2nd Generation Kinetic Typography Engine
A new version of the Kinetic Typography Engine was written, with the aim of being powerful enough to generate complex animations, but also be accessible to more users. Additionally, the engine should be able to generate animations that would be deployable on a wide variety of platforms. KTE2 was redesigned with a new, constraint-based architecture, and written in ActionScript 3.0. The move to ActionScript provided two main benefits: First, the engine could harness the robust graphics routines of the Adobe Flash runtime engine, without having to manage threads and graphics system calls in Java. Second, the engine gained increased deployability on the web, desktops and theoretically on mobile devices, since the Flash runtime ran on these platforms. The use of a constraint system also helped make writing animation effects easier, especially more complex animations.
The new engine also supports an extensible animation library, which provides a set of core effects, as well as an avenue for “effect authors” to write custom effects and include them for use with the engine. The core effects are based off traditional animation techniques [4] developed at Disney, and adapted for use in computer animation systems. These techniques produce animations that seem more engaging and realistic. The library also includes effects specific to Kinetic Typography, and allows the creation of characters, direction of attention, and conveyance of emotion.
The use of constraints for animation has a long history [2, 7]. Animations are naturally expressed as constraints relating attributes such as size, position, and orientation to the passage of time. For example, to move an object across the screen, a constraint on the x attribute of the object can be made with respect to some factor of the current time. Thus, as time increases, the value of the x attribute increases, and the object moves. More complicated animations can be created by applying constraints onto multiple attributes of the same object. Figure xx shows how a circular path can be obtained by applying a Sine and Cosine constraint on the x and y attributes. Complex animations are also easy to generate, by first animating a ‘lead’ object, and constraining ‘follower’ objects to the ‘lead’ element.
After the new engine was written, it was validated by using it to recreate a corpus of existing Kinetic Typography animations that were done by visual designers. Many of these animations were complex, and by replicating them using KTE2, We would have shown that the engine could generate animations similar to what experienced visual designers could do. Figure 3 shows several examples from the corpus that were reproduced using KTE2.
However, using the engine by itself still required substantial programming knowledge, which limited the usefulness of the engine to developers. In order to make Kinetic Typography accessible to a wider audience, a simpler method of generating animations was needed.
Kinetic Typography Markup Language (KTML)
To make the generation of Kinetic Typography animations more accessible, we are currently working on developing a markup language, called the Kinetic Typography Markup Language (KTML). KTML will be a lot like HTML, which marks up text by indicating certain display styles for particular sections of text. Like HTML, KTML will allow novice users to mark up passages of text with certain effects, or emotions. Figure 4 below shows an example of what KTML may look like.
<ktml> <character id="clerk"> <angry><emphasis>What do you want?</emphasis></angry> </character> <character id="customer"> I'm looking for the... </character> : : </ktml>
The process of taking a piece of text and marking it up such that specific sections have specific animations attached to it would be easier for a novice to grasp, and enable even people without programming experience to generate animation pieces. KTML will provide several core animation effect tags for users to markup text with, and will also support the use of custom effects by specifying parameters and custom animation libraries. These custom animation libraries will be created by “effect authors” and integrated into KTE2’s animation library.
KTML is still a work in progress, and we will be interviewing visual designers to determine what types of animation effects they use to convey certain emotions or feelings. This vocabulary will then be used to generate KTML’s core set of animation tags, and enable a wide range of users to create their own animation pieces.
Conclusion
By first building an application to explore emotive expression in computer mediated human-to-human, we found that Kinetic Typography does make written text more engaging, and brings back some emotional content. However, the generation of these animations is usually out of reach of most people, and confined to visual designers, or programmers using specialized text animation engines. Thus, by providing a simple interface for generating affective animations, KTML makes Kinetic Typography more accessible to users.
Acknowledgements
The author would like to thank Prof. Scott Hudson (CMU HCII) for his guidance and support.
References
[1] Bodine, K. and Pignol, M. 2003. Kinetic typography-based instant messaging. In CHI '03 Extended Abstracts on Human Factors in Computing Systems (Ft. Lauderdale, Florida, USA, April 05 - 10, 2003). CHI '03. ACM, New York, NY, 914-915.
[2] Duisberg, R. A. 1987. Animation using temporal constraints: an overview of the animus system. Hum.-Comput. Interact. 3, 3 (Sep. 1987), 275-307.
[3] Ekman, P., Friesen, W. V., & Ellsworth, P. (1982). What emotion categories or dimensions can observers judge from facial behavior? In P. Ekman (Ed.), Emotion in the human face (pp. 39-55). New York: Cambridge University Press.
[4] Lasseter, J. 1987. Principles of traditional animation applied to 3D computer animation. In Proceedings of the 14th Annual Conference on Computer Graphics and interactive Techniques M. C. Stone, Ed. SIGGRAPH '87. ACM, New York, NY, 35-44.
[5] Lee, J. C., Forlizzi, J., and Hudson, S. E. 2002. The kinetic typography engine: an extensible system for animating expressive text. In Proceedings of the 15th Annual ACM Symposium on User interface Software and Technology (Paris, France, October 27 - 30, 2002). UIST '02. ACM, New York, NY, 81-90.
[6] Lee, J., Jun, S., Forlizzi, J., and Hudson, S. E. 2006. Using kinetic typography to convey emotion in text-based interpersonal communication. In Proceedings of the 6th Conference on Designing interactive Systems (University Park, PA, USA, June 26 - 28, 2006). DIS '06. ACM, New York, NY, 41-49.
[7] Myers, B. A., Miller, R. C., McDaniel, R., and Ferrency, A. 1996. Easily adding animations to interfaces using constraints. In Proceedings of the 9th Annual ACM Symposium on User interface Software and Technology (Seattle, Washington, United States, November 06 - 08, 1996). UIST '96. ACM, New York, NY, 119-128.
[8] Sánchez, J. A., Hernández, N. P., Penagos, J. C., and Ostróvskaya, Y. 2006. Conveying mood and emotion in instant messaging by using a two-dimensional model for affective states. In Proceedings of VII Brazilian Symposium on Human Factors in Computing Systems (Natal, RN, Brazil, November 19 - 22, 2006). IHC '06, vol. 323. ACM, New York, NY, 66-72.
[9] Yeo, Z. 2008. Emotional instant messaging with KIM. In CHI '08 Extended Abstracts on Human Factors in Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 3729-3734.