RESEARCH

Embodied Conversational Agents
Children and Technology: Story-Listening Systems
Technology for Empowerment and Voice


Embodied Conversational Agents
What is an Embodied Conversational Agent? It is a lifesize virtual human capable of carrying on conversations with humans by both understanding and producing speech, hand gesture and facial expressions. Embodied Conversational Agents are a type of multimodal interface where the modalities are the natural modalities of human conversation: speech, facial displays, hand gestures, body stance. They are a type of software agent insofar as they exist to do the bidding of their human users, or to represent their human users in a computational environment. They are a type of dialogue system where both verbal and non-verbal devices advance and regulate the dialogue between the user and the computer. In the Embodied Conversational Agent, the visual dimension of interacting with a cartoon character on a screen (rather than a keyboard) is intrinsic to its function. The graphics are not just pretty pictures, but visual displays of conversation, in the same way that the face and hands serve that function in face-to-face conversation among humans.

After having spent ten years studying verbal and non-verbal aspects of human communication through microanalysis of videotaped data (starting as a graduate student) I began to bring my knowledge of human conversation to the design of computational systems. I built the very first embodied conversational agent as NSF visiting faculty at the University of Pennsylvania, in the Center for Human Modeling and Simulation, working with their faculty and graduate students. Previously professional animators manually synthesized conversational behaviors for animated figures based on their intuitions, and they "hard-wired" facial expressions and gestures. Although the intuitions of such animation artists are excellent, and hard-wiring is a satisfactory approach to regular animation, their approach cannot be extended to the generation of these behaviors in systems running independently of a human designer. My work introduced the first rule-governed, autonomous generation of verbal and non-verbal conversational behaviors in animated characters. Secondly, previous conversational interfaces or dialogue systems concentrated on the content of the conversation -- the statements and questions that advance the discourse. My work introduced for the first time a conversational agent capable of generating and understanding both those propositional components and synchronized interactional components such as back-channel speech, gestures and facial expressions. These interactional components are crucial to the construction of what I have called the 'conversational envelope'.

In the newest Embodied Conversational Agent project, NUMACK, we are working with Matthew Stone and Barbara Tversky to study how people give directions using language, maps, and hand gestures, in order to uncover their underlying cognitive representations, and then to use the results to implement virtual humans whose speech, gesture and map drawing are generated autonomously from such underlying representations. This work addresses the challenges of specifying an underlying representation of discourse that is capable of driving generation of several modalities. For publications about the Embodied Conversational Agents, see papers.


Children and Technology: Story-Listening Systems
The discussion of the role of computational technology in children’s development has become increasingly polarized over the last year or so. On the one hand we find a frantic push to place computers and internet access into all U.S. schools, and on the other hand, a frantic push-back to place a “moratorium” on children’s access to computers. Clearly the answer lies at neither end of this long spectrum, and a careful review of existent studies shows a number of benefits, a palmful of harmful effects, and a plethora of unknowns. Based on my earlier (read, Developmental Psychology days) investigations into children's developing competence in narrative structures, and the important role that competence plays in their cognitive and social development, my response is that the answer lies in responsible and developmentally-informed design and evaluation of technology that specifically targets the needs and unique abilities of young children. “Computer technology” need not be incompatible with play-based learning, physical activity, active engagement, and social interaction, those features of childhood whose loss computer phobists decry.

With this goal, my students and I build Story Listening Systems that listen and respond appropriately to children's stories. What sets this work apart from previous Eliza-like systems that respond to users, or current CD-Roms that tell stories to children, is the fact that our systems encourage childen’s active exploration of narrative, linguistic creativity and verbal play. In this sense, the work fits into the long tradition of constructionist research at the Media Lab. Our contribution is to extend the notions of child as technology designer to systems that explore story, self-concept, and linguistic creativity. In addition, the majority of our research is embedded into electronic toys, and not desktop computers, supporting children's full-bodied, collaborative, social play-based learning.

Our most recent work in Story Listening Systems focuses on a virtual playmate for children who is able to attend to children’s stories, and tell back relevant stories in return. In this project, called Sam, the Castlemate, children can even pass figurines back and forth from the real to the virtual world.

In recent evaluations of Sam the Castlemate, we have demonstrated that children are able to improve emergent literacy skills -- their first steps into reading and writing -- by interacting with Sam, and even to improve their scores on the Test of Early Language Development. In a current NSF-funded project, along with Susan Goldman we are continuing to place Sam in actual kindergarten and first-grade classrooms and observe children's interactions with the virtual peer, and how these interactions may have an effect on children's early literacy.

Our storytelling systems have been used by children around the world. Renga is a permanent exhibit in the science museum of Singapore, and many of the other systems have been used by schools around the world and in several industry research labs. WISE has taken on a new function, teaching industry executives about the power and uses of storytelling. Publications about our storytelling systems work maybe found here.


Technology for Empowerment and Voice
Almost paradoxically, technologies that allow people to communicate across great distances have allowed social scientists to make advances in understanding the construction and maintenance of community. In particular, information and communication technologies (ICTs) have provided a miraculous window into the processes of community formation when the community members are vastly different from one another along the axes of age, culture, economic benefits, language, and other dimensions that would hinder if not prohibit communication in the physical world. But to what extent do online groups really demonstrate the hallmarks of community: increasing identification with group goals, patterns of assimilation to the other community members, growing enjoyment of joint tasks – not just work but also play? And when a group of people from many different countries come together online at the same time to create their own new community, does one culture dominate or are the collective voices of different world regions distinguishable? Does the voice of the nation that designed the forum influence the nature of the communication among the participants? How do these variables change over time as the members of the community come to know one another?

 

research |