Demo: Situation-aware conversational agent with kinetic earables

Shin Katayama, Akhil Mathur, Tadashi Okoshi, Jin Nakazawa, Fahim Kawsar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users’ situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a first-of-its-kind situation-aware conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, ambient sound and motion signatures. In particular the system is composed of the following components: • Perception Builder: This component is responsible for building an approximate view of user’s momentary experience by sensing his/her 1) physical activity, 2) emotional state, 3) social context and 4) environmental context using different purpose-built acoustic and motion sensory models [4, 5]. • Conversation Builder: This component enables a user to interact with the agent using a predefined dialogue base, and for this demo, we have used Dialogflow [1] populated with a set of situation-specific dialogues. • Affect Adapter: This component is responsible for guiding the adaptation strategy for the agent’s response corresponding to the user’s context, taking into account the output of the perception builder and a data-driven rule engine. We have devised a set of adaptation rules using multiple quantitative and qualitative studies that describe the prosody, volume and speed to shape agents response. • Text-to-Speech Builder: This component is responsible for synthesising the agent’s response in a voice that accurately reflects a user’s situation using IBM Bluemix Voice service [2]. This synthesis process interplays various voice attributes, e.g., pitch, rate, breathiness, glottal tension etc. to transform agents voice according to the rule of the Affect Adapter.

Original languageEnglish
Title of host publicationMobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
PublisherAssociation for Computing Machinery, Inc
Pages657-658
Number of pages2
ISBN (Electronic)9781450366618
DOIs
Publication statusPublished - 2019 Jun 12
Event17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019 - Seoul, Korea, Republic of
Duration: 2019 Jun 172019 Jun 21

Publication series

NameMobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

Conference

Conference17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019
Country/TerritoryKorea, Republic of
CitySeoul
Period19/6/1719/6/21

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Demo: Situation-aware conversational agent with kinetic earables'. Together they form a unique fingerprint.

Cite this