CHI'22 Paper Award

Creating design principles for conversational agents



Can you believe this is the longest research project I've ever done? In this award-winning paper, I explored how to offer the best guidance for chatbot users, including when and what type of guidance to provide.
I crafted a set of design principles that will guide future designers in building chatbots that improve users' task performance, progress, and overall experience.




Chatbot interactions remain
inefficient because the timing and
type of guidance to provide still
remain unknown.


I led pioneering research on conversational user interfaces, co-authored a paper published in a top human-computer interaction conference, and received a best paper award (5%).




Lead UX Researcher


Between-subjects experiment, Literature review, Interviews,
Reflection study, Survey


11 months (Sep 2021 - Aug 2022)


2 Researchers (including me), 3 Coding assistants, 1 Project Lead


Within just a year, this paper has been downloaded 2000+ times and cited 10+ times, benefiting many researchers, designers, and HCI & UX practitioners.

Read Paper



Discovering Research Gap


Guidance timing matters.
Content also matters.

Initially, I was intrigued by the challenge of assisting users in recovering from conversational failure. After reading over 100+ research papers, I was able to narrow our focus to designing better guidance. I discovered previous research lacks consensus on the ideal timing and type of guidance.

Research Questions



Which guidance type and timing enhance users' task efficiency, conversational progress, and performance in subsequent chatbot interactions?


What are users' subjective experiences of each combination?


What are users' desired characteristics for the chatbot's guidance type and timing combination?

Finding The Right Methods


Why mixed-methods?

We formulated three research questions and identified the necessary data to answer them. To ensure comprehensive results, I chose a mixed-methods approach, which included a lab experiment and reflection study (interviews).

I choose a lab experiment to gather firsthand data in a face-to-face setting, which helped us to answer RQ 1 ; And interviews to helped us to gain deeper insights into people's experiences, enabling us to address RQ 2 & 3.

Designing Context & Task


Design for relevancy.

To ensure research applicability, I designed task scenario for popular contexts: travel arrangement and movie booking.

We developed IBM Watson chatbots and crafted nine guidance conditions, including a control group. I led the conversation-design process, which entailed collecting sample dialogues, mapping 54 conversation flows, and iterating the CUI with 12 pilot testings.

Four timings.
Two types.

I designed nine guidance combinations that involve two types of guidance (Example-based and Rule-Based) and four different timings (Service-onboarding, Task-intro, Upon-request, and After-failure), and a control group. Our goal was to bridge the current research gap by determining which of these combinations can provide the best user experience.

Experiment Design


126 interviews,
1512 task interations.

The study consisted of two phases. In the first phase, I observed participants as they interacted with the chatbot while performing six tasks of varying complexities. Next, participants were asked to complete a survey that measured their satisfaction with the guidance provided.

In the second phase, I conducted interviews to gain insight into their perceptions, attitudes, and concerns regarding each guidance combination.

Data Analysis


Turning 1000+
affinity notes into
three important themes:
task efficiency, performance improvement, & opinions on guidance type and timing.

Because of the abstract nature of this research, I decided to bring clarity by doing physical affinity diagramming so the team can rearrange notes in a tangible space.

I led the team to synthesize over 1000 notes into three main topics: task efficiency, performance improvement, and diverse opinions on guidance and timing. Coming up with spicy insights became challenging as we converged.

Quantitative data
as story
for user performance.

In addition to quotes, we believe that users' actual interactions are indicative of their overall experience. We analyzed task-completion time, non-progress events, and improvements using statistical methods with R and Python. I took the lead in discussing which statistical method to use and defining the quantitative metrics.



Design conversation
based on users' need:
efficiency-oriented or exploration-oriented.

This paper delves into the nuances of various guidance combinations and user perceptions of timing and type. In sum, conversational designers should prioritize understanding whether users' needs are efficiency-oriented (e.g., booking, finding urgent information) or exploration-oriented (e.g., learning, browsing).

Examples = efficiency,
rules = exploration

After completing this project, I took the extra time to map out how these findings can be applied in real-world products.
Also, to accelerate task execution, chatbots could proactively shift to example-based guidance when they detect that the user is on the go.



Educate thousands of designers, researchers,
and developers

Working on this project for almost a year has taught me how to conduct research with both rigor and attention to detail.


Sharing insights asap

Research alone is insufficient without sharing insights with your team. Foster learning by sharing findings, and engage your team  as soon as possible.


Visualize everything

I've learned the importance of considering the audience when presenting insights. Simple visualizations help in quickly conveying key takeaways.

What I Learned


Affinity notes are only useful if well-written

Taking thorough and relevant notes is crucial to helping the team synthesize data more effectively and arrive at better insights that inform design decisions.

Mixed-methods = sticky story

Just like UX case studies, academic papers are also about telling a good story.
Using different methods can improve the quality of narrative.

Moving forward- do more quant

I performed high-level tasks like selecting statistical methods and determining key terms, and I want to improve my quantitative execution skills.

What's It Like Working With Me


“In the six years that I have worked with many students, Sonia stands out as one of the most well-rounded and skilled student researcher I have had the pleasure of working with. Her sharpness in finding valuable insights is truly remarkable. From our very first meeting, Sonia asked thought-provoking and important questions that drove the research ... She was able to connect users’ quotes with her keen observations during the experiment. When everyone in the room thought the quote was out of context, Sonia was the one who could connect the dots and provide meaningful interpretation to it.

- Stanley Chang (Project Lead; Associated professor @ NYCU Computer Science Department)