Chatbot Guidance Design

In this award-winning paper,
I offered a set of design recommendations for conversational user interface to enhance task performance, improvement,
and user experience.

3 takeaways for designing conversation user interfaces:

01. Example guidance results in better efficiency, while rule guidance leads to better understanding.
02. Timing matters; examples before task are preferred; examples after failures
are considered worst.
03. Design conversations according to the purpose: efficiency-oriented or exploration-oriented.




How can we provide better guidance
for task-oriented chatbot users?


I designed and led pioneering research on conversational user interfaces, co-authored a paper published in a top human-computer interaction conference, and received a best paper award (5%).




Between-subjects experiment, Literature review, Interviews,
Reflection study, Survey

My Role

UX Research Lead


Sep 2021 - Aug 2022
11 months


2 Researchers (including me)
3 Coding assistants
1 Project Lead

Read Paper



Discovering Research Gap


Guidance timing matters.
Content also matters.

Initially, I was intrigued by the challenge of assisting users in recovering from conversational failure. After reading over 100+ research papers, I was able to narrow our focus to designing better guidance. I discovered previous research lacks consensus on the ideal timing and type of guidance.

Finding The Right Methods


Why Mixed-methods?

We formulated three research questions and identified the necessary data to answer them. To ensure comprehensive results, I chose a mixed-methods approach, which included a lab experiment and reflection study (interviews).

I choose a lab experiment to gather firsthand data in a face-to-face setting, which helped us to answer RQ 1 ; And interviews to helped us to gain deeper insights into people's experiences, enabling us to address RQ 2 & 3.

Designing Context & Task


Design for relevancy.

To ensure research applicability, I designed task scenario for popular contexts: travel arrangement and movie booking.

We developed IBM Watson chatbots and crafted nine guidance conditions, including a control group. I led the conversation-design process, which entailed collecting sample dialogues, mapping 54 conversation flows, and iterating the CUI with 12 pilot testings.

Four timings.
Two types.

I designed nine guidance combinations that involve two types of guidance (Example-based and Rule-Based) and four different timings (Service-onboarding, Task-intro, Upon-request, and After-failure), and a control group. Our goal was to bridge the current research gap by determining which of these combinations can provide the best user experience.

Talking, And More Talking!


126 interviews,
1512 task interations.

The study consisted of two phases. In the first phase, I observed participants as they interacted with the chatbot while performing six tasks of varying complexities. Next, participants were asked to complete a survey that measured their satisfaction with the guidance provided.

In the second phase, I conducted interviews to gain insight into their perceptions, attitudes, and concerns regarding each guidance combination.

Getting Our Hands Dirty With Data


Turning 1000+
affinity notes into
three important themes:
task efficiency, performance improvement, & opinions on guidance type and timing.

Because of the abstract nature of this research, I decided to bring clarity by doing physical affinity diagramming so the team can rearrange notes in a tangible space.

I led the team to synthesize over 1000 notes into three main topics: task efficiency, performance improvement, and diverse opinions on guidance and timing. Coming up with spicy insights became challenging as we converged.

Quantitative data
as story
for user performance.

In addition to quotes, we believe that users' actual interactions are indicative of their overall experience. We analyzed task-completion time, non-progress events, and improvements using statistical methods with R and Python. I took the lead in discussing which statistical method to use and defining the quantitative metrics.

Research Findings


Design conversation
based on users' need:
efficiency-oriented or exploration-oriented.

This paper delves into the nuances of various guidance combinations and user perceptions of timing and type. In sum, conversational designers should prioritize understanding whether users' needs are efficiency-oriented (e.g., booking, finding urgent information) or exploration-oriented (e.g., learning, browsing).

Examples = efficiency,
rules = exploration

After completing this project, I took the extra time to map out how these findings can be applied in real-world products.
Also, to accelerate task execution, chatbots could proactively shift to example-based guidance when they detect that the user is on the go.

Impacting The WORLD


Educate thousands of designers, researchers,
and developers

Working on this project for almost a year has taught me how to conduct research with both rigor and attention to detail.


Research is pushing the team's intellectual boundaries

Just like product research, simply conducting research is not enough unless the insights and knowledge gained are shared with your team and stakeholders. Share your findings and engage in the practice of synthesis with your team to help them learn. Research is not just about storing more documentation in a drive; it's about pushing the team's intellectual boundaries.

Leveraging academic research to inform
product decision.

Although this project is grounded in academic research, my findings can be valuable for informing product design.  It's fun to translate insights into actionable design recommendations.


Visualize everything. Who's going to read your paper anyway?  

I've learned the hard way that I'm often the only one who cares about my paper. When presenting insights, it's important to consider your audience and use simple visualizations to help teams quickly grasp the key takeaways and understand how these findings can be applied in practical contexts.

Here's What I've Learned


Affinity notes are only useful if well-written

Taking thorough and relevant notes is crucial to helping the team synthesize data more effectively and arrive at better insights that inform design decisions.

Mixed-methods = sticky story

Just like UX case studies, academic papers are also about telling a good story.
Using different methods can improve the quality of narrative.

Moving forward- do more quant

I performed high-level tasks like selecting statistical methods and determining key terms, and I want to improve my quantitative execution skills.

What's It Like Working With Me


In the six years that I have worked with many students, Sonia stands out as one of the most well-rounded and skilled student researcher I have had the pleasure of working with. Her sharpness in finding valuable insights is truly remarkable. From our very first meeting, Sonia asked thought-provoking and important questions that drove the research ... She was able to connect users’ quotes with her keen observations during the experiment. When everyone in the room thought the quote was out of context, Sonia was the one who could connect the dots and provide meaningful interpretation to it.

- Stanley Chang (Project Lead; Associated professor @ NYCU Computer Science Department)