Project
001
3 takeaways for designing conversation user interfaces:
01. Example guidance results in better efficiency, while rule guidance leads to better understanding.
02. Timing matters; examples before task are preferred; examples after failures
are considered worst.
03. Design conversations according to the purpose: efficiency-oriented or exploration-oriented.
Overview
002
How can we provide better guidance
for task-oriented chatbot users?
I designed and led pioneering research on conversational user interfaces, co-authored a paper published in a top human-computer interaction conference, and received a best paper award (5%).
Info
003
Between-subjects experiment, Literature review, Interviews,
Reflection study, Survey
UX Research Lead
Sep 2021 - Aug 2022
11 months
2 Researchers (including me)
3 Coding assistants
1 Project Lead
Timeline
004
Discovering Research Gap
005
Initially, I was intrigued by the challenge of assisting users in recovering from conversational failure. After reading over 100+ research papers, I was able to narrow our focus to designing better guidance. I discovered previous research lacks consensus on the ideal timing and type of guidance.
Finding The Right Methods
006
We formulated three research questions and identified the necessary data to answer them. To ensure comprehensive results, I chose a mixed-methods approach, which included a lab experiment and reflection study (interviews).
I choose a lab experiment to gather firsthand data in a face-to-face setting, which helped us to answer RQ 1 ; And interviews to helped us to gain deeper insights into people's experiences, enabling us to address RQ 2 & 3.
Designing Context & Task
007
To ensure research applicability, I designed task scenario for popular contexts: travel arrangement and movie booking.
We developed IBM Watson chatbots and crafted nine guidance conditions, including a control group. I led the conversation-design process, which entailed collecting sample dialogues, mapping 54 conversation flows, and iterating the CUI with 12 pilot testings.
I designed nine guidance combinations that involve two types of guidance (Example-based and Rule-Based) and four different timings (Service-onboarding, Task-intro, Upon-request, and After-failure), and a control group. Our goal was to bridge the current research gap by determining which of these combinations can provide the best user experience.
Talking, And More Talking!
008
The study consisted of two phases. In the first phase, I observed participants as they interacted with the chatbot while performing six tasks of varying complexities. Next, participants were asked to complete a survey that measured their satisfaction with the guidance provided.
In the second phase, I conducted interviews to gain insight into their perceptions, attitudes, and concerns regarding each guidance combination.
Getting Our Hands Dirty With Data
009
Because of the abstract nature of this research, I decided to bring clarity by doing physical affinity diagramming so the team can rearrange notes in a tangible space.
I led the team to synthesize over 1000 notes into three main topics: task efficiency, performance improvement, and diverse opinions on guidance and timing. Coming up with spicy insights became challenging as we converged.
In addition to quotes, we believe that users' actual interactions are indicative of their overall experience. We analyzed task-completion time, non-progress events, and improvements using statistical methods with R and Python. I took the lead in discussing which statistical method to use and defining the quantitative metrics.
Research Findings
010
This paper delves into the nuances of various guidance combinations and user perceptions of timing and type. In sum, conversational designers should prioritize understanding whether users' needs are efficiency-oriented (e.g., booking, finding urgent information) or exploration-oriented (e.g., learning, browsing).
After completing this project, I took the extra time to map out how these findings can be applied in real-world products.
Also, to accelerate task execution, chatbots could proactively shift to example-based guidance when they detect that the user is on the go.
Impacting The WORLD
011
Working on this project for almost a year has taught me how to conduct research with both rigor and attention to detail.
Research is pushing the team's intellectual boundaries
Just like product research, simply conducting research is not enough unless the insights and knowledge gained are shared with your team and stakeholders. Share your findings and engage in the practice of synthesis with your team to help them learn. Research is not just about storing more documentation in a drive; it's about pushing the team's intellectual boundaries.
Although this project is grounded in academic research, my findings can be valuable for informing product design. It's fun to translate insights into actionable design recommendations.
Visualize everything. Who's going to read your paper anyway?
I've learned the hard way that I'm often the only one who cares about my paper. When presenting insights, it's important to consider your audience and use simple visualizations to help teams quickly grasp the key takeaways and understand how these findings can be applied in practical contexts.
Here's What I've Learned
012
Taking thorough and relevant notes is crucial to helping the team synthesize data more effectively and arrive at better insights that inform design decisions.
Just like UX case studies, academic papers are also about telling a good story.
Using different methods can improve the quality of narrative.
I performed high-level tasks like selecting statistical methods and determining key terms, and I want to improve my quantitative execution skills.
What's It Like Working With Me
013