This case study is based on our project with a leader in tobacco heating systems production. At the start of our collaboration, the chatbot was already relieving operators of significant workload. The company wanted to identify areas where the bot struggled, assess if it made any typical errors during interactions, and most importantly, gain inspiration from the market, including international experience.
Markswebb conducted an audit of the chatbot (both on the website and Telegram) and advised the client’s team on which scenarios to improve and where to learn useful business tricks. We quickly delved into the project with the customized Chatbot Rank 2022 system and gained valuable insights about the client’s chatbot.
Note! At Markswebb, we update the Chatbot Rank metric annually and know how to develop such services to reach leading market positions. We can find international practices that will inspire ahead-of-the-curve solutions for the local market, as we did for the client in this project.
Contents
At the start of our collaboration, the chatbot was already significantly reducing the load on operators. The client wanted to understand what the bot still couldn't handle, whether it made any typical mistakes during interactions, and most importantly, what ideas could be gleaned from the market (including international examples).
The task is to create a map of service development:
Markswebb's experience began in the banking sector, but our evaluation methodologies are initially designed to be scalable to any market. Therefore, we quickly "entered" the project with an adapted Chatbot Rank 2022 system and gathered extensive information about the client's chatbot.
If we had built the evaluation criteria from scratch or used a less flexible methodology (such as a standard checklist), it would not have been possible to obtain so much information in such a short time.
Atypical scenarios often result in failures, but they occur less frequently. The real challenges lie in everyday tasks within the services, where typical errors are repeated by thousands of users daily. This is where businesses lose the most money.
Instead of inventing interesting and rare scenarios, we examined user feedback and questions and modeled 18 typical requests for the client’s industry, such as placing an order for a device, finding the nearest store, troubleshooting device charging issues, and more.
One might think that the bot handles these frequent and typical requests well. Technically, yes, they are accounted for. However, real users found it difficult to interact with the bot, particularly in finding the desired response within the replies.
For example, store addresses were hidden behind the button "Learn more about [product name]." Users were confident it would lead to instructions, advertisements, or flavor descriptions — anything but stores. As a result, many manually entered their queries, while others tried various dialogue options in search of stores through trial and error.
One respondent ultimately failed to obtain the information, resulting in a lost order.
Difficulties were also encountered with other scenarios. A person intending to make an online purchase had to figure out that they needed the "Special Offers" section.
The user always thinks in simple categories: "buy online," "buy in-store," "find a store," "complain," "find out why it’s not working," and expects familiar icons and placement for essential elements like the shopping cart. Buttons in the chat should also be created considering the user’s logic and how they form their needs.
As a result, out of 18 "simple" chatbot scenarios, only 6 performed well. For the rest, we have compiled recommendations for improvement.
Requests like "I would like to discuss the color of the device I’m ordering with the bot" or "Why doesn’t the bot automatically retrieve my phone number?" are essentially ready-made script modifications. They are insightful and can be directly passed on for development. However, this is not enough, and often deep interviews are required, where users express their desires and experiences outside specific scenarios:
Our task as UX researchers is to "capture" as many valid grievances as possible and formulate steps to address them. In-depth interviews help the business hear the real user’s opinion and learn about their impressions, which is challenging to achieve from within the business.
By the way, in the in-depth interviews, we also include questions about how users learned about the application. So our research is also of interest to marketers!
In total, we obtained 23 valuable ideas, which, like the scenario errors, were packaged into specific steps for correction and improvement.
If users remain silent about certain shortcomings, it doesn’t mean everything is fine. Sometimes people simply don’t have an "ideal" digital experience, so they don’t realize that things could be different and are willing to tolerate inconveniences.
Specialists conducting cabinet examinations help supplement user feedback and identify what is wrong with the service based not just on feelings, but on UX science.
A good bot is like a good employee: it possesses both hard skills (knowledge of its field) and soft skills (ability to understand, greet, communicate friendly, clarify, check if the answer was found, bid farewell, and invite for further inquiries).The "human" aspect of the bot’s abilities is assessed using CUI principles: functionality, usability, empathy, politeness, and others.
Conversational User Interface, or CUI for short, is a vital part of every bot. Markswebb has a basic CUI methodology that we use for auditing chatbots, but it is adapted for each specific domain with its unique user requests.
In our reports, we not only provide evaluations but also prioritize the identified issues:
The customer received recommendations from Markswebb for each item, stating what and why the bot should be trained on:
We provided over 20 such recommendations in the CUI section. Even when only a portion of them is implemented, loyalty towards the bot and the company will increase, while the burden on human resources, on the contrary, will decrease.
In UX, there has been a trend of "liquid expectations" for several years. It’s the story of how people form their expectations based on all the services they use, regardless of the industry.
It turned out that initially, humanity could order pizza and sushi online, and then household appliances and even apartments. Yes, these are all "liquid expectations" that make us wonder: if I get some convenient "perks" in one application, why can’t I have the same in another?
That’s why we search for interesting practices without limiting ourselves to a specific industry.
We found interesting practices in other applications in the local market. Their bots understand typos, allow users to go back a step, and recognize the type of inquiry - for example, distinguishing between "want to cancel an order" and "having trouble canceling an order." In the first case, the relevant section is shown to the user, while in the second case, they are redirected to an operator. It’s very convenient, and the client took note of it.
In total, Markswebb specialists gathered 24 examples of implementations for functions that should be improved. As a bonus, the report included practices from the international market — features and algorithms that neither the customer nor their competitors currently possess.
For example, the Bank of America widget offers a convenient scenario for resolving disputed situations:
This practice, like others, can be adopted (the scenario is described in more detail in the report for the client).
In our reports, we always provide a comprehensive implementation roadmap, essentially a ready backlog. Additionally, the UX specialists conducting the research provide consultation to the product team throughout the process. If the release is delayed or technical and product limitations arise, they assist in making adjustments on the go without compromising the quality of these changes.
This case study demonstrates a strategic and comprehensive approach to enhancing chatbot performance. By partnering with Markswebb, the client was able to identify key areas for improvement, implement innovative solutions, and leverage best practices from various markets, ultimately transforming their chatbot from good to great.
We respond to all messages as soon as possible.
Every year we conduct up to 15 studies of digital services. These are industry benchmarks that reflect the state of the market and trends.