Why Talking to a Chatbot Should Be Like Ordering Pizza.

Some best practices for building functional—maybe even lovable—chatbots.

Jenny Clark
July 13, 2016

Imagine walking into a featureless reception area, devoid of any indications of what services or products the business offers. From behind the desk, a receptionist asks, “How can I help you?”

“Well, I don’t know,” you might say. “I don’t even know what this place is--what can you help me with?”

That’s effectively what the experience is like for someone engaging with a chatbot for the first time and encountering a wide-open question like “How can I help you?” This type of question indicates to the user that they can ask just about anything--even if that’s not the case. This means that the user will likely overwhelm the bot, and in turn become frustrated by the inability to execute a task or finish a conversation.

Today, chatbots typically work by focusing on a single subject, acting as a concierge to execute a task. Simply put: an insurance chatbot can’t order food, a food delivery chatbot can’t talk insurance, and even though your roommate could probably benefit, no chatbot on Earth will teach someone empathy.


That said, chatbots are swiftly becoming more sophisticated as leading companies, including Facebook, Google, and Microsoft, just to name a few, develop them. Eventually, you’ll be able to buy things and surface content without leaving the interface-light messaging apps (Messenger, Slack, and Skype) that you already use. Messaging apps have already surpassed social networks in active users, and most users spend the vast majority of their time in just three apps--Facebook, YouTube, and Facebook Messenger--not the app you built. Because chat will (eventually) replace heavier interfaces, it’s important that we create experiences that engage and satisfy users, giving them what they want, quickly.

Set the Stage.

While working on Watson, IBM’s machine-learning platform of Jeopardy fame, I designed a demo to illustrate how natural language processing could be integrated into a conversational interface. (Think of a conversational interface as a really, really smart chatbot—one whose AI allows it to learn a user’s preferences to save her time and lighten her cognitive load.) This is the first line a user encounters in the Watson template:

Hi! I can help you order pizza, what size would you like?

The exchange begins with what the service can do (take a pizza order) and prompts the user to provide the information that both humans and computers can understand (pie size). The same exchange happens when you place a pizza order over the phone: The person taking the order contextualizes the conversation by asking whether you’d like pick-up or delivery. A well-designed chatbot doesn’t conflate subject matters, and always completes a task before moving on to the next.

It’s the interface’s goal to keep the user focused on supplying all the details needed to close the deal--just like it would be for a human taking an order. To do that, the system zeroes in on keywords that determine how it will respond. If it doesn’t get a keyword that matches its API, it says so (“I didn’t quite get that”) then attempts to disambiguate by asking follow-up questions (“We don’t have medium. Did you want a small or large?”). Think about a conversation you and I might have. When you’re nodding your head, you’re communicating, “Yes, got it,” so I move on to the next thing.

But if you don’t understand what I’m saying, you’re usually zeroing in on a word or two that I had said that you didn’t immediately grok. In that case, I’d clarify what I meant to dispel confusion—”disambiguate” in developer speak—then advance the conversation. Once everyone’s on the same page about pizza size, for instance, the dialogue can move on to toppings.

The Need for Control.

Remember, conversational interfaces typically do one thing well—in this case, order pizza—so the trick when designing one is to make sure to build conversational guardrails to keep the chat going in the right direction. While designing the What’s in Theaters movie app for Watson, for instance, I learned that the easiest way to gain conversational control was by asking the user her name. Once you can address her personally, you have her attention and can better steer the exchange.

The other benefit to maintaining tight control is that you can make the AI smarter. Let users have their way with it, and you’re asking for trouble. Just ask Microsoft’s innocent bot, Tay, who devolved into a hate-monger à la Donald Trump after fewer than 24 hours interacting with Twitter users. Which isn’t to say that the unpredictable conversations users have with your bot should disappear into a void; they should be logged and mined for ways the interface can be made more valuable for users. If a significant number of people are asking for anchovies, and that’s not on your list of available toppings, maybe you should add them—or write a clever response that also progresses the conversation forward to executing the task (“Sorry, we don’t have anchovies. Go fish!”).

The best conversational interfaces will be the ones that offer a new experience every time a user comes back, whether that’s through sharper chit-chat, a progressively simplified UI, or a streamlined checkout process built on previous ordering preferences. That’s when function gives way to delight—the ingredient that transforms a digital product from one that’s tolerated into one that’s loved.

Read This.