Structure and Fluidity — Thoughts on Conversational Interface Design

Loren Davie
Anti Patter
Published in
3 min readJun 29, 2016

--

Please listen carefully as our menu options have recently changed.

The worst.

Phone trees incite what I call Faceless Rage: that feeling of fury that rises in the realization that one is being forced to interact with an automated bureaucracy that is badly impersonating a human being. Phone trees are guilty of many sins, but chief among them are:

  • They are both stupid and aggressive: they completely control the terms and structure of the conversation, prioritizing their agenda and worldview utterly over yours.
  • They are ponderous. You have to work your way down their ontology in order to reach your goal. It’s like being stuck in traffic.
  • They are tone deaf. They can’t remember the exchange you had one minute ago, nor hear the emotional timbre of your voice.

Although phone trees are arguably primitive conversational interfaces, they are fairly poor ones, which is why they have such a terrible reputation for customer experience. Again, they have many sins, but I’d like to focus on the problems of Structure and Fluidity in conversational interfaces, and use phone trees as the poster child for problematic over-structuring.

The Case for Structure

When dealing with computers, structure is generally your friend. Business logic and data need to be organized in a definitive way, to make the system behave predictably. So the default approach of a system with a conversational interface is to impose structure on every aspect of that conversation, thus ensuring that incoming requests from users can be easily slotted into their appropriate functions and responses. This is how phone trees generally work.

There is a second argument for structure, which is about discoverability. Unlike a GUI, a purely conversational interface can’t lay out all of the options for the user in a single cognitive model, so it’s not too hard to create a situation in which the user is unable to speak or type the right phrase to get the results they want. These are the kinds of problems we had with human computer interaction prior to the rise of the GUI: users who were baffled by command-line interfaces, and effectively paralyzed.

The Case for Fluidity

Complete pre-defined structure is not how human conversations work. For example, we praise the qualities of a good listener: someone who hears what we have to say, confirms with us that they have heard, and then responds with relevance and empathy to address our topic. The good listener, plus operational effectiveness, is essentially also our model for good customer service. The customer service rep listens to your problem, understands it and is empathetic, then is effective in addressing it. The good listener is a decent model for what users want out of conversational interfaces.

Fluidity in a conversation also is more efficient. One can simply set the conversational agenda directly, and everything will shift to that context. It’s much more desirable to simply state “my cable connection isn’t working” than to press 3, then 5, then 7, and read your account number etc. In essence, the onus for structuring the information isn’t all on one party.

Finding a Balance

So with potential discoverability and usability problems on the Fluid side, and potential efficiency and humanity problems on the Structure side, it becomes clear that the art of conversational interface design involves finding the right balance. This has led to some interesting hybrid approaches.

In chat interfaces, it is becoming common to present a set of option buttons within the context an otherwise unstructured conversation. The user isn’t forced to press one of those buttons — they could type into the chat window as usual. But this approach allows the system to telegraph a menu of options to the user, both addressing discoverability issues and also taking a step towards pre-structuring user input, making it a little easier on system implementation.

What’s nice about this, is that even though a little bit of structure has been imposed on the conversation, it’s still completely contextual. These two buttons have been pushed in response to the current context of the conversation, instead of sitting contextless in some GUI or phone tree. So it feels reasonably humanistic and authentic, but still addresses the problems with the pure-fluid approach.

--

--