Conversational Architecture

Published in

Anti Patter

11 min readOct 31, 2015

My business partner Roy and I wandered around this lovely party on the Gawker rooftop here in NYC last summer. A “Hacks and Hackers” meetup, it presented a common ground for programmers and journalists to get together and compare notes. We started to notice an interesting gap in the thinking of the two represented groups.

On one hand, the hackers/programmers were acutely aware they were sitting on top of a bunch of potentially useful technology for the publishing world. Semantic web, collective intelligence, mobile apps, location-aware, social network integration and so forth. There are now many, many tools in the swiss army knife of technology with which one could build a new-generation media empire.

On the other side of the fence, we found the business people we encountered quite open and interested in the new technologies. Everywhere we’ve been, we’ve encountered attitudes ranging from “oh yeah, I’ve heard of that — I should really learn more about that” to “yes it’s awesome. I’m really interested in getting on that train!”.

What we have not found, however, is a bridge between these two worlds. When we’ve started to speak with user experience, visual design and product development people, they seem at a loss as how to incorporate these technologies into their products and leverage them to their advantage. There is no common design language or methodology for making sense of the whole collective intelligence world in a comprehensive manner.

Before we go any further though, let’s go camping.

The Idea

Let’s pretend for a moment we have a clothing retailer whom we’ll call Rugged Clothes. They want a complete coordinated digital marketing strategy. Here’s what it might look like:

The Smith family goes camping. During the camping trip, the kids wear some new clothing their parents have bought from Rugged. The parents take pictures of their kids with the Rugged iPhone app, uploading the pictures.

When they get home, the mother opens up the Rugged app on her iPad. Because they’ve recently uploaded the photos, the app automatically opens up in photo editing mode. Mom goes through the photos, picking out the best ones, adding comments to them. Then she elects to “publish” them.

The photos are published to the Smith family page on Facebook via the Rugged Facebook app. Friends and family can access them. The photos are automatically tagged with the articles of clothing that appear in them. Clicking on them will take users to a special category page on the Rugged website based on the Smith family. From their the individual pieces of clothing can be purchased.

This story tracks the experience of users across four different media (iPhone, iPad, Facebook and the website) but describes a single, coherent experience that is aware of the current state of it’s users. This is the kind of story that catches the attention of the more visionary business people these days. It’s the promise of the collective intelligence technologies, and of the integration of mobile, social and web services.

The Page: A Bootstrapping Metaphor

We’re in a period of time with collective intelligence technologies analogous to the early days of film, a century ago. When motion picture technology first came on the scene, people simply leveraged the methodology of a previous medium, theater, and filmed it. It was several years before they started to realize they had a completely separate medium on their hand, and started to experiment with film editing (montage) and moving the camera during a shot (tracking and panning).

Similarly, in the web world, we’ve had the page. The concept of the web page swiftly became an incredibly convenient metaphor for designers in the early days of the commercial web. It allowed people who had a background in print design to make the jump into web, because they already knew how to lay out a page. (Put aside the endless problems created by designers who assume web pages work like print, it was actually a net advantage: it bootstrapped web design). By framing design decisions in the metaphor of a page, and a website in the terms of a “collection of pages”, we had the foundation to structure the question of how to build in this new medium.

Unfortunately, all that is breaking down now. The page metaphor is becoming increasingly strained and less relevant in our modern world. Consider the following:

Mobile applications on multiple platforms
Highly dynamic AJAX/DHTML/HTML5-style websites
Social Networking Platforms
Location aware services
Collective Intelligence / Semantic Web technology

None of this stuff has much to do with pages. But without a design language or metaphor to fall back on, a chasm emerges between business people who can see the potential of these technologies and are willing to fund the right projects, and the technology folks who stand ready to build this stuff if only someone could let them know what, exactly, they should be building. There is no way to capture this stuff simply by discussing “pages”. It’s time to put that wireframe down, and step back.

Brave New World

Let’s refrain, for a moment, from discussing the specifics of any individual new medium (web, mobile, social etc) for a minute and try and consider the big picture organizational communications. There are three basic characteristics to which we should aspire in our communications strategy:

Multi-Channel: As is largely conventional wisdom now, just having a web page, or just having a Twitter account etc., is usually not sufficient. Different media have different mechanics and areas in which they are effective, and the best approach would be a comprehensive communications strategy that takes advantage of the strengths of each platform and leverages them in such a way that makes the most sense for the organization.
Multi-Modal: This is an important concept: people’s interactions with organizations are modal. Often they are driven by some purpose, specific or not, held by the individual. One of the biggest design challenges on the Internet has been to try and present what are essentially modeless designs (e.g. “good for everybody, all the time”) that are actually used in very specific, modal ways. A great communications approach would be multi-modal, rather than a “one size fits all” mode.
Multi-Directional: Communications is really a two-way street. While it’s one thing to have a touchy-feely marketing message in which you claim to listen to your customers, actually implementing it in a quality way at scale is extremely challenging. Nonetheless, an organization that can actually respond to feedback and requests from individuals is at an incredible advantage.

So what metaphor can we use to pull together these qualities? What exists in the natural world that’s a good fit for this?

Enter The Conversation

Conversations have obviously been around forever, which conveniently means that most people have something of an intuitive grasp of what they’re about. Looking at our criteria above, we can see that this metaphor maps nicely onto what we’re trying to accomplish. Conversations can traverse various media (multi-channel) can shift modes depending on various action of the partipants (multi-modal) and involve two or more parties both listening and speaking (multi-directional).

Conversations should be used as the foundational design metaphor, at the point after when the initial concept for online communications has been proposed, but before specific user interfaces are designed.

This let’s you know what to build. By modeling the conversation (or conversations) your organization is having with the outside world, you will be able to shape your online communications strategy in a way that is targeted towards specific audiences, over specific channels. More targeted communication means less noise, and the more fruitful the conversation will be.

Conversation Design: How to Do It

So how exactly does one design a conversation? Let’s break it down into steps:

Conversation Mapping

The first step is conversation mapping, or essentially to determine what conversations exist between participants. This is a high level, strategic activity, that creates some shape to the universe into which specific design thought can be injected. The first step is to identify the participants. Here’s an example from my company, Saaspire:

Saaspire itself sits in the middle as the reactor. This isn’t meant to indicate that it’s passive in any sense, but because it’s the participant that we actually control, any automation we build will live there, so the term “reactor” is accurate as far as describing the process. All around are the main constituencies with which Saaspire communicates: customers, investors, developers and press. Those are the “actors”.

Now we need to establish the conversations we’re having with each participant.

Basically what we’re saying is that Saaspire is having four different conversations with different participants. For customers our communications tend to be about product support and education — similar to this is the developer conversation in which our communications are more technical and platform oriented, but again about documentation and support.

To investors we speak about the value and potential of the company itself, and finally to the press we try and make an attractive “next big thing” story.

Identify User Contexts

So once the existence of a conversation is established, how do we gain some insight on how it functions? First we need to look at the driving forces of conversation modality, which I call context. Context is an aggregation of the various factors about a participant that, in combination, drives the conversation from one mode to another.

Context consists of four factors:

Personas are behavioral aggregates of participants. They represent any long-lived group of user behavior that’s worth addressing en masse. Traditionally in web design, personas have been expressed in demographic and (high level) motivational terms (“Cindy is 36 and wants to get things done fast.”). While we’re less concerned with the demographic aspect of personas in this case, it can still useful to think of them in motivational terms: How do they think? What do they want?
Affinity represents the stuff that people like. It might refer to content or advertising on a media site, or it might instead refer to products on an e-commerce site.
A state or goal represents a temporary condition in which a participant exists. For example, a user that has started a checkout process on an e-commerce site could be said to be in a specific state that will conclude with finalizing checkout. The main difference between states and personas is the temporary nature of states.
Environment is a catch-all meaning the circumstances under which the conversation takes place. In digital terms, it tends to refer to the browser, the operating system and the form factor of the device used for access, but can also be broadened to include concepts like location.

This leads up to a very important concept in conversation design: contexts trigger modes.

Mode Mapping

Once you’ve identified the possible personas, affinities, states/goals and environments that you’re going to support, the next step in conversation design is to determine your response. This is done by having specific combinations, or contexts, trigger modes. Modes are your response to that context. For example, within our customer support/education conversation, we might identify the following mode:

In this particular case, we are looking at participants that we’ve tagged in two specific ways: first, they are considered part of the “qualified customer” persona. This might have been established in any number of ways, such as requiring them to log in or otherwise establishing that they hold at least once license for one of our products, or it may just be some much softer form of self-identification on their part. Secondly, they have exhibited behavior (perhaps a search on our site, or in inbound link from a specific Google search) that let’s us know that they have the immediate goal of seeking information.

Given these two criteria (and we don’t care about their affinity or environment in this case) we trigger “customer support mode”. Within customer support mode, we might provide facilities on our website that are slightly (or substantially) different from what other users might see.

So What’s A Mode, Exactly?

A mode is a building block for your web service. They are the states that your web service passes through for individual users as they are triggered by those users’ contexts. In different modes, your web service might contain different functional components, variations in user interface, different content and so forth. The question of what to vary between modes will be one of the foundational skills in this approach to design.

For example, you could use different modes to optimize for the way the user likes to interact with the site. Perhaps certain users tend to search for information within your website whereas others are more browsers (they use the navigation system). The site could adjust modes to emphasize the elements of the UI (search vs navigation) that most suit those users.

Modal design could emphasize the state in which a user exists at a given point. For example, consider the “offline buying decision” in e-commerce. A visitor goes to an e-commerce website and browses around, looking at various wares. Then he leaves the site, and while he’s away from the site, decides to purchases something he’s seen. At that point he goes back to the website and immediately purchases the item.

Most websites don’t know what to do with this behavior. They see the first session as a conversion failure, and the second session as a success without any explanation. But a modal site would recognize this as a state shift for the same profile, and the site could optimize for the appropriate state (browsing vs buying).

If the site suspected it had an offline purchase decision maker (a Persona), it could switch between “browse mode” and “buy mode”, based on the inferred State/Goal of the individual user. In browse mode it would be always showing more options to the user, up-selling, suggesting more items and generally just extending the engagement between the user and the site.

If the person came back to the site, and immediately added the item to their shopping cart (something they had looked at before) the site would switch modes, into “buy” mode. In this mode, the objective of the site is to let the customer check out as fast as possible. No more distractions, no upsell, no additional options. The user is now in buy mode — just let them buy.

More to Come

So that’s the quick introduction to Conversational Architecture. I’ll be drilling down into more of this in future posts. Please let me know what you think — I’d love to discuss this.