Skip to content
All posts

Conversation Flows in the age of Gen AI

 

In the field of Customer Experience (CX), from AI assistants and voice calls to human agent chats and email ticketing systems, the underlying natural language conversations/interactions is a black box. Opening up that black box means visualizing the organic natural language flows to pinpoint the breaking points and failures to better serve the end customer.

The issue is amplified in Gen AI assistants. Unlike classic dialog trees designed manually, Gen AI assistants are free-form, with little observability and monitoring.

Regular iterative changes to CX needs monitoring, including conducting A/B tests to track improvements. This requires diving deeper into the customer conversation data flows and taking actions based on those insights to drive continuous improvement in the quality of support service, while improving retention & revenue as a side effect.

How to monitor and analyze conversational CX flows?

At Cuein, it all starts with raw conversation transcripts or recordings between agents (bot or human) and the end customer. The Cuein AI platform extracts 100s of signals aka “cues” from each conversation like understanding the user query, underlying root cause, data collected to clarify the user query, and agent actions. It also bubbles up cues like frustration and confusion to get additional understanding of the interaction. 

The platform then aggregates these cues across millions of conversations and groups them into actionable variations or conversation flows. Let’s dive deeper into these flows, what they look like and some design considerations for the product.

Conversational CX flows

The anatomy of conversation flows

The concept of conversation flows serves as a valuable tool for mapping extensive arrays of customer interactions into more abstract, sequential views. By offering a visual representation of crucial cues within these exchanges, conversation flows enable a streamlined analysis of numerous customer interactions in a condensed time frame. Conversation Flow analysis efficiently identifies bottlenecks, pinpoints areas of improvement or automation, and monitors the progression of interactions across siloed Customer Experience (CX) systems.

Conversation flows are graphs with nodes and edges. Nodes represent message categories, like User Query, Agent Response, Confirmation of Resolution and so on. The edges connecting these nodes represent the sequential progression of interactions within conversations, creating a structured representation of the conversation.

To illustrate, let's take an example of a fictitious food delivery provider. The customer support includes an AI assistant called BiteAssistant designed to address customer questions. In the course of time, numerous conversations may unfold, each categorized under specific contact reasons such as food delivery estimate, late deliveries, order problems or tipping concerns.


Let’s double click on a contact reason - Food Delivery Estimate
The conversation below illustrates a user asking the AI assistant about food delivery estimates with a follow up concern on the driver going the wrong way in that context. The AI assistant confuses that with desire to cancel. The user then asks to be transferred to an agent who then provides an answer. The corresponding flow view shows these Confusion, Agent Transfer nodes making it easy to highlight areas of improvement for the bot.

Also note that the flow is a representation for 308k distinct conversations. So we end up with a handful of such paths that effectively capture the majority of the high frequency interactions, streamlining the process of extracting valuable insights from the massive volume of conversations.

Anatomy of Conversation Flows 2

Design considerations for Conversation Flows 

Here are some of the design considerations that were made as part of numerous iterations of conversation flow


Node Taxonomy

There is a lot of historical literature on message (aka utterance) categorization and taxonomies. Classic academic papers including MultiDoGo, Schema Guided Dialog Dataset and MultiWoz have attempted to construct realistic conversations and define their own message categories. 

The node taxonomy is inspired from the above references but tailored for the future of conversational interfaces. The nodes essentially have two parts to their anatomy. 

The first is a domain agnostic taxonomy which is derived from a combination of some of these taxonomies. Note that there were categories like “Greeting”, “Goodbye”, “Request More” but since they were less meaningful parts of the conversation, they are not shown in our conversation flow.
Categories like User Intent, Agent Response, Agent Transfer, Confusion are the meat of the conversation and help surface the relevant and most actionable aspects that need attention.

The second part of the anatomy are intent categories, response categories, pieces of data information collected as part of a transaction (like email, order number, etc.) which are specific to the dataset being analyzed. This varied based on the vertical domain, and even with the same domain based on the use case.

Our taxonomy continues to evolve with new use cases and needs constant review and tweaking on an ongoing basis to make it efficient and effective for the end product.


Filter and Search Flows 

Another challenge in visualization of natural language conversation flows is to get to the actionable pieces quickly without having to explore through all the permutations and combinations. The constraint gave birth to an opinionated filter abstraction to enable slicing/dicing and segmentation to derive actionable improvements. 

Intents or contact reasons are used as the primary way to drill down into specific subset of flows. As seen below, Food Delivery Estimate is the selected intent for the example. Additional intents if present in conversations would also show up if there were interactions with multiple intents.

Flow Search and Filter

There is also a zoom option (1x to 3x) which shows more of the top flows for that intent by frequency of occurrence. The gradual addition of more flows makes it easier to consume the information at different levels.

In addition to filtering by user intent and contact reason, dynamic filtering of the flows are supported by dimensions like Confusion, Deflection, Inferred CSAT, etc. 

Lastly, power users can navigate flows through powerful keyword and semantic search on the entire conversation transcript.

Generalizations and compression to separate signal from noise


Messages are the components of a conversation. Initially, we modeled every single message into flows, which resulted in many different paths in the flow with insignificant variations in them.

For instance, similar agent answers might be split into multiple messages interleaved with occasional filler messages and/or user acknowledgement and clarifications with repetitions. This explodes the flows with inherent noise in natural language conversations.

We evaluated and experimented with many generalization and compression strategies before picking an optimal view for the high level noise free flow view and node semantics while not losing critical information in terms of frustration, confusion and abandonment in the conversations.

 

Conclusion

As language emerges as the new UI, conversation interfaces need a flow visualizer to be able to provide transparency into breaking points. Real world conversation flows are complex and nuanced, and need multiple design considerations to be insightful and actionable. At Cuein, we invested years in the research and development of conversation flows in production environments and continue to explore the balance between noise and signal.