AI user experiences beyond Chat
When our Usable AI™ team tries to explain the impact of generative AI as we perceive it, we often use a metaphor popularized by Jensen Huang, the CEO of NVIDIA:
This is the iPhone moment of AI.
When the iPhone came out it altered the way we interact with technology for good. Apple placed a capable, touch-enabled mini computer, with ubiquitous internet connectivity, a fully-fledged operating system, and an app store in our pockets. And so the comparison might feel quite appropriate.
However, while it might feel like the iPhone has changed the way we interact with digital systems dramatically, our primary interaction mode hasn’t actually really changed all that much especially if we look a bit further back in time.
How we are interacting with computers
In the not-so-distant past, ‘interacting with computers’ meant defining work to be carried out by a machine in the form of batches of instructions. These instructions contained the complete workflows which the computer would then go on executing before it — often hours or days later — would return the output.
The revolution, the one that might help put the iPhone into perspective, occurred with the advent of the mainframe computer some time later. This innovation allowed for the first time for a much more interactive command-response model, enabling users to issue commands and to receive responses nearly instantaneously, as well as modify future commands based on feedback from previous ones. And all that first in text and later using a mouse and GUIs.
This interaction model has persisted and shaped how we are working with computers (including the iPhone) still today. While our devices may have become more elegant, smaller, and faster, the fundamental command-response interaction has remained the same.
Until now.
Expressing intent
With the arrival of chat as an interface to using powerful large language models (LLMs) — most famously ChatGPT — we are poised to take another revolutionary step in how we interact with computers.
AI introduces a novel mechanism. Users are now able to simply express intent using natural language and a statistical model determines fulfillment. This paradigm shift moves us from providing concrete, detailed instructions to simply defining a desired outcome in fuzzy, human terms.
ChatGPT has excited millions in record time, among other reasons because it seemingly fulfills the promise of Siri and Alexa, allowing users to converse in natural human language about, well, pretty much anything with a machine.
But as we are starting to come to terms with this new interaction paradigm we are also slowly starting to understand some of the embedded challenges.
The bad
Humans are stateful
Our perception of the world is shaped by past experiences, learned knowledge, and inherited understanding. When interacting with any external system, digital or otherwise, we bring this entire background with us. This is through whether we are dealing with another human or a machine.
Product designers have adapted to this particular aspect of serving human users over time. The skeuomorphic design trend which was also popularized by the release of the iPhone, where digital interfaces mimicked real-world counterparts, might have aided the adoption of the first generation of apps by creating familiar experiences.
Good design bends to path dependence, especially when it extends across generations. The QWERTY keyboard layout, designed in the 19th century for mechanical typewriters, persists in digital keyboards to date, despite the absence of the original constraints. Yesterday’s learnings and expectations shape our interaction with tomorrow’s tools — whether it’s the iPhone or ChatGPT.
Growing pains
Generative AI is evolving rapidly — it’s hard to remember that it’s been only a year since ChatGPT entered the world stage. A short time, especially for radically new technologies which tend to experience growing pains. Far too often tend new tools to be hard to use, unintuitive, or ill-suited to their promised capabilities when they are first released. Sometimes they can overcome these challenges and sometimes they end up killing a new product in the tracks.
It took years for the automotive industry to realize that large touch displays might not be the best (sole) way to interact with a digital system while driving a heavy vehicle at great speeds. Physical dials and buttons, providing haptic feedback without requiring visual attention, are hard to replace by screens. What’s more — they are often more enjoyable to use in an environment that’s all about sensory sensations. But after a while companies started to veer off the new path and started balancing dynamic digital capabilities with the need for tangible, intuitive interaction.
The blank canvas problem
ChatGPT has made large language models accessible but did so by introducing a blank canvas with absolute freedom of choice. While this is great for exploration, it's less effective for users to carry out specific tasks quickly and with minimal overhead. OpenAI recognized the fact that their interface had inadvertently introduced an abundance of choice early on and added hints and usage examples to help users to get started.
Usability heuristics
And that’s not the only usability issue a conversational interface for LLMs introduces. ChatGPT and friends fall short in a regards to a number of well established, universal usability heuristics:
What’s the system status of ChatGPT?
A truly usable design keeps users informed about what’s happening through appropriate, timely feedback. With ChatGPT, determining the system status at a given moment can be challenging due to its 'wait and see' approach and non-deterministic nature.
What are ChatGPT’s primary functions?
Designers have learned to design for recognition rather than recall.
The Bosch IXO power tool is designed for maximum recognition, with clear affordances and apparent functionality — unlike ChatGPT. Good design minimizes cognitive load by making actions and options visible, guiding users through the application or service.
How does ChatGPT help you when things go wrong?
Any good design aims to prevent errors before they occur and enable users to recognize, diagnose, and recover from errors if they occur anyway. With ChatGPT, recovering from errors often means starting over. The handling of errors — whether it’s bad user input or a a system broken by surging demand — is reduced to do-overs.
The good
ChatGPT and large language models are remarkable tools, but they come with issues when we move beyond basic tasks like writing poems about cats. Successful use requires asking the ‘right (metaphorical) questions’, and even then, ‘answers’ may be untruthful or irrelevant. The challenge is dealing with the uncertainty of probabilistic results and the limitations of natural language interfaces. That being said, for users with ‘the right questions’ LLMs provide near-magical experiences.
The system matches the real world
Natural language chat lowers the barrier to entry more than any other interface. ChatGPT-like systems can explain, rephrase, or summarize to meet users halfway, accommodating users with varying levels of experience.
Minimal aesthetics, minimal overhead
Interfaces should avoid irrelevant or rarely needed information. Every additional unit of information competes with the most relevant ones, reducing their visibility. ChatGPT's interface, when used with the ‘right questions’, excels in efficiency and focus by adapting to new use cases with familiar formatting choices."
Accelerate, personalize, customize
Chat-based systems, while constrained in some aspects, offer incredible flexibility in others. Power users can customize the output extensively, from language choice to response brevity and typographic preferences.
How to keep the good and leave the bad
As we start building the first generation of AI-powered products and services enabled by large language models, we need to focus on creating experiences that are AI-enhanced, intent-driven, and more human-centric. Over the past 12 months we have identified a few tools which help us to do just that:
Wrapping AI
Excel is a powerful tool capable of operating a large company's sales division if used right. However, you are unlikely to find a large company running their sales off Excel as the generic nature of the software is bound to lead to inefficiency. In contrast, custom experiences like Hubspot, tailored for specific challenges, tend to offer greater efficiency and utility.
Adding natural language interfaces to existing products or enhancing them with generative AI capabilities enables us to minimize our reliance on text inputs and to reduce guesswork on part of the user.
Masking AI
Developers have used OpenAI's API to generate structured, properly formatted JSON outputs since the tool was first announced. The desire to incorporate GPT capabilities into applications, enhancing functionality while minimizing direct user interaction with the natural language portion of LLMs is palpable. AI can simplify and enhance processes that involve human input, like customer support automation and product recommendations. By integrating AI more deeply into applications, we can leverage its capabilities without overwhelming users.
Managing uncertainty
Dealing with the uncertainty inherent in probabilistic models will prove increasingly a major challenge as we are integrating LLMs and friends without products and services. Designing interfaces that enable users to navigate this uncertainty effectively is crucial for creating usable AI-powered systems.
User interfaces for these systems will succeed if they manage to be truly straightforward, help to accommodate for unexpected errors, and enable decision-making on part of the user regardless. The focus will be — at least in the midterm — on helping users achieve their goals efficiently, especially when dealing with errors.
The better
As we explore beyond chat interfaces in AI-enabled products in attempt to deliver real value through real products we might want to consider to try and check a few new boxes:
Anticipating user needs
AI enables us to offload tasks and guess likely next steps, prompting users to choose rather than ask ‘the right questions’ or find their way through a complex UI. A focus on anticipating user needs will significantly enhance the usability of AI-powered applications.
Deep personalization and context awareness
AI offers the opportunity for deep personalization and truly context-aware systems — maybe for the first time ever to this extent possible. Imagine a unique user interface for every individual, shaped by personalization that influences both content access and navigation.
As we develop the next generation of applications, it's crucial to consider AI as a tool for enhancing user experience, enabling intent-driven interactions, and providing more human-centric solutions. And a lot of this will happen far from chat.
How to design AI-enabled products that meet real user needs
Much of the excitement around generative AI is driven by technological developments, often to the detriment of actual customer and user experiences. To address this issue, we have developed the Usable AI™ framework. By emphasizing real, validated challenges and the opportunities unlocked by generative AI from the outset, we can ensure that we select the appropriate method and channels — whether it be a conversational UI, wrapped or masked AI, or an entirely different approach.
If you want to learn more about the framework, check out our webinar on the topic or book a meeting with Aki, our Head of AI.