Machine learning (ML) is everywhere now. It may not be the flexible, general artificial intelligence promised by science fiction stories, but it’s a powerful alternative to rules engines and brute force image and voice recognition systems. But one big problem remains: Modern AI is composed of black-box systems that are only as good as their training data.

The underlying nature of the ML-powered modules we drop into our applications and services raises new questions about how we design our applications. If we don’t know exactly how an application makes decisions, how can we inform its operators and users? We need some way to design that mix of certainty and uncertainty into our code so that we can show users that, while applications make decisions, they need to remain aware of algorithmic bias and other fundamental sources of error.

Two concepts come to the fore: one is the issue of AI explainability and the other is the more complex issue of responsible use of AI. The first is hard to deal with because the complexities of a trained deep learning neural network are very difficult to understand, let alone explain. The second is a multilayered issue that starts with your choice of data sets and training data and ends up with how you show that machine learning has been used.

Introducing HAX

If we’re to drive the use of responsible AI, then we need tools to help the entire development team understand what they’re doing and show them how their choices affect end users. We need to be able to at least minimize—if not avoid—the harm our applications can cause. That’s where Microsoft Research’s new Human-AI eXperience (HAX) toolkit comes in, providing a framework to help us choose how we work with these new technologies and how we use design to keep users informed. HAX builds on work published in a 2019 paper, turning its research concepts into practical tools.

As developers, we often forget about design. After all, it’s just chrome to keep users happy. But a modern software development life cycle needs to embed design from the very early stages of a project, rapidly iterating design alongside code as part of a prototype-driven agile development model. User-centered design is as much a part of requirements gathering as business analysis and systems architecture.

HAX aims to provide tools that can fit into any software development life cycle, informing and guiding without changing the way you work. It’s a big challenge but a necessary one. We need to be able to bake responsible AI usage into our processes in a way that still lets us benefit from the technology. Responsible AI needs to be considered in our UI models as well. Our machine learning applications may not have a user interface but may still be part of an interaction with users.

How do we show responsible use of image recognition techniques in a computer vision-based safety tool that shuts dangerous machines down when someone wanders into a space that’s considered risky? There’s no application UI, so can we design appropriate signage as part of our application design? The approach taken by HAX should help here, as it helps you ask the right questions as part of the development process.

Working with HAX

At the heart of HAX is a set of guidelines built from 18 best practices that you can use to guide interactions between users and your systems. They’re presented as design patterns in a series of cards, color-coded by when you might want to use them in an application. They represent four different interaction zones: before an interaction, during an interaction, when something goes wrong, and how a system should work with users over a longer period.

The various cards are presented in an interactive design library that adds examples of the various interactions. The first guideline card suggests describing what the system does, providing an effective introduction to the tool you’re presenting. Samples include showing how productivity tools use machine learning, such as how Word uses natural language processing as part of its text prediction service.

However, it’s the set of guidelines that focuses on handling uncertainty that is probably the most useful. We often assume that our applications will work perfectly and our users will understand our intent, so it’s important to have best practices that explicitly call out when things don’t work. To a certain extent, these guidelines are an argument for clarity in our UI copy and for graceful fallbacks in the event of failure.

When an AI-based system fails, it’s important for a user to understand what went wrong and why. For instance, if a chatbot doesn’t understand what a term meant, it should ask for clarification and apply the user’s response as part of its extended labeling and training.

HAX from the start

A tool like HAX is best used early in your development process. With machine learning a key component in an AI application, you need to understand how interactions with it affect the rest of your code. HAX includes an Excel workbook to help your team understand the impact of building responsible AI interfaces into your systems. This process involves more than the design team; it should include developers, project managers, and even business analysts.

Excel-based planning tools are a familiar part of the software engineering process, helping with requirements gathering and basic project scoping. It’s not a stretch to add a new spreadsheet to the very start of the software development life cycle with the aim of ensuring it’s kept up-to-date through the whole process. The HAX workbook uses familiar metaphors based on T-shirt sizing to help address the impact of adding human-AI interactions to an application.

Using a familiar metaphor here is important. One of the biggest problems for any cross-disciplinary development initiative is language. Engineering, design, and business teams have their own dialects. The same term may mean different things to different disciplines, and different words may say the same thing. By bringing in a new model we can provide a shortcut to the often-painful process of finding a way to work together and choose something we all understand works well.

Exploring AI with the HAX Playbook

The final part of HAX is still under development, the HAX Playbook. This is a guided tool that helps you test basic interaction scenarios for an application, providing a list of scenarios to test based on your answers to a set of questions. The initial version only supports natural language systems, but these can be the hardest to define as conversational tools are surprisingly free-form. Suggestions include looking at how a system handles spelling errors, generates responses, and classifies data.

Microsoft’s research team is working on the HAX Playbook on GitHub, so you can see how they’ve built it and what they are adding. You can even fork the repository to add questions and suggestions, focusing on your own scenarios and applications. You can think of the Playbook as an add-on to the Guidelines, a tool that shows what you might need to consider above and beyond the sample cards.

Microsoft Research is doing important work with HAX. We tend to treat responsible AI as a research topic rather than a practical discipline. But if we’re to use machine learning as part of our applications, we need to understand that it’s in fact a lot closer to home and needs to be part of our software development processes. HAX forces us to focus on the human interaction aspect of AI, to start asking questions about how our software uses AI. In doing so, it pushes developers to think more about how technology is used.