Can we rely on AI?

by Joseph K. Clark

As artificial intelligence (AI) systems get increasingly complex, they are being used to make forecasts – or instead generate predictive model results – in more and more areas of our lives. But at the same time, concerns are on the rise about reliability amid widening margins of error in elaborate AI predictions. How can we address these concerns?

According to Thomas G Dietterich, professor emeritus and director of intelligent systems research at Oregon State University, management science offers a set of tools that can make AI systems more trustworthy. During a webinar on the AI for Good platform hosted by the International Telecommunication Union (ITU), Dietterich told the audience that the discipline that brings human decision-makers to the top of their game could also be applied to machines.

Why is this important? Because human intuition still beats AI hands-down in making judgment calls in a crisis. People – especially those working in their areas of experience and expertise – are more trustworthy. Studies by the University of California, Berkeley, scholars Todd LaPorte, Gene Rochlin, and Karlene Roberts found that certain professionals, such as air traffic controllers or nuclear power plant operators, are highly reliable even in a high-risk situation. These professionals develop a capability to detect, contain and recover from errors and practice improvisational problem-solving, said Dietterich.

This is because of their “preoccupation with failure”. They are constantly watching for anomalies and near-misses – and treating those as symptoms of a potential failure mode in the system. Abnormalities and near-misses, rather than being brushed aside, are then studied for possible explanations, generally by a diverse team with wide-ranging specializations. Human professionals bring far higher “situational awareness” levels and know when to defer to each other’s expertise.


These principles are helpful when thinking about building an entirely autonomous and reliable AI system or designing ways for human organizations and AI systems to work together. AI systems can acquire high situational awareness thanks to integrating data from multiple sources and continually reassessing risks. However, while adept at situational awareness, current AI systems are less effective at anomaly detection and cannot explain anomalies and improvise solutions.

More research is needed before an AI system can reliably identify and explain near-misses. We have strategies ttodiagnose known failures, but how do we diagnose unknown failures? What would it mean for an AI system to engage in improvisational problem-solving that can extend the space of possibilities beyond the initial problem the system was programmed to solve?

Shared mental model

Where AI systems and humans collaborate, a shared mental model is needed. AI should not bombard its human counterparts with irrelevant information and must also understand and predict the behavior of human teams. One way to train machines to explain anomalies or deal with spontaneity could be to the performing arts. Researchers and musicians at the Monash University in Melbourne and the Goldsmiths University of London explored whether AI  could perform as an improvising musician in a phantom jam session.

Free-flowing, spontaneous improvisations are often considered the most authentic expression of creative artistic collaboration among musicians. “Jamming” requires musical ability, trust, intuition, and empathy toward one’s bandmates. In the study, the first set, called “Parrot”, repeats whatever is played. The second system autonomously plays notes regardless of a human musician’s contribution. The third also features complete autonomy but counts the number of messages being played by the human musician to define the energy of the music. The fourth and most complicated system builds a mathematical model of the human artist’s music.

It listens carefully to the musician’s play and, creates a statistical model of the notes and their patterns, and even stores chord sequences. Adding to this human/AI jamming session approach, Dietterich sees two promising strategies to improve and mathematically “guarantee” trustworthiness. A competence model can compute quantile regressions to predict AI behavior, using the “conformal prediction” method to make additional corrections. Yet this approach requires lots of data and remains prone to misinterpretation.

The other way is to make autonomous systems deal with their “unknown unknowns” via open category detection. For instance, a self-driving car trained on European roads might have problems with kangaroos in Australia. An anomaly detector using unlabelled data could help the AI system respond to surprises more effectively. As AI is deployed in more and more areas of our lives, what is becoming clear is that, far from a nightmare scenario of machines taking over, the only way AI can be made more reliable and more effective is for there to be a tighter-than-ever symbiosis between human systems and AI systems. Only then can we truly rely on AI.

Related Posts