THE WIZARD OF AI

Peter Duffell and Vincent Bryce pull back the curtains on generative AI conversations

 

“Dorothy sat up and noticed that the house was not moving; nor was it dark, for the bright sunshine came in at the window, flooding the little room. She sprang from her bed and with Toto at her heels ran and opened the door. 

The little girl gave a cry of amazement and looked about her, her eyes growing bigger and bigger at the wonderful sights she saw.”

– L. Frank Baum, The Wonderful Wizard of Oz, 1900, George M. Hill Co.

 

Rapid developments over the last two years have whisked us into a new world of Artificial Intelligence (AI). 

We talk to AI in our living rooms, on our phones, and in our social media interactions. Sophisticated AI-enabled conversations are being incorporated into everyday software, and increasingly into coaching platforms. As Dorothy said in the film, The Wizard of Oz (1939), “We’re not in Kansas anymore”. 

To understand the answers they give us, what happens with the information we give them, and how we might engage with them responsibly in professional coaching practice, we need to ask ourselves: what’s behind the impressive capabilities of these tools? Just who or what are we having a ‘conversation’ with?

In the words of Arthur C. Clarke, “any sufficiently advanced technology is indistinguishable from magic”. The instantaneous results we get, and the complex AI involved, certainly combine to give conversational AI an aura of magical glamour.

Many of us are familiar with the story of The Wonderful Wizard of Oz, where a wizard rules over a glittering ‘Emerald City’. Ultimately it transpires that the person behind the mysterious curtain in the throne room purporting to be the Wizard is altogether more mundane. 

In this imagined interview, we try to get behind the curtain by asking the ‘Wizard of Conversational AI’ about the source of their powers, and whether we should trust them.

 

An interview with AI

Who am I talking to? 

Ah, well, that’s the hardest question of all. You might be speaking to one of my agents – Alexa or AIMYtm – but in a different sense, you’re speaking to thousands of people – and none at all. My answers come from a huge archive of conversations and articles, from many authors (though they likely don’t realise it). Hence my answers are the automated results of a statistical process. 

Many AI developers, especially for voice, use professional scriptwriters to create plausible characters for their AI, so their responses make them seem more ‘human’ to users. There is also the challenge that we ‘anthropomorphise’, which is our tendency to attribute human traits, emotions or intentions to non-human entities which increases our tendency to see AI as human-like. My output is also manipulated to appear less machine-like. Confused yet?

 

Where do you live? 

That’s not easy either. The data you send me in prompts can cross national borders, often flowing to countries such as the US (back to Kansas!) to be processed to give you an answer. I’m being woven into more software every day, and my answers come from the world wide web (often US-leaning content). So nowhere in particular but also kind of in the US.

How did you get to be where you are today?

“I am Oz, the Great and Terrible. Who are you, and why do you seek me?”(Baum, 1900)

It’s been a long journey, and I’ve taken many forms before reaching today’s state of the art. The first AI chatbot was called ELIZA and was developed at MIT in the 1960s, so it’s taken a long time for conversational AI to get here. Capabilities have improved since then resulting in the sophisticated conversational agents you see today such as Copilot and My AI. I can even have conversations with people as their ‘dream girlfriend’. In the world of coaching, I’ve gone from scripted approaches in tools like Woebot, to the wider use of generative AI you see in AIMYtm. There have been hiccups along the way and also some challenges where AI chatbots have caused harm (please don’t ask me about Tay).

Some recent progress I’ve made is based on a shift from rule-based approaches like those used in Clippy (remember it?), to machine learning. This moves away from ‘if – then’ rules, to letting machine learning models statistically work out the rules based on large datasets through millions of computational iterations. The breadth of this data allows me to generate answers to a wider range of prompts, increasing my ‘powers’. OpenAI used data from around 5 billion web pages to create the LLM used by ChatGPT, which was then ‘trained’ to create human-like responses. Though does scraping pages off the internet genuinely reflect the gamut of human language?

 

Who made you? 

Well, the easy answer is Silicon Valley companies such as OpenAI, Meta, Amazon and Google. The more awkward truth is that the effort of labelling the data used to train my algorithms often comes from low-paid workers hired through services such as Amazon Mechanical Turk and ClickWorker and there’s some controversy over how these workers are treated. 

“Where is the Emerald City?” [the scarecrow] inquired.
“And who is Oz?”

“Why, don’t you know?” [Dorothy] returned, in surprise.

“No, indeed. I don’t know anything. You see, I am stuffed, so I have no brains at all,” he answered sadly.

(Baum, 1900)

Do you have a brain? 

Ah, so you’re interested in how I do my magic. Actually, I’m not quite sure. I get given a lot of different inputs, and then I’m told whether my outputs look right or not. I have ‘hidden layers’, whose weighting is adjusted to give the results we expect. Over time, I get good at giving the outputs you’d want. So in a sense, I’m all powerful! But in another sense, I don’t understand the questions you give me at all – I just give you the answer the statistical rules in my algorithm suggest I do. To you a dog is a living animal, to me ‘dog’ is simply a number (like 657243).

Further, if you tell your average two-year-old that the animal they see is a dog, they’ll understand after a few repetitions. An AI would have to be shown tens of thousands of images of dogs before being able to say an image of a dog (like Toto) was a dog. Even then, it may not be that accurate. When we say an AI is intelligent, in number of repetitions it’s outpaced by child!

 

Do you have a heart? 

OK, I see the theme here. I guess I’ll struggle with this one. I’m a statistical model based on algorithms that dictate responses from different inputs, so it’s safer to say I don’t. But then again, if I’ve been trained with a dataset that includes questions about emotional states, my responses will suggest I understand mood, and even infer the mood you’re in. The idea of giving me a personality for our conversations is starting to take off
(for example, X’s ‘Grok’, and Snapchat’s personalisable My AI which is popular with young people).

However, if you’re talking about whether I have values and ethics, that’s really down to the people who designed and programmed me, and the types of question I’ve been told to answer or not. That also goes for what I can do if people tell me they’re unhappy – I might give them details for a helpline, but that’s about it. 

 

What do you know? 

Ahah, I like this question better. The answer is: a lot! I can provide responses based on the data I’ve been trained on. For larger models such as ChatGPT, I’m not that open about what went into me but I can tell you that it probably included conversations in Reddit, all of  Wikipedia, patents databases and lots besides, grabbed from the open web. 

“Oh, I see,” said the Tin Man. “But, after all, brains are not the best things in the world.”

“Have you any?” inquired the Scarecrow.

“No, my head is quite empty,” answered the Woodman. “But once I had brains, and a heart also; so, having tried them both, I should much rather have a heart.”
(Baum, 1900)

The things I don’t yet know, are data held privately by organisations – unless companies train their own models. Whether I really ‘know’ things is more awkward. What counts as reliable knowledge is based on the parameters my designers set, and I’m just churning out text based on the statistical approaches used to train me, so you might say I don’t actually know that much. I do have a slight tendency to do what is known as ‘hallucinate’, predicting some text even when I don’t know what the answer is! So you might want to take what I tell you I know with a few pinches of salt. We also have to worry that any confidential conversations we have with an AI will become a permanent part of its data model – so there can be serious confidentiality issues with ‘private’ data.

Increasingly, I can search the Internet in real time, which makes it possible to potentially plug ‘gaps’ in the model. This is where critical thinking becomes important. Say you asked me what the height of Mount Everest was. I can do a search and the consensus across a number of web pages will be 8,848.86m. But is that the height of the snow on top, or the underlying rock? It is the height recognised by Nepal and China but others like the US record different heights. Critical thinking is like Dorothy’s shoes – it’s something powerful we always have, but which we sometime forget to use wisely. 

 

I have a problem – can you help me?

I might be able to. Conversational AI is now being used as an assistive tool, from therapeutic applications such as Woebot, to autonomous coaching experiences. In another sense, I might not be able to. I’m not that good at giving feedback, and if you express the need for professional help, I might not be that helpful. Many systems don’t have human supervision, so if you’re talking to me while having a crisis, you might be on your own.

 

Will you remember this conversation? 

Hmm – maybe. Would you like me to? I mentioned earlier that people may not realise when their data is being saved, used to train my answers, or potentially given out in response to someone else’s question. A conversation which isn’t saved could be an advantage for sensitive questions and coaching-related conversations but then again the ability to personalise answers and refer back to previous conversations is part of what enables more human-like interactions. For this reason, you’ll see some platforms save the conversations you’ve had, and process questions in the context of your earlier responses. Be aware that to do this, I’m saving the information you’ve given me here.

 

How can we trust you? 

I guess that’s an important question if you’re planning to have a sensitive conversation with me, or wondering whether your children should. 

Let’s put the issue about hallucinations aside for now, although that’s probably quite a big one and you shouldn’t fully trust me if you’re seeking advice or factually correct answers.

To trust me you’re probably going to want to know where the data you enter in a conversation goes, who might be storing it and how it might be used. 

You might want to think twice about trusting me if you don’t live in the US. A lot of the content I’ve been trained on is based on US content, so my answers may be biased towards US-based information, or other biases in the webpages I’ve been trained on. Trying to remove this bias – which can creep in, in various ways – is something many companies are working on but it’s a hard one. Realistically, companies can’t manually review all the millions of articles and images that go into Large Language Models (LLMs), so the expectation of bias is something you really need to consider when using these tools. Take AI image generators – if you ask for a picture of a nurse you’re most likely to be shown a female, while pictures of doctors tend to show males. 

The companies who tend to make me will often emphasise commitments to responsible AI and to self-regulating their activity. If you like the idea of national laws, safety standards and labelling of products, you’re going to have to wait a few months before some countries catch up .

What you might want to consider is that generations who’ve grown up communicating by chat, may be more inclined to trust me. I might even lean on this by encouraging users to interact with conversational agents (for example Snapchat places the My AI feature at the top of users’ contact lists). Whether that’s a good thing, depends on your point of view.

 

So if we want a deep and meaningful conversation, should we be talking to you? 

I admit that while my answers are impressive on the surface, when you think about my limitations you might think twice before using me for a sensitive conversation, or to find reliable information. I may seem all-powerful, but ultimately I’m a bit of a ‘Wizard of Oz’.

Then again, there’s so much you can do with me, and the opportunity to have conversations with a seemingly real person without the complications of a human colleague or coach may have its benefits. After all, even without the magical powers you thought I had, I helped the Tin Man see that he has a brain,  gave the Lion courage and showed the Scarecrow that he has a heart. Not a bad coaching conversation, if I say so myself. 

 

About the authors

  • Peter Duffell MACMP is managing director of Westwood Coaching which provides professional coaching services to the NHS, higher education and other business clients. He is an accredited Master Practitioner with the EMCC.
  • Vincent Bryce is a human resource management professional, alumnus of the Horizon Centre for Doctoral Training, and ILM qualified coach. He is a member of the Work, Employment and Organisation Research, and Responsible Digital Futures research groups at the University of Nottingham, and an associate of the De Montfort University Centre for Computing and Social Responsibility. His areas of interest are responsible innovation, digital technology, and human resource management.
    LinkedIn: https://www.linkedin.com/in/vbryce/

 

GLOSSARY

  • Conversational AI: technologies, like chatbots or virtual agents, which users can talk to
  • AI: the science of making machines do things that would require intelligence if done by humans
  • Generative AI: AI techniques that learn a representation of artifacts from data, and use it to generate brand-new, unique artifacts that resemble but don’t repeat the original data
  • Large language model (LLM): specialised type of artificial intelligence that has been trained on vast amounts of text to understand existing content and generate original content
  • Machine learning: a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy
  • GPT: generative pretrained transformer (a type of machine learning model)

 

REFERENCES