Skip to content

Taming AI – Peter Leonard – S1.10

Peter Leonard
Peter Leonard

In this episode my guest is Peter Leonard. He is a guru in the legal data and tech space and one of my primary go-to people when I am thinking about data governance, data sharing, regulation, and AI ethics.

Today we are talking about some of the issues that folks are going to need to think about as we all wing our way into the AI era.

Episode link
RSS feed

Episode information links:
Five Safes Framework
UNSW Data Governance


 Kate Carruthers [00:00:00]:

Hello and welcome to another episode of the Data Revolution podcast. I’m Kate Carruthers, and today my guest is Peter Leonard. Now, he is a remarkable chap. Known him for aeons. He’s a data and technology business consultant and lawyer. But that’s not all he is. He was the founding partner of Gilbert and Tobin lawyers, and he led its technology and data practice over many years. And he also works at UNSW as a professor of practice across the schools of management and governance and information systems and technology management. He has his own company now called Data Synergies, and I’ve worked with him on looking at how we can manage data better and how we can manage AI better. So he’s an all around good chap and we’re going to have a chat. Welcome, Peter Lennon. It’s great to have you on the podcast.

Peter Leonard [00:00:59]:

It’s great to be here, Kate.

Kate Carruthers [00:01:01]:

So what are we going to talk about today?

Peter Leonard [00:01:04]:

Well, I thought we might talk about how AI is changing the world as we know it, and in particular, the kinds of challenges I’m seeing around how you ensure safe and responsible uses of AI. Layered on top of all of those challenges we’ve been looking at over the last couple of years around respectful uses of information about individuals. And by respectful, I mean that respects their rights in privacy and is not excessive surveillance. So we’re pretty familiar very well at.

Kate Carruthers [00:01:48]:

That, to start before AI, were we?

Peter Leonard [00:01:52]:

Well, exactly. And that’s kind of the point, right? That we’re in the middle of a work in progress that we’ve not been doing very well around, building data privacy and data security by design and default. And then in the middle of all of that hard work that, as you and I know, is work in progress and some people are doing more seriously than others, and some organizations don’t seem to be doing very well at all. Everyone is deploying AI in various ways, including generative AI. That’s coming into many organizations by stealth and in circumstances where CIOs and other responsible people like you are having difficulty keeping control as to who’s doing what using that AI in their organizations.

Kate Carruthers [00:02:51]:

So what are some of the issues that you’re seeing emerging in this space, particularly from a privacy and data protection perspective?

Peter Leonard [00:02:59]:

Peter I suppose the first thing is that many people are gaily prompting public generative AI applications like Chat, GPT, Google, Bard, Microsoft, Copilot, with personal information relating to individuals without consideration of how that information is leaving their organization and how it might be used in the that’s a great one.

Kate Carruthers [00:03:36]:

I had a startup come and pitch me the other day and they were using that, and I just said, please stop. Please don’t do that with your proprietary commercial information.

Peter Leonard [00:03:44]:

Yeah, and look, it’s an interesting question because on the one hand, you might say, look, this is just a transitional educational issue of people needing to understand that they shouldn’t be doing that stuff. And the other reason that you might regard it as a transitional issue is that you and I know that fairly quickly we will see large language models made available within institutions like UNSW where it’ll be a local instance of the large language model that ensures that the data doesn’t leave the organization already there? Yeah.

Kate Carruthers [00:04:33]:

But isn’t the real issue that people who don’t understand the implications of what they’re doing now have the power to do stuff? Which was always starting to happen with software as a service where people could put stuff in the cloud and not understand it, but now they can do it on the public Internet, which seems to me to be the real challenge.

Peter Leonard [00:04:53]:

So I think there are two challenges. One is that we don’t know enough about the training data that’s been used to train the large language models that organizations can reliably assess whether the model is fit for purpose for the particular task for which the generative AI is being used. And then secondly, there’s the issue that the generative AI is so easy for anyone to use that it’s being used for myriad tasks within organizations that CIOs like you can’t even begin to imagine what might be happening in some building elsewhere on campus and certainly can’t control. And that in turn leads to questions of whether the person that’s using the generative AI in that way is placing undue reliance upon what might be very unreliable results. So there’s a question firstly as to whether the data that was used to train the model that is then generating the result through the generative AI was fit for purpose. And then whether the human who is looking at the results is unduly relying on what might be a completely unreliable output. And that’s very different to the kinds of AI assessment that I’ve been involved with in the last few years. Because typically what we’ve been looking at, for example, for the New South Wales government in the AI review committee that I sit on, is we see big It projects involving AI that are bought to us that are specific. Purpose AI designed and evaluated by us as being fit for purpose or not fit for purpose for the particular task for which it’s designed. But of course that’s completely different to this stealth AI coming into an organization, being used for myriad tasks and being relied upon by myriad human beings, limited only by the imagination of humans as to what tasks they might get the generative AI to do.

Kate Carruthers [00:07:42]:

I know and this space is moving so fast. I keep telling people that it used to move in years and months and now it’s moving in hours and minutes. Like I log on in the morning, look at stuff, go to work, log on in the evening and stuff, new stuff’s emerge during the day so it’s moving really fast. But what are some of the issues some of the ways that people can approach sort of AI safely and responsibly, do you think?

Peter Leonard [00:08:08]:

Yeah, look, it’s a good question. I suggest to people that they need to put the AI in the context of a decision and work backwards from the question of what is the decision for which technology is being used and then evaluate whether introduction of the AI into the decision making chain makes the decision less reliable or more reliable. And in many contexts the introduction of the AI may even make the decision more reliable. But you actually have to look at the decision chain in the context in which the decision is being made and the purpose for which the decision will be used. And let me give you sort of an example on it. Let’s assume that it’s a doctor in a hospital who’s thinking about writing up a discharge summary for a patient and currently would look at the electronic medical records relating to that patient and write up a discharge summary out of that. Well, that same doctor might use chat GPT to look at those inputs and do a first draft of the discharge summary for the doctor to review. In that circumstance, you’ve got somebody, a trained medical practitioner who’s got obligations, professional intelligence obligations amongst others, to patients and should be bringing the requisite level of care and has the relevant source data there to compare against the summary that the, say, chat GPT, whatever they’re using, is generating. So then it’s just a question of, well, is the relevant individual properly appraised of the risk that chat GPT might make something up or get something wrong and do they have the time to properly evaluate what the generative AI is presenting to them? And there’s always a risk in this, of course, that when you talk about automation, what may often happen is that employers promptly steal the time back that they’ve liberated for the individual by allowing the individual to use the automation. So the hospital might say, well, go out and do some more rounds because writing up discharge summaries now and he takes half the time. Whereas the reality is, if you’re going to use generative AI in this context responsibly carefully and safely, I mean, you.

Kate Carruthers [00:11:21]:

Just use chat GPT sort of in the vernacular, I’m assuming, in the place of generative AI because one would hope that a hospital has their own custom generative AI application that is getting the right inputs. Because one of the things that I keep talking to people about is an LLM doesn’t a large language model does not know stuff, it only knows what you tell it. And so you actually need to be able to insert new knowledge, new information into its decision making because it only knows what it knows until you’ve inserted stuff. So what I’m seeing is that generative AI is going to need to be part of more of a machine learning workflow where you’re inserting all of the different inputs at the right time so that you’re getting the right kind of outputs. And we don’t understand this space at all, like submerging. It’s so new. We don’t really know how to do this at scale for enterprises or big hospitals and stuff. It’s all very new. And we’ve already seen how Chat GPT in particular can hallucinate, so it just makes up stuff. It made up two new jobs for me when I got it to write a CV for me.

Peter Leonard [00:12:40]:

Well, that’s great. It augmented your skills. But as an example of how quickly things are moving on this, it was interesting to look at the release notes that came with Meta’s llama two, three or four weeks ago, and they included quite a detailed model card and a responsible user guide and some quite useful information in there around their so called open source models. And I think that’s going to be an interesting trend. So if we look at, say, that hospital example in six or nine months out, I can imagine that. What we will see is area Health Services in New South Wales, having, using a third party large language model that they’ve bought into the organization, pre trained and assessed for reliability of the pretrained data, and then further trained using the confidential patient data sets within the organization evaluated by people who are properly skilled to evaluate that and made available in that controlled environment.

Kate Carruthers [00:14:11]:

Most of the big vendors have been working on this in the background for years with their large data sets ready, because the real problem for all of this is training the models, having enough data. And I was at a dinner last night with a whole bunch of cyber security folks and they were all talking about, we must delete all the data. And I was like, Hold on a SEC, we might need it to train some models before we throw it out. So that’s actually kind of the weird imperative. Now people want to throw out data, but then we need it to train the models, otherwise they’re not going to be reliable. So it’s an interesting paradox that we’re in nowadays. But one thing I did want to touch on is what do you think is going to happen in the regulatory space?

Peter Leonard [00:14:55]:

Well, that’s a very interesting question because there’s a number of regulators who I think would like to own this space, including the privacy regulator, the ACCC. And the ACCC is not a silly choice, actually, because many of the issues around AI can be addressed through existing provisions of Australian consumer law, amended a bit and tightened up a bit. So, for example, Australian consumer law says that if you make available a software product, it has to be fit for purpose of merchantable quality and you should be making disclosures about known limitations of your products that are not misleading or deceptive. So one can imagine a world where you might expand some of those provisions to ensure that vendors make full and proper disclosures about the reliability of their large language models or any generative AI applications that they’re making available. And that could do quite a bit of work. I’m not a supporter of the concept of a super AI regulator or anything like that, because yeah, look, I think that AI is going to be part of everything that every business in every sector does, and we’re just going to have to skill up the regulators in the various sectors to properly address the issues in their sectors.

Kate Carruthers [00:16:46]:

Sort of like computers, like we did with computers. It’s the same thing. So it leads me, though, to the question about explainability, because that seems to be something that we’re going to have to solve if we are going to regulate this in a proper way.

Peter Leonard [00:17:03]:

Yeah, and explainability is an interesting concept, isn’t it? Because when you look at large language models, there’s a question as to what level of explainability you’re after. It is often not possible to fully explain how the large language model is operating, but what you can explain is any known limitations that you’ve identified through the operation of the model and you can disclose the sources of data and known limitations as to data quality. So data provenance issues around the data that was used to feed and train the large language model. So I don’t think that all roads.

Kate Carruthers [00:17:52]:

Are leading back to data governance, aren’t they?

Peter Leonard [00:17:55]:

Absolutely. And when you look at data governance, you can’t look at that alone without looking at the people and the process, the decisions that individuals are making using that data and the technology. So it goes back even further to the question you and I have been looking at for years, which is you can’t evaluate technology without thinking about the people and the processes.

Kate Carruthers [00:18:27]:

The five safes just popped into my head for some reason. I will share a link in the show notes to that. Peter, I’m really conscious you’ve got to go off and teach a class. So thank you so much for your time. I really do appreciate it. We could have gone on for hours, but I know your time is limited, so thanks very much for joining me.

Peter Leonard [00:18:45]:

Pleasure, Kate.

Kate Carruthers [00:18:47]:

Good night.

Peter Leonard [00:18:48]:

Good night.

Kate Carruthers [00:18:50]:

That was the end of another episode of the Data Revolution podcast. I’m Kate Carruthers. Thank you for joining me. Hope you’ll leave a nice review for the podcast and please join us again next time.