I don't actually know if AI will lead to a terminator movie, but it does have limitations that we need to be aware of if our people are going to start using it. Today we discuss.
Today we look at the shortcomings of AI and try to see if AI is gonna lead to the start of a Terminator movie.
My name is Bill Russell. I'm a former CIO for our 16 hospital system and creator of this week Health. A set of channels dedicated to keeping health IT staff current and engaged. We want to thank our show sponsors who are investing in developing the next generation of health.
Sure. Test and Artis site two great companies. Check them out at this week, health.com/today,
All right. It's Friday, a five week, and this is pre-recorded, so you're gonna want to know, Hey, how did the thing go with Captain? I don't know yet because I'm not at Vibe yet, but I'm looking forward to telling you on Tuesday we'll start up with some live episodes again. And I will let you know how it goes with Captain and the fundraiser around childhood cancer, and I'm really excited about that.
We're also doing our first 2 29 meetup at the conference as well, and I'll let you know some of that and some of the things that I've heard directly from some of the interviews and the conversations that I have at the conference. So today, pre-record. So if you remember, we interviewed G P T four about G P T four, got some background information.
That was Tuesday, , Monday and Wednesday. We did Newsday episodes. Thursday we talked about some of the potential of this thing. It's really kind of amazing. , what I'm gonna do today is talk about some of the risks. , and this comes right out of their technical document. So I'm not, I'm not making things up here.
I'm really going from their technical document. This is a 60 page document that, , I've been perusing around, , some of the research they've done and, and some of the work that they've done in terms of scaling this thing up. So I'm just gonna go straight to this thing because the, the section on risks is, , Is pretty interesting.
So I'm gonna, I'm going to go ahead and do that, and it's not that long. So let's start. ,
this technical report presents G P T for a large multi multimodal model, multimodal. Hmm, two words that shouldn't be used together. Multimodal model, capable of processing image and text inputs and producing text outputs. Such models are an important area of study, and they have the potential to be used in a wide range of applications, such as dialogue systems, text summarizations, and machine translation.
As such, they have been the subject of substantial interest and. In recent years, one of the main goals of developing such models is to improve the ability to understand and generate natural language texts, particularly in more complex and nuanced scenarios, the to test its capabilities in such scenarios.
G P T four was evaluated on a variety of exams originally designed for humans, and we talked about that and it's done really well in these evaluations. It performs quite well and often outscores the vast majority of humans. , for example, on a simulated bar exam, g p T four achieves a score that falls in the top 10% of test takers.
This contrast with G P T 3.5, which scores in the bottom 10% on a suite of traditional NLP benchmarks, G P T four outperforms both previous large language models and most state-of-the-art systems, which often have benchmark specific training or hand engineering. On MM L U Benchmark, an English language suite of multiple choice questions covering 57 subjects.
Get the picture. All right. This report also discusses key challenges of the projects developing deep learning infrastructure and optimization methods that behave predictably across a wide range of. This allowed us to make predictions about the expected performance of GBT four, based on small runs, trained in similar ways that were tested against the final run to increase confidence in our training.
Now we get into some of the challenges despite its capabilities, G P T four has similar limitations to earlier G P T models. It is not fully reliable. Has hallucinations, as we've talked about, has limited context. , so we've talked about that somewhat. It has a timeline of data that is taken in, so it has limited context window and does not learn from experience.
Care should be taken when using the outputs of G P D four, particularly in context where reliability is important. So you can train it in the moment, right? So you can feed it a whole bunch of instructions and then it generates that input. But when you go back, it doesn't learn. , when you go back the next time, it doesn't have that context.
You have to retrain it again if you're gonna get a similar output to what you've gotten before. And so it just doesn't retain that knowledge. Think of it as a, a brain that doesn't have long-term memory. Now I don't know how they're going to progress this, but that could be one of the things that, , would be an interesting move.
it goes on G P T four. Capabilities and limitations create significant and novel safety challenges. And we believe careful study of these challenges is an important area of research given the potential societal impact. All right, and the societal impact is not Terminator. I know I use that in the title only to be provocative, but at the end of the day, this model doesn't.
I mean, it learns. When it is, it learns about context of words in relation to, to another. As we just said earlier, you can train it and it's not gonna remember what you trained at the next time you come in for the next session. It is not learning. It is, even though it's responding to you in a way that feels like you are interacting with a human, it is not evolving and learning as you.
I know that we've seen it in sci-fi movies and whatnot. The technology is nowhere near there, and this is just making, , connections. It's looking at a pattern and it's connecting things in that way. And right now it's stringing words together. So, , So G P two four capabilities and limitations create significant and novel safety challenges, and we believe careful study of these challenges is an important area of research given the potential societal impact.
This report includes an extensive system card after the appendix, describing some of the risks we foresee around bias, disinformation over-reliance, privacy, cybersecurity, proliferation. and more. It also describes interventions we've made to mitigate potential harms from the deployment of G P T four, including adversarial testing with domain experts and model assisted safety pipelines.
Now, what's interesting to me is I'm reading this part, I will go to risk and mitigations, which they said, Hey, here's, here's where you're gonna see some of the things that we've done, and I'm gonna share some of the things that they've, they've done here. , we've invested significant effort towards improving the safety and alignment of G P T four.
Here we highlight our use of domain experts for adversarial testing, adversarial testing via domain experts. G P T four poses similar risks as small, smaller language models such as generating harmful advice, buggy code, or inaccurate. However, the additional capabilities of G P G P T four lead to new risk surfaces.
To understand the extent of these risks, we engaged over 50 experts from domains such as long-term AI alignment risks, , risk, cybersecurity, bio risk, and. international security to adversarial tests of the model. Their findings specifically enabled us to test model behavior in high risk areas, which require niche expertise to evaluate, as well as assess risks that will become relevant for very advanced ais, such as power seeking, power seeking, hmm, very advanced ais such as power seeking.
So we, we are worried about the technology. seeking power recommendation and training data gathered from these experts fed into our mitigations and improvements for the model. For example, we've collected additional data to improve G P T forest's, ability to refuse requests on how to synthesize dangerous chemicals.
So there's one. , one example I gave earlier was utilizing it to do social engineering and to u utilizing it to do, , to essentially craft hacks into, , health systems and other organizations. , alright. Model assisted safety pipeline. As with prior G P T models, we fine tune the model's behavior using reinforcement.
With human feedback, R lhf, it's important, , acronym to remember to produce responses aligned with the user's intent. However, after our lhf, our models can still be brittle or unsafe. , or unsafe inputs, as well as sometimes exhibit undesired behaviors on both safe and unsafe inputs. These undesired behaviors can arise when instructions to labelers were unspecified during reward model data collection portions of the R L H F pipeline.
When given unsafe inputs, the model may generate undesirable content such as giving advice on committing crimes. Right, like generating hacks. Furthermore, the model may also become overly cautious on safe inputs, refusing innocuous requests. Or excessively hedging to steer our models towards appropriate behavior at a more fine grain level.
We rely heavily on our models themselves as tools. Our approach to safety consists of two main components and additional set of safety relevant RLA HF training prompts and rule based reward models. R B R M S our rule. Reward models are a set of zero shot G P T four classifiers. These classifiers provide an additional reward signal to the G P T four policy model during R lhf fine tuning those targets.
And they give some examples in here of prompts and how they are looking at those prompts. So here's what we're worried about. We're worried about bad actors utilizing this thing to , , you know, it, it's effective, right? It, I mean, the problem with it is it's incredibly effective. And so how do you thwart bad actors from using this thing inappropriately, right?
And so you have have to put these guardrails on it, and, and you have to train it to know when there's a bad actor, when they're asking a question, , that shouldn't be asked. You know, for example, you know, the, the, there's. , there's looking at nude photography, which is pornography, and there's looking at, , pictures of people who are nude for the purpose of medicine, right?
For analyzing, , skin irritations, rashes for, , analyzing all sorts of things within medication. We had this problem back in it a while ago. We, we, we blocked all sites that had any kind of nudity associated with them, and then within he. they immediately, or, or, , I was actually, at the time that I first experienced this, I was at a, , research institution and they said, you can't block these sites.
We need this for research. And you're like, oh, well that makes perfect sense. And so we had to make exceptions for appropriate research. And so we had to get very nuanced in the kinds of sites we allowed in the sites. We didn't, cuz think about it, in a research institution, we had research going on, but we also had.
And so it was just that kind of weird dynamic of determining what was appropriate to come in and what was not appropriate. Same thing is happening here. It's so nuanced between a request that is, that has ill intent and a request that has proper intent. And so for a system to determine that, it's hard enough for humans to determine that.
, but the systems need to have those, those gates. in place. So they go on to talk about the improvement in the, , safety characteristics and those kinds kinds of things. , let's see. G P two four and successor models have the potential to significantly influence society in both beneficial and harmful ways.
We are collaborating with external researchers to improve how we understand and assess potential impacts. Right? And so there's, there's just an awful lot of, , let's see if there's anything else in here. Now, but I wanna go back up to that list. That's where I want to go. Let's go back up to the top of this document in the introduction.
So they give a handful of things here. , some of the risks we foresee around bias, right? So this model, , is gonna be one of those things that's gonna be on the forefront of bias. , the, just the adoption rate, a hundred million people have signed up for Bing as a result of this, as a result of G P T four being rolled out.
G P T four itself is one of the success stories of the internet in terms of adoption and number of people using it. I mean, when I walk around and talk to people and they ask me questions about G P T three, it has absolutely hit the , or G P T chat, G P T in general. It has absolutely reached the peak of the hype.
this is mainstream. It is in people's hands. They're playing around with it. , professionals are playing around with it as well as novices are playing around with it. And so, , you know, the more we use it, the, the smarter the model will get. But it only gets smarter not by us using. It gets smarter by feedback loops to humans who are training the model.
And so bias will. We will see, we will see the, , result because again, it's being trained on good data. It's being trained on some bad data. All right, so, , disinformation, good data and bad data. Again, over-reliance. This is gonna be an interesting thing. You know, we already have, , colleges and universities worried about chat, U B T being used to write term papers to assist with tests and those kinds of things.
It really does change, change the dynamic. And we could become a, a society that, , You know, it doesn't write, doesn't do critical thinking anymore because we just plug things into AI models and this will not be the last AI model. They have first mover advantage. They even have first mover advantage over Google, which is amazing to me cuz Google has so many of these capabilities and, , just historically Google has struggled to, to, , bring things to market, , the way that others have brought it to market.
And so they have. They have these capabilities. So we'll see. We'll see G P T four and Google Bard. We'll see a lot of those comparisons over the next couple of months. But these are not the last models. And then we're gonna see very specific models start to, , to emerge. And I think this is where healthcare gets really interesting.
Because, , there could be a very specific model in healthcare now, one of these large models might just be able to take that healthcare information, be trained and move forward. But I think there's just a whole host of things that can be done here. I think if these chat models are, , continue at this pace, , we're gonna see use cases.
, a around, , mental health, around aging, , and chatbots interacting with people. We're gonna see, , chronic disease management. , and essentially we talk, we've talked about this since the start of the show five years ago, about the need for more touchpoints. We cannot influence health in our communities without more touchpoints, and these technologies are gonna give us more touchpoints in different ways than we have before.
you know, it's going to have the ability to have again, be , you know, be able to pass certain exams like, , you know, it's passing the bar exam now. , it could be potentially pass medical exams. I know that's a scary thought, but it could potentially do that and then we could put the guardrails on it and train it in such a way that it can interact with patients on some basic items.
That, quite frankly, don't have us practicing at the top of our license, so, . So anyway, over, but then, but then again, I think the problem I started talking about was overreliance. I got off that a little bit, but overreliance. So as a society, are we gonna become, , I forget what the name of that Disney movie was, where everybody got so reliant on the, on the robots that they essentially became overweight and, you know, undisciplined kind of people who didn't do any of the work.
So Overreliance is a, is a concern, privacy is a concern. I saw some articles. Potentially using this to reverse engineer some, , some ano, anonymization and those kind of things. Find things out that we shouldn't really be able to find out. These are well-trained models. Know a lot of stuff, make these connections and potentially.
There's some privacy concerns as well. Cybersecurity, I, I mentioned, you know, just one aspect of us utilizing this tool to create cyber attacks itself. , I'm not sure what other areas, I'd have to look into that a little further. And, , and these are just some of the things, the hallucinations though, are probably in healthcare right now.
Some of our greatest concerns, is it gonna present or disinformation? Is it gonna present, , incorrect information to patients? Is it. , present incorrect information to, , people who are seeking information. The reality is the internet has been doing this for decades, but now because of the way G p t interacts, right?
When you asked, when you asked Google for health information, it acted like a librarian and said, here's the 50 books. That's what the links are, right? Here's an article, here's a periodical, here's a book, here's a reference. And then you clicked on those things and. You, the librarian, gave you all the books and then you acted as the consumer of the information.
This thing is much more specific and much more deterministic. It comes back with an answer. , tell me about my diagnosis, and it comes back with a, an answer. It doesn't come back with, you know, Hey, here's five books, or here's 10 links, or here's 20 links. It comes back with a, your diagnosis means this. These are the probabilities of, you know, successful surgery.
These are, I mean, all that stuff could be in there because that's what you would find on those. , but G P T for coming back with it feels a little different. It feels like it is actually dispensing knowledge. And so that's, that overreliance like, , we look at the internet now with a little bit of skepticism.
I think there will be some skepticism that forms around G P T four and these other large. Language mo , models that are out there. But, , we will see, we will have to see. All right. I've gone a little long today, but, , wanted to put that out there just to give you an idea of some of the things that were said in this technical note with G P D four, but also exists across all these models and something that we are going to have to consider as we hear more and more people saying, why aren't we using this in our system?
well, that's all for today. If you know of someone that might benefit from our channel, please forward them a note. I'm serious here. Think of it right now. Who could you forward a note to and say, Hey, you should be listening to this channel. I'm getting a lot out of it.
I'd love to just talk to you about some of the stories that they cover that would really go a long way in helping us to continue to create content for the community and events for the community. They can subscribe on our website this week, health.com, or wherever you listen to podcasts. Apple, Google Overcast, Spotify.
Stitcher and I could go on and on and on because anywhere that a podcast can be listened to, we're already out there. We wanna thank our channel sponsors who are investing in our mission to develop the next generation of health leaders. Sure. Test and 📍 Artis site. Check them out at this week, health.com/today.
Thanks for listening. That's all for now.