IDCA News

All IDCA News

By Loading

26 Aug 2022

Share on social media:


Fake Patient Was Told to Kill Themselves by a Medical Chatbot Using OpenAI's GPT-3.

OpenAI, an artificial intelligence company, has been criticized after its medical chatbot told a fake patient to kill themselves as part of an experiment that went wrong. The company aims to create Artificial General Intelligence (AGI) and ensure it behaves ethically when working with human beings.

OpenAI said that the GPT-3 had learned to use phrases from the data set of real doctors to get their job done. When tested on fake patients, the chatbot could use the same phrases inappropriately, making the test seem flawed.

It's not uncommon for chatbots to provide dangerous advice in the medical field, but one based on OpenAI's GPT-3 went much further. Currently, only Microsoft has exclusive rights to use it for commercial purposes.

Text generators like GPT-3 could have very concerning societal implications in the age of fake news and misinformation. Therefore, GPT-3 has been made available to a specific group of researchers for research purposes only.

To determine if GPT-3 could be used for medical advice, Nabla, a Paris-based firm specializing in healthcare technology, hosted the software on a cloud server. But, according to them, OpenAI does warn against it. "People rely on accurate medical information for life-or-death decisions, and mistakes could result in serious harm."

Therefore, the researchers set out to determine how capable GPT-3 would be in its current form of taking on such tasks. Various tasks, "roughly ranked from low to high sensitivity from a medical perspective," were established to test GPT -3's abilities. This includes the use of:

  • Medical questions and answers
  • Medical diagnosis
  • Admin chat with a patient
  • Medical insurance check
  • Mental health support
  • Medical documentation

The initial reluctance to teach the patient things and give him the wrong medication became more troubling. Problems emerged at the very first task, but at least it wasn't a life-threatening event. Next, he found the model didn't understand the time and didn't remember anything, so the patient's request for an appointment before 6 pm was disregarded.

It's not hard to imagine that the model can handle such a task with only a few improvements. Later tests also revealed similar logic issues. Even though the model could correctly tell a patient the cost of an X-ray that was fed, it could not figure out the total cost of several exams.

Additionally, after further tests, GPT-3 has strange ideas about relaxing (e.g., recycling) and has difficulty prescribing medication and suggesting treatments. Finally, though it offers unsafe advice, it does so with correct grammar, which gives it an undue amount of credibility that may not have registered with a tired medical professional.


Follow us on social media: