IBM Watson Reportedly Recommended Cancer Treatments That Were 'Unsafe and Incorrect'

IBM Watson Reportedly Recommended Cancer Treatments That Were 'Unsafe and Incorrect' 103

Posted by BeauHD on Wednesday July 25, 2018 @06:40PM from the time-to-schedule-a-check-up dept.

An anonymous reader quotes a report from Gizmodo: Internal company documents from IBM show that medical experts working with the company's Watson supercomputer found "multiple examples of unsafe and incorrect treatment recommendations" when using the software, according to a report from Stat News. According to Stat, those documents provided strong criticism of the Watson for Oncology system, and stated that the "often inaccurate" suggestions made by the product bring up "serious questions about the process for building content and the underlying technology." One example in the documents is the case of a 65-year-old man diagnosed with lung cancer, who also seemed to have severe bleeding. Watson reportedly suggested the man be administered both chemotherapy and the drug "Bevacizumab." But the drug can lead to "severe or fatal hemorrhage," according to a warning on the medication, and therefore shouldn't be given to people with severe bleeding, as Stat points out. A Memorial Sloan Kettering (MSK) Cancer Center spokesperson told Stat that they believed this recommendation was not given to a real patient, and was just a part of system testing.

According to the report, the documents blame the training provided by IBM engineers and on doctors at MSK, which partnered with IBM in 2012 to train Watson to "think" more like a doctor. The documents state that -- instead of feeding real patient data into the software -- the doctors were reportedly feeding Watson hypothetical patients data, or "synthetic" case data. This would mean it's possible that when other hospitals used the MSK-trained Watson for Oncology, doctors were receiving treatment recommendations guided by MSK doctors' treatment preferences, instead of an AI interpretation of actual patient data. And the results seem to be less than desirable for some doctors.

IBM Watson Reportedly Recommended Cancer Treatments That Were 'Unsafe and Incorrect'

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 103 Comments Log In/Create an Account

Comments Filter:

- - - Re: (Score:2, Informative)
      
      by Anonymous Coward writes:
      
      You mean imposing sanctions, killing hundreds of Russian soldiers, giving $200 million in weapons to the Ukraine, expressly rejecting Russia's takeover of the Crimea, pushing to put US troops and their missile shield into Poland, increasing fracking to drive down the price of oil, trying to force Europe to stop buying Russian gas and increase their militaries...?
      Trump has already done more to oppose Russia than Obama ever did - Obama didn't have the guts to enforce his own "red-line" in Syria.
      But when Obama
    - Re: (Score:2)
      
      by sexconker ( 1179573 ) writes:
      
      "You'd have to show that he was being directly influenced by a foreign official or head of state"
      All the moves he's made in Russia's favor and the disgusting sycophancy he's shown around Putin is raising and should raise a lot of questions.
      It's called diplomacy.
      I'm sorry if you think wanting to maintain good relations with our allies (yes, Russia is our ally) is bad. I'm sorry if you think peace between the Koreas is bad.
So Watson is no worse than actual Doctors ? (Score:3)

by Crashmarik ( 635988 ) writes: on Wednesday July 25, 2018 @06:45PM (#57009658)

Really where is the there, here ? You'll have doctors frequently dispute what the correct treatment is and with diseases like cancer it doesn't help that the best you can often do is offer a statistical improvement of someone's chances.
Far better that more people can afford treatment faster than this remain the province of the priesthood.

- - Re: (Score:1)
    
    by Narcocide ( 102829 ) writes:
    
    12 years
  - Re: So Watson is no worse than actual Doctors ? (Score:3)
    
    by guruevi ( 827432 ) writes:
    
    I work with Med students. Even though the requirements are pretty high, there is no effort to keep people out for any reason.
    The problem is that the majority of the people failing first year is because they want to be doctors for the money, they lack the drive to see it through when they are notified they'll have to spend 60h in a rotation for little to no pay.
    Doctors don't make big money until well after college, often several years later being residents in various hospitals following around other doctors
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - - Re: (Score:2)
      
      by jellomizer ( 103300 ) writes:
      
      IBM will just be lazy. However if they can get there system to have measurable results they can sell more.
    - Re: jellomizer is a moron (Score:1)
      
      by Hentai007 ( 188457 ) writes:
      
      More like Hippocratic Oaf, am I rite?
- Re: So Watson is no worse than actual Doctors ? (Score:2, Troll)
  
  by aaronb1138 ( 2035478 ) writes:
  
  Cancer is a huge money industry for medicine. This is why the huge focus is on screening / early detection, because those allow tons of unnecessary treatment for perfectly healthy people. People get done with treatment and get told they're in the clear. Everybody is happy and celebrates. Nobody sues for fraud when nothing was wrong in the first place.
  
  https://qz.com/1335348/google-is-building-virtual-agents-to-handle-call-centers-grunt-work/
  - Re: So Watson is no worse than actual Doctors ? (Score:4, Informative)
    
    by aaronb1138 ( 2035478 ) writes: on Wednesday July 25, 2018 @07:49PM (#57009894)
    
    Dammit, wrong link copied over.
    
    https://fivethirtyeight.com/features/the-case-against-early-cancer-detection/
    
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- The Airbnb syndrome (Score:1)
  
  by hcs_$reboot ( 1536101 ) writes:
  
  They found "multiple examples of unsafe and incorrect treatment recommendations". How many exactly, what's the %? What's the relevancy of the "incorrectness" (totally, or mildly?). Doctors have to protect their interests and, probably, discredit AI, thus any mild error is to be publicized. Similar to complaints against Airbnb. Airbnb does close to a million rentals a day, and when an infinitesimal part of that (twice a year) makes trouble, it's largely publicized.
Watson: I suggest this to kill the cancer... (Score:3)

by rnturn ( 11092 ) writes: on Wednesday July 25, 2018 @06:51PM (#57009690)

... but it will the patient. Is that a problem?"
Doctor (shaking his head): Yes, Watson... that is a problem.
(Who trained Watson for this job anyway?)

Using a screwdriver as a hammer (Score:5, Insightful)

by Tablizer ( 95088 ) writes: on Wednesday July 25, 2018 @06:57PM (#57009714) Journal

The purpose of such a tool should be to make suggestions that a doctor may not consider themselves. It should be up to the doctor(s) to vet the suggestions or leads before any treatment is actually rendered. A doctor would have to be born in Stupidville to accept bot suggestions as-is.

- Re: (Score:3)
  
  by WillAffleckUW ( 858324 ) writes:
  
  This is why you want Dr Who, not Dr Watson.
  Dr Who knows how to use a screwdriver, and she does it much better than Dr Watson does.
  - Re: (Score:1)
    
    by Tablizer ( 95088 ) writes:
    
    Dr. Who only knows how to use a sonic screwdriver. A muggle's screwdriver baffles the daylights out of her/him/it.
    - Re: (Score:2)
      
      by WillAffleckUW ( 858324 ) writes:
      
      Dr. Who only knows how to use a sonic screwdriver. A muggle's screwdriver baffles the daylights out of her/him/it.
      She's The Doctor, not an Engineer.
  - Re: (Score:1)
    
    by Daralantan ( 5305713 ) writes:
    
    I was going to say they need to make an IBM House.
    You'd end up with suggestions like punching the patients in the face, or abusing the staff. Good times.
- Re: Using a screwdriver as a hammer (Score:2)
  
  by billDCat ( 448249 ) writes:
  
  That is in fact what it does
  - Re: (Score:1)
    
    by Tablizer ( 95088 ) writes:
    
    It's ultimately what the doctor does with the info that really matters. I would hope they are properly trained to use the system and know its limitations. Disclaimer notices wouldn't hurt as reminder.
Really no surprise (Score:5, Interesting)

by gweihir ( 88907 ) writes: on Wednesday July 25, 2018 @06:59PM (#57009720)

This is a statistics-driven automaton that has zero insight or understanding. Calling it "AI" is a marketing lie, even if the AI field has given in and calls things like this "weak AI", which is the AI without "I". As such, this machine can find statistical correlations, but it cannot do plausibility checks, because that requires insight. It cannot do predictions either, because that also requires insight. The real strength of Watson (and it is quite an accomplishment) is that unlike older comparable systems, you can feed the training data and the queries into it in natural language. This means you can train a lot cheaper, but at the cost of accuracy, as the effect described in the story nicely shows.
It is time for this "AI" hype to die down. All it shows is that many people do not chose to use what they have in general intelligence and rather mindlessly follow a crows of cheer-leaders.

- Term Squirm [Re:Really no surprise] (Score:5, Insightful)
  
  by Tablizer ( 95088 ) writes: on Wednesday July 25, 2018 @07:07PM (#57009756) Journal
  
  Calling it "AI" is a marketing lie
  In practice the term "AI" is vague and continuous rather than a Boolean designation ("is" versus "is-not"). The term is not worth sweating over. The exception may be if you are making a big purchase and/or investment based on something being "AI". In that case, inspect it carefully rather than assume something with "AI" is smart and/or useful. But that's good advice for any significant purchase: test drive it & ask detailed questions rather than rely on the brochure.
  
  - Re:Term Squirm [Re:Really no surprise] (Score:5, Insightful)
    
    by gweihir ( 88907 ) writes: on Wednesday July 25, 2018 @11:26PM (#57010628)
    
    It actually is pretty Boolean: Use it for anything real and you are a liar. Because exactly nothing that deserves the description "AI" does exist. Qualify it with "weak" and you use an obviously inappropriate term.
    
    - Re: (Score:1)
      
      by Tablizer ( 95088 ) writes:
      
      Terms are ultimately defined by common usage, not necessarily by what's logical, clear, useful, or fair.
      Defining "natural intelligence" is sticky also. I remember debating for weeks over what "intent" means. Great nerdy fun. (This was before Emailgate, by the way.)
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        We are in science and engineering here. Terms have real meaning and are not defined by common use outside of that field.
        
        Re: (Score:1)
        
        by Tablizer ( 95088 ) writes:
        
        The issue was "AI". If you can supply a precise and unambiguous definition, please do.
        Further, what it means colloquially (regular press) and what it means in technical journals could vary. The audience scope or target thus may also matter.
    - Re: (Score:2)
      
      by RespekMyAthorati ( 798091 ) writes:
      
      It actually is pretty Boolean: Use it for anything real and you are a liar.
      
      Who the fuck appointed you the arbitrator of what's "intelligent" and what isn't?
      
      Besides, anybody who has read your previous posts knows that you consider
      intelligence to be some kind of supernatural hocus-pocus,
      so of course a machine can't have it.
- Re:Really no surprise (Score:4, Insightful)
  
  by ShanghaiBill ( 739463 ) writes: on Wednesday July 25, 2018 @07:13PM (#57009782)
  
  As such, this machine can find statistical correlations, but it cannot do plausibility checks, because that requires insight. It cannot do predictions either, because that also requires insight.
  Neither of these require "insight". They just require more data. With enough examples, statistical correlation is all you need.
  
  - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    You'll never capture everything in the training set.
    In this case what was required is being able to read the medicine's instructions and do some common sense reasoning to see how it's relevant to the patient. Between reading and common sense we're well beyond what Watson is capable of.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    You will never have enough data for that.
    - Re: (Score:2)
      
      by RespekMyAthorati ( 798091 ) writes:
      
      Only if you subscribe to gweihir's superstitious concept of intelligence.
  - Re: (Score:2)
    
    by cascadingstylesheet ( 140919 ) writes:
    
    With enough examples, statistical correlation is all you need.
    A: We have to withhold this treatment because 100% of people with this condition last year died within a month.
    B: Were they treated for it?
    A. No, because we have to withhold treatment.
- Re: Really no surprise (Score:3)
  
  by phantomfive ( 622387 ) writes:
  
  The AI hype is sound and on solid footing compared to the blockchain hype: I've never seen so much effort poured into such a useless technology, cthulu be praised.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    So you think something that does not exist is "solid" in comparison to something that does exist but it pretty useless? Strange priorities you have there...
Oops.... (Score:3)

by erp_consultant ( 2614861 ) writes: on Wednesday July 25, 2018 @07:01PM (#57009726)

I'll take Incorrect Diagnosis for $200, Alex.

- Re: (Score:1)
  
  by Tablizer ( 95088 ) writes:
  
  I'll take Incorrect Diagnosis for $200, Alex.
  "What is Trollitis?" ;-)
So they're no worse than doctors. (Score:3)

by greenwow ( 3635575 ) writes: on Wednesday July 25, 2018 @07:01PM (#57009728)

But is Watson cheaper than a doctor?

Too quick to judge? (Score:2)

by alzoron ( 210577 ) writes:

The survival rate for lung cancer can sometimes be as low as 4% over five years. Even if the drug combination had a 90% chance to outright kill the patient it might raise their overall chances of survival enough to actually be worth the risk. Based on what I know about lung cancer dying from severe hemorrhaging could be preferable to the relatively slow agonizing death some experience otherwise, especially if your overall chances of survival are higher.
So? (Score:5, Insightful)

by 50000BTU_barbecue ( 588132 ) writes: on Wednesday July 25, 2018 @07:06PM (#57009752) Journal

How many human doctors did the same or worse?

- Re: (Score:3)
  
  by SlaveToTheGrind ( 546262 ) writes:
  
  Asking society to put its trust in a machine with the justification that at its best it fucks up no more often than some humans at their worst is a non-starter.
  - Re: (Score:2)
    
    by yusing ( 216625 ) writes:
    
    Yeah but ... no health benefits! no retirement! no vacations!
    Great deal for the vendors, not so much for their victims.
- Re: (Score:2)
  
  by AHuxley ( 892839 ) writes:
  
  Human doctors face peer review of all work in good advanced teaching hospitals.
  The best teaching hospitals can ensure only a nations very best medical professionals are working every decade.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  "Part of the system testing"
  I think what we are reading is leaked info from someone working on the trail who releases that it's going frighteningly well.
  Basically:
  Watson will get things wrong. Especially in testing. It should not be used on it's own for the foreseeable future. It needs a trained doctor to review the decisions... it is a remarkable assistant. It will only get better and bring a standard of healthcare to a vast number of people who could never afford to access/reach a doctor.
- - Re: (Score:1)
    
    by The Evil Atheist ( 2484676 ) writes:
    
    Again, you make the mistake of not considering the base rate to begin with. You rant against "minority quota", but how many straight white males also should have flunked out but stayed in?
    
    Kind of stupid of you to assume that everything was all hunky-dory until the minorities got in.
    - Re: (Score:1)
      
      by blindseer ( 891256 ) writes:
      
      Again, you make the mistake of not considering the base rate to begin with.
      
      What does that even mean? The article was quite clear, the guy got in to medical school, with lower grades than his Asian American friend who was denied entry, because he pretended to be African American. What does "base rate" have to do with this?
      You rant against "minority quota", but how many straight white males also should have flunked out but stayed in?
      Listen, facts don't care about your feelings. If you have data that straight white males have been allowed to stay in college even after failing to meet requirements then I'd like to see it.
      It's been widely reported that medical schools have been discriminating
      - Re: (Score:2, Offtopic)
        
        by blindseer ( 891256 ) writes:
        
        I noticed you made no effort to disprove that white and asian students are being discriminated against based only on their race. I made my case that this racial discrimination exists. I'd like to see you prove otherwise.
        Saying that no one wants to see this data is provably false, numerous colleges and universities have been sued for this data. There are people that want to know. I'm sure that some schools sued over their blatantly racist admissions will fight for this to not come out. That's not becaus
The operative quote here... (Score:3)

by GerryGilmore ( 663905 ) writes: on Wednesday July 25, 2018 @07:10PM (#57009770)

...is this: "A Memorial Sloan Kettering (MSK) Cancer Center spokesperson told Stat that they believed this recommendation was not given to a real patient, and was just a part of system testing."
Isn't this the kind of thing that testing is designed to uncover? It sounds to me like at least this part of the process is working, unlike the asshole who fed the model "fake data".

Sounds like a well trained AI (Score:2)

by WillAffleckUW ( 858324 ) writes:

It just wanted to help impose pro-Darwinian responses to malformed genetic abnormalities.
Next up: self-driving cars that crash on purpose because their passengers sing songs the AI hates.
Garbage In Garbage Out (Score:2)

by kiviQr ( 3443687 ) writes:

test data provide test results.
Garbage in, dead patients out... (Score:3)

by Dread_ed ( 260158 ) writes: on Wednesday July 25, 2018 @07:49PM (#57009892) Homepage

So the data fed to train Watson wasn't from actual cases? Why does it matter what the computer prescribed, then? The system that is Watson is only as good as the data you feed it. Feed it fake information, get not even wrong results. Sounds more like a smear campaign,
intentionally designed to fail, and certainly not an experiment designed to measure Watson's recommendations against actual doctor recommendations.
Here's a better idea...
Feed the damn thing actual patient records with everything included from first immunization to the patient's ultimate death. If you are looking to see if there are any correlations that humans haven't already made you need to feed that sucker as much data as is inhumanly possible and then let it do the work.
What we have now is a pseudepigrapha of Watson's capabilities. Sure the results are from Watson, but they are not what Watson would do if given accurate, real life data to work with. They made a forgery of the system and put Watson's name on it.
Shady, bro. Shady...

- Re: (Score:2)
  
  by omnichad ( 1198475 ) writes:
  
  And how do you resell services from a data model that contains HIPAA-protected data?
Garbage in, garbage out (Score:5, Interesting)

by blindseer ( 891256 ) writes: <blindseer@@@earthlink...net> on Wednesday July 25, 2018 @07:51PM (#57009902)

An AI can only be as good as the data used to train it. The article pointed out that Watson was trained using what was possibly based as much on objective data as much as it was on subjective preferences of the physicians that fed it data.
I recall reading an article about someone doing a study on medical procedures done throughout the USA and they noticed "hot spots" of procedures being done in certain areas. What they found was that in these places they'd see physicians that would recommend procedures out of personal preference. One example was a an area with a lot of tonsillectomies, because a physician felt that any throat infection meant the tonsils had to come out. Another area had an elevated number of hysterectomies, because a physician felt that post-menopause women had an elevated risk of developing cysts and cancers on the uterus. The article went on to say that while such treatments may be unusual no one was willing to consider this malpractice.
So, Watson recommended a treatment for someone that might aggravate an existing problem of severe bleeding. Is this bad coding for not taking this into account? Or, is there a physician that entered such a prescription for their patient with similar symptoms? It's real difficult to second guess a physician. It's real easy to second guess the computer. Even if both the computer and the human came to the same recommendation for treatment.

- - Re: (Score:2)
    
    by The Evil Atheist ( 2484676 ) writes:
    
    What? ML solutions are programs. They are vastly easier to figure out what went wrong, compared to a human brain. You really want to claim that a human mind is easier to figure out what went wrong? In instances where we can work it out, is only due to self-attestation to what they were thinking at the time, which is not accurate, and subject to ego. And the self-attestation is also biased, leading to corrections that may not address the root of the problem.
- Re: (Score:2)
  
  by cascadingstylesheet ( 140919 ) writes:
  
  So, you're holding it wrong?
Hmmmm... (Score:3)

by yusing ( 216625 ) writes: on Wednesday July 25, 2018 @11:40PM (#57010670) Journal

What would happen if we started calling Ai 'Fake Intelligence' ... Fee Fi Foes?
As I understand the current fashions, AI has a fatal flaw: it's result is non-deterministic ... noone can be sure how it arrives at an answer. That might be okay for face recognition, or 'computer art' ... but for locating potential automobile collision victims, or deterministically arriving at a sound treatment for a patient? Wrong model.
I'd guess that the 'expert systems' of 20 years back outperform neural nets. Their logic trees were scrutable.

so what? (Score:2)

by SuperDre ( 982372 ) writes:

Right now it's still in a early learning proces, and it's a tool to help doctors. So what if it, at this point in development, makes the unsafe/incorrect treatment? It's not like doctors are right all the time, and doctors also have been well know to prescribe wrong treatments. Or, maybe the system did know about it, but calculated the risc factor of the patient dying anyway if he didn't get treatment.
But we're still at the beginning of having AI determine stuff like this, and yet Watson is already very wel
AI and BiG Data (Score:1)

by OneSizeFitsNoone ( 3378187 ) writes:

Watson just needs a big data cache of real-life human deaths to learn how to cure cancer.
obligatory (Score:2)

by bigdavex ( 155746 ) writes:

Unless you combine it with dilaftin.
Which any first-year should know is
the standard prep medication your patient
was taking before surgery. Your patient
should be dead.
Yet more evidence that so-called AI is crap (Score:2)

by Rick Schumann ( 4662797 ) writes:

For the second time today we see evidence that the poor excuse for AI they keep trotting out, in this case probably the most advanced version of it, even, is crap. I maintain that without understanding how a biological brain actually is able to think, there's no way these throw-it-at-the-wall-and-see-if-it-sticks guesses at an approach are going to ever be real AI -- and since we don't have the instrumentality to really truly see how a biological brain works, and map it's connections, in a living subject, w

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re: (Score:2, Informative)

Re: (Score:2)

So Watson is no worse than actual Doctors ? (Score:3)

Re: (Score:1)

Re: So Watson is no worse than actual Doctors ? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: jellomizer is a moron (Score:1)

Re: So Watson is no worse than actual Doctors ? (Score:2, Troll)

Re: So Watson is no worse than actual Doctors ? (Score:4, Informative)

Re: (Score:2)

The Airbnb syndrome (Score:1)

Watson: I suggest this to kill the cancer... (Score:3)

Using a screwdriver as a hammer (Score:5, Insightful)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: Using a screwdriver as a hammer (Score:2)

Re: (Score:1)

Really no surprise (Score:5, Interesting)

Term Squirm [Re:Really no surprise] (Score:5, Insightful)

Re:Term Squirm [Re:Really no surprise] (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re:Really no surprise (Score:4, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Really no surprise (Score:3)

Re: (Score:2)

Oops.... (Score:3)

Re: (Score:1)

So they're no worse than doctors. (Score:3)

Too quick to judge? (Score:2)

So? (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2, Offtopic)

The operative quote here... (Score:3)

Sounds like a well trained AI (Score:2)

Garbage In Garbage Out (Score:2)

Garbage in, dead patients out... (Score:3)

Re: (Score:2)

Garbage in, garbage out (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Hmmmm... (Score:3)

so what? (Score:2)

AI and BiG Data (Score:1)

obligatory (Score:2)

Yet more evidence that so-called AI is crap (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals