Algorithms used in medicine are trained on data from only a few states


Most medical algorithms were developed using information from people treated in Massachusetts, California, or New York, according to a new study. Those three states dominate patient data — and 34 other states were simply not represented at all, according to the research published this week in the Journal of the American Medical Association. The narrow geographic distribution of the data used for these algorithms may be an unrecognized bias, the study authors argue.
The algorithms that the researchers were looking at are designed to make medical decisions based on patient data. When researchers build an algorithm that they want to guide patient diagnosis — like to examine a chest X-ray and decide if it has signs of pneumonia — they feed it real-world examples of patients with and without the condition they want it to look for. It’s well-recognized that gender and racial diversity is important in those training sets: if an algorithm only gets men’s X-rays during training, it may not work as well when it’s given an X-ray from a woman who is hospitalized with difficulty breathing. But while researchers have learned to watch for some forms of bias, geography hasn’t been highlighted.
“There are all these things that end up getting baked into the dataset and become implicit assumptions in the data, which may not be valid assumptions nationwide,” study author and Stanford University researcher Amit Kaushal told Stat News.
Kaushal and his team examined the data used to train 56 published algorithms, which were designed to be used in fields like dermatology, radiology, and cardiology. It’s not clear how many are actually in use at clinics and hospitals. Of the 56 algorithms, 40 used patient data from either Massachusetts, California, or New York. No other state contributed data to more than five algorithms.
It’s not clear if or exactly how geography might skew an algorithm’s performance. Coastal hubs like New York, though, have different demographics and underlying health issues than states in the South or Midwest. Still, researchers do know, in general, that algorithms that work under one set of circumstances sometimes don’t work as well with others. Some studies show that algorithms can work better at the institutions where they’re created than they do at other hospitals.
Many academic research centers that do artificial intelligence and machine learning research are in health care hubs like Massachusetts, California, and New York. Data from California, home to Silicon Valley, was included in about 40 percent of the algorithms. It’s difficult for researchers to get access to data from institutions other than the ones where they work. That may be why the data clusters in this way. Broadening the datasets may be challenging, but identifying the disparity shows that geography is another factor worth tracking in medical algorithms.
Most medical algorithms were developed using information from people treated in Massachusetts, California, or New York, according to a new study. Those three states dominate patient data — and 34 other states were simply not represented at all, according to the research published this week in the Journal of the…
Recent Posts
- I tried ChatGPT’s Dall-E 3 image generator and these 5 tips will help you get the most from your AI creations
- Gabby Petito murder documentary sparks viewer backlash after it uses fake AI voiceover
- The quirky Alarmo clock is no longer exclusive to Nintendo’s online store
- The government is still threatening to ‘semi-fire’ workers who don’t answer an email from Elon Musk
- Sigma’s latest camera is so minimalist it doesn’t have a memory card slot
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010