Close Menu
    Trending
    • You’re Only Three Weeks Away From Reaching International Clients, Partners, and Customers
    • How Brain-Computer Interfaces Are Changing the Game | by Rahul Mishra | Coding Nexus | Jun, 2025
    • How Diverse Leadership Gives You a Big Competitive Advantage
    • Making Sense of Metrics in Recommender Systems | by George Perakis | Jun, 2025
    • AMD Announces New GPUs, Development Platform, Rack Scale Architecture
    • The Hidden Risk That Crashes Startups — Even the Profitable Ones
    • Systematic Hedging Of An Equity Portfolio With Short-Selling Strategies Based On The VIX | by Domenico D’Errico | Jun, 2025
    • AMD CEO Claims New AI Chips ‘Outperform’ Nvidia’s
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Artificial Intelligence»Researchers reduce bias in AI models while preserving or improving accuracy | MIT News
    Artificial Intelligence

    Researchers reduce bias in AI models while preserving or improving accuracy | MIT News

    FinanceStarGateBy FinanceStarGateFebruary 15, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Machine-learning fashions can fail after they attempt to make predictions for people who had been underrepresented within the datasets they had been educated on.

    As an illustration, a mannequin that predicts the most effective therapy possibility for somebody with a continual illness could also be educated utilizing a dataset that incorporates principally male sufferers. That mannequin would possibly make incorrect predictions for feminine sufferers when deployed in a hospital.

    To enhance outcomes, engineers can attempt balancing the coaching dataset by eradicating knowledge factors till all subgroups are represented equally. Whereas dataset balancing is promising, it usually requires eradicating great amount of knowledge, hurting the mannequin’s total efficiency.

    MIT researchers developed a brand new method that identifies and removes particular factors in a coaching dataset that contribute most to a mannequin’s failures on minority subgroups. By eradicating far fewer datapoints than different approaches, this method maintains the general accuracy of the mannequin whereas enhancing its efficiency concerning underrepresented teams.

    As well as, the method can determine hidden sources of bias in a coaching dataset that lacks labels. Unlabeled knowledge are much more prevalent than labeled knowledge for a lot of purposes.

    This technique may be mixed with different approaches to enhance the equity of machine-learning fashions deployed in high-stakes conditions. For instance, it’d sometime assist guarantee underrepresented sufferers aren’t misdiagnosed because of a biased AI mannequin.

    “Many different algorithms that attempt to tackle this problem assume every datapoint issues as a lot as each different datapoint. On this paper, we’re displaying that assumption isn’t true. There are particular factors in our dataset which might be contributing to this bias, and we are able to discover these knowledge factors, take away them, and get higher efficiency,” says Kimia Hamidieh, {an electrical} engineering and pc science (EECS) graduate pupil at MIT and co-lead writer of a paper on this technique.

    She wrote the paper with co-lead authors Saachi Jain PhD ’24 and fellow EECS graduate pupil Kristian Georgiev; Andrew Ilyas MEng ’18, PhD ’23, a Stein Fellow at Stanford College; and senior authors Marzyeh Ghassemi, an affiliate professor in EECS and a member of the Institute of Medical Engineering Sciences and the Laboratory for Info and Resolution Programs, and Aleksander Madry, the Cadence Design Programs Professor at MIT. The analysis can be introduced on the Convention on Neural Info Processing Programs.

    Eradicating dangerous examples

    Typically, machine-learning fashions are educated utilizing large datasets gathered from many sources throughout the web. These datasets are far too massive to be rigorously curated by hand, so they could include dangerous examples that damage mannequin efficiency.

    Scientists additionally know that some knowledge factors affect a mannequin’s efficiency on sure downstream duties greater than others.

    The MIT researchers mixed these two concepts into an strategy that identifies and removes these problematic datapoints. They search to unravel an issue often known as worst-group error, which happens when a mannequin underperforms on minority subgroups in a coaching dataset.

    The researchers’ new method is pushed by prior work by which they launched a technique, referred to as TRAK, that identifies crucial coaching examples for a particular mannequin output.

    For this new method, they take incorrect predictions the mannequin made about minority subgroups and use TRAK to determine which coaching examples contributed essentially the most to that incorrect prediction.

    “By aggregating this info throughout dangerous check predictions in the suitable approach, we’re capable of finding the precise components of the coaching which might be driving worst-group accuracy down total,” Ilyas explains.

    Then they take away these particular samples and retrain the mannequin on the remaining knowledge.

    Since having extra knowledge normally yields higher total efficiency, eradicating simply the samples that drive worst-group failures maintains the mannequin’s total accuracy whereas boosting its efficiency on minority subgroups.

    A extra accessible strategy

    Throughout three machine-learning datasets, their technique outperformed a number of strategies. In a single occasion, it boosted worst-group accuracy whereas eradicating about 20,000 fewer coaching samples than a traditional knowledge balancing technique. Their method additionally achieved larger accuracy than strategies that require making modifications to the interior workings of a mannequin.

    As a result of the MIT technique includes altering a dataset as a substitute, it will be simpler for a practitioner to make use of and may be utilized to many varieties of fashions.

    It can be utilized when bias is unknown as a result of subgroups in a coaching dataset usually are not labeled. By figuring out datapoints that contribute most to a function the mannequin is studying, they’ll perceive the variables it’s utilizing to make a prediction.

    “It is a instrument anybody can use when they’re coaching a machine-learning mannequin. They will take a look at these datapoints and see whether or not they’re aligned with the aptitude they’re making an attempt to show the mannequin,” says Hamidieh.

    Utilizing the method to detect unknown subgroup bias would require instinct about which teams to search for, so the researchers hope to validate it and discover it extra totally via future human research.

    In addition they need to enhance the efficiency and reliability of their method and make sure the technique is accessible and easy-to-use for practitioners who may sometime deploy it in real-world environments.

    “When you have got instruments that allow you to critically take a look at the information and determine which datapoints are going to result in bias or different undesirable conduct, it offers you a primary step towards constructing fashions which might be going to be extra honest and extra dependable,” Ilyas says.

    This work is funded, partially, by the Nationwide Science Basis and the U.S. Protection Superior Analysis Initiatives Company.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMeet Kate: Your AI-Powered, Live Multimodal Website Assistant 🤖 | by Médéric Hurier (Fmind) | Feb, 2025
    Next Article Uber CEO Wants to Partner With Tesla, Musk on Autonomous Vehicles
    FinanceStarGate

    Related Posts

    Artificial Intelligence

    How AI Agents “Talk” to Each Other

    June 14, 2025
    Artificial Intelligence

    Stop Building AI Platforms | Towards Data Science

    June 14, 2025
    Artificial Intelligence

    What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

    June 14, 2025
    Add A Comment

    Comments are closed.

    Top Posts

    How I Streamline My Work Day With ChatGPT With These Course’s AI Prompt Ideas

    March 16, 2025

    News Bytes Podcast 20250217: Arm Selling Its Own Chips to Meta?, Big xAI, Big Power, Big… Pollution?, TSMC in Intel Fab Takeover?, Europe’s Big AI Investment

    February 17, 2025

    MIT spinout maps the body’s metabolites to uncover the hidden drivers of disease | MIT News

    February 19, 2025

    How Cloud Innovations Empower Hospitality Professionals

    June 3, 2025

    Former 911 Dispatcher’s Side Hustle Earns Over $4k a Month

    March 21, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    How to automate data extraction in healthcare: A quick guide

    April 8, 2025

    Your Job Search Doesn’t Have to Be a Full-Time Job

    April 5, 2025

    How to Solve Machine Learning Case Studies: Cracking Fraud Detection in Data Science Interviews | by Ancienthorse | Feb, 2025

    February 27, 2025
    Our Picks

    ALL-IN-ONE Agent — Manus?. Alright! Let’s chat about something… | by Kaushik Holla | Mar, 2025

    March 13, 2025

    Top 25 AI Influencers to Follow in 2025 | by Mohamed Bakry | Apr, 2025

    April 8, 2025

    AI: The Invisible Force Behind Your Daily Routine | by Navneet Sai | Mar, 2025

    March 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.