Close Menu
    Trending
    • You’re Only Three Weeks Away From Reaching International Clients, Partners, and Customers
    • How Brain-Computer Interfaces Are Changing the Game | by Rahul Mishra | Coding Nexus | Jun, 2025
    • How Diverse Leadership Gives You a Big Competitive Advantage
    • Making Sense of Metrics in Recommender Systems | by George Perakis | Jun, 2025
    • AMD Announces New GPUs, Development Platform, Rack Scale Architecture
    • The Hidden Risk That Crashes Startups — Even the Profitable Ones
    • Systematic Hedging Of An Equity Portfolio With Short-Selling Strategies Based On The VIX | by Domenico D’Errico | Jun, 2025
    • AMD CEO Claims New AI Chips ‘Outperform’ Nvidia’s
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Where Do Loss Functions Come From? | by Yoshimasa | Mar, 2025
    Machine Learning

    Where Do Loss Functions Come From? | by Yoshimasa | Mar, 2025

    FinanceStarGateBy FinanceStarGateMarch 6, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Photograph by Antoine Dautry on Unsplash

    While you practice a machine studying mannequin, you decrease a loss perform — however have you ever ever questioned why we use those we do? Why is Imply Squared Error (MSE) so frequent in regression? Why does Cross-Entropy Loss dominate classification? Are loss features simply arbitrary decisions, or have they got deeper mathematical roots?

    It seems that many loss features aren’t simply invented — they emerge naturally from chance concept. However not all of them. Some loss features defy probabilistic instinct and are purely designed for optimization.

    Let’s begin with a easy instance. Suppose we’re predicting home costs utilizing a regression mannequin. The commonest strategy to measure error is the Imply Squared Error (MSE) loss:

    At first look, this simply seems to be like a mathematical strategy to measure how far our predictions are from actuality. However why squared error? Why not absolute error? Why not dice error?

    Chance Density Operate (PDF)

    If we assume that the errors in our mannequin comply with a regular distribution:

    Then the chance density perform (PDF) is:

    Probability Operate

    If we observe a number of impartial knowledge factors x1​,x2​,…,xn​, then their joint chance (probability perform) is:

    Since we sometimes work with log-likelihoods for simpler optimization:

    Deriving the Loss Operate

    Now, to show this right into a loss perform, we negate the log-likelihood (since optimizers decrease loss relatively than maximize chance):

    If we assume σ^2 is fixed, the loss perform simplifies to:

    which is simply Imply Squared Error (MSE).
    MSE isn’t only a selection — it’s the results of assuming usually distributed errors. Because of this we implicitly assume a Gaussian distribution each time we decrease MSE.

    If we don’t assume a hard and fast variance, we get a barely totally different loss perform:

    This additional time period, logσ^2, implies that the optimum parameters for μ and σ^2 are discovered collectively, relatively than assuming a hard and fast variance.

    If we deal with σ^2 as unknown, we transfer towards heteroscedastic fashions, which permit for various ranges of uncertainty throughout totally different predictions.

    Cross-Entropy Loss

    For classification issues, we frequently decrease Cross-Entropy Loss, which comes from the Bernoulli or Categorical probability perform.

    For binary classification:

    This arises naturally from the probability of knowledge coming from a Bernoulli distribution.

    To date, we’ve seen that many loss features come up naturally from probability features. However not all of them. Some are designed for optimization effectivity, robustness, or task-specific wants.

    Hinge Loss (SVMs)

    Most classification loss features, like cross-entropy loss, come from a probabilistic framework. However Hinge Loss, the core loss perform in Assist Vector Machines (SVMs), is totally different.

    As a substitute of modeling probability, it focuses on maximizing the margin between lessons.

    If now we have labels y∈{−1,+1} and a mannequin making predictions f(x), hinge loss is:

    If yf(x) ≥ 1 → No loss (appropriate classification with margin).

    If yf(x) → Loss will increase linearly (incorrect or near boundary).



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleCeramic.ai Emerges from Stealth, Reports 2.5x Faster Model Training
    Next Article Kubernetes — Understanding and Utilizing Probes Effectively
    FinanceStarGate

    Related Posts

    Machine Learning

    How Brain-Computer Interfaces Are Changing the Game | by Rahul Mishra | Coding Nexus | Jun, 2025

    June 14, 2025
    Machine Learning

    Making Sense of Metrics in Recommender Systems | by George Perakis | Jun, 2025

    June 14, 2025
    Machine Learning

    Systematic Hedging Of An Equity Portfolio With Short-Selling Strategies Based On The VIX | by Domenico D’Errico | Jun, 2025

    June 14, 2025
    Add A Comment

    Comments are closed.

    Top Posts

    When each human is a line of the dataset | by 侧成峰 | Mar, 2025

    March 24, 2025

    Machine Learning. Machine Learning Basics | by Pranav V R | Apr, 2025

    April 3, 2025

    These Sleep Earbuds Can be Perfect for the Office, Now 25% Off

    May 6, 2025

    The AGI Revolution Is Coming — Here’s What Every Leader Needs to Know

    February 14, 2025

    This $200 MacBook Air Handles Your Hustle Without Complaints

    May 31, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    Starbucks Cuts the Number of Drinks Allowed in Mobile Orders

    February 7, 2025

    Building a Modern Dashboard with Python and Gradio

    June 5, 2025

    Mastering Natural Language Processing — Part 13 Running and Evaluating Classification Experiments in NLP | by Connie Zhou | Apr, 2025

    April 28, 2025
    Our Picks

    Precision Agriculture: Transforming Modern Farming( From Hoe to High-Tech) | by Fatima Habib Ahmed | Apr, 2025

    April 30, 2025

    Federated Learning: Unlocking Insights Without Sharing Data | by shuvam mishra | Apr, 2025

    April 5, 2025

    Puzzling out climate change | MIT News

    February 10, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.