Close Menu
    Trending
    • High Paying, Six Figure Jobs For Recent Graduates: Report
    • What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization
    • YouBot: Understanding YouTube Comments and Chatting Intelligently — An Engineer’s Perspective | by Sercan Teyhani | Jun, 2025
    • Inspiring Quotes From Brian Wilson of The Beach Boys
    • AI Is Not a Black Box (Relatively Speaking)
    • From Accidents to Actuarial Accuracy: The Role of Assumption Validation in Insurance Claim Amount Prediction Using Linear Regression | by Ved Prakash | Jun, 2025
    • I Wish Every Entrepreneur Had a Dad Like Mine — Here’s Why
    • Why You’re Still Coding AI Manually: Build a GPT-Backed API with Spring Boot in 30 Minutes | by CodeWithUs | Jun, 2025
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Where Do Loss Functions Come From? | by Yoshimasa | Mar, 2025
    Machine Learning

    Where Do Loss Functions Come From? | by Yoshimasa | Mar, 2025

    FinanceStarGateBy FinanceStarGateMarch 6, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Photograph by Antoine Dautry on Unsplash

    While you practice a machine studying mannequin, you decrease a loss perform — however have you ever ever questioned why we use those we do? Why is Imply Squared Error (MSE) so frequent in regression? Why does Cross-Entropy Loss dominate classification? Are loss features simply arbitrary decisions, or have they got deeper mathematical roots?

    It seems that many loss features aren’t simply invented — they emerge naturally from chance concept. However not all of them. Some loss features defy probabilistic instinct and are purely designed for optimization.

    Let’s begin with a easy instance. Suppose we’re predicting home costs utilizing a regression mannequin. The commonest strategy to measure error is the Imply Squared Error (MSE) loss:

    At first look, this simply seems to be like a mathematical strategy to measure how far our predictions are from actuality. However why squared error? Why not absolute error? Why not dice error?

    Chance Density Operate (PDF)

    If we assume that the errors in our mannequin comply with a regular distribution:

    Then the chance density perform (PDF) is:

    Probability Operate

    If we observe a number of impartial knowledge factors x1​,x2​,…,xn​, then their joint chance (probability perform) is:

    Since we sometimes work with log-likelihoods for simpler optimization:

    Deriving the Loss Operate

    Now, to show this right into a loss perform, we negate the log-likelihood (since optimizers decrease loss relatively than maximize chance):

    If we assume σ^2 is fixed, the loss perform simplifies to:

    which is simply Imply Squared Error (MSE).
    MSE isn’t only a selection — it’s the results of assuming usually distributed errors. Because of this we implicitly assume a Gaussian distribution each time we decrease MSE.

    If we don’t assume a hard and fast variance, we get a barely totally different loss perform:

    This additional time period, logσ^2, implies that the optimum parameters for μ and σ^2 are discovered collectively, relatively than assuming a hard and fast variance.

    If we deal with σ^2 as unknown, we transfer towards heteroscedastic fashions, which permit for various ranges of uncertainty throughout totally different predictions.

    Cross-Entropy Loss

    For classification issues, we frequently decrease Cross-Entropy Loss, which comes from the Bernoulli or Categorical probability perform.

    For binary classification:

    This arises naturally from the probability of knowledge coming from a Bernoulli distribution.

    To date, we’ve seen that many loss features come up naturally from probability features. However not all of them. Some are designed for optimization effectivity, robustness, or task-specific wants.

    Hinge Loss (SVMs)

    Most classification loss features, like cross-entropy loss, come from a probabilistic framework. However Hinge Loss, the core loss perform in Assist Vector Machines (SVMs), is totally different.

    As a substitute of modeling probability, it focuses on maximizing the margin between lessons.

    If now we have labels y∈{−1,+1} and a mannequin making predictions f(x), hinge loss is:

    If yf(x) ≥ 1 → No loss (appropriate classification with margin).

    If yf(x) → Loss will increase linearly (incorrect or near boundary).



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleCeramic.ai Emerges from Stealth, Reports 2.5x Faster Model Training
    Next Article Kubernetes — Understanding and Utilizing Probes Effectively
    FinanceStarGate

    Related Posts

    Machine Learning

    YouBot: Understanding YouTube Comments and Chatting Intelligently — An Engineer’s Perspective | by Sercan Teyhani | Jun, 2025

    June 13, 2025
    Machine Learning

    From Accidents to Actuarial Accuracy: The Role of Assumption Validation in Insurance Claim Amount Prediction Using Linear Regression | by Ved Prakash | Jun, 2025

    June 13, 2025
    Machine Learning

    Why You’re Still Coding AI Manually: Build a GPT-Backed API with Spring Boot in 30 Minutes | by CodeWithUs | Jun, 2025

    June 13, 2025
    Add A Comment

    Comments are closed.

    Top Posts

    Nvidia CEO Jensen Huang Says San Francisco Is Back Due to AI

    May 7, 2025

    The Future Just Landed — Are You Watching Closely, AI Techies? | by Sourabh Joshi | Jun, 2025

    June 1, 2025

    Study shows vision-language models can’t handle queries with negation words | MIT News

    May 14, 2025

    Can boosting algorithms outperform neural networks? | by Muhammad Husnain | Feb, 2025

    February 14, 2025

    Breaking into Data Science as an Analytics Engineer | by Amber Walker | May, 2025

    May 25, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    Unveiling the Neural Mind: Tracing Step-by-Step Reasoning in Large Language Models | by Vilohit | Apr, 2025

    April 28, 2025

    5 AI Projects You Can Build in a Weekend (With Python) | by Abdur Rahman | May, 2025

    May 5, 2025

    Prompt vs Output: The Ultimate Comparison That’ll Blow Your Mind! 🚀 | by AI With Lil Bro | Apr, 2025

    April 8, 2025
    Our Picks

    My Bear Market Investment Game Plan: Adjusting the Strategy

    April 9, 2025

    From a Point to L∞ | Towards Data Science

    May 2, 2025

    What Living in a 5-Minute City Taught Me About Building Better Businesses

    May 26, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.