Close Menu
    Trending
    • Patterns at Your Fingertips: A Practitioner’s Journey into Fingerprint Classification | by Everton Gomede, PhD | Jun, 2025
    • Get Microsoft 365 for Six People a Year for Just $100
    • The Age of Thinking Machines: Are We Ready for AI with a Mind of Its Own? | by Mirzagalib | Jun, 2025
    • Housing Market Hits a Record, More Sellers Than Buyers
    • Gaussian-Weighted Word Embeddings for Sentiment Analysis | by Sgsahoo | Jun, 2025
    • How a Firefighter’s ‘Hidden’ Side Hustle Led to $22M in Revenue
    • Hands-On CUDA ML Setup with PyTorch & TensorFlow on WSL2
    • 5 Lessons I Learned the Hard Way About Business Success
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Papers Explainedv377: Fathom-R1. Fathom-R1–14B is a 14-billion-parameter… | by Ritvik Rastogi | May, 2025
    Machine Learning

    Papers Explainedv377: Fathom-R1. Fathom-R1–14B is a 14-billion-parameter… | by Ritvik Rastogi | May, 2025

    FinanceStarGateBy FinanceStarGateMay 30, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Fathom-R1–14B is a 14-billion-parameter reasoning language mannequin derived from Deepseek-R1-Distilled-Qwen-14B, fine-tuned for mathematical reasoning by Fractal.

    The fashions and datasets can be found at HuggingFace.

    We start by curating a high-quality mathematical corpus from the next open-source datasets:

    • Open-R1 — default subset
    • Numina — Olympiads & AOPS_forum (phrase issues, float sort solutions)

    After rigorous deduplication and decontamination, roughly ~100K distinctive issues are consolidated forming the preliminary corpus for all subsequent trainings.

    Coaching Recipe for Fathom-R1–14B-v0.6

    SFT on troublesome questions and their reasoning chains has confirmed efficient for enhancing reasoning skill. Constructing on this, this coaching stage goals to enhance the mannequin’s efficiency on difficult mathematical issues utilizing an iterative curriculum studying technique, with a most sequence size of 16k. Curriculum studying (CL) is a well-established methodology for coaching LLMs, the place the mannequin is progressively uncovered to more and more troublesome duties. This method helps scaffold extra complicated reasoning, enhancing generalization and lowering overfitting. On this case, CL is carried out iteratively, which means a number of iterations of CL are carried out.

    For dataset preparation, every query’s problem is annotated utilizing OpenAI’s o3mini mannequin. Solely questions rated above common are retained and additional filtered to incorporate these with resolve charges between 0.2 and 0.7. This course of leads to the Iterative Curriculum Studying dataset, comprising 5K examples.

    Coaching Recipe for Fathom-R1–14B-v0.4-RS

    The technique for creating this checkpoint entails a two-stage pipeline:

    First Stage (Leveraging RL for environment friendly test-time pondering):

    1. Curate a seed dataset guaranteeing minimal reward however room for progress, comprising questions with resolve charges inside a selected vary, forming a 7.7K query RL Compression dataset.
    2. Practice the bottom mannequin, DeepSeek-R1-Distill-Qwen-14B, utilizing the GRPO algorithm with a 6k token sequence size restrict.
    3. The mannequin learns to generate concise responses, displaying improved efficiency at decrease token limits.

    Second Stage (Leveraging SFT to enhance reasoning effectively at increased sequence size):

    1. Construct upon the RL checkpoint and carry out SFT with a 16K context window to reinforce detailed reasoning for complicated issues.
    2. Curate a dataset of exhausting issues with decrease resolve charges, forming a 9.5K instance SFT Shortest Chains dataset.
    3. Supervised fine-tuning on this dataset stabilizes the mannequin’s reasoning at as much as 16K sequence size.

    The ensuing mannequin, Fathom-R1–14B-v0.4, is optimized for concise but correct mathematical reasoning.

    Coaching Recipe for Fathom-R1–14B-v0.4

    Given the efficiency enchancment observed throughout the second fine-tuning stage of growing Fathom-R1–14B-v0.4-RS and in an try and additional scale back the associated fee, an experiment was performed by eliminating RL and immediately performing second stage SFT on Deepseek-R1-Distilled-Qwen-14B base mannequin.

    Mannequin Merging

    Given v0.6 and v0.4 fashions have been developed by following completely different coaching methodologies, linear merging is carried out to mix the strengths to acquire last 2 checkpoints.

    • Fathom-R1–14B: Obtained by way of merging Fathom-R1–14B-V0.6 (Iterative Curriculum SFT) and Fathom-R1–14B-V0.4 (SFT-Shortest-Chains)
    • Fathom-R1–14B-RS: Obtained by way of merging Fathom-R1–14B-V0.6 (Iterative Curriculum SFT) and Fathom-R1–14B-V0.4 (RL-compression + SFT-Shortest-Chains)
    • Fathom‑R1–14B demonstrates extremely aggressive efficiency throughout all datasets, bettering over the unique R1-distilled fashions whereas intently matching or surpassing different sturdy baselines in a number of settings.
    • On each AIME 25 and HMMT 25, our mannequin exhibits the very best go@1 in addition to cons@64 scores amongst all of the open-source fashions (together with the larger R1-Distilled-32B mannequin), with R1–670B being the one exception.
    • Fathom-R1–14B is superior to the primary two generations of OpenAI’s mini-reasoning fashions, together with o1-mini and o3-mini-low- and its efficiency intently matches that of newly launched o4-mini-low (self-consistency decoding).

    Fathom-R1: $499 Training Recipe for Unlocking Math Reasoning at o4-mini level with just 14B parameters under 16K context



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleSalesforce Is Cutting Back on Hiring Engineers Thanks to AI
    Next Article A Practical Introduction to Google Analytics
    FinanceStarGate

    Related Posts

    Machine Learning

    Patterns at Your Fingertips: A Practitioner’s Journey into Fingerprint Classification | by Everton Gomede, PhD | Jun, 2025

    June 1, 2025
    Machine Learning

    The Age of Thinking Machines: Are We Ready for AI with a Mind of Its Own? | by Mirzagalib | Jun, 2025

    June 1, 2025
    Machine Learning

    Gaussian-Weighted Word Embeddings for Sentiment Analysis | by Sgsahoo | Jun, 2025

    June 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Hd#شماره خاله تهران# شماره خاله تهرانپارس# شماره خاله تهرانسر# شماره خاله انقلاب شماره خاله ونک…

    March 16, 2025

    ByteDance InfiniteYou: AI model to Generate Character Consistent images | by Mehul Gupta | Data Science in your pocket | Mar, 2025

    March 22, 2025

    314 Things the Government Might Know About You

    April 21, 2025

    Chobani Is Building a Billion Dollar Dairy Factory in NY

    April 23, 2025

    NBA Hall of Famer Paul Pierce Just Walked 20 Miles to Work

    May 9, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    Land More Gigs with This AI-Powered Job App Assistant for Just $55

    May 14, 2025

    Deploying Machine Learning Models with FastAPI | by Abhishek Shaw | Mar, 2025

    March 28, 2025

    CPI Report: Inflation Dropped in March. Will the Fed Cut Rates?

    April 11, 2025
    Our Picks

    Phase two of military AI has arrived

    April 15, 2025

    Is OpenAI Training AI on Copyrighted Data? A Deep Dive into the Controversy | by Brandon Hepworth | Apr, 2025

    April 4, 2025

    Openlayer Raises $14.5 Million Series A

    May 14, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.