Close Menu
    Trending
    • Frank McCourt Jr. Interview: Why I Want to Buy TikTok
    • LLM Optimization: LoRA and QLoRA | Towards Data Science
    • 🔥 “A Fireside Chat Between Three Minds: JEPA, Generative AI, and Agentic AI Debate the Future” | by pawan | May, 2025
    • Top Colleges Now Value What Founders Have Always Hired For
    • The Secret Power of Data Science in Customer Support
    • Decoding Complexity: My Journey with Gemini Multimodality and Multimodal RAG | by Yaswanth Ippili | May, 2025
    • Turn Your Side Hustle Into a 7-Figure Business With These 4 AI Growth Hacks
    • Agentic RAG Applications: Company Knowledge Slack Agents
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Papers Explainedv377: Fathom-R1. Fathom-R1–14B is a 14-billion-parameter… | by Ritvik Rastogi | May, 2025
    Machine Learning

    Papers Explainedv377: Fathom-R1. Fathom-R1–14B is a 14-billion-parameter… | by Ritvik Rastogi | May, 2025

    FinanceStarGateBy FinanceStarGateMay 30, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Fathom-R1–14B is a 14-billion-parameter reasoning language mannequin derived from Deepseek-R1-Distilled-Qwen-14B, fine-tuned for mathematical reasoning by Fractal.

    The fashions and datasets can be found at HuggingFace.

    We start by curating a high-quality mathematical corpus from the next open-source datasets:

    • Open-R1 — default subset
    • Numina — Olympiads & AOPS_forum (phrase issues, float sort solutions)

    After rigorous deduplication and decontamination, roughly ~100K distinctive issues are consolidated forming the preliminary corpus for all subsequent trainings.

    Coaching Recipe for Fathom-R1–14B-v0.6

    SFT on troublesome questions and their reasoning chains has confirmed efficient for enhancing reasoning skill. Constructing on this, this coaching stage goals to enhance the mannequin’s efficiency on difficult mathematical issues utilizing an iterative curriculum studying technique, with a most sequence size of 16k. Curriculum studying (CL) is a well-established methodology for coaching LLMs, the place the mannequin is progressively uncovered to more and more troublesome duties. This method helps scaffold extra complicated reasoning, enhancing generalization and lowering overfitting. On this case, CL is carried out iteratively, which means a number of iterations of CL are carried out.

    For dataset preparation, every query’s problem is annotated utilizing OpenAI’s o3mini mannequin. Solely questions rated above common are retained and additional filtered to incorporate these with resolve charges between 0.2 and 0.7. This course of leads to the Iterative Curriculum Studying dataset, comprising 5K examples.

    Coaching Recipe for Fathom-R1–14B-v0.4-RS

    The technique for creating this checkpoint entails a two-stage pipeline:

    First Stage (Leveraging RL for environment friendly test-time pondering):

    1. Curate a seed dataset guaranteeing minimal reward however room for progress, comprising questions with resolve charges inside a selected vary, forming a 7.7K query RL Compression dataset.
    2. Practice the bottom mannequin, DeepSeek-R1-Distill-Qwen-14B, utilizing the GRPO algorithm with a 6k token sequence size restrict.
    3. The mannequin learns to generate concise responses, displaying improved efficiency at decrease token limits.

    Second Stage (Leveraging SFT to enhance reasoning effectively at increased sequence size):

    1. Construct upon the RL checkpoint and carry out SFT with a 16K context window to reinforce detailed reasoning for complicated issues.
    2. Curate a dataset of exhausting issues with decrease resolve charges, forming a 9.5K instance SFT Shortest Chains dataset.
    3. Supervised fine-tuning on this dataset stabilizes the mannequin’s reasoning at as much as 16K sequence size.

    The ensuing mannequin, Fathom-R1–14B-v0.4, is optimized for concise but correct mathematical reasoning.

    Coaching Recipe for Fathom-R1–14B-v0.4

    Given the efficiency enchancment observed throughout the second fine-tuning stage of growing Fathom-R1–14B-v0.4-RS and in an try and additional scale back the associated fee, an experiment was performed by eliminating RL and immediately performing second stage SFT on Deepseek-R1-Distilled-Qwen-14B base mannequin.

    Mannequin Merging

    Given v0.6 and v0.4 fashions have been developed by following completely different coaching methodologies, linear merging is carried out to mix the strengths to acquire last 2 checkpoints.

    • Fathom-R1–14B: Obtained by way of merging Fathom-R1–14B-V0.6 (Iterative Curriculum SFT) and Fathom-R1–14B-V0.4 (SFT-Shortest-Chains)
    • Fathom-R1–14B-RS: Obtained by way of merging Fathom-R1–14B-V0.6 (Iterative Curriculum SFT) and Fathom-R1–14B-V0.4 (RL-compression + SFT-Shortest-Chains)
    • Fathom‑R1–14B demonstrates extremely aggressive efficiency throughout all datasets, bettering over the unique R1-distilled fashions whereas intently matching or surpassing different sturdy baselines in a number of settings.
    • On each AIME 25 and HMMT 25, our mannequin exhibits the very best go@1 in addition to cons@64 scores amongst all of the open-source fashions (together with the larger R1-Distilled-32B mannequin), with R1–670B being the one exception.
    • Fathom-R1–14B is superior to the primary two generations of OpenAI’s mini-reasoning fashions, together with o1-mini and o3-mini-low- and its efficiency intently matches that of newly launched o4-mini-low (self-consistency decoding).

    Fathom-R1: $499 Training Recipe for Unlocking Math Reasoning at o4-mini level with just 14B parameters under 16K context



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleSalesforce Is Cutting Back on Hiring Engineers Thanks to AI
    Next Article A Practical Introduction to Google Analytics
    FinanceStarGate

    Related Posts

    Machine Learning

    🔥 “A Fireside Chat Between Three Minds: JEPA, Generative AI, and Agentic AI Debate the Future” | by pawan | May, 2025

    May 31, 2025
    Machine Learning

    Decoding Complexity: My Journey with Gemini Multimodality and Multimodal RAG | by Yaswanth Ippili | May, 2025

    May 31, 2025
    Machine Learning

    Understanding Reward Models in Large Language Models: A Deep Dive into Reinforcement Learning | by Shawn | May, 2025

    May 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Return-to-Office Push Meets Employee Pushback — What’s Next?

    April 8, 2025

    6 key lessons in building and enjoying your wealth

    March 27, 2025

    Develop a Lifetime of New Skills for Only $20

    April 26, 2025

    I Use the 6-Week Sprint Method For Better Product Development — and More. Here’s Why You Need It, Too.

    February 12, 2025

    Optimasi Model Machine Learning. Optimalkan model machine learning… | by Yasun Studio | May, 2025

    May 11, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    Waabi says its virtual robotrucks are realistic enough to prove the real ones are safe

    March 11, 2025

    Designing a new way to optimize complex coordinated systems | MIT News

    April 25, 2025

    The Future of AI in Business: Trends to Watch in 2025 and Beyond

    February 9, 2025
    Our Picks

    Top Python Libraries for Machine Learning | by Expert App Devs | Apr, 2025

    April 14, 2025

    Universal Fine-Tuning Framework (UFTF): A Versatile and Efficient Approach to Fine-Tuning Language Models | by Frank Morales Aguilera | AI Simplified in Plain English | Mar, 2025

    March 3, 2025

    Your iPhone’s a Data Scientist — But a Very Private One. | by Shusrita Venugopal | May, 2025

    May 7, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.