Close Menu
    Trending
    • How Diverse Leadership Gives You a Big Competitive Advantage
    • Making Sense of Metrics in Recommender Systems | by George Perakis | Jun, 2025
    • AMD Announces New GPUs, Development Platform, Rack Scale Architecture
    • The Hidden Risk That Crashes Startups — Even the Profitable Ones
    • Systematic Hedging Of An Equity Portfolio With Short-Selling Strategies Based On The VIX | by Domenico D’Errico | Jun, 2025
    • AMD CEO Claims New AI Chips ‘Outperform’ Nvidia’s
    • How AI Agents “Talk” to Each Other
    • Creating Smart Forms with Auto-Complete and Validation using AI | by Seungchul Jeff Ha | Jun, 2025
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Sentence Transformers, Bi-Encoders And Cross-Encoders | by Shaza Elmorshidy | Mar, 2025
    Machine Learning

    Sentence Transformers, Bi-Encoders And Cross-Encoders | by Shaza Elmorshidy | Mar, 2025

    FinanceStarGateBy FinanceStarGateMarch 10, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    A sentence transformer [Bi-Encoder] is a neural community mannequin designed to generate high-quality vector representations (embeddings) for sentences or textual content fragments. It’s primarily based on transformer architectures, corresponding to BERT or RoBERTa, however optimized for duties like semantic similarity, clustering, or retrieval. In contrast to conventional transformers, which concentrate on token-level outputs, sentence transformers produce a fixed-size dense vector for a whole sentence, capturing its semantic which means.

    Cross-Encoders, alternatively, take two textual content inputs (e.g., a question and a candidate response) and course of them collectively by way of a single mannequin to compute a rating, usually indicating their relevance or similarity. They obtain larger accuracy as a result of the mannequin can concentrate on contextual interactions between the inputs, however they’re computationally costly because the scoring requires processing each pair anew.

    Cross Encoders are sometimes used to re-rank the top-k outcomes from a Sentence Transformer mannequin.

    The answer got here in 2019 with Nils Reimers and Iryna Gurevych’s SBERT (Sentence-BERT) and since SBERT, numerous sentence transformer fashions have been developed and optimized.

    SBERT Structure

    SBERT (Sentence-BERT) enhances the BERT mannequin by using a siamese structure, the place two an identical BERT networks course of two separate sentences independently. This produces embeddings for every sentence, pooled utilizing strategies like imply pooling. These sentence embeddings, uuu and vvv, are then mixed right into a single vector that captures their relationship. The best mixture method is (u,v,∣u−v∣)(u, v, |u-v|)(u,v,∣u−v∣), the place ∣u−v∣|u-v|∣u−v∣ represents the element-wise absolute distinction.

    Coaching Course of

    SBERT is fine-tuned on duties like Pure Language Inference (NLI), which includes figuring out whether or not one sentence entails, contradicts, or is impartial to a different. The coaching course of consists of the next steps:

    1. Sentence Embedding: Every sentence pair is processed to generate particular person embeddings.
    2. Concatenation: The embeddings (uuu and vvv) are mixed right into a single vector ((u,v,∣u−v∣)(u, v, |u-v|)(u,v,∣u−v∣)).
    3. Feedforward Neural Community (FFNN): The concatenated vector is handed by way of an FFNN with a number of hidden layers to generate uncooked output logits.
    4. Softmax Layer: The logits are normalized into possibilities, comparable to NLI labels (entailment, contradiction, or impartial).
    5. Cross-Entropy Loss: The expected possibilities are in contrast with precise labels utilizing the cross-entropy loss perform, which penalizes incorrect predictions.
    6. Optimization: The loss is minimized by way of backpropagation, adjusting the mannequin’s parameters to enhance accuracy on the coaching activity.

    Pretrained fashions and evaluations are discovered right here Pretrained Models — Sentence Transformers documentation

    • Basic Objective Fashions: These embrace variations of BERT, RoBERTa, DistilBERT, and XLM-R which are fine-tuned for sentence-level duties. Examples:
      – The all-* fashions have been educated on all obtainable coaching information (greater than 1 billion coaching pairs) and are designed as basic goal fashions. The all-mpnet-base-v2 mannequin supplies the very best quality, whereas all-MiniLM-L6-v2 is 5 occasions quicker and nonetheless presents good high quality.
    • Multilingual Fashions: These fashions help a number of languages, making them splendid for multilingual and cross-lingual duties. Examples:
      distiluse-base-multilingual-cased-v2
      xlm-r-100langs-bert-base-nli-stsb
    • Area-Particular Fashions: Fashions fine-tuned on particular domains or datasets, corresponding to biomedical textual content, monetary paperwork, or authorized textual content. Examples:
      biobert-sentence-transformer: Specialised for biomedical literature.
      – Customized fine-tuned fashions obtainable by way of Hugging Face or Sentence Transformers for area of interest domains.
    • Multimodal Fashions: These fashions can deal with inputs past textual content, corresponding to pictures and textual content mixed, making them helpful for functions like picture captioning, visible query answering, and cross-modal retrieval. Examples:
      clip-ViT-B-32: Integrates visible and textual inputs for duties that contain each modalities, corresponding to discovering pictures primarily based on textual queries.
      mage-text-matching: A specialised mannequin for matching textual content descriptions with related pictures.
    • Activity-Particular Fashions: Pre-trained for duties like semantic search, clustering, and classification. Examples:
      msmarco-MiniLM-L12-v2: Optimized for info retrieval and search duties.
      nli-roberta-base-v2: Designed for pure language inference.
    • Customized Advantageous-Tuned Fashions: Customers can prepare their very own fashions on particular datasets utilizing Sentence Transformers’ coaching utilities. This permits adaptation to extremely specialised use circumstances.

    References:

    What is a Sentence Transformer?

    Index of /docs/sentence_transformer



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleNews Bytes 20250310: TSMC’s $100B for Arizona Fabs, New AGI Benchmarks, JSC’s Quantum-Exascale Integration, Chinese Quantum Reported 1Mx Faster than Google’s
    Next Article Linear Regression in Time Series: Sources of Spurious Regression
    FinanceStarGate

    Related Posts

    Machine Learning

    Making Sense of Metrics in Recommender Systems | by George Perakis | Jun, 2025

    June 14, 2025
    Machine Learning

    Systematic Hedging Of An Equity Portfolio With Short-Selling Strategies Based On The VIX | by Domenico D’Errico | Jun, 2025

    June 14, 2025
    Machine Learning

    Creating Smart Forms with Auto-Complete and Validation using AI | by Seungchul Jeff Ha | Jun, 2025

    June 14, 2025
    Add A Comment

    Comments are closed.

    Top Posts

    President Trump Pauses Tariffs for Most Countries, Not China

    April 10, 2025

    Clustering of Popular Business Locations in San Francisco Bay Area Using K-Means | by Partha Das | May, 2025

    May 20, 2025

    Leidos and Moveworks Partner on Agentic AI for Government Agencies

    April 23, 2025

    The Forbidden Truths of Lasting Generational Prosperity | by The Investment Compass | Apr, 2025

    April 10, 2025

    $50 Lifetime Access to Reachfast Finds Verified B2B Leads in Less Than Five Minutes

    April 3, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    Chobani Is Building a Billion Dollar Dairy Factory in NY

    April 23, 2025

    Understanding Random Forest & Naïve Bayes (Classifier) | by Alvin Octa Hidayathullah | Feb, 2025

    February 20, 2025

    Getting Your Feet Wet in AI, ML, and LLMs: A Developer’s Guide | by Kalyan Sripathi | through-the-eye-of-security | Mar, 2025

    March 24, 2025
    Our Picks

    Get Microsoft 365 for Six People a Year for Just $100

    June 1, 2025

    Multiverse Says It Compresses Llama Models by 80%

    April 9, 2025

    College Professors Turn Back to Blue Books to Combat ChatGPT

    May 29, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.