Retrieval Augmented Generation in SQLite

This is the second in a two-part sequence on utilizing SQLite for Machine Learning. In my last article, I dove into how SQLite is quickly turning into a production-ready database for internet functions. On this article, I’ll focus on carry out retrieval-augmented-generation utilizing SQLite.

For those who’d like a customized internet utility with generative AI integration, go to losangelesaiapps.com

The code referenced on this article will be discovered here.

Once I first discovered carry out retrieval-augmented-generation (RAG) as a budding knowledge scientist, I adopted the conventional path. This normally appears one thing like:

Google retrieval-augmented-generation and search for tutorials
Discover the most well-liked framework, normally LangChain or LlamaIndex
Discover the most well-liked cloud vector database, normally Pinecone or Weaviate
Learn a bunch of docs, put all of the items collectively, and success!

In truth I really wrote an article about my expertise constructing a RAG system in LangChain with Pinecone.

There’s nothing terribly flawed with utilizing a RAG framework with a cloud vector database. Nonetheless, I’d argue that for first time learners it overcomplicates the state of affairs. Do we actually want a whole framework to discover ways to do RAG? Is it essential to carry out API calls to cloud vector databases? These databases act as black packing containers, which isn’t good for learners (or frankly for anybody).

On this article, I’ll stroll you thru carry out RAG on the best stack attainable. In truth, this ‘stack’ is simply Sqlite with the sqlite-vec extension and the OpenAI API to be used of their embedding and chat fashions. I like to recommend you re ad part 1 of this sequence to get a deep dive on SQLite and the way it’s quickly turning into manufacturing prepared for internet functions. For our functions right here, it is sufficient to perceive that SQLite is the best sort of database attainable: a single file in your repository.

So ditch your cloud vector databases and your bloated frameworks, and let’s do some RAG.

SQLite-Vec

One of many powers of the SQLite database is using extensions. For these of us acquainted with Python, extensions are so much like libraries. They’re modular items of code written in C to increase the performance of SQLite, making issues that had been as soon as unimaginable attainable. One widespread instance of a SQLite extension is the Full-Text Search (FTS) extension. This extension permits SQLite to carry out environment friendly searches throughout massive volumes of textual knowledge in SQLite. As a result of the extension is written purely in C, we are able to run it wherever a SQLite database will be run, together with Raspberry Pis and browsers.

On this article I will likely be going over the extension generally known as sqlite-vec. This provides SQLite the ability of performing vector search. Vector search is just like full-text search in that it permits for environment friendly search throughout textual knowledge. Nonetheless, moderately than seek for an actual phrase or phrase within the textual content, vector search has a semantic understanding. In different phrases, trying to find “horses” will discover matches of “equestrian”, “pony”, “Clydesdale”, and so forth. Full-text search is incapable of this.

sqlite-vec makes use of digital tables, as do most extensions in SQLite. A digital desk is just like a daily desk, however with extra powers:

Customized Knowledge Sources: The information for the standard desk in SQLite is housed in a single db file. For a digital desk, the info will be housed in exterior sources, for instance a CSV file or an API name.
Versatile Performance: Digital tables can add specialised indexing or querying capabilities and assist advanced knowledge sorts like JSON or XML.
Integration with SQLite Question Engine: Digital tables combine seamlessly with SQLite’s customary question syntax e.g. SELECT , INSERT, UPDATE, and DELETE choices. Finally it’s as much as the writers of the extensions to assist these operations.
Use of Modules: The backend logic for the way the digital desk will work is carried out by a module (written in C or one other language).

The standard syntax for making a digital desk appears like the next:

CREATE VIRTUAL TABLE my_table USING my_extension_module();

The essential a part of this assertion is my_extension_module(). This specifies the module that will likely be powering the backend of the my_table digital desk. In sqlite-vec we are going to use the vec0 module.

Code Walkthrough

The code for this text will be discovered here. It’s a easy listing with the vast majority of recordsdata being .txt recordsdata that we’ll be utilizing as our dummy knowledge. As a result of I’m a physics nerd, the vast majority of the recordsdata pertain to physics, with only a few recordsdata referring to different random fields. I can’t current the complete code on this walkthrough, however as an alternative will spotlight the essential items. Clone my repo and mess around with it to research the complete code. Beneath is a tree view of the repo. Word that my_docs.db is the single-file database utilized by SQLite to handle all of our knowledge.

.

├── knowledge

│   ├── cooking.txt

│   ├── gardening.txt

│   ├── general_relativity.txt

│   ├── newton.txt

│   ├── personal_finance.txt

│   ├── quantum.txt

│   ├── thermodynamics.txt

│   └── journey.txt

├── my_docs.db

├── necessities.txt

└── sqlite_rag_tutorial.py

Step 1 is to put in the required libraries. Beneath is our necessities.txt file. As you possibly can see it has solely three libraries. I like to recommend making a digital surroundings with the most recent Python model (3.13.1 was used for this text) after which operating pip set up -r necessities.txt to put in the libraries.

# necessities.txt

sqlite-vec==0.1.6

openai==1.63.0

python-dotenv==1.0.1

Step 2 is to create an OpenAI API key for those who don’t have already got one. We will likely be utilizing OpenAI to generate embeddings for the textual content recordsdata in order that we are able to carry out our vector search.

# sqlite_rag_tutorial.py

import sqlite3

from sqlite_vec import serialize_float32

import sqlite_vec

import os

from openai import OpenAI

from dotenv import load_dotenv

# Arrange OpenAI consumer

consumer = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

Step 3 is to load the sqlite-vec extension into SQLite. We will likely be utilizing Python and SQL for our examples on this article. Disabling the flexibility to load extensions instantly after loading your extension is an effective safety observe.

# Path to the database file

db_path="my_docs.db"

# Delete the database file if it exists

db = sqlite3.join(db_path)

db.enable_load_extension(True)

sqlite_vec.load(db)

db.enable_load_extension(False)

Subsequent we are going to go forward and create our digital desk:

db.execute('''

   CREATE VIRTUAL TABLE paperwork USING vec0(

       embedding float[1536],

       +file_name TEXT,

       +content material TEXT

   )

''')

paperwork is a digital desk with three columns:

sample_embedding : 1536-dimension float that may retailer the embeddings of our pattern paperwork.
file_name : Textual content that may home the identify of every file we retailer within the database. Word that this column and the next have a + image in entrance of them. This means that they’re auxiliary fields. Beforehand in sqlite-vec solely embedding knowledge might be saved within the digital desk. Nonetheless, not too long ago an update was pushed that enables us so as to add fields to our desk that we don’t really need embedded. On this case we’re including the content material and identify of the file in the identical desk as our embeddings. This may enable us to simply see what embeddings correspond to what content material simply whereas sparing us the necessity for additional tables and JOIN statements.
content material : Textual content that may retailer the content material of every file.

Now that we have now our digital desk arrange in our SQLite database, we are able to start changing our textual content recordsdata into embeddings and storing them in our desk:

# Perform to get embeddings utilizing the OpenAI API

def get_openai_embedding(textual content):

   response = consumer.embeddings.create(

       mannequin="text-embedding-3-small",

       enter=textual content

   )

   return response.knowledge[0].embedding

# Iterate over .txt recordsdata within the /knowledge listing

for file_name in os.listdir("knowledge"):

   file_path = os.path.be a part of("knowledge", file_name)

   with open(file_path, 'r', encoding='utf-8') as file:

       content material = file.learn()

       # Generate embedding for the content material

       embedding = get_openai_embedding(content material)

       if embedding:

           # Insert file content material and embedding into the vec0 desk

           db.execute(

               'INSERT INTO paperwork (embedding, file_name, content material) VALUES (?, ?, ?)',

               (serialize_float32(embedding), file_name, content material)

# Commit modifications

db.commit()

We primarily loop by means of every of our .txt recordsdata, embedding the content material from every file, after which utilizing an INSERT INTO assertion to insert the embedding, file_name, and content material into paperwork digital desk. A commit assertion on the finish ensures the modifications are persevered. Word that we’re utilizing serialize_float32 right here from the sqlite-vec library. SQLite itself doesn’t have a built-in vector kind, so it shops vectors as binary massive objects (BLOBs) to save lots of area and permit quick operations. Internally, it makes use of Python’s struct.pack() operate, which converts Python knowledge into C-style binary representations.

Lastly, to carry out RAG, you then use the next code to do a Ok-Nearest-Neighbors (KNN-style) operation. That is the guts of vector search.

# Carry out a pattern KNN question

query_text = "What's normal relativity?"

query_embedding = get_openai_embedding(query_text)

if query_embedding:

   rows = db.execute(

       """

       SELECT

           file_name,

           content material,

           distance

       FROM paperwork

       WHERE embedding MATCH ?

       ORDER BY distance

       LIMIT 3

       """,

       [serialize_float32(query_embedding)]

   ).fetchall()

   print("High 3 most comparable paperwork:")

   top_contexts = []

   for row in rows:

       print(row)

       top_contexts.append(row[1])  # Append the 'content material' column

We start by taking in a question from the consumer, on this case “What’s normal relativity?” and embedding that question utilizing the identical embedding mannequin as earlier than. We then carry out a SQL operation. Let’s break this down:

The SELECT assertion means the retrieved knowledge can have three columns: file_name, content material, and distance. The primary two we have now already talked about. Distance will likely be calculated through the SQL operation, extra on this in a second.
The FROM assertion ensures you might be pulling knowledge from the paperwork desk.
The WHERE embedding MATCH ? assertion performs a similarity search between the entire vectors in your database and the question vector. The returned knowledge will embody a distance column. This distance is only a floating level quantity measuring the similarity between the question and database vectors. The upper the quantity, the nearer the vectors are. sqlite-vec supplies a couple of choices for calculate this similarity.
The ORDER BY distance makes certain to order the retrieved vectors in descending order of similarity (excessive -> low).
LIMIT 3 ensures we solely get the highest three paperwork which are nearest to our question embedding vector. You possibly can tweak this quantity to see how retrieving kind of vectors impacts your outcomes.

Given our question of “What’s normal relativity?”, the following paperwork had been pulled. It did a fairly good job!

High 3 most comparable paperwork:

(‘general_relativity.txt’, ‘Einstein’s idea of normal relativity redefined our understanding of gravity. As an alternative of viewing gravity as a power appearing at a distance, it interprets it because the curvature of spacetime round large objects. Gentle passing close to an enormous star bends barely, galaxies deflect beams touring tens of millions of light-years, and clocks tick at completely different charges relying on their gravitational potential. This groundbreaking idea led to predictions like gravitational lensing and black holes, phenomena later confirmed by observational proof, and it continues to information our understanding of the cosmos.’, 0.8316285610198975)

(‘newton.txt’, ‘In classical mechanics, Newton’s legal guidelines of movement type the inspiration of how we perceive the motion of objects. Newton’s first regulation, usually known as the regulation of inertia, states that an object at relaxation stays at relaxation and an object in movement continues in movement until acted upon by an exterior power. This idea extends into extra advanced physics issues, the place analyzing internet forces on objects permits us to foretell their future trajectories and behaviors. Over time, making use of Newton’s legal guidelines has enabled engineers and scientists to design safer automobiles, extra environment friendly machines, and even information spacecraft by means of intricate gravitational fields.’, 1.2036118507385254)

(‘quantum.txt’, ‘Quantum mechanics revolutionized our understanding of the microscopic world. In contrast to classical particles, quantum entities equivalent to electrons can exhibit each wave-like and particle-like behaviors. Phenomena like quantum superposition recommend that particles can exist in a number of states directly, and the act of measurement usually “collapses” these states into one noticed end result. This unusual and counterintuitive idea underpins trendy applied sciences like semiconductors and lasers, and it supplies a conceptual framework for rising fields like quantum computing and cryptography.’, 1.251380205154419)

We will then stuff the context of the mannequin with these three paperwork and have it try to reply our query.

# Put together the context for the question

context = "nn".be a part of(top_contexts)

system_message = "You're a useful assistant. Use the next context to reply the question."

# Ship question and context to OpenAI

strive:

   completion = consumer.chat.completions.create(

       mannequin="gpt-4o-mini",

       messages=[

           {"role": "system", "content": system_message},

           {"role": "user", "content": f"Context: {context}nnQuery: {query_text}"}

       ]

   )

   print("Response:")

   print(completion.decisions[0].message.content material)

besides Exception as e:

   print(f"Error producing response: {e}")

Here’s what the mannequin stated:

Common relativity is a idea developed by Albert Einstein that redefines our understanding of gravity. As an alternative of viewing gravity as a power appearing at a distance, normal relativity interprets it because the curvature of spacetime attributable to the presence of mass. In line with this idea, large objects like stars and planets create a distortion within the cloth of spacetime, and this curvature impacts the movement of different objects, together with gentle, which might bend when it passes close to an enormous physique.

That is faithfully sticking to the paperwork we gave the mannequin. Nice job 4o-mini!

Conclusion

sqlite-vec is a mission sponsored by the Mozilla Builders Accelerator program, so it has some vital backing behind it. Have to present an enormous due to Alex Garcia, the creator of sqlite-vec , for serving to to push the SQLite ecosystem and making ML attainable with this easy database. It is a nicely maintained library, with updates coming down the pipeline frequently. As of November twentieth, they even added filtering by metadata! Maybe I ought to re-do my aforementioned RAG article utilizing SQLite 🤔.

The extension additionally gives bindings for a number of widespread programming languages, together with Ruby, Go, Rust, and extra.

The truth that we’re in a position to radically simplify our RAG pipeline to the naked necessities is outstanding. To recap, there isn’t a want for a database service to be spun up and spun down, like Postgres, MySQL, and so forth. There isn’t a want for API calls to cloud distributors. For those who deploy to a server immediately through Digital Ocean or Hetzner, you possibly can even keep away from costly and unnecessary complexity related to managed cloud companies like AWS, Azure, or Vercel.

I imagine this easy structure can work for a wide range of functions. It’s cheaper to make use of, simpler to keep up, and quicker to iterate on. When you attain a sure scale it is going to possible make sense emigrate to a extra strong database equivalent to Postgres with the pgvector extension for RAG capabilities. For extra superior capabilities equivalent to chunking and doc cleansing, a framework will be the proper alternative. However for startups and smaller gamers, it’s SQLite to the moon.

Have enjoyable attempting out sqlite-vec for your self!

Source link

How AI Agents “Talk” to Each Other

Stop Building AI Platforms | Towards Data Science

What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

Markus Buehler receives 2025 Washington Award | MIT News

🐛 The Problem I Encountered While Studying Lesson 2 of fastai’s Practical Deep Learning | by thgirb | Jun, 2025

Reframing digital transformation through the lens of generative AI

Bvcxzxcv

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | by Jyoti Dabass, Ph.D. | Feb, 2025

Most Popular

Think You’re Ready to Franchise Your Business? Here Are 8 Things You Need to Consider First

TSMC to Invest $100B in 3 New U.S. Fabs, Packaging, R&D

The Creator of Pepper X Feels Success in His Gut

Our Picks

One-Click LLM Bash Helper

Why Franchise Leads Ghost You — And How to Fix It

Gretel Tutorial: How to Generate Synthetic Data Like a Data Scientist Who’s Done With Dirty CSVs | by Cristina Ross | May, 2025

Retrieval Augmented Generation in SQLite

SQLite-Vec

Code Walkthrough

Conclusion

Related Posts