Vector Search, R of the RAG

Photo by David McBee on Pexels.com

What

A vector is a quantity that has both magnitude (or length) and direction. A vector might represent the features of an image or a piece of text. In this context, the numbers within the vector represent values of those features.

Why

Imagine you’re searching for a car with a red color. In a traditional keyword search, you’d need the car’s description to explicitly mention “red color” to get a match. But car companies rarely keep it that simple — instead, they use creative names like “Cherry Bomb Tintcoat” or “Rosso Corsa”. A keyword search would struggle to connect your “red color” query with these fancy names. However, with vector search, the system understands the underlying meaning and context, so even abstract color names show up — ranked by a relevance score that reflects how closely your query aligns with the results.

How

Vector search, powered by embeddings, allows systems to understand the meaning of words and phrases. Embeddings convert text into numerical vectors that represent their semantic relationships.

Example

from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
from sentence_transformers import SentenceTransformer

# Initialize a pre-trained sentence transformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Sample car color names (like a car database)
car_colors = [
    "Cherry Bomb Tintcoat",
    "Rosso Corsa",
    "Soul Red Crystal Metallic",
    "Firecracker Red",
    "Carmine Red",
    "Race Red",
    "Ruby Flare Pearl"
]

# Convert car color names to vectors (embeddings)
color_embeddings = model.encode(car_colors)

# User query
user_query = "I want a car with red color"

# Convert user query to vector
query_embedding = model.encode([user_query])

# Calculate cosine similarity between query and car color names
similarities = cosine_similarity(query_embedding, color_embeddings)[0]

# Rank results by similarity score
ranked_results = sorted(zip(car_colors, similarities), key=lambda x: x[1], reverse=True)

# Display top results
print("Top matches for your query:\n")
for car_color, score in ranked_results[:3]:
    print(f"{car_color} (Relevance Score: {score:.2f})")

This approach offers a more effective way to retrieve relevant contextual information or facts based on a user’s query or question. In the context of Retrieval-Augmented Generation (RAG), this information enhances the model’s responses by grounding them in accurate, relevant data. I’ll dive deeper into how this works in my next post.

Now, go fix some bugs! 🚀