Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation is a method that enhances the responses of language models (LLMs) by incorporating relevant, up-to-date information from external sources, improving accuracy and relevance, especially when the information is not in the model’s training data

Grounding

Grounding refers to providing context to an LLM to enhance response accuracy and reduce hallucinations. For example, a chatbot for a news agency can pull real-time headlines from a news API to ensure timely and accurate responses

Retrievers

Retriever responsible for searching and retrieving relevant information. It can handle unstructured inputs (like questions) and find structured data for context. Methods for building retrievers include Full-text search, Vector search, Text to Cypher

Semantic search focuses on understanding the intent and contextual meaning of search phrases rather than relying solely on keywords. Traditional keyword search often depends on exact-match keywords or proximity-based algorithms that find similar words. For example, if you input “apple” in a traditional search, you might predominantly get results about the fruit. However, in a semantic search, the engine tries to gauge the context: Are you searching about the fruit, the tech company, or something else?
The results are tailored based on the term and the perceived intent

Vectors and Embeddings

  • Vectors: Represent data as lists of numbers (e.g., [1, 2, 3]) and can represent various data types, including text and images
  • Dimensionality: The number of dimensions in a vector (e.g., a vector with three numbers has a dimensionality of 3)
  • Embeddings: Vectors that represent data meaningfully for specific tasks. Each dimension can capture a semantic aspect of a word or phrase (e.g., “apple” might have dimensions for fruit, technology, color, etc.)
  • Embedding Models: Tools like Open Ai’s text-embedding-ada-002 create embeddings, often for entire sentences or paragraphs to capture context
  • Using Vectors in Semantic Search: Similarity Measurement - The distance or angle between vectors indicates semantic similarity; similar meanings have closer vectors.
    • RAG Process
      1. Create an embedding of the user’s question
      2. Compare the question vector to indexed data vectors
      3. Score results based on similarity
      4. Use the most relevant results as context for the LLM

Vector Indexes

Vector Indexes help find similar data by comparing embeddings, which are numerical representations of data.

  1. We can create new embeddings using following code
WITH genai.vector.encode(
	"Text to create embeddings for",
	"OpenAI",
	{ token: "sk-..." }
) AS embedding
RETURN embedding
  1. Creating Vector Index
CREATE VECTOR INDEX questions IF NOT EXISTS
FOR (q:Question)
ON q.embedding
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}}

Graph RAG

Graph RAG (Graph Retrieval Augmented Generation) is an approach that uses the strengths of graph databases to provide relevant and useful context to LLMs
While vector RAG uses embeddings to find contextually relevant information, Graph RAG enhances this process by leveraging the relationships and structure within a graph
Process in Graph RAG goes as follows

  • A user submits a query
  • The system performs a vector search to find similar nodes
  • The graph is traversed to find related entities
  • Relevant entities and relationships are added to the context for the LLM
WITH genai.vector.encode(
	"A mysterious spaceship lands Earth", 
	"OpenAI", 
	{ token: "sk-..." }
) AS myMoviePlot
CALL db.index.vector.queryNodes('moviePlots', 6, myMoviePlot)
YIELD node, score
MATCH (node)<-[r:RATED]-()
RETURN node.title AS title, node.plot AS plot, score AS similarityScore,
       collect { MATCH (node)-[:IN_GENRE]->(g) RETURN g.name } as genres,
       collect { MATCH (node)<-[:ACTED_IN]->(a) RETURN a.name } as actors,
       avg(r.rating) as userRating
ORDER BY userRating DESC

Allows users to find specific keywords or phrases in documents or nodes. It can be used alone or alongside vector search to refine results

Knowledge Graph

A knowledge graph is an organized representation of real-world entities and their relationships. Knowledge graphs provide a structured way to represent entities, their attributes, and their relationships, allowing for a comprehensive and interconnected understanding of the information. Knowledge graphs are useful for Generative AI applications because they provide structured, interconnected data that enhances context, reasoning, and accuracy in generated responses. Search engines typically use knowledge graphs to provide information about people, places, and things Organizing Principles: They are frameworks that categorize and structure data within the graph, allowing for complex queries and analytics