From RAGs to riches, a framework for leveraging Generative AI in your own organisation

GenAI and RAG
When you have a hammer, everything can appear to be a nail! Generative AI (GenAI) is certainly a massive hammer, but using it for competitive advantage on your own data can sometimes be elusive, especially in business. How can GenAI be put to work on your company's information without leaking all your proprietary secrets?

In a previous blog article, I discussed small language models (SLMs). For many use-cases, these can be a cost-effective and viable alternative to large language models (LLMs). A key question is whether a SLM can be trained on your own company’s information for your own unique version of “ChatGPT”?

The difference between a SLM and LLM lies in the size of the training data. SLMs are trained on less data, are much smaller, can run on modest hardware, and perform nearly as well as a LLM in. most cases. If an SLM is trained in a specific area of expertise, its accuracy can match an LLM at a far lower cost. SLMs are therefore of particular interest to people looking to run GenAI on premise with their own hardware.

SLM’s can also be “tuned” using your own data, meaning that you can in theory create a proprietary SLM model for your company, with no further reliance on a third party, thereby preserving the privacy and confidentiality of your information.

But SLMs are not the only technique for deploying proprietary AI for use within a company. Another approach to getting customised results from Generative AI (GenAI) is called “RAG. When I first heard of RAG, I did not have a clue what it meant. But a little time invested and research has quickly uncovered the relevance of RAG to companies who want their own version of AI in the modern landscape. I quickly realised that this concept needs to be demystified for technical leaders who do not have a background in data science, and who are looking for opportunities to leverage GenAI in their organisations.

The purpose of this short article is to demystify RAG and explain how it is uniquely architected so that GenAI can be unlocked for a typical company. I will not get into too much detail at this stage because it is easy to miss the wood for the trees and it is therefore important to step back and try to first grasp the high level concepts.

At the outset however, RAG is not a silver bullet, and there are many practical challenges using this technique in an enterprise. As is the case for other evolving areas of AI, RAG has some way to go, and as AI models evolve it is likely that RAG too will evolve to take advantage of new capabilities.

So what is RAG?

In short: RAG, (Retrieve, Augment, and Generate), is a process used in artificial intelligence to discover, enhance, and create information.

  • Retrieve: This means searching for information from different sources, like looking up facts on the internet or finding a specific document in a company’s document repository.
  • Augment: This step involves improving or adding to the retrieved information, providing additional context and guidance for the GenAI model.
  • Generate: Finally, this step uses the augmented information to create something new, like writing a summary, report, or even answering a question in a detailed manner.
 

In short, RAG is about finding information, making it better, and then using it to produce new content. RAG is fundamentally different from manually sending “prompts” to ChatGPT.  RAG can automatically query and include related content from within your companies information systems, whereas a LLM such as ChatGPT can only produce output based on a “prompt”, which, depending on the care taken when creating it, can be weak with little usable context.

RAG is particularly useful when a company has proprietary information that for obvious reasons cannot be discovered in any public LLM training process. Effectively, with RAG, you “point” the generic third party LLM to your own data and then ask specific questions of the AI agent in a structured way. The LLM combines its extensive “public” training dataset with your specialised “proprietary” information to generate new information and insights, highly relevant for your organisation.

For example, if you are a lawyer, you could simultaneously interrogate a legal firm’s historical records together with general case law information in order to reveal insights about a new problem that the firm is facing. Because the firms historical records are likely to be highly confidential, this information would never have been discovered when training the generic LLM.

RAG is a technique that allows you to combine both worlds and leverage the enormous power of a generic LLM developed by a third party to also interpret your relatively small but highly proprietary document repositories.


Breaking Down RAG

The RAG framework consists of a logical sequence of steps. Most of these can be automated, but this is not necessarily a requirement. Expert reviews and validation of the information at each step may be required, and in fact may be vital in certain applications, for example where the information is particularly complex and / or has the potential to cause harm to people or institutions.
 

Retrieval

In the context of RAG, the “retrieve” step involves the following key steps:

Identifying the Information Need

Determine what specific information is being sought. For example, if a user asks a question, the AI system needs to understand the broad topic and context needed to generate a useful answer.

Searching for Relevant Sources

Specify where the system should look for and retrieve the supplementary information. This could include:

  • Databases: Structured repositories of information like academic databases or your own company records.
  • Web Searches: Using search engines to find relevant articles, documents, or websites.
  • APIs: Accessing external services that provide specific data (e.g., weather data, stock prices).
 

Extracting Relevant Information

Run a query against the chosen data sources to retrieve potential matches. Then sift through the retrieved information to find the most relevant and useful pieces. This might involve discarding duplicates or irrelevant results.

Initial Processing

Basic cleaning of the data, such as removing unnecessary formatting, ensuring the text is readable, and converting it to a consistent format is important before submitting to a LLM.

If large amounts of data were retrieved, it will be necessary generate a summary to capture the main points.

Example:

Imagine you are looking for high quality information on how AI can be used in manufacturing:

  • Identifying the Information Need: Understand that the user wants to know about AI applications in manufacturing.
  • Searching for Relevant Sources: Search academic journals, industry reports, and credible websites for articles on this topic. If you are a large consultancy, perhaps also, search your corporate intranet for recent projects that have incorporated AI for your manufacturing clients.
  • Extracting Relevant Information: From the search results, pick the most relevant articles that discuss AI in manufacturing.
  • Initial Processing: Clean up the articles by removing ads, footnotes, and irrelevant sections, then summarise the key points.
 
 

The retrieval step ensures that you have gathered the most pertinent information, setting the stage for the next steps, i.e. augmenting and generating new content.


Augment

The “augment” step in RAG involves improving and enriching the information retrieved so far to make it more useful and relevant. Here’s a simple breakdown of what this entails:

Adding Context

  • Clarifying Information: Add context to make the information clearer. For example, if a retrieved article mentions a technical term, provide a brief explanation or definition. If your company has its own in-house abbreviations (for example, departmental names or engineering standards), then these need to be provided in the form of a glossary that can be understood by a LLM.
  • Connecting Dots: Link and cross reference related pieces of information to provide a more comprehensive understanding to the LLM. For instance, if you have data on AI applications in manufacturing, you might insert examples or case studies to better illustrate these applications.

Enhancing Data Quality

  • Correcting Errors: Fix any mistakes or inaccuracies in the retrieved information. This could involve fact-checking or correcting typos and grammatical errors.
  • Standardising Format: Ensure the data is in a consistent format, which makes it easier to use in the next steps. This might include converting all dates to the same format or perhaps ensuring all measurements use the same units.

Adding Missing Information

  • Filling Gaps: Add any missing pieces of information that are necessary for a complete understanding. For instance, if a retrieved report lacks a conclusion, summarise the findings.
  • Supplementary Data: Incorporate additional relevant data that was not included in the initial retrieval. This could be statistical data, quotes from experts, or related research findings.

Organising and Structuring

  • Logical Flow: Arrange the information in a logical order to make it easy to follow. This might involve creating sections or headings to organize the content.
  • Summarisation: Condense lengthy information into concise summaries that capture the main points without losing important details.
 

Example:

  • Adding Context: If an article mentions “machine learning algorithms,” you might insert a definition explaining what machine learning is. If an internal report or database refers to “MPL-07”, you might expand on this to explain this is a platinum resource number 7 where there is a mining exploration site licensed to your company.
  • Enhancing Data Quality: Correct any errors found in the retrieved articles and ensure all data points, like percentages and dates, are consistently formatted.
  • Adding Missing Information: If a case study lacks details on outcomes, add relevant data or results from other sources.
  • Organizing and Structuring: Create sections such as “Introduction,” “Applications of AI,” “Case Studies,” and “Conclusion” to organize the augmented information.
 

The augment step will ensure the information is accurate, comprehensive, and well-organised, making it more valuable for the generation step.


Generate

The “generate” step in RAG involves using the augmented information to produce new, meaningful content. Here’s a simple breakdown of what this entails. Essentially, you are building a meaningful “prompt” for the GenAI LLM to work from.

Understanding the Objective

  • Purpose: Articulate the goal of the new content. This could be to inform, explain, summarize, or persuade.
  • Audience: Consider who will be reading the content to ensure it is tailored to their needs and understanding.
 

Using AI Models

  • Content Generation: Leverage AI models, such as GPT-4o, to create text based on the augmented information. These models can generate coherent and contextually relevant sentences, paragraphs, or entire documents.
  • Templates and Structure: Apply pre-defined templates or structures to organise the generated content logically. For example, generating a report might follow a structure like Introduction, Body, and Conclusion.
 

Refining the Output

  • Editing and Polishing: Review the AI-generated content for coherence, accuracy, and readability. Make necessary edits to improve the flow and clarity.
  • Ensuring Relevance: Check that the generated content stays true to the original purpose and is relevant to the audience’s needs.
 

Customising for Specific Needs

  • Personalization: Tailor the content to fit specific contexts or user requirements. This could involve adding personal touches, addressing specific questions, or focusing on particular aspects of the topic.
  • Incorporating Feedback: Use feedback loops to refine and improve the content. This might include user feedback or iterative edits based on review.
 

Example:

  • Understanding the Objective: The goal might be to create an informative focussed training program for employees interested in AI applications in manufacturing.
  • Using AI Models: Use an AI model to generate a draft based on the augmented information, following a structured outline like Introduction, Definitions, Concepts, Key Applications, Case Studies, and Summary.
  • Refining the Output: Edit the draft to ensure clarity, correct any inaccuracies, and enhance readability.
  • Customising for Specific Needs: Personalise the content by adding specific examples relevant to the audience, addressing common questions, and focusing on the most impactful applications of AI in manufacturing.
 
 

The generate step uses a LLM to transform the information gathered so far into a polished piece of content ready for presentation or distribution to the target audience. It combines your own internal proprietary information with the power of Large Language Models to provide additional information, improve the delivery and targeted at a specific audience.


Challenges and Considerations

Implementing the RAG (Retrieve, Augment, and Generate) framework involves several challenges and considerations that need addressing to ensure effective and efficient outcomes.

Data Quality and Relevance

Challenge: Ensuring the information retrieved is accurate, up-to-date, and relevant to the query.

Considerations:

  • Use reputable data sources and databases.
  • Implement rigorous filtering and validation mechanisms to verify the accuracy and relevance of the data.

Integration and Compatibility

Challenge: Integrating different data sources and ensuring compatibility between them.

Considerations:

  • Use standardised data formats and protocols.
  • Employ middleware or APIs to facilitate seamless data integration.

Computational Resources

Challenge: Managing the computational resources required for processing large volumes of data and running complex AI models.

Considerations:

  • Optimise algorithms and processes to reduce computational load.
  • Leverage cloud computing and scalable infrastructure to handle resource-intensive tasks.

Data Privacy and Security

Challenge: Protecting sensitive information and ensuring compliance with data privacy regulations.

Considerations:

  • Implement strong encryption and access control measures.
  • Adhere to legal and regulatory requirements for data privacy and security.

Complexity of Augmentation

Challenge: Effectively augmenting data to add value and context without introducing errors or biases.

Considerations:

  • Use sophisticated algorithms and models for data enrichment.
  • Continuously monitor and refine augmentation processes to improve accuracy and reduce biases.

Quality of Generated Content

Challenge: Ensuring the generated content is coherent, contextually relevant, and of high quality.

Considerations:

  • Employ advanced AI models and fine-tune them for specific use cases.
  • Implement robust review and editing processes to refine the generated content.

User-Specific Customisation

Challenge: Tailoring the output to meet the specific needs and preferences of different users.

Considerations:

  • Collect and incorporate user feedback to customise and improve the output.
  • Develop flexible templates and frameworks that can be easily adapted to different contexts.

Handling Ambiguity and Uncertainty

Challenge: Dealing with ambiguous queries or incomplete data.

Considerations:

  • Use contextual clues and user interaction to clarify ambiguous queries.
  • Implement fallback mechanisms to handle incomplete data and provide approximate answers when necessary.

Addressing these challenges and considerations effectively can significantly enhance the performance and reliability of the RAG framework, leading to better user satisfaction and more valuable outcomes.


To summarise:

RAG, (Retrieve, Augment, and Generate), is a process used in artificial intelligence to discover, enhance, and create information. The process can involve several manual check points, and there may be automatic data processing steps (such as information retrieval / search) that do not involve any AI technology. But RAG generally involves the use of AI in some significant way to transform relevant information from multiple sources in combination with a pre-trained model to generate new information and insights.

Benefits of RAG

  • Enhanced Accuracy: By retrieving and augmenting information, RAG ensures generated content is accurate and contextually relevant.
  • Increased Efficiency: Automates data gathering and content creation, saving time and reducing manual effort.
  • Improved Quality: Provides high-quality, well-structured content that is informative and useful.
  • Flexibility: Adapts to various domains and use cases, offering a versatile solution for content generation and data processing.

Where RAG can be Used

  • Customer Support: Enhances responses by retrieving relevant data, augmenting with additional context, and generating accurate answers.
  • Content Creation: Assists in writing articles, reports, and summaries by combining information retrieval, augmentation, and content generation.
  • Technical Documentation: Produces detailed and accurate technical manuals and documentation by integrating and expanding on retrieved data.
  • Research and Analysis: Facilitates comprehensive research by retrieving, enhancing, and synthesizing information from multiple sources.

Future Applications of RAG

  • Personalised Learning: Customises educational content based on individual learning needs and retrieved knowledge.
  • Advanced Healthcare: Integrates patient data and medical research to generate personalized treatment plans and insights.
  • Legal Research: Retrieves and augments legal documents to generate precise legal opinions and case summaries.
  • Enhanced Business Intelligence: Provides actionable insights by analysing and generating reports from diverse business data sources.
 

RAG provides important clues about how company information will be processed together with AI to reveal new insights across various industries.

As a technical leader, you will want to develop a basic conceptual understanding of how RAG is used together with GenAI models. It is recommended that if this is an area of interest to you that you invest time to explore RAG tools and platforms, case studies, and best practices. You will then be able to use this knowledge to develop a realistic and practical vision for AI in your own organisation and get started with a proven framework with which to approach your first GenAI project.

Table of Contents

Share this

Facebook
Twitter
LinkedIn

Suggested Articles