What is retrieval augmented generation?| Glossary

What is retrieval augmented generation (RAG)?

Retrieval augmented generation (RAG) is a natural language processing (NLP) technique that combines the strengths of both retrieval- and generative-based artificial intelligence (AI) models. RAG AI can deliver accurate results that make the most of pre-existing knowledge but can also process and consolidate that knowledge to create unique, context-aware answers, instructions, or explanations in human-like language rather than just summarizing the retrieved data.

RAG AI differs from generative AI in that it is a superset of generative AI. RAG combines the strengths of both generative AI and retrieval AI. RAG is also different from cognitive AI, which mimics the way the human brain works to get its results.

How does retrieval augmented generation (RAG) work?

RAG, short for retrieval augmented generation, works by integrating retrieval-based techniques with generative-based AI models.

Retrieval-based models excel at extracting information from pre-existing online sources like newspaper articles, databases, blogs, and other knowledge repositories such as Wikipedia or even internal databases. However, such models cannot produce original or unique responses.

Alternatively, generative models can generate original responses that are appropriate within the context of what is being asked, but can find it difficult to maintain strict accuracy.

RAG was developed to overcome these relative weaknesses in existing models to combine their strengths and minimize drawbacks.

A retrieval model is used to find relevant information from existing sources in a RAG-based AI system. In contrast, the generative model takes the retrieved information, synthesizes all the data, and shapes it into a coherent and contextually appropriate response.

What are the benefits of retrieval augmented generation?

By integrating retrieval and generative artificial intelligence (AI) models, RAG delivers more accurate, relevant, and original responses while sounding like they came from humans. That’s because RAG models can understand the context of queries and generate fresh and unique replies by combining the best of both models.

More accurate — Using a retrieval model to identify relevant information from existing knowledge sources, the original human-like responses that are subsequently generated are based on more relevant and up-to-date information than a pure generative model.
Better at synthesizing information — By combining retrieval and generative models, RAG can synthesize information from numerous sources and generate fresh responses in a human-like way. This is particularly helpful for more complex queries that require integrating information from multiple sources.
Adept at putting information into context — Unlike simple retrieval models, RAG can generate responses that are aware of the context of a conversation and are thus more relevant.
Easier to train — Training an NLP-based large language model (LLM) to build a generative AI model requires a tremendous volume of data. Alternatively, RAG models use preexisting and pre-retrieved knowledge sources, reducing the need to find and ingest massive amounts of training data.
More efficient — RAG models can be more efficient than large-scale generative models, as the initial retrieval phase narrows down the context and thus the volume of data that needs to be processed in the generation phase.

How is retrieval augmented generation being used today?

These are some real-life examples of how RAG models are being used today to:

Improve customer support — RAG can be used to build advanced chatbots or virtual assistants that deliver more personalized and accurate responses to customer queries. This can lead to faster responses, increased operational efficiencies, and, eventually, greater customer satisfaction with support experiences.
Generate content — RAG can help businesses produce blog posts, articles, product catalogs, or other content by combining its generative capabilities with retrieving information from reliable sources, both external and internal.
Perform market research — By gathering insights from the vast volumes of data available on the internet—such as breaking news, industry research reports, even social media posts—RAG can keep businesses updated on market trends and even analyze competitors’ activities, helping companies to make better decisions.
Support sales — RAG can serve as a virtual sales assistant, answering customers’ questions about items in inventory, retrieving product specifications, explaining operating instructions, and in general, assisting in the purchasing lifecycle. By marrying its generative abilities with product catalogs, pricing information, and other data—even customer reviews on social media—RAG can offer personalized recommendations, address customers’ concerns, and improve shopping experiences.
Improve employee experience — RAG can help employees create and share a centralized repository of expert knowledge. By integrating with internal databases and documents, RAG can give employees accurate answers to questions about company operations, benefits, processes, culture, organizational structure, and more.

Cohesity and AI

Cohesity is at the forefront of the dawning age of AI because the Cohesity platform is ‘AI-ready’ for RAG-based large language models (LLM). The ground-breaking Cohesity approach provides robust and domain-specific context to RAG-driven AI systems by leveraging the robust file system of the Cohesity patented SnapTree and SpanFS architectures. To achieve this, an on-demand index of embeddings will be provided just-in-time to the AI application requesting the data. Additionally, the data will be secured through Cohesity’s role-based access control (RBAC) models.

Cohesity Gaia utilizes RAG AI to search and summarize content using everyday language to create conversational queries.

The Cohesity Gaia RAG platform accepts human and machine-driven input, such as questions and queries. That input is then tokenized with keywords that quickly filter petabytes of enterprise backup data down to a smaller subset of contextualized data. It then selects representations within those documents or objects most relevant to the question or query. That result is packaged, along with the original query, to an LLM such as GPT4 to provide a context-aware and human-sounding answer. This innovative approach ensures that the generated responses are knowledgeable, up-to-date, diverse, and relevant to the specific business content.

By layering RAG on top of an enterprise’s datasets, Cohesity customers will not need to perform costly fine-tuning or extended training on vast volumes of data to teach LLMs “what to say.” This saves time and money and reduces environmental impact since RAG models are flexible enough to adapt to rapidly growing and constantly changing datasets. For this reason, leveraging RAG on the Cohesity platform can provide the most recent and relevant context to any query.

Cohesity’s RAG-aware platform will generate more knowledgeable, diverse, and relevant responses compared to off-the-shelf LLMs without massively increasing data storage requirements. This breakthrough has tremendous potential for innovations with enterprise Q&A (questions and answers) applications and industry search and discovery models.

Technology and business executives alike will have a unique opportunity to leverage the power of data-driven insights to enhance the quality of AI-driven conversations with Cohesity’s RAG-driven AI system. Organizations can unleash new levels of efficiency, innovation, and growth by harnessing the power of Cohesity data management and security solutions enhanced by AI.

To learn more, read the AI eBook.

Cohesity and Veritas have joined forces!

Retrieval augmented generation (RAG)

Table of Contents

What is retrieval augmented generation (RAG)?

How does retrieval augmented generation (RAG) work?

What are the benefits of retrieval augmented generation?

How is retrieval augmented generation being used today?

Cohesity and AI

You may also like

Get started today