
Commissioning: Search Augmentation and Generation (RAG) has become the gold standard for helping companies improve large-scale language model (LLM) results using enterprise data.
Typically, LLMs are trained using public information, but RAGs allow companies to enhance LLMs with contextual or domain-specific knowledge from company documentation about products, processes, or policies.
According to McKinsey, RAG has a proven ability to augment the results of companies’ generative AI services, resulting in improved employee and customer satisfaction and improved overall performance.
It is less clear how to scale RAG across the enterprise. This enables organizations to enhance their GenAI use cases. Early efforts to codify repeatable processes to help launch new GenAI products and services using RAGs encountered limitations that impacted performance and relevancy.
Fortunately, short- and medium-term solutions provide a possible path for RAGs to ensure they can expand beyond 2025.
Rise of RAGOps
LLMs incorporating RAGs require access to high-quality training data. However, because data is spread across different departments, systems, and formats, ensuring the quality and availability of relevant data tends to be difficult.
For maximum effectiveness, LLMs using RAGs also need to connect to the sources from which departments want to obtain data (think customer service platforms, content management systems, human resources systems, etc.). Such integration requires a high degree of technical expertise, including experience with mapping data. API management.
RAG models are also deployed at scale, which can consume large amounts of computational resources and generate large amounts of data. This requires the appropriate infrastructure and experience to deploy it, as well as the ability to manage data supported across large organizations.
One approach to RAG mainstreaming that is gaining popularity among AI experts is a methodology that helps automate RAG workflows, models, and interfaces in a way that reduces complexity while ensuring consistency. This is RAGOps.
RAGOps allows data scientists and engineers to automate data ingestion, model training, and inference. It also addresses scalability obstacles by providing mechanisms for load balancing and distributed computing across the infrastructure stack. Monitoring and analysis is performed at every stage of the RAG pipeline to help continuously refine and improve models and operations.
For example, McKinsey uses RAGOps to help its Lilli GenAI platform sift through 100,000 highly selected documents. Lili has answered more than 8 million prompts recorded by nearly three-quarters of McKinsey employees seeking customized insights into their work.
The coming era of Agent RAG
As an operating model for organizations looking to leverage more value from their GenAI implementations, RAGOps promises to take hold well in organizations already leveraging other operating frameworks such as DevOps and MLOps.
But some organizations may take a more novel approach, which is where the GenAI industry is headed: blending RAG and agent AI. This allows the LLM to adapt to changing circumstances and business requirements.
Agents designed to perform digital tasks with minimal human intervention are gaining interest from companies looking to delegate more digital operations to software. According to Deloitte research, approximately 25% of organizations will have deployed enterprise agents by 2025, and this is expected to rise to 50% by 2027.
Agenttic AI using RAGs involves many approaches and solutions, but many scenarios are likely to have some common characteristics.
For example, individual agents might evaluate and summarize answers to prompts from a single document, or even compare answers across multiple documents. Meta agents coordinate processes, manage individual agents, and integrate output to provide a consistent response.
Ultimately, agents analyze, plan, and reason in multiple steps within the RAG framework, learning as they perform tasks and changing their strategies based on new inputs. This allows LLMs to better respond to more subtle prompts over time.
At least in theory.
conclusion
The future looks bright for GenAI technology. GenAI technology will flow from the lab to corporate AI factories as part of the burgeoning enterprise AI sector.
For example, as models are optimized to run efficiently on-premises or at the edge on AI PCs and other devices, the model footprint shrinks. RAG standardization will expand to include software libraries and off-the-shelf tools.
Whether your organization is adopting RAGOps or agent AI, solutions are emerging to help organizations scale their RAG implementations.
Apply Agentic RAG on NVIDIA and Dell AI Factory to healthcare to enable structured data such as patient schedules and profiles to parallel unstructured data such as medical notes and image files while maintaining HIPAA compliance. It will help you adjust your usage issues. and other requirements.
That’s just one bright option. More is coming to help light the way for organizations in the midst of their GenAI journey.
Learn more about Dell AI Factory by NVIDIA.
Provided by Dell Technologies.