Welcome to today's Kilometers World webinar brought to you by Shelf, Coveo, and Progress Semaphore. I'm Mary d Ojala, editor in chief at Kilometers World magazine. I will be the moderator for today's broadcast. Our presentation today is titled unlocking the power of rag. Now before we get started, I want to explain how you can be a part of this broadcast. There will be a question and answer session. So if you have a question during the presentation, just type it into the question box provided and click on the submit button. We will try to get to as many questions as possible. But if your question has not been selected during the show, you will receive an email response to it within a few days. And now to introduce our speakers for today. Jan Strejetsa, director of data and analytics shelf, Emma Zernask Siebeck, product marketing manager Coveo, Steve Ingram, senior sales engineering manager, progress semaphore. And now let me pass the event over to Jan Stiehetzen, director of data and analytics, Shelf. Go ahead, Jan. Thank you, Mary D. Thanks for the thanks for the introduction. So what I wanna focus on for the next fifteen minutes or so is how to successfully implement and deploy retrieval augmented generation applications in the enterprise. And I wanna share five practical strategies that you can use to do so effectively. First, a little bit more about me. My name is Jan, director of data analytics at Shelf. I've been in data and AI for over a decade now. Some part of that in research where I worked on and wrote about embedding models, so text vectorization models and information retrieval, both now key concepts in retrieval augmented generation. Shelf a little bit more about us, we are helping companies deliver more accurate GenAI answers. And when we talk about generative AI in the enterprise, we are very frequently talking about RAG. So RAG retrieval augmented generation is the dominant technique for implementing generative AI solutions in the enterprise. It has almost become synonymous with enterprise generative AI. And how are we helping companies ensure higher quality advances in generative AI? It is through improving the quality of data that is used in generative AI, that is used in retrieval augmented generation. And before I jump into the strategies, I want to share a little bit more about the state of retrieval augmented generation in the enterprise. So there's this immense potential to generative AI, transformative potential to generative AI. But, and this quote is very telling, less than ten percent of Gen AI products have actually reached the production phase. So teams are getting stuck in POCs, teams are getting stuck in development, and are failing to deploy generative AI projects at scale in production. What are the issues? They're all related to answer quality. Inaccurate answers, hallucinated answers, inconsistent answers, and unreliable responses. And then as a consequence of those, poor user satisfaction, poor customer satisfaction, and then poor adoption. And risks are just too high because all of this can negatively impact the business. And when we start for looking looking for the call, what is causing those issues? What is where is the source? And it is all pointing in the same direction. It is poor data quality. So there's a consensus or this is consistently reported as the number one issue companies have in generative AI in retrieval augmented generation. It is data quality. Data quality that is fueling generative AI algorithms. And this is especially true in retrieval augmented generation. And what I wanna do now is just share three graphics, almost a a a very short cartoon. And what I wanna talk about is the responsibility for answer accuracy. Or phrased a little bit differently, the burden of answer accuracy. So who is responsible for us or the users seeing good or bad answers in in our generative AI application? So let's first look at a no rack, just out of a box large language model, out of a box generative AI generative AI interface where users can interact with the model. And let's say we ask you the question, what's the population of Italy? Now what the large language model will do, it will rely on its own internal knowledge. It will rely rely on what it has seen during the pre training and fine tuning phases to produce an answer. And that answer will be either accurate or inaccurate depending on what was in the training data. So that burden is fully carried by our large language model. Now in retrieval augmented generation, something happens. What we do is we integrate large language models with our data, with enterprise data. Now the first purpose of that is we want to ground those answers in data we control, data we trust. But when it comes to this responsibility, when it comes to this burden, what we have done in RAG, we have shifted it from our LLM to our data. Because let's say let's let's let's let's go through that scenario again. What is the population of Italy? Now in a RAG system, what will happen is we will try to find relevant documents, relevant document sections that can answer this this this question, And then we specifically instruct our large language model, provide an answer, but only using the data that you see in the context window in the prompt. And now if this data entry is inaccurate, then the large language model will surface this as an inaccurate answer. So let's say if our if our data on the population of Italy is outdated, our user will see an outdated answer. So it has this this this responsibility has been moved to our data, and now our data quality is critical to us seeing high quality answers. And if we looked at if we look at, let's say, an out of the box retrieval augmented generation pipeline with no quality controls, What is very true is the concept of garbage in garbage out. So poor data flowing in will result in poor answers flowing out. So let's say data without proper context, outdated data, inaccurate data, it will be served as hallucinations, inaccurate answers, wrong actions taken, and then some issues which I already mentioned earlier, low user satisfaction, low customer satisfaction, and low adoption. And what is the extent of data issues? And this is those are the numbers that we see in real world through processing vast amounts of data. We can see that over ninety percent of documents have at least one issue, have at least one inaccuracy. Perhaps it can be a missing metadata field. Over a quarter of documents have out of date information that is no longer relevant. Almost or or exactly a third of of documents have some sort of duplication, Barca duplication, redundancies, and then over ten percent of the documents have compliance risks, so let's say private information in them. So those are the issues and and those data issues are directly resulting in answer issues in retrieval augmented generation and in retrieval augmented generation pipelines and applications. So this is the main issue, or this is the main point we need to focus on. This is the critical aspect of retrieval augmented generation data that is feeding our algorithm. So what can we do about it? How to get it right, especially in enterprise settings? And here is where I want to share a little bit more about the strategies. And I've outlined five strategies and then four techniques or methods under each. And now I will not be, this will not be a strategy deep dive, but rather I want this to serve as a toolbox, as a toolset of techniques and methods you can use, you can start applying today to improve the quality of your retrieval augmented generation answers. So the first strategy I want to talk about is data enrichment. One one key issue that generative AI has when we integrate it with our data is the lack of business context. And we can provide this by enriching our documents so that we we we inject this context so that when generative AI and large language models interact with our data, they know what is the purpose of this data. They know, what issues this data is addressing. So what are some of the techniques under the strategy? Controlled vocabularies and glossaries. What are the entities, or let's say acronyms, that we frequently use in our data? A large language model out of the box will not understand what those means, and this is what we see very frequently calls hallucination. Metadata. Metadata enrichment. What is this document about? Who authored this document? What purpose does it serve? When was it last updated? All this is very important context for managing data and also downstream for large language models to be able to use this data effectively. Topic modeling, another technique. What are the topics that our knowledge base is covering? What is this particular document? To which topic does it belong? Again, this can help contextualize our data. And then point number four, knowledge graphs. Knowledge graphs are really the the information structure of the of the future. And graph rag is also picking up momentum, and we can model this information as a knowledge graph and make it available to downstream retrieval augmented generation applications. Second strategy is identifying data risks that can cause answer issues. So I already mentioned a few, but I want to I want to here outline four specific ones. So duplicate information, redundant information, it will decrease the efficiency of our system and it is also it is also injecting noise into the into the context window of a large language model. Inaccurate and outdated information, again, it will be surfaced as inaccurate and outdated answers. Conflicting information, which will confuse large language models, and then lastly privacy and compliance risks. Are we protecting our data assets now that we are using them in different in different applications? Next one is controlling RAG outputs by setting up quality filters. So now when we have the ability to identify some of those issues, we want to prevent those issues from getting into the context window from ever reaching generative AI. So let's say preventing contradiction contradictions, conflicts from being injected into the context window of a large language model, private content filter, toxic and biased content filter, again something that can be very relevant in in certain domains, and then duplicate content filter. Strategy number four is monitoring RAG answers. We need a way to monitor and understand the quality of our RAG answers and techniques that we can use their user feedback mechanisms. Let users provide feedback to answers so that we can understand which answers are are are not good and are causing user dissatisfaction. Setting up regular conversation quality reviews. Again, understanding what is the quality of of of our conversations, Rack conversations, Implementing a document audit trail. This is something that that RAG offers. We know exactly which documents were used to generate a specific response. And if that response is of low quality, we now have a a direct trail to the document that that that might have caused that issue. And then lastly, automated answer evaluation. This is also something that this is also something that we can use models, which are specifically trained for the technique hallucinations. And there's also a groundness score, which in RAC can measure how well our answer is grounded in in our knowledge. And this is something that is very that we want that score to be high in RAC. And the last strategy here is fixing issues at source. So an efficient and scalable way to fix and continuously improve your pipeline. Human in the loop transparency. You need humans to understand where are the data issues, where are the answer issues. Monitoring data health through time with updates. So we update our information, we introduce new information, we have more data, and this is this is this can then introduce conflicts, this can then introduce issues. So you need a way to counteract that process of data quality degradation through time. Clear data ownership, who owns documents, who is who is responsible for ensuring data accuracy. And then lastly, ticketing system for issue resolution, why I want to mention this one because the team that is implementing RAG is very frequently not the team that owns data quality. So then there needs to be a a regular communication channel allowing those teams to work together because they are both required for a successful RAG implementation. And I just want to very quickly show this slide again. So garbage in garbage out, poor data quality, flowing in will result in poor answers flowing out. And what we see work in practice in the real world is having a layer that allows you to control data quality and it allows you to control or understand the quality of conversations. And having those two inputs allows you to continuously improve the quality of retrieval augmented generation or the quality of your application. And then at the start, I mentioned at Shelf, we help companies deliver better, higher quality Gen AI answers. Techniques and strategies I shared today are something that is built into our platform. So it is a data quality control platform and a continuous improvement platform allowing you to see the data issues that are causing RAG issues and allowing you to see the quality of your conversations. And when conversations are are not or when the quality of answers is not where it should be that you know exactly where to where that issue is where that issue lies and exactly what your document caused the issue. And that that gives you full transparency to the team that is that is implementing, controlling rack to address issues again in that continuous manner so that you can improve your application through time. So I've shared a lot. I just want to very very quickly summarize some of the key points. We see that the majority of Gen a projects are failing to to get to production. So stats or data says over ninety percent of those projects get stuck in the POC stage. Poor data quality is the number one obstacle. And then enterprise data issues are directly causing answer issues. So this is this is where where where teams should focus their attention to improving the data that is used in retrieval augmented generation. And then how we can do that? Five strategy or five key concepts and then underlying techniques. The first one is data enrichment, identifying data risks, filtering out high risk data from ever reaching your your your Gen AI application, monitoring retrieval augmented generation answers, understanding what is the quality of of of the answers that we are exposing to users either internally or externally, and then lastly, fixing issues at source. So there's different ways how you could control a a a a rag answer, but to really address this issue scalably is at the source, and it's in the data. And retrieval of content generation can can sometimes seem overwhelming because you do get something working very quickly in the POC stage. But then when you see what all now needs to be done so that you have something working in production that is that is, actually compliant, that is good enough to expose to to to to users, this is sometimes hard to hard to grasp, hard to grasp and hard to understand. What are those critical areas that we need to focus on? So for this reason, what we prepared for all the for all the webinar attendees is a is a assessment. It's twenty questions you must be able to answer to ensure successful retrieval augmented generation. And what this assessment allows you to do is to see what are all the things that we need to think about, what are all those critical areas that we need to consider and improve to ensure that we're gonna see that we're gonna see consistent high quality answers in a retrieval augmented generation application? So if you are just exploring our cumulative generation, this can help you give the understanding of what are all the things you need to consider. And if you already have something working in production and are perhaps struggling to see consistent answers, this can help you this can help you show the way what are some of the things that we really that we really need to focus on. So to get the assessment, just indicate, just indicate your interest. You can do it in the in the chat or in the in the in the question menu. Just write yes, Jan or yes, shelf, and we'll get this to you. It it takes, not more than fifty minutes to complete, and it can give you, great input for your RAG initiative. This is it. Thanks, and back to you, Mary d. Well, thank you, Jan. And now let me pass the event over to Emma Zernask Siebeck, product marketing manager, Coveo. Go ahead, Emma. Thank you, Mary d. And what a great curricula from YON truly highlighting challenges with implementing GenAI and data quality and further highlighting, you know, that importance of retrieval augmented generation. So now I'll be diving into how to unlock the true power of retrieval augmented generation or RAG by focusing on what makes the quality of the retrieval component so important for delivering relevant and accurate responses and therefore business outcomes. Let's unpack the growing knowledge crisis in the digital workplace and explore why RAG is the key to overcoming it. Before we get into the details, let me quickly introduce Coveo. We are leaders in AI powered search and generative experiences with nearly twenty years of development and over ten years focused on AI. Our platform is purpose built to handle enterprise complexity at scale. Our mission is to help organizations deliver the most relevant, personalized, and secure information to employees and customers no matter where it lives. We achieve this by leveraging our deep expertise and AI relevance and secure content retrieval. As knowledge taxonomy and employee systems experts, I'm sure many of you have first hand experience with the growing enterprise knowledge crisis. With the advent of generative AI, many of us are now observing a knowledge and data crisis emerging across countless industries. This technology reveals both the potential and the shortcomings of how organizations manage and use content and data internally and externally. The alarming rate of enterprise data growth is impacting employee productivity and proficiency, underscoring the need for better knowledge management and knowledge discovery solutions. The impacts of this are already being felt as Forrester predicts that employee engagement will sink to thirty four percent in twenty twenty four as part of a continued annual decline in worker engagement since the pandemic. Our latest employee experience industry report also found that fifty six percent of employees still struggle to find the information that they need on their own, ultimately affecting their performance. Further, IDC projects that seventy five percent of organizations have increased their technology investments around data lifecycle management due to generative AI. This is significant as thirty to fifty percent in efficiency gains have already been achieved by employers adopting innovations like GenAI. However, RAG is crucial for these initiatives to succeed. Tools like GenAI can optimize workflows but it is not just about generating insights. It's about ensuring those insights are grounded in accurate, relevant data through precise retrieval. With the right technology and approach, enterprises can unify data and leverage precise retrieval to improve employee experience and drive more efficient decision making without burdening the experts who design these experiences. Gen AI is at the top of every employee experience leader's digital transformation checkness, but it comes with its risks. Gen AI alone is not an easy fix to your enterprise's knowledge crisis. Generative models, when used without context as seen as this in the simple diagram here, are prone to hallucinations, use outdated information, and can lack control over the sources that they pull from. Here's where RAG makes a meaningful difference. By grounding GenAI models in a curated source of truth, RAG ensures that answers are accurate and reliable. This addresses the main concerns associated with GenAI, transforming it from a novelty to a valuable, effective tool in enterprise settings. With RAG, generative answering gains depth and accuracy anchored in those verified data sources. This grounding process is critical because it enables models to provide answers that reflect real world up to date knowledge rather than isolated outdated content. RAG's real advantage is in its ability to enhance accuracy, reduce hallucinations, and provide control over knowledge sources. With source citations, secured content access, and cost effective implementation, RAG transforms the way organizations can leverage AI, delivering not just answers but answers you can trust. However, even with RAG, not all implementations are equally effective. RAG alone is not enough. Retrieval plays a huge role in ensuring that generated content is relevant and impactful. A subpar retrieval setup can still lead to inefficiencies which means that simply adopting RAG is not a silver bullet. There are numerous approaches to RAG each with varying levels of effectiveness and sophistication. Some organizations opt for basic retrieval methods while others, like Coveo, invest in an advanced, scalable retrieval infrastructure. Choosing the right approach is critical to achieving optimal business outcomes. So consider, which approach best aligns with your needs? At its core, retrieval is what brings RAG to life. It grounds the generative capabilities of AI in truth, amplifying relevance and impact for each answer. This process ensures that employees receive information that's not only accurate but also tailored to their needs, making retrieval the real differentiator in RAG. So what exactly goes into making retrieval work? It starts with secured connectivity to your data sources to guarantee employees are not only able to see what they're allowed to see and they're just only able to see it given based their role and department. This is followed by indexing, vectorization, and applying AI relevance to ensure the right content is surfaced for each user. At Coveo, we've spent the last twenty years building a mature enterprise grade retrieval system with the last decade focused on AI to handle these complexities at scale. Building an enterprise grade retrieval system isn't easy. It requires significant financial, time, and resource commitments. At Coveo, we've made those investments so you don't have to. This allows you to leverage a proven, scalable solution right from the start without the need for constant upkeep. Where Coveo stands out is in combining RAG with AI relevance by factoring in user intent, context, and behavior. This results in personalized, secure, and relevant answers at scale. The g in reg may seem like the headline, but the r is what truly drives meaningful business outcomes. Investing in digital workplace transformation is not just about adding more tools but about integrating systems that can enable efficient access to knowledge. With Coveo's advanced RAG capabilities, organizations can improve employee productivity, reduce internal support costs, and enhance overall decision making, all essential to keeping engagement high and business outcomes strong. And I do wanna highlight, touching back to to Jan's point earlier, a lot of this retrieval focused portion of rag is still critical when to have a clean data quality going into it. Garbage in, garbage out is still relevant no matter the the retrieval system, but a strong retrieval system with layers of tuning can definitely help to ensure that even with maybe not the most efficient data management practices in place, you can still mitigate some of the hallucinations that you might have experienced otherwise. And with that, I'd like to to thank you all for joining today's session. I hope this has provided valuable insight in how precise retrieval can supercharge your RAG applications. If you have any questions or would like to learn more about how Coveo can help you amplify knowledge across your organization, please feel free to reach out to us. And with that, back to you, Mary D. Well thank you, Emma. And now let me pass the event over to Steve Ingram, senior sales engineering manager, Progress Semaphore. Go ahead, Steve. Thank you. Thank you, Mary d. So, yes, I'm I'm Steve Ingram. I lead presales for Progress Semaphore. Semaphore is, Progress's semantic AI, product, probably known, more more widely known as Smart Logic. My background's in search and digital transformation. So perhaps in a parallel universe, I've just handed over to Emma for their semaphore presentation. Anyway, back to this universe and on with my presentation. And I'm gonna be covering how enterprise semantics can, can enhance a a RAG solution, leading to better outcomes and hopefully lower costs. So there we go. Press that button. Gen AI, is happening. It's a thing. And economists are forecasting an industry worth over a trillion dollars, by twenty thirty two in the US economy alone. Whereas, Kantner are predicting that there's gonna be an awful lot of money wasted before that happens with thirty percent of projects failing, in the next year or so. But if you follow the guidance in these presentations, and I've been really enjoying the kind of convergent evolution as everybody comes, to a point around RAG. You're gonna maximize your chances that your project will not be one of the ones that fails. So, for all the promise in Gen AI, there's still very many challenges that need to be addressed before it can deliver real value. You know, we've we've talked about hallucinations, and if anybody, anybody hasn't checked the Air Canada chatbot disaster that, Jan was had on his, on his on his slide, it's worth looking into. And studies have shown that hallucinations can can can occur between sort of fifteen and twenty percent of the time. Data biases, an LLM asked about, career choices might suggest that nursing was a more suitable for a female candidate than for a male, reinforcing an outdated and inaccurate stereotype, because that's what was represented in the data that was used to train the LLM. Data which is inevitably gonna be out of date. So we have time and cutoff issues, and that's not even before we start to consider, security issues and intellectual property issues and all those kind of factors. But the biggest, challenge though is is is context and understanding. LLMs are limited to the world as defined by the data they're trained with. And the context of a piece of information is as important as the information itself. We need things, not strings. And it's a semantics problem. Now I would say that, wouldn't I? But but but bear bear with me. Hear hear me out. This is this is my my favorite example. The word gas, it's three letters, but those same three letters have exactly I'm sorry. I have a completely different meaning depending on where in the world you're based. In North America, it's it's what you put in your car. Everywhere else, it's not a solid and not a liquid, and it's more often than not hazardous. But even going other way, let's look at, you know, what people in the rest of the world put in their car. Most people call it petrol. We may call it diesel. If you're a trader or involved in getting the stuff out of the ground, you're gonna call it petroleum or petroleum products. If your job is in logistics, you might use the HASKEM code three y e twelve seventy. If you're a chemist, you're gonna be more comfortable using the, chemical formula, but it is the same thing. So we have a single word referring to multiple concepts. We have a, you know, multiple words for the same concept. It's a semantics problem. But all an LLM has is the proximity of words. They have no way of expressing concepts such as usage in in any other way. And that's gonna be okay for consumer focused applications where just taking the most common use of the word is likely to be, acceptable. But decision making in organizations typically requires a greater level of detail. Now for example, here's here's the answer to the question, can you give me a few examples of companies that are incorporated in California? Well, that looks like a decent list if you were to say Californian companies. These are the ones you'd probably think of. Maybe not Tesla these days, but, you know, you know what I mean. And the first one, in fact, in first one is is is correct. Rest, yeah, not so much. And this is because the LLM does not understand what is meant by incorporated in a legal context. Now most people don't really care about the distinction, but there will be people for whom it is important. So, you know, we have a problem. We want to be able to get accurate and relevant answers from GenAI and be able to rely on those answers without incurring the expense of developing our own LLMs. Now RAG, provides LLM with the business context, allows it to give relevant, accurate, and trustworthy answers. In fact, RAG is really starting to to come into its own. Now not so long ago, the industry was disparaging RAG as a solution, not really surprisingly. The vector database vendors want you to shovel take all your content out of the systems that you use currently and put it into a vector database. And NVIDIA wanted to you to invest in large data centers with lots of shiny chips. But times have changed. Organizations are realizing they can leverage gen AI organizations can leverage gen AI, using what they already have. Now I'm gonna talk about how how, you know, the steps towards putting that solution. And this is all part of what we refer to as the the Progress Data Platform, which is a a solution combining many of Progress's application and data platform products, including, OpenEdge, DataDirect, Corsican, MarkLogic, and indeed Semaphore. Now this solution is designed to allow organizations to take advantage of modern technologies, such as Gen AI, such as knowledge graphs, data mesh, and data fabrics, but in an organic manner, leveraging what is already in place. And this is how we we use the platform to enable a a RAG solution. So this is, you know, typical interaction with an LLM. A user formulates a query. They pass that query to the LLM. The LLM responds with an answer. It may not be the right answer, but it responds with an answer. But what is what is missing from this picture is, subject matter expertise and context from your internal assets, your internal data. And this is exactly what RAC brings, to the equation. The first step, to is is to leveraging your internal knowledge is is to make it accessible. Now knowledge exists in two forms. Implicit knowledge is, what's in your experts' heads, and that's at least as important as the explicit knowledge which has been committed to a paper or or some form of electronic record. And we use progress semaphore to capture subject matter expertise into reusable knowledge models. We call them knowledge models, sometimes called controlled vocabularies, taxonomies, business glossaries, call them what you will. It's all encapsulating the expertise that is inside your organization. And for explicit knowledge, we have ProgressMarkLogic. It's an enterprise strength multimodal database that works particularly well as a metadata hub and also provides very capable full text index and now, indeed, vector search capabilities. So now we have both our explicit and implicit knowledge under control. It's all plain sailing. Right? Well, you know, not quite. The next step is to, align the implicit and explicit knowledge as as Jan said, to build a knowledge graph. In our case, we use the cement the, semaphore classification server to enrich the data held in Barca logic using rules generated automatically from, from the knowledge models. So this is applying a rich and consistent and relevant air layer of metadata over the top of the, internal content. And now that the content, is enriched, we can do the same for for for the question. We can use the semantic enhancement server, which is another component of Sema four. Bear with me. To to enrich the question. So we can add add to the we can identify, we reach the question, and then we can use the vector search capabilities, b m twenty five, and other search algorithms to identify the relevant content that can be then used to then can be used to accompany the questions, to the LLM. And we can further extend this mechanism to provide on demand vectorization of content to suffering significant savings in exploiting h n I. Now as our content is semantically enriched, we can identify the relevant sections of our documents and use these and and only these to accompany the original question. So we can maximize the benefit, but also keep safely within token limits further keeping costs under control. And this additional information is what gives the LLM the context it needs to be able to give specific answers. So we have, to demonstrate this, we actually created a a simple application. And then we took our internal product documentation and semantically tagged it, in a rack solution. So we can ask this, this platform a question. And on the first panel on the left hand side, we're getting the response from public LLM without any private data. The next panel is a combination of, the LLM and the private data, and the third panel is showing how a knowledge model can also contribute to this process. Now in this example, we asked about MarkLogic and digital twins, which is a digital representation of a real entity such as an aircraft or a train. Now the LLM has a response as it found some public data on our website. But bear in mind, that data is gonna be out it's probably two years old. Our private data brings back more accuracy and and indeed now links back to the relevant internal documentation. But if we add the knowledge model to the solution, the the improvement is marked. Now a device shadow is another way to describe a digital twin. The LLM cannot answer the question because it wasn't trained on any content from MarkLogic that used the term device shadow. Our internal documentation is not much better because it's not a term that was used when the content was created. But by combining this with a knowledge model where the relationship between device shadow and device, digital twin is known, we can now use the, the we can also reserve serve the same content up, to answer, you know, a related question. And early results from this this solution have have been exciting. Customers are reporting a massive increase, incorrect answers, in one case, up from fifty percent to ninety percent and indeed beyond. And these benefits are being, realized in in weeks, not months. And it's a last statistic that I find truly remarkable. So ninety nine point five percent accuracy. So in the test of two hundred questions, only one of them was not incorrect. And if you can compare that to an unmodified LLM, which could return between thirty and forty incorrect answers in the same same sample, that's bad enough. But if with that many incorrect answers, you're increasing the chance for a plausible but misguided answer coming back with potentially huge implications. So RAG is is delivering on the promise of faster, better, more accurate, more reliable answers by by leveraging what you already have under your control. A rack based solution can easily be adapted without costly retraining and is agile, adapting to specific enterprise needs and and operational context. So giving organ organizations adopting rack have a competitive edge without the need to reinvent the wheel or retool processes that have been working well for years. So by way of conclusion, we we have a a virtuous circle. Using using RAG, semantics can be used to harness the power of Gen AI. And the Gen AI can also be used to enrich semantic modeling, although that is a subject for for another presentation. Retrieval automated generation powered by semantics will deliver significant benefits, but those benefits are only going to be well, they'll be maximized by organizations that, take information science seriously, and the management of both implicit and explicit knowledge. And with that, I think it's back to Mary d for the q and a. Lovely. Thank you to the three of you. And, yes, we would like to turn to some questions from our viewers. And let me remind you all, if you have a question, just put it into the question box and click submit, and we will hopefully get to all your questions. We do have one question here, that is quite practical. I think it was Jan who was talking about data quality, but I think we're all interested in data quality. And the question that came in is, who should own data quality? Jan, let me start with you on that one. Yep. Sure. So that's a Yeah. Yeah. This is a this is a phenomenal question. So the straightforward answer would be the one who has the knowledge to to fix data issues and maintain the quality. But the challenge that happens very frequently in organizations that are that are deploying or building rack solutions is that this person is frequently not on the rack implementation team. So let's say you have developers, you have data scientists working on the rack solution, controlling the framework, building the system. They are given data that this system should use to generate answers, but they don't necessarily control the data quality and they don't necessarily, they don't necessarily know how to fix some of the data issues. So what we're seeing change now from an organizational standpoint is bringing teams that before perhaps did not work together. So teams that do control the data quality, that do manage the knowledge, and then teams who are implementing RAC. Those teams now have to have a a a clear communication channel so that they can work together for for kind of an end to end successful RAC pipeline. Emma? I think that Jan answered it very thoroughly. I completely agree with him. Steve, anything to add? Or It's a brilliant it's a brilliant question. I'm I'm reminded of, Zen and the Art of Motorcycle Maintenance where somebody asks about equality and the guy spends a day thinking about it. It it's a brilliant question. Just to reiterate what Jan was saying, the people who can who can sort of who know, the subject should be empowered to to make those decisions. And that, yeah, that that that is the way and the the best the best the best thing you could do is actually make the the quality function, as automatic as possible, but leverage the information that is, in the expert's heads. Okay. And I hope that happens in all organizations. We have several questions about hallucinations, which is probably not surprising considering the fact that the topic is rag. And I guess, I'm I'm gonna try and put them together here. Obviously, RAG exists to mitigate or eliminate, hallucinations, but they still seem to happen. Although I was quite impressed by your one company, Steve, that had ninety nine point something, percent accuracy. That was that was quite high. That was quite amazing. But when what are the obstacles that you do find to eliminate or mitigate RAG, when you're implementing retrieval? Emma, let me start with you on that one. Yeah. That's actually a great question. That's something that I think is is very top of mind and something that Coveo takes very seriously. I can give you an example. One of our our clients, United Airlines. So, obviously, as Steve spoke to the hallucination that came up with Air Canada, United Airlines did not want to experience, you know, those same sorts of risks when implementing GenAI. What we do at Coveo is ensure that we have a guardrail in place for certain questions and certain ways that people can try to trick the AI. So we we add those rules, and even like specific questions, we we tune the the AI to specifically not generate an answer for. So something like can I put my child in my carry on luggage? We will simply refuse to answer questions of that nature as opposed to to try to attempt to generate a response to address it. So that's one of the ways that that we do it with Coval. Yeah. And, actually, you you were the one who brought up Air Canada in the first place. For people who don't know what happened, do you wanna give a brief synopsis of what the Air Canada fiasco was? Yeah. Sure. So it is essentially a chatbot providing, inaccurate information. I believe it was a customer service chatbot and and and and and and they were giving up, wrong suggestions how to address that. I'm not exactly sure what the what the issue that the customer had, but the answer was was far from helpful. I think it was a bereavement fair, wasn't it? I think so. Yeah. That's right. Was. Yeah. That's right. Now they didn't make it. That's exactly right. And, Canada tried to, by just saying, oh, it's a chatbot. It doesn't yeah. Yeah. And then quickly quickly found out that they couldn't actually do that. Yes. The courts, I think, disagreed with Air Canada on that topic. They did. Yep. So the guardrails, Emma, is is a pretty good solution. Do we have other solutions, Steve? How did you how did you end up with the ninety nine point something percent? That was incredible. Yeah. All I can I don't actually know the details? We literally got I got that slide on Friday. The all I can all all, my my well, my thinking is is is an LLM will will hallucinate when it it doesn't have, the context to give an answer. And if you're providing it with relevant, material, relevant context, then the opportunity for it the need for it to make stuff up, is reduced. Mhmm. Steve, we do have a question for you here. Can you elaborate on the scalability of GenAI, particularly when you're working with large volumes of unstructured data? Yeah. That that is a great question. So and, again, we can what do we mean by scalability? So is it, you know, the volume of data? Is it the non volume of applications? And and it really, really can can multifaceted so it's a multifaceted faceted problem. So, technology, you need to set you need a scalable platform. You need to be able to reach all of your, all of your resources. So you can't you can't you can't survive if everything is siloed. You have to be able to have a reach going across your organization, and you need to be able to, enrich that, that content tied into your, unified sort of knowledge graph, if you wanna call it that. And that has that can only be, automatically. So you you you can't you can't just throw humans at it. You need to, you know, distill what's in the human's head, make it into a form that you can actually start to apply it to your data, and then spread that, spread the enriched data right across across your your organization. You know, obviously, you gotta make sure at the same time you're respecting, the regulatory and standards, and all the rest of it. Right. So it's, there are multiple facets to just the word scalable. Yeah. Yeah. That's true. That's true. Emma, Jan, you wanna add anything about scalability? Yeah. Sure. So I think I think, Steve covered a lot of it. One part is the those are those are infrastructure supported, the volume of data that we now need to process. The other aspect that I would like to mention is perhaps as we start to increase the volume of data that we use, there's a few things that we need to consider with RAG. One is data quality and some of the data quality issues within, let's say have an increased probability of happening such as duplicate information, conflicting information. There's there's much more we now need to manage. And then I think we need to look for tools that allow us to handle a larger volume of data and give us some sort of understanding what is good and what is not good. And then the second thing is the more data that the rack system needs to search through, and I think this is something, Emma's presentation was focused on is this retrieval part, the harder it will be to find those accurate pieces of information, for a given query that is coming in. So again, deduplication and then that context around the data that helps the retrieval system get to that accurate piece of information, even though that the volume of data can be can can be very large. Ash, can I on on that piece? Yeah. You don't you don't wanna throw everything at it. It's important to throw just the bits that are irrelevant. So, accurate search, retrieval, enrichment, making sure that you find the key documents in your, in your space, that will help when are relevant to the question. You know, also would help you reduce costs and, I mean, you're not trying to boil the ocean and and throw throw everything out at the NLM. Right. Emma? Yeah. Yeah. Just to add to that a little bit more similar to to what both Yon and, Steve have been saying here. Another way that we tackle it with scale, is considering, like, user permissions and ensuring that's going into to the retrieval peep feature as well. So anytime they are we are retrieving things for a given query, we're using any contextual information we have about the user to ensure we're only pulling from information that would be relevant to that user specifically, which can help. We do have a question here also on semantic enhancement servers. How can that enrich the questions that Barca, posed to a GenAI RAG solution? Yes. It's Can you spell Yeah. I can do that one, probably. Yeah. I mean, it's it's, yeah, it's all about using the, the knowledge model to basically normalize the vocabulary. So if you have audiences multiple audiences, different frames of reference, so sort of medical is a classic situation where doctors and and medical providers have one set of vocabulary, patients and carers have another set of vocabulary, and the somatic enhancement server literally just kind of, determines what the sort of standard you know, it it basically you can think of it as as synonyms. You know, we're we're maintaining lists of terms that are related and and managing the relationship between those terms. And you can use that to say, okay. I know this. They've asked about, device shadow, but, actually, we, you know, we know that's a digital twin. And that's literally the the other process. So so we're we're making sure there's a wider range, of ways of expressing the same concepts can be handled through the LMM. Emma, Jan, either one of you are sorry? Yeah. I think I think Steve covered it wonderfully. I think Steve covered it wonderfully. I'll I'll maybe just add that that we've seen that it's very impactful indeed to so this this step of of enriching the query to help the retrieval system get to that accurate document, and then taking into account some of those some of those organization specific vocabularies, terms, entities that they use. You know, this is something that will help help the retrieval as it does rely on the semantics and the and the semantic matching to actually get to that, relevant document. It is indeed very impactful. Sure. Yeah. And and, yeah, I would just add oh, I was just gonna add a similar, that we've experienced at Coveo. We have a semantic encoder model that we do apply in the, generation layer, and definitely, it, helps for sure. Okay. Alright. We have another question in here about, employee productivity, and I'm gonna throw this one to Emma. You were talking a lot about the retrieval, the r in RAG being for retrieval. And and how does that enhance employee productivity compared to a more traditional search system? Yeah. So so the retrieval in RAG really can go beyond that traditional keyword based search by delivering those contextually relevant answers rather than just a list of results. You know? It it can understand a user's question, pull in the most relevant pieces of information, and then it enables the generative model to really construct that meaningful response. So, you know, this reduces time that employees spend searching across multiple sources, and it can boost productivity by giving them those quick precise answers grounded in that information that they can then check by by giving them the sources. So, you know, by eliminating the search fatigue associated with sifting through those irrelevant potentially results, the the retrieval in grad really can help employees make those decisions faster, get that information, and and move forward with more confidence. I love that phrase, search fatigue. I'm I'm I'm stealing that. Jan, you have anything to add? You're welcome to it. Thank you, Ben. No. No. Nothing really. I think Emma Emma covered it. I think now we can also see, like, there's there's data coming out that actually measures it right. How productivity has has in fact increased. I think there was one one published from Microsoft not long ago, which was done at scale. You know, there are visible gains to productivity, to employees that have access to Rack Solutions and employees that that do not open and search in a more traditional way. Okay. Good. Steve, anything? Nope. I don't think so. I think you guys have covered it. Okay. Alright. So we'll all be more productive. Yay. So Or perhaps more time to drink coffee. Okay. Well, this is actually all the time we have for questions today, and we apologize we were unable to get to all of your questions. But as I stated earlier, all questions will be answered via email. And I would like to thank our speakers today, Jan Steheze, director of data and analytics, Shelf, Emma Zernask Siebeck, product marketing manager, Coveo, Steve Ingram, senior sales engineering manager, progress semaphore. If you would like to review this event or send it to a colleague, please use the same URL that you used for today's event. It will be archived for ninety days. Plus, you will receive an email with the URL to view the webinar once the archive is posted. If you would like a PDF of the deck, go to the handout section once the archive is live. Thank you again for joining us.

Unlocking the Power of RAG - KMWorld

Overview

Chapters

Transcript

Retrieval Augmented Generation (RAG) has risen in popularity as a powerful technique for enhancing the performance of language models. By leveraging external information sources, such as large datasets and knowledge bases, RAG can improve the quality and contextual relevance of responses. This makes it a very compelling choice for GenAI applications.