Welcome to today's KM World webinar brought to you by Coveo. I'm Mary D'Ogela, editor in chief at KM World magazine, and I will be the moderator for today's broadcast. Our presentation today is titled GenAI success begins with content, five strategies for accuracy and precision. Before we get started, I want to explain how you can be a part of this broadcast. There will be a question and answer, session. So if you have a question during the presentation, just type it into the question box provided and click on the submit button. We will try to get to as many questions as possible. But if your question has not been selected during the show, you will receive an email response to it within a few days. And now to introduce our speakers for today, David Atala, product manager Coveo, Matthew Lavois Sabra, product manager Coveo, Patricia Pettit Lang, product marketing manager Coveo. And now let me pass the event over to Patricia Pettit Lang, product marketing manager Coveo. Thank you so much, Mary and Dee. I really appreciate it. Hi, everyone. Thank you so much for that wonderful intro. So my name is Patricia Pizziliang. I'm product marketing manager here at Coveo. And today, we're so excited to talk about some really great strategies for really achieving accuracy and precision across your generative answering solutions. Today, we're here to really discuss these proven strategies that can help to enhance and boost knowledge discovery across your organization, and we also wanna help you find different ways to transform scattered structured and unstructured content, and to transform that data and content into actionable, accurate, and reliable answering. And so we really are here today to share some of our learnings from implementing GenAI at some of the world's largest tech companies. And for those of you who don't know Coveo, Coveo is the world's leading AI relevance platform that powers secure, accurate, and individualized AI search and generative experiences for global innovators who specialize in both customer and employee experiences. And some of the most, wonderful things about our platform, include the fact that we're built for the future of knowledge discovery, with our best in class content retrieval, our unified hybrid index, our fully managed AI suite, that can also help you surface content from anywhere, which is really exciting. And we really want to help you today cut through that enterprise complexity with the strategies that we're going to present, to help you deliver some award winning digital experiences. Alright. So I'm just gonna go through the agenda very quickly once again. I'll be sharing some emerging trends in generative AI, which I feel like last week a lot has emerged. And then Matt's here will talk about best practices for optimizing content, and then, of course, we'll have David, join us later to talk about some strategies for continuous improvement. And, of course, we'll have our q and a. So let's get started on some emerging trends for generative AI. When we take a look at the market, when we see trends to watch in twenty twenty five, what we are seeing is that according to McKinsey, about up to seventy percent of an employee's time can be freed by GenAI and other AI technologies, allowing them to focus on higher value, more strategic projects. But we also know that AI and generative AI doesn't fix everything. We're seeing that about fifty five percent of organizations are avoiding specific Gen AI use cases, especially due to data related issues and privacy as well as key barriers for adoption. On top of this, the Boston Consulting Group also found that about more than sixty six percent of enterprise leaders are ambivalent or dissatisfied with the progress of their AI and GenAI initiatives at their company, citing a shortage of talent and skills as part of the reason that their companies are struggling currently. We also know that search and RAG improve Gen AI performance, but the market is really saturated right now with countless black box solutions. And when we take a look at it, what we're seeing as well is that even the largest leaders, in AI and and search technology are struggling. So this is from twenty twenty four. As we can see, Google in twenty twenty four, when they released their AI answering features, they were telling us to put glue on pizza. Now it's twenty twenty five, and we're seeing that there still are small struggles that are occurring, that can have rippling effects in the industry as well. When we see examples of Apple's generative AI, producing, some problematic news stories that might contain hallucinations, for example. And so if twenty twenty four was a year of Gen AI experimentation, it's also been a brave new time for Gen AI failure. And I think that failure is an admirable it's a really great side of putting yourself out there. But, of course, when you have stakeholders and shareholders, to appease, you obviously want to make sure that you have a strong foundation in any sort of tool that you adopt, so that you can succeed quickly and also fail quickly. And so technology should empower employees to work smarter, not harder. AI isn't a one size fits all solution, but it can be when it's practiced properly. And so we want our technology to help unify workflows and remove barriers with both accuracy and precision. And accurate accuracy and precision really are some of the biggest pieces here that we have to prioritize. And so according to Deloitte and their research, everybody knows the saying now for twenty twenty five, garbage in, garbage out. When it comes to content to fuel generative AI and to get that accuracy and precisions, We're seeing that about seventy five percent of organizations are ramping up, their investments in data life cycle management, to prioritize security and content quality. Moving on, what we can see as well is, once again, Gen AI performance really relies heavily on your content strategy, and we've been seeing this over the past few years. When it comes to safeguarding against misinformation, content cleanliness has never been more important than before. We're seeing a lot of unforeseen results from generative answering, in the form of exposing hidden gaps in content management when it's surfacing answers that might relate to content that you didn't even know was being indexed by your system. It can also struggle to distinguish between valid and outdated information when you overload it with too many files, too many sources, too much content, it can have a hard time distinguishing the truth. And, of course, we see again, vendors are still offering very limited controls over these black box solutions that make training decisions vital. And so, of course, a strong content strategy is your best defense against unreliable outputs to secure the performance and and integrity of your generative AI solution. And so, of course, we know boiling the ocean is an unsustainable content approach, and that's why we're so happy to be sharing today some of the best practices for optimizing your content for GenAI. Matsir, do you wanna take it away? Yes. Thank you, Patricia. Very happy to be here to speak with you all and very happy to have the opportunity to share some of the great findings and learnings that we saw from our enterprise clients. And over the enterprise clients. And over the years, they've been working with us, and we've been helping them retrieve the great content they have. So we've been hearing and seeing the great practices that they were implementing to get relevant, accurate in secret search. But as good as our retrieval was, there was always a strong dependency on quality content. And now we're seeing the same thing with generative answering, and high quality, accurate, trustable content has become now more and more important because it's easy to distinguish from content that it may be old, that it may be, you know, written by, people posting questions on the forum, on the community, for example. But for a Gen AI and LLM, it's impossible to distinguish the truth from perhaps outdated or opinionated, content, for example. As such, content proper content strategy is now becoming more and more important for generative answering. And as Patricia elaborated, we need this accurate content for generative answering solutions to be efficient, to be accurate, and to be secure. So I'll go over some of the best practices that we've seen over the years, as we've evolved to from a search only platform to a search engine and advanced training platform. Starting with and I'll do a bit of an overview of the five strategies, and then we'll slowly dive in. But the first one that we'll talk about is permissions. Right? So ensuring that we can deliver answers that only that users allowed to see, essentially, that are secured and accurate. We'll talk about filtering, ensuring that we only index, and we only give LLMs the content that they already should be seeing that is accurate, that is relevant. We'll also be talking about optimization, how to actually structure the content, titles, the structure. How do you essentially make the content easy to digest for an LLM? We'll look at categorization, how does essentially organize the content into meaningful groups to improve accuracy of the retrieval. And then finally, we'll look at just the relevance. Right? Which content you actually wanna include in your models, content that's built and, you know, proper for search to be retrieved for humans might not actually be, you know, relevant and high quality enough for LLMs to to review. So we'll review that as well as, some of the strategies that we've seen our clients successfully implement to, to to practice those. So first thing first, indexing security permissions. Essentially, ensuring personalized and secured answer generation relies on indexing permissions. And this is one of the main struggles we've we've tried to solve initially when we deployed our generative answering solution. But we were essentially in luck because we already had that built in as part of our platform. But it's essentially key for LLMs to be able to have access to only the content that the user is allowed to see that is proprietary or that the user has permissions to see in order to generate an answer that is accurate and secured for that same user. So as such, the best practice is really to index, within your retrieval engine, that's gonna be used to then generate answers, item level permissions directly from your content repositories. So we build a technology to do that, and we've seen with their enterprise clients that this is pivotal to to deliver, you know, results that deliver answers that are tailored to the user, but, again, secure that only tells them what they're allowed to know. And then so when when that's done, essentially, only documents that the user is authorized to see will then be retrieved and passed to an LLM to generate an answer that, again, is gonna be tailored to that user. And that ensures that response are gonna be tailored to the user's permissions, ensuring accuracy as well as confidentiality. So that's the first thing. Very important. It's kind of the foundation for everything else. Secondly, not everything that is useful, but everything that is useful should be available. So what we see often is that when we wanna index content in articles, websites, there might be different pieces or different parts of that content that, you know, could be useful in some use cases, but then shouldn't necessarily be made available for an LLM. Right? So we wanna ensure that everything that is useful is included in our indexing keep in our indexing strategy, but we also wanna make sure that everything that is not useful from that content is excluded. So things like, you know, the relevant content of an item. So if we think of a knowledge article, it could be the issue, the description, or the solution to the specific problem. That, of course, needs to be include, should be a deck, should be shared with an OLM to provide an answer. But then if the content, also contains things like boilerplate, well, we need to make sure that this is ignored. So things like navigation will need to be excluded, but also things like headers and footers, essentially ensuring that all relevant info irrelevant information, I'm sorry, is nonindexed. Right? So the LLM can use what's relevant. But if your content is cut in the middle by a promotion or there's menus and things like that, that might talk about completely different things, we wanna ensure that that doesn't mix the LLMs, doesn't confuse it, and doesn't result in confusion or wrong answers from this, you know, boilerplate content that should not be shared or not be made available to an LLM. So property filtering that is gonna be key in ensuring accuracy and proper responses. The third one is optimization. So if a new man doesn't know what the item is about, well, the LLM will certainly not know either. So we wanna make sure that titles are always present and specific. Right? We've seen this sometimes where clients will index large PDFs that have very specific sections around troubleshooting, for example. But if that item is on its own, a troubleshooting section as maybe the PDF was split in multiple items, what we need to know is troubleshooting for what? For which product exactly? Right? So instead of a title like troubleshooting, for example, we will wanna specify it's a Mercuser GPS troubleshooting. Right? So the DLM itself can understand what this is about. If, again, if a human doesn't understand what it's about, the DLM won't either. And we also wanna ensure that the content is contextualized. Right? Meaning that, looking at this example here from the the, from a video provider website, the content is contextualized in a specific solution or specific, software that it's tailored to. So here we have, you know, some, example of how to sign in, and we said that this is for Windows and Mac OS as opposed to web and Android, for example. And the structure is very clear and built in a way that allows to see the step by step answers, ensuring that both a human but also an LLM can understand what we're talking about and for which specific context is the solution tailored for. In this case, it would be for Windows and Mac OS. And then categorization. I think that's a well known, issue. We see clients come to us often with, you know, multiple repositories that are handled by different teams, and we'll see that one repository will have a slightly different name for their product, for products as opposed to another repository, where marketing calls the product a specific way. R and d or product management calls it a different way, but then, you know, support calls it a a third different way. Right? So this needs to be normalized. So the key things here is to ensure that, you know, we can search within a specific category. Why search for all products if we can tailor and narrow the search and narrow the retrieval that the LLM will use to a specific category and increase accuracy. So normalizing key fields across documents is very important. So we help our clients do that through what we call extension that allow to normalize, for example, product names across multiple repositories so that when they end up in our index, it's gonna be one single name for that product for all the different sources that are indexed and centralized together, which makes, of course, a retrieval easier for an LLM. We also wanna ensure that we group and link content together based on meaningful relationship. So we can think, for example, as nonarticles that might be attached to other pieces of knowledge or other pieces of documents or cases that might have attachments or conversations attached to it. Right? The links between those things allow for us to essentially search through content as well as the related content that's attached to it. Other examples could be, for example, searching for, commerce related content. That might not be so much of a knowledge example, but if you're searching for a shoe, you need to know which variants of that shoe is available, which color of that shoe is available. They're all separate shoes with separate inventory level. So the same goes for knowledge management as links between items will help retrieve and understand the item within its context in the larger, index and all the content that's available to an LMM. Doing these things properly essentially enhance the retrieval consistency and the accuracy, again, helping an LLM generate more accurate answers for your users. And then finally, the last element is relevance. Right? So, essentially, what we're saying here is that not all search content should be used for answer. And I like to use an example that we had ourselves where we have, forums. We have a community where people can post questions, and we can essentially answer the questions in that community directly. This is very useful to get, you know, direct contact or direct q and a with our clients. But those forum discussions can sometimes become outdated. Right? Or we may sometimes see questions that a little bit confused and maybe misinterpret some of the documentation. Right? So this is great, and this is information and content that we want to be made available for search, but we don't necessarily want that content to be used by an LLM. Because as a user, if I'm searching and if I'm going to community forums, I'm looking at a very old answer, I'll be able to understand that it's a very old answer, and it might be outdated. It might still give me some information. But an LLM won't be able to make that distinction. And as such, may interpret those community questions or perhaps an updated answer on the community to be accurate. So what is searchable should not necessarily be made available for an LLM to use. Right? So there could be an extension distinction between the two. So the idea is we're gonna wanna ensure that all the useful content that, you know, can be inadequate for sorry. That even useful content, I'm sorry, can be inadequate for answer generation. Maybe useful for search for humans, but we don't necessarily wanna give it to an LM. We only wanna give what's a hundred percent accurate to the LLMs. Also, what is also up to date to the LLM. Right? So we wanna think of, like, the sources of truth, whether it be your documentation, whether it be articles that you know have been properly vetted and are up to date. That's what we wanna have the LLM answer from. But we may still allow other content to be available in our search for humans, but not necessarily again for DLMs. And what that will ensure is that it will improve the model's efficiency and the output quality by having only the most trust content, right, that can be used by the CLM to answer question. So these were the five key strategies that we've seen our clients use very successfully. But now having those strategies, we also need to continue to improve the content. Right? We need to understand what LM has LMs have been using to answer and help identify perhaps gaps. Right? Because it's hard for to know all those gaps in advance, and even documentation may sometimes become outdated. So David Atala will essentially share some of the best practices and strategies that we've seen our clients use, and that we're helping them, use to continually improve their content as well. I'll pass it over to you, David. Yeah. Thank you, Mathieu. So interestingly, as Mathieu was mentioning, this is all part of obviously, this all all this applies to our generative AI solutions and will help have the right content for generative AI. Now we've been focusing on content and relevance for years now. But there are some important pieces that we need to contextualize and put in place around what Coveo is doing around generative AI as well. And so we, last year or in twenty or late twenty three, have done some all of our investment in twenty four after twenty thirty four focused on what we call coherent relevance generative answering, which is our generative AI solution. Now this is all well and great, but what this came out with and when we released this, we realized that generative AI was this massive black box that no one really get what to do with. It was like, yes, content is great. Yes. I'm getting answers, but how is this answer being generated? Why is it being generated? And what can I do to control this thing? And so since, the whole year the whole twenty twenty four, we've been looking at that. And this is how where I'm pleased to announce that we've been working on what we called our knowledge hub. And the knowledge hub will be focusing on making sure, we built a platform built around, General VI and for General VI and for the general world in general. It will focus on transparency and make sure you have the right content for the right users and understand how this content is being generated and increase control and autonomy around that as well. And so you will be able to assess answers, and the content quality behind that. You'll be able to see the source of the content and how it's being generated, and you'll be able to ensure the output of that as well. And all of that is extremely important. And the way to do that and the way this translates in terms of the content itself is that we are building a platform that allows you to have a three sixty feedback loop with that LN. So not just saying, let me have this LN do its thing and then hope that it does well. No. Coveo will help you give the solutions and give an understanding of how this is being generated. And the way we do that is through multiple approaches. And the first one is, going through quantitative evaluations. The key metric today in the market of understanding generative AI is answer rates, making sure that model answers. Yes. That's great. But how can I drive these answers without having it go live into all of our users? Well, within the knowledge hub, you'll be able to look into that quantitative evaluation and quickly be able to have an assessment of how this is performing. Now I'll put this in kind of out there. Having a high answer rate, is it the most important bit? Maybe. But also you might say answer quality is extremely important because if the quality is if the answer is bad, well, getting an answer for that is not the best. And so being able to have quality evaluations in that, I'd also understand if the answer that was provided is helpful is extremely important. And so being able to assess that whether it's through manual human input, which is still required today or through automation with other LLMs that could help assess that is an element of investment that will help you understand if the content within that is good or bad. Now sometimes you might get an answers and this answer is really not good. And you're like, man, this LLM is really bad and I don't know what to do with it. And this answer is really poor. Well, we've provided the tools as well to compete, like, kind of break open that black box, that Pandora's box that is generative AI, where we literally provide word for word, chunk for chunk, how this answer is being generated. We take the document, the sources, we should we show all of this to you and we're like, here you go. These are the documents that were used and these were the parts of the document, the passages or the chunks as we call them, that were used to generate that answers. And this is all part of our approach to make sure we're completely transparent and we're completely ethical behind the results that we're getting behind that. And the group the key part behind this is that you will all be also be able to control the results that are coming out of that. So imagine in the world this answer is still bad and you've troubleshooted, you've looked at it, you've seen how it was generated, but you're still not satisfied with the results. So you have we provide the ability to quickly take control over that result and quickly block those bad answers and take action immediately to be able to provide to to save that end user experience, at the end of the day. And so this buys you time as a knowledge manager to be able to take this information and go look at the content. Because as Patricia was saying earlier, garbage in, garbage out. If the content is bad, Dell Elm has no way around it and has no way to solve this issue. And so by applying some of the strategies that Mitchell has just shared earlier, you have a few elements that allows you to have at least this high quantitative answer rate at the start with the quantitative evaluations. You'll be able to have a good understanding of, hey. The quality looks pretty good. You'll be able to troubleshoot and be able to say, hey. If the if the result is bad, let me take action on this and let me block this. But also importantly in the end is to be able to report on performance and how is it being used. And I think this is the last piece of investment we're doing and still investing in this year is the ability to understand better performance, quality, and activity behind the generative solution that Covell offers. And so by getting all those deep insights, the simplified management, and this kind of one stop shop for everything generative AI, we wanna make this a three sixty loop where this process will have to repeat itself because as of today, human input is still required to ensure the right content is being served and the LM is using the right area. So as long as this is required, especially this is a market, a market standard, we are making this stoop as efficient and as easy as possible. And so the knowledge hub will be the space for that, in twenty twenty five and beyond, hopefully. So the knowledge hub is aiming to be, available to all all Kavio users soon. And so we're hoping that this will set the standard and will provide the industry benchmark to be able to provide that for all users, around the world. And so from there, I think Patricia wants to share a bit more with you guys kind of the outcome of having no constraints, no content strategy, and what that means, when that when and what that means. Sorry. So, Patricia, I'll pass it on to you. Thank you so much, David. And just going back to your slide, this is so fantastic. Here. How do I let's see. How do I make perfect? Yeah. This is so fascinating too. It's so important to have a human in the loop and being able to give users and customers that agency and control over their generated outputs has never been more important. So this is really fascinating and exciting to see even those metrics too. Very great work, David. And, yeah, so what are the outcomes of having no content strategy? As we're learning with GenAI, less really is more. If you wanna have a higher impact on your generated outputs, you really do have to approach content with a lot of intentionality and thoughtfulness. And so the outcome of having no content strategy is going to be making the list of the biggest AI failures of twenty twenty five. So don't let this be you if you can help it. Really do try to be careful when you're approaching any sort of generated output, any sort of answering solution. And, of course, what are the outcomes of having a great content strategy? Well, as you can see, this might include some of our customers who have incorporated generative answering within their own self-service solutions and have seen some really great results over the past, year in twenty twenty four. And so companies like Xero, f five, Forcepoint, and SAP Concur, were a lot of earlier adopters, with our generative AI solution, and they provided us with a lot of input, and feedback too on how we were implementing it at the company. Matthew, do you have any comments here actually since you've been deeply involved in some of these implementations? Any thoughts or insights you'd like to share? Well, I think you you you said it great, and that's ultimately, you know, as as good as our retrieval might be for some of those clients, what really sets the difference or can have an exponential, you know, an exponential impact on the value they're getting from our solution is really the content at the end of the day that they're using. So we're facilitating the ingestion of that content. We're facilitating the accurate fuel of that content. We're facilitating the management of that content with the solutions David showed. And essentially, is by bringing all these pieces together into generative answering solutions that can provide an end to end solution for a client that we're seeing significant outcomes in self-service, as well as in case deflection, which then in turn yields significant ROI as we can see, for example, with the SAP concur example, number here. Yeah. So there's no, Oh, you go first. It's it's it's essentially a result of of all of these things together that really yields the the the outcomes that we're seeing here. It's not a single bit on its own. Well, that's it. And I was gonna say too, again, you've been deeply involved in these implementations as well. And I think a lot of these results too, we're only going to see grow with time as well, which is really exciting too. Like, these are their early results too. So a lot of exciting things for the future. And so like we said, when it comes to Gen AI success, it really does all start with content. Content is king here. If you're ever feeling worried about AI coming and taking your job, just know that knowledge managers have never been more important. Knowledge discovery has never been a more important area to invest in. All sorts of innovative technologies can come to your company and do all sorts of really cool things. However, if you don't have a strong content practice or an infrastructure in place, or even tagging systems, there's a lot that can be lost along the way. And these novelties will be exciting for a little bit, but they won't be a great long term solution. They won't be a great foundation for success like we are seeing with our our wonderful customers, who partner with us. And so, of course, the next step here is going to be relevance, and I'm really excited to share a few different resources that you can browse after this webinar. The first one is going to be a content guide on data cleaning best practices for enterprise AI success, which is a really, really granular document that really talks about the different ways that you can tag documents and metadata, to really help to scope, your content. That, again, your generative AI really excels with it, produces really great answers that are not only accurate but also reliable so that either your employees or your customers feel comfortable coming back to your sites to self-service. And, again, this document is highly technical, so I think you're all gonna really enjoy it. And, of course, we also have an on demand webinar, which is called the best retrieval method for your RAG and LLM applications. And so if you are curious about Coveo's offering, this is a really great deep dive into how we do RAG and what we like to call relevance augmented retrieval as well, which is another component to the many different layers of our Coveo offering to power unified hybrid search, relevance experiences, artificial intelligence across every point of experience, etcetera, etcetera. I'd like to have a huge thanks to and to David for joining me today and, of course, to KM World and Mary Deed. And I think now it's time for our q and a section. And we have questions. Indeed. We do. We do. So let me let me just start here, with someone who wants to know how do we handle outdated or duplicate documents in our knowledge base so they don't disrupt the accuracy of Gen AI responses? Do you wanna take that one, David, or should I give it a give it a try? Yeah. Yeah. You can start. That's fine. Well, some of the tooling that David showed typically tend to help with understanding, content that might be inaccurate. And our tooling is also built in a way that we can help identify duplicates, for example, when we index content. We've also seen, knowledge management practices across where the clients that we work with use our search and retrieval at content creation time. So, essentially, would revolve around, I would say, three aspects. One is identifying content that already exists as you're about to create new content to avoid creating duplicates. When we index content, we're able to identify duplicates and essentially merge them, ensuring that we don't have twice the same document if it's exactly the same. And then finally, if if that content, for example, we have a new version, the first version is outdated, with our toolings, as David presented, we're both identified, you know, content that might be inaccurate, that might be older, that may help to generate answers that are, not up to date because of, you know, the content might not be, to identify that content and to give that feedback or insights to knowledge managers so it can be updated. So I'd say that it's those three aspects is how we typically help with, with that. That's really nice. Can I bully you a bit? I have another question. I have a second part to that question now that I'm thinking about it too. How does our platform handle two documents with the same name? That might be, for example, let's say it's like, a digital workplace, but you have, like, your health care policies for the United States or Europe. It's a great question. So there's two two aspects I'd say to that. First, it well, it comes a bit of content hygiene. Right? So we offer the tooling when you're indexing content to change the way that your content is named in our index to make it more accurate as well as to include the product meta metadata so that we can identify, of course, you know, if that's the US version of the of the policy or the European version of the policy, for example. With having that in your index, we can then improve the retrieval to be more accurate in two other ways, I would say. So let's let's put my my answer from two to three here. One of which would be filtering the content, but we can also understand the context of the user. So if we know where the user's from, and we know that the user's from the US, for example, it's something that our retrieval engine can automatically understand to make the retrieval more accurate. And we're also working on ways for the LLMs to be instructed with that, to be able to provide even more accurate answers, ensuring that if I'm, for example, I'm in the US or I'm in Canada, I'm asking for the holidays. Well, I don't get the list of US holidays. I'm gonna get the list of Canadian holidays, and I won't get the same result that David gets who's in the UK, for example. So, of course, content hygiene is gonna be the start of this, but then we're putting in place the tooling to help increase or enhance the that content as it reaches our index, and then the tools to ensure that we're making a retrieval more accurate and more tailored and filtered, so that ultimately we get personalized and accurate answers. That's awesome. Thank you. Can can I jump into just as another follow on question? What happens if there's someone in the US who's actually interested in the UK holidays because that person wants to schedule meetings and doesn't wanna schedule on a UK holiday? Is there a way to override the personalization? That's a great question. So, as I said, there's two ways. Right? We can automatically take the context and apply it, but there's also filtering. Right? So there's also manual filtering that would allow a user to specifically filter. I wanna look at the US content alone. I wanna look at the European content alone, which is why the first aspect essentially the proper content hygiene and normalization as it gets indexed, which is something that we really help knowledge managers do. I think is the foundational part that allows to have both automated personalization and accurate retrieval for user, but also have the metadata to allow filtering capabilities to be used by user if the user would decide to look at content from another country, for a given reason. So and, ultimately, it's having proper tagging that that, allows that. Yeah. I love that to say that the Coveo platform allows you to do both, and it's completely up to you as a as a user of the Coveo platform how you'd wanna set it up. And that's the the power and the beauty of because there was that flexibility that it offers to be able to decide exactly how it's set up. And, a lot of platform claim they can do it as efficiently, but I don't think all of them can. K. We've got another question here about could could the Gen AI prompt put the response in the form of an email that could then be copied and pasted to a customer? I can take this one if you want, Matthew. So just to talk from today, the way the way we're currently set up in our at the existing prompt within Coveo within our own Coveo relevant gender of answering is set and is pretty, I would say, standard to be able to work on the most use cases as possible across the Coveo platform. So Barca a relevant gender of answering model is across all of the touch points that Coveo offers from self-service, to the case form, as well as for agent experiences as well, and even some commerce customers as well. So it's pretty standard and generic at the moment, and our prompt is pretty locked, I would say. However, we do offer what we call the passage retrieval API, which is an API that allows you to take, the benefit of our of our first stage retrieval and be able to have all the content still within Coveo, but then have your own LLM, do whatever you want with it. So you could have your prompt with your own LLM, give it whatever prompt you want, whether it's make it into to an email or create an image out of your text, whatever it is. A funky use case you wanna do with this, you'd be able to do that. All of that leveraging all of security index content, everything that Matt you mentioned earlier essentially apply to an LM. So at the moment, it's an intentional decision on Kavio's side to keep it standard and the most, quote, unquote, generic as possible. However, we do offer, with the tools that I mentioned earlier, some control over that and being able to control some of the results. And what we are classifying called within Kadeo Query Pipelines is the ability to just refine how that content goes into which content do we use, how do we use that content. And then the other one will just take that and then start providing more, dedicated response. So to that, today, the way the Coveo, the CRJ model, as we call it, is pretty generic. But you can just take that passenger truly API that I just mentioned and make it work on any prompt you want, essentially. So it's completely up to our customers, and that passenger API actually is going GA in the next few days. Yeah. And we already have a few customers using it. So it's something that we're extremely excited about and has had a lot of traction in the market because we know that that API first approach and being able to build whatever you want is extremely exciting for some of some of our customers. For sure. And that's such a great answer, David. And I feel like too, Mats, you correct me if I'm wrong too, but CRGA for this, like, the agent use case as well with things like rich text formatting and the different ways that we can also display answers for the agents within, their generative dancing panel. You can also create, like, step by step guides, basically, for how to solve certain issues to that. They can also probably just copy paste it to the columns after. Right? Absolutely. Absolutely. A lot of the problems, it is really built for question answering. Right? But because it answers a question in much details by using content from multiple sources, as David elaborated, it's very generic. Right? It still applies to a lot of use case, whether it be self-service, case resolution for, a case form or case resolution for an agent. And we do offer the capabilities then to for an agent to copy our our question and, you know, paste it into an email. If you'd like that answer to be already written as an email, you know, the passage retrieval API is gonna be the way to go as David suggested, where you can essentially retrieve the same passages that we use to generate our answer. That's more of a question answering prompt, but use it with your own prompt and say, hey. Don't give me the answer. Actually, write it right away into an email, so that the the the agent doesn't need to copy the answer into an email, but actually has the full email up already written. That So the caveat behind that is we going back to what we said earlier, human in the loop, the importance of having human input and reviewing those answers is key. So one of the reasons why we don't offer it out of the box is because we believe that, for example, for an agent, there is still a need for a human eye to look at the results before sharing it. Now each customer have their own requirements, so we're not here to be the judge of that. So we offered the option to, as much as you said, to take the password through API, but we recommend and because they were here to recommend stuff, we recommend a human in the loop still and not just blindly share a generated response because you'll never know. It could always have a snitch. As much as our quality and answer rate can be as high as it can, you will all there's always, like, a limit to some of those. And so if a bit of the content is a bit off and you forgot about it, then all of a sudden, it sounded like, oop, I have a disaster in my hands. Yeah. We like to say that we, because we know retrieval is accurate, and the retrieval to secure it is accurate, but the prompt always leaves a bit of of openness to DLM. So we like to say that we're almost threatening DLM in our prompt with our proprietary solution to force it to answer accurately. So we're usually not seeing a lot of hallucinations in our solution. That's why large enterprises like United Airlines and Dell and Zeros are are confident with our solution. If you were to use your own prompt, then, you know, we have, like, natural language engineers, PhDs that have made a lot of studies to build their prompt. And they like to say they're they're threatening LLM to make it as safe and and accurate as possible. But there's always a slight risk in using your own prompt. You could also, you know, have opened the the door to the LLM, you know, putting a bit of a hallucination, trying to really do what you're asking it to do and potentially cutting some corners and giving you an answer or an email that's not a hundred percent accurate. So human in the loop is key, today, with LLMs as accurate and as powerful as they are and as accurate as your retrieval might be, there's still, of course, some openness in the interpretation of the prompt by an LLM, which could cause some, you know, some slight hallucinations. So, the human, I think, will will be required for quite a quite a bit of time still. Yeah. And I think you two are also touching upon, like, the beauty of our platform too is that we have a lot of really great out of the box already made solutions like our relevance generative answering solution. So if you wanna do something quick and easy to have that answering capability within your external or internal use cases, you can do that with Coveo very easily. But we also give you the option to to use passage retrieval API to build whatever experience you want, ground that in our accuracy, relevant security, avoid hallucinations, all of that. But I love the whole build versus buy aspect. You can really build and buy with Coveo, which, again, it's a very versatile platform. Alright. On And it never tell you to put glue on your pizza. Yeah. We we're not gonna do that. Just gonna find your first slides there. We do have a question here, about security. I think you mentioned that a little bit earlier here. In terms of permissions and user access levels, how can we maintain strict security protocols while still maximizing the breadth of content available to GenAI? That that's tricky one. I can try that, but, I mean, the question sounds tricky, but the solution we have for that is actually quite straightforward. So, for many, many years now, I think it must have been, you know, at the inception of Coveo. Coveo built a very, very strict and very powerful, security capabilities when indexing content. So with some providers out there, you have when you index content, you have to manually rebuild. You have to manually somehow attach all permissions to the content as you bring it into that indexing your search tool. Whereas we build connectors and capabilities that will essentially automatically mimic and exactly replicate the permission sets or permission rules of it that of it that indicates the system from which we're indexing content. And we do so with a process called early binding, which means that item level permissions will be attached to an item before it even gets to our index. And then once it's attached to those permissions, if you don't have the permission to see this item, it's like it doesn't exist for you. Right? So only content that exists for you and can be retrieved can then be made available to be used by an LLM, generate an answer for you. So that's how we've tackled that over the years. I hope that answers that that, that that question, well. But we've essentially made security to be the foundation of our platform. All content that goes into platform is already indexed and secured, so there's no way to go around that security, and just the content that you're not allowed to see. And to give a concrete example, if Machu and I would be on the same page, on the same community, or on the same portal, whatever it might be Mhmm. And we didn't have the same permissions, and we ask the same question, we would get different results, because of that permission. So it's this is something that happens on the spot on the fly, the moment the query happens. So that's not something that needs to be controlled before. So depending on what type of user I am, let's say I'm a end user that's part of a that's part of, like, the the general masses, and I shouldn't be able to have access to internal content, then I will not. Whereas if I'm an agent and and my team is an agent and he has access to that specific content, that security just happens from the get go. There's there is absolutely no risk that that flies through because those permissions were set, at the start of the setup. And this is something our team and our and our partners can easily do. It's something it's part of our standards. It's it's not just for the LMS. Let's be let's be clear. It's also for our classic search solution. It's for it's part of the core foundation that what Coville is built on. I was gonna say too, I feel like because we have so many external facing customers as well, when we design a lot of these things, we think about what can go wrong. Like, what's the example we always have? What is it, like, United when they're using, like, their generative answering solution? Like, you you can think of, like, a million questions to ask to try to get the generative answering solution to answer something strange. Like, can I smuggle, like, my pet in my suitcase? And so Can I put my kid in the upper hand compartment? Example. Yeah. Can we put the child in there? Yeah. So we we're always thinking of these use cases. We don't wanna embarrass people in the public arena, and just make sure again that we're delivering the right answers. We don't answer such questions, just to be clear, because we people have tried and tested our solution on on United Airlines to try to get answers to those. And sometimes for those questions, the best answer is actually no answer. Right? So, we've been able to avoid that for our clients. Excellent. Yeah. We hear a lot about guardrails, which I think is basically what you're describing here, that you have certain guardrails in place so that ridiculous questions don't get answered. Indeed. So let's see. What else have we got here? If we notice the AI generating incorrect or misleading answers, what's the best way to troubleshoot and correct those issues? And, boy, have we heard a lot. That one has you. Go ahead. Yeah. Yeah. Yeah. I can take that one. So, so this is part of the suite of solutions that are available as part of the module that I've just shared. The idea behind it is, again, that three sixty loop that I that I mentioned earlier, the idea is that the reality that you will probably face at some point in time an incorrect answer from an alum, it's almost unavoidable just because of as Matthew mentioned, we can make it as rigid as we want. We can give it as many rules as we want. We can be make the prompt as aggressive or as attacking as much you described it, as we want. There are still some some margins for for inaccuracies. And so the way we make sure to that loop is handled is through that three sixty that I mentioned earlier from quantitative all the way to the reporting, going through quant quantitative evaluations, being able to troubleshoot and being able to control. Because if you get a bad if you get a bad answer, your first protocol is we need to understand that this is a bad answer. So this is, like, there's this idea of reporting a bad answer and making sure your users are in tune with LMs. What we had realized is that when we when our general solution started to go into Barca, really people didn't or one didn't know how to, engage with the with the with the LMS. So they would ask it, queries that were not relevant or could not be give you, like, a correct answer because it was too broad. So let's say you ask it a question like, Apple. Apple is not a question. It will not really know what to say. But if you ask it, how do apple grow? How how is an apple does an apple tree grow? Well, all of a sudden, it knows a bit more and be able to answer that question. So our first protocol was to make sure that our users are well informed and know how to prompt and query the LMs. And then the second bit is then going through those motions of, okay. We get an answer. The answer is bad. Let me assess that answer. How is has how has this answer been generated? Let me use our the trunk inspector in this case for that Coveo offers. Let me understand which chunks which part which documents were used and which part of these documents were used to generate answers. If the document is bad, if the content within that document is bad, it's my action. My my action as an audio manager to take action on this and go and clean up the content to make sure that the content that's in is the right one. And then from that point onwards, there's this motion of retesting. So I've changed the content. How does that then impact the new result that I've tested? And this is pretty efficient and quick and easy to do within Compare Aware. You can just ask to reindex the content on the spot or there's automatic reindexing that happens as you add more content to it. And then this whole content is taken into consideration, and then you can retest that. Now if there is no improvements, there is very much a possibility that the other end gets it wrong, and it's something that can be that can happen. And this is part of the investments we're doing to make sure we're at the top of the line in terms of what offers. So, for example, we're currently migrating to GPT for a mini for a lot of our users. We're providing a lot more quality behind some of the rules that we provide. We're also improving the performance behind our generative, models. So all of that is something that happens in the background, but shouldn't worry or shouldn't be kind of in the forefront of our our of of our users' minds. What they should be worrying about is making sure the content's right. And if the content is right, the possibility that the answer is right from the LLM is much higher. And this is what we're trying to try for, where in one world, ideally, I was talking earlier about answer rate and answer quality. And today, we're about at about an average of sixty to eighty percent answer rate for about, let's say, a sixty to seventy percent answer quality or no answer. And as Metra said, better have the no answer than an answer. But, ideally, we'd we'd get into a place where it's a hundred percent, hundred percent, where every single answer is the right one, and there's always an answer generated. But if you ask this to a lot of users, would you ever have a hundred percent answer rate and a fifty percent answer quality or a fifty percent answer rate and a hundred percent answer quality? That latter will always win because we would rather have quality rather than quantity in this case because, again, it helps in whatever use case the the general solution might be on. So it's all about that that three sixty loop, that human in the loop that we mentioned earlier and making sure it's a repeated action that's it's not something that it's not it's never it's not yet a sudden forget. However, Coveo is investing in tools to be able to help you understand why is an answer being generated automatically, so less manual inputs beyond user, the quality. So we're also investing in understanding, can we help users automatically identify if an answer is right or wrong? How do we do that? Again, it's trade secrets, but we're working on that at the moment. So there's a bunch of stuff we can do there to, that that that could help us get to a point where we're pretty satisfied in one, the human in the loop plus all the automations that work behind that. And that three sixty feedback loop, we believe, will be and will is and will be the industry standard for the, the upcoming few years. It's not something that will go away, like, magically. Yeah. David, you're so right. And that's it too. I think everybody does want to be able to set it and forget it, but, truly, you do need to supervise and do regular content audits when you can. An example I'm thinking of, one of our favorite customers who I won't name, but they're currently working on their generative answering solution. And one thing they love to do is to just get their subject matter experts to do qualitative evaluations of their answer batches. And it's been really exciting to hear about it. It's just fun to hear about data. So thanks for that, Dave. And and subject matter experts are required depending on the complexity of the product and the solution you have. Today, the way Coveo runs and Mitch ensures our customers have success with their first setup of, of gen of, our CRGA is by working closely with first, our point of contact within the customer base, which could be, of course, probably as a knowledge manager. But they then actively work with subject matter experts to make sure the right answers are being generated. So, again, this is the side of the human in the loop because depending on the complexity, you might need someone really expert to be able to say that something about a financial policy that's extremely specific in a specific country. Well, you need that expert to be able to answer that. Luckily, we're pretty good at those, but, again, it's, it's something that will will remain as is, at least for the upcoming future. Let me give, you guys another question on a somewhat different topic here. Which straightforward metrics or KPIs can we track to gauge how well our generative AI is performing in retrieving and generating relevant answers? I can give that one a try, David. Yeah. You should allow me. Well, you know, we've historically focused a lot on, like, the traditional search metrics, like visit click through, search event click through, click rank, and things like that. And we're slowly seeing this text becoming less and less relevant, of course, because with the generated answer, you don't necessarily need to click. The answer on its own might be just enough to get to get to your outcome. So we're moving to try to focus a lot more on the specific outcomes. Right? Things like self-service success, the number of cases that are submitted from a portal. So we tend to try to look at it maybe the other way around. Now we're we're typically looking at the percentage of users who click, as opposed to looking at the percentage of users who fail and go and open a support ticket or request additional assistance. We're now kind of slowly looking the other way around, and we're looking at the submission rate of a portal. And what we've been seeing overwhelmingly is that the number of clicks might somewhat decrease once you have a generative answering solution being implemented in the self-service or, support portal or in a community. But on the other hand, we tend to see that there's a reduction in number of of case submissions or requests for assistance in the visits, and that's how we can see that the solution's having an impact on, you know, the actual outcomes of the user. Of course, there's all the qualitative evaluations, because people can put a thumbs up and a thumbs down and provide, you know, feedback on the question. There's the answer rate as well, seeing often the question answer. And then we tend to look at, okay. Well, you know, what are the patterns that we're seeing when we have a visit with an answer generated versus a visit without? And that's how we can see that delta and typically shows us that visits that do show a generated answer tend to perform better, and we tend to see much less case being submitted for those visits. Yeah. I think to what you're talking about, Matthew, touches on really well some of the stats I even shared earlier about GenAI allowing a lot of employees to to be able to prioritize how your value work. The examples that I'm thinking of that include our customers who might be, using our generative answering solution to help their employees answer customer questions or service different types of customers. And so when you get rid of all of the extra basic questions that are coming through to our either our agents or our advisors, etcetera, It gives them more bandwidth and time to solve these more complex issues and go even further. Do you think we're getting better at being able to determine relevance? Because search relevance has been an issue forever. Is Gen AI helping with that? Are we seeing improvements? It's demonstrating that's that that some people's relevance, I think, could use improvement. What do you think, Justin? I think, yeah, Jenny I is demonstrating the need for better relevance. Yeah. This is interesting. Yeah. It's what we're seeing and, you know, I mean, relevance has been our bread and butter for for many years now. So, you know, we've been, you know, you know, striving hard to help solve those relevant issues and make search experiences great, right, for for large enterprises clients. So having this background experience has allowed us to help generative answering to all be relevant. But as as Patricia said, relevance now is as much of a topic and as important or even more important with generative answering, to ensure that, you know, well, what the LLM is answering from is really relevant for the question. So it is a very, very important topic indeed. And and, you know, we're constantly making improvements. We're constantly improving our LLMs. I think maybe one key aspect that we haven't really touched on, and that relates to to the question around around data, is data and interactions as well as answers shown, which you can copy, you can thumbs up, thumbs down, you can click on a citation. We still see human interactions, and we learn from those interactions to bring that back into a learning loop to make retrieval retrieval accuracy, relevance, and then, of course, you know, the next generated answer even more accurate. So despite the five dozen times less clicks, because answers on their own tend to just perhaps give you the answer. You don't need to click. We still see interactions. People wanted to go to the citations, copying the answer, doing a like, still clicking sometimes on regular results. So that's still feedback and insights that we use Yeah. As well as the outcomes to constantly improve relevance and and make it better for different use cases. So behavioral good old behavioral machine learning is no is there a Yes. Human in the loop, especially when that human is the person asking the question in the first place. Exactly so. We need to learn something from what they're doing. I mean, their outcome and their interactions are important, right, ultimately. Yeah. So we continue to use that. That's absolutely right. Yeah. Yeah. Yeah. Well, that is all the time we have for questions today, and we apologize that we were unable to get to all of your questions. But as I stated earlier, all questions will be answered via email. And I would like to thank our speakers today, David Atalo, product manager Coveo, Matthew Lavois Saboel, product manager Coveo, Patricia Pettit Liang, product marketing manager Coveo. And if you would like to review this event or send it to a colleague, please use the same URL that you used for today's event. It will be archived for ninety days. Plus, you will receive an email with the URL to view the webinar once the archive is posted. If you would like a PDF of the deck, go to the handout section once the archive is live. And thank you again for joining us today. Thank you, Mary d. Thank you. Thanks, everyone. Thanks, everyone.

GenAI Success Begins with Content: 5 Strategies for Accuracy & Precision

Overview

Chapters

Transcript

Generative AI holds immense promise for knowledge discovery and engagement. Yet, according to BCG, 66% of leaders are dissatisfied with their GenAI progress. The missing link? Properly curated, indexed and refined content.

Join David Atallah and Mathieu Lavoie-Sabourin, Coveo's GenAI experts, and product marketer Patricia Petit Liang for a deep dive into proven strategies to maximize GenAI's potential. Discover how to transform scattered structured and unstructured data into accurate, actionable and reliable answering.

Key Takeaways: