Hello everyone. My name is Vincent. I am the director of Applied AI Solutions here at Coveo. I'm very happy to be the host of this Masterclass for Knowledge twenty twenty six and I'm here with my good friend Oscar. Hey, hi Vince. How are doing? I'm good. I'm good. Thanks. Very delighted to be here and to show actually where we are. A bit of background, Oscar and I were at the inception of Coveo RG, so we've been at Coveo for a long time. The LLM came on the market, started to have a lot of pressure to get generative experiences out there. Now it's done, and we're kind of excited to showcase today the new phase of it, I'd say. So I think it's officially released now. Can you tell me more about what's going on? Yeah. We are excited to have a conversational feature in beta tested tested by customer at the moment. So really happy. We're getting a lot of good feedback. And it fundamentally changes how user and end user are getting support and can go deeper into their complex self-service experience. So I think that we're gonna show some of that today. Yeah. I I I think that's the expectation nowadays anyway. Everybody is using these experiences and the word we hear from every single customer is conversational. So people are not just expecting to get an answer, they want to kind of troubleshoot with the the interface or with the data. Is that what you see actually? Yeah. And that's, that's interesting because in self-service and support, context, the goal for customers even before LLM has always been to guide user that don't really know what to ask or don't know the depth of the documentation and how to move from one document to the other. And now I would think we have the technology that can really help that and alleviate a lot of pressure on the end users. You're right. When you are a search engine provider, you look at the queries that are coming in and there's new one every day. Even if you don't change your content, people don't know how to interact necessarily with your content or don't know the words. So they're gonna kind of try a variation of the same thing and that's where I think the LLM can really help like kind of assemble the whole thing. Yeah. Let's go and start with just showcasing the UI directly. So we are on the Coveo documentation website. This is usually our test bed for production so we are drinking our own champagne and starting by usually battlefield testing or battle testing this whole thing directly on our stuff. So Coveo docs, you can try it out. The first query we'll try here is something a little bit standard from our industry, which is I need to index a new metadata from my website. So you're using Coveo to index your website, you want to add a new property or something in there. So let's fire it away. You'll notice that it's not fundamentally different from what we had before, meaning that the first answer show up in the same kind of widget. How do you get that first speed? I think it's it's it's kind of it's kind of snappy, you know? Yes. So we've we know when you're on a support site, you you expect a really really low latency. You want something really quick. So we've we've optimized that first answer to be fast as the entry point to the experience. And then I think what we've introduced recently is the ability to keep going with the follow ups and and get into the more conversational mode where more reasoning power is needed. So and and the trade off for user to, kind of, like, wait a little bit and get, like, more brainpower is is acceptable and we'll we'll we'll show that here. Yeah. That's I I mean, that's clever for sure. The first answer come back. It's well formatted again. I have a ton of questions regarding like what do you need to do this, but we'll dig in a little bit later. A lot of good stuff actually. You can see online doc in in line URLs that are showing up. That's that's brand new I think from this week. The core of the presentation today is obviously the follow-up part or the conversational aspect of it. My follow-up is quite simple here. I have a new metadata from my website, but it's not the format expected. It's it's kind of a JSON and I'm expecting to have an error. So this is where it's cool. You can really go and ask questions in natural language, very complex, multiple topics if you want. In this case, the data isn't clean, I would like to transform it. I know internally the answer. It's an indexing pipeline extension. Does Coveo Conversational know it? What's happening here at this point? Well, are we have the first answer. We have also some documented passages from your first answer, so we keep that in the memory. And we perform other searches, so we look at your new query in the light of what we already know and we'll let the LLM and that's the agentic part. We'll let him use the search tools as as it wants. There is some constraints, some guardrails obviously, so it doesn't go in endless circles. But we let it decide how many searches, how to reformulate the searches so that it takes it uses and take as it as as a benefits from the search tools really and all the capabilities. So it can go deeper. It can go multiple turns, which is what also bring more accuracy and and more resolution steps for for your problem. And let's see what what it did here. It it got it right actually. So if it is adjacent and you want it to be an array, a join operation won't work. So what you need to do is creating an indexing pipeline extension. Good job, robot. And then it gives you even a template here in Python so you can port it directly to your cloud organization, gives you some steps to deploy. I think it's great. RGA was well known to be extremely precise and rigorous. That was the requirement we had for many of our customers that are under let's say regulations or with applicable laws. Here are we still as much on guardrails or do we kind of let it a little bit looser? So there is a we we took the good from from CRGA and made made it better really. So the the process we had to evaluate the context, make sure it's grounded in the documentation. It's still or answering tool here. So it has a search tool and it has answering tool. Okay. And so that grantedness remains while we've allowed the reasoning to happen over it and over the search tool. So we're we don't wanna get rid of this because that was really appreciated by our customer and we kept it on track. So now this CRGA element is the answering machine for the LLM agent that's controlling it and search tools. Okay. Very interesting. So at this point, I'll try just something else, for French. In French, please. At this point, since we're asking different questions here, some of them are retrieval based. Like you said, you can do multi turn. So if you're asking three different questions, it's going to transform it into three different queries. But you see here the speed at which it understood that it didn't need to go to the index again. It was just like play with the ingredients you already have. Correct. So it just answered like straight enough. Yeah. That's answering from the memory basically. We we keep like and we have the secret sauce on like how we handle memory. It's still not like a a solved problem where you could just, like, grab a library on a an open source library and just expect it to work while, you know, minimizing your or maximizing your token input and and all those things. So we've got our secret sauce and we know when not to when not to search basically and when to answer from memory and it was a lot faster just because he didn't have to do a lot of reasoning, a lot of searching. The the round trip. I find it interesting and also the fact that it's a based on your data and you don't need to provide like a huge context for it to perform well. In an era where costs are an uncertainty in all this agentic world, I think this is a very good sign of control actually and and and making sure that you're not, creating something that you won't be able to afford afterward. My next question is pretty simple. How do like, there is a lot of good thing here, but to happen, my understanding is that the answers are are as good as the content you provide. Right? Yeah. It still needs the raw content. We are bounding it to your knowledge, your content, and you define it. We'll talk about it later how you really scope this, but it needs raw ingredients to create the recipe. Yeah. And I think at this point you guys are comfortable with generative experiences. The wow effect nowadays is not necessarily the UI. I think the wow effect is definitely how can you make this a reality within your environment. So this is what we're to look at right now. In terms of content, there is a lot of providers out there that are promising generative experiences. However, you're usually kind of locked in a specific universe, whatever it's SAP or Salesforce or Microsoft. You're going to be able to do things within that realm but not necessarily outside. Here we have the Administration Console. You must be familiar with it. And at this point you'll see that for the experience we had on the documentation website we're using many different content. What do we see here? That's Confluence. Yeah, there's a sitemap. There is like REST APIs. You can see the quantity of documents extracted from each one of them. Confluence is a big one obviously. S3 buckets from the Tech Doc, YouTube, Sitecore. So this is a very good example. This implementation is a good example of a mixed bag of content, which is the case for most organizations. You take the best system for the best job, but then at one point you realize you have data scattered all over the place. Do we have tools to help, I'd say, manage this data? Yeah. Because you you bring in the content, it might not be in the right format, it might not be clean, you might wanna massage it a little bit as it comes in so it it it helps the search engine, to perform better and and understand your content better. So, if you go to the the field the extension section, we can see how we allow, our implementers to add extension to transform the the data and the metadata as it comes in. So you have, like you could you could do it pre indexing. You could do after it's indexed. So a lot of opportunity to improve the quality of your content. So and and also do a little bit of, like, standardization. We see a lot of customer standardizing fields and taxonomy right when when yeah. When user. When it comes to to Coveo, and that's that's really a a really good first step in your kind of, like, data health and and data management perspective. I liked it. I like it even more nowadays because if you open one of them you'll realize that this is a Python script that's going to intercept a document and then work on it and just throw it back in the indexing pipeline. That's how it works. You have many different options here and there but it is code and if you know me, I love code if I don't write it. So with what we just showed earlier, can see that this was exactly what I was looking for. I'm looking for the conversational tool to help me generate that configuration and you see that it's definitely based out of examples internally because like the way it's annotated and the way it knows all the different properties, that's exactly what I needed. You take this, paste it here, and voila, you can start getting better content for your agent just by using our agent. It's going to be meta today. So if we continue, what about the content? So I know that most of the So I know that most of the connectors we use are are pulling in HTML files, whatever it's a Confluence page or a website. Do we have other type of content we tolerate? Yeah. So we take we'll take anything in that text basically. So whether it comes from a PDF, SharePoint, you name it, we we will we'll be able to ingest it and and make it available for search and LLM's down the down the road. So, yeah, the most common are HTML documents. But the PDFs, when you get to a real use cases, like, a lot of people have like, manuals and and long like policies, HR policies, all the stuff of of policies which can on nasty PDF with tables, nested tables, So that's that's when the reality of like comes in. And, yeah, we we we optimize for it too. And when you say optimize, like, how far can we stretch that thing? What's the limit? Yeah. We well, in terms of limit, you can bring into your Agentic experiences. You can bring up to fifteen million documents, so that's that's quite high. We have only a very few large customer that are hitting those limits right now. The the other aspect is kind of like the the quality of it. A lot of customers ask us, hey, what are the best practice? How do I write the content, etcetera? And with AI, it's becoming easier and easier. Yet, we still get HTML file. We still get PDF, and that's not kind of like the preferred diet that for LLM. So what we've done recently was to automatically turn the content that comes into Coveo into a markdown. So here, we see conflict the three data streams that are happening through the ingestion process. The first one is converting any document, even a PDF, into text. So it could be it's searchable, and that has been at Coveo for a long time. Just the search, so we extract the keywords. And we're also using here you see that you're gonna have chunks of keywords, but there's also tags. So some metadata fields, for instance, you can just dump them in there to increase visibility of, of the, I say, the whole document to the LLM. So then the second one in the middle is the HTML. Probably the original form of the document. We use it as preview as for the admin to navigate, etcetera. And really the third one, which is the preferred way now for LLM and all our generative model tap into that format is is markdown. So we automatically transform, those documents, and the great thing is to keep the structure. We keep the tables. We keep the headings. We keep the links. That's how you saw that that inline links that was not hallucinated. We let the LLM decide on like, oh, what's the destination for that link? It was it's coming from the challenge initially when we started Yeah. To do it inside If you just say, well, Generative Links, they all just make up links as from pretraining. So this is, again, a way to facilitate our customer way or path to AI adoption, and so they don't have to do it. And we'll see a little bit later how that can, like, improve the quality of the Answers with other elements that we have in the in the pipeline. Yeah. I'm quite excited for that part. If we go back here to the content browser just for fun, we'll take a sample just for you guys to understand what we mean by these concepts. So if you open one of these documents, this one is a search page, a documentation page from our search website. So you see the classic fields that we're indexing, but here you have what we call a quick view. And this quick view is the HTML representation of what we indexed. This one is pretty clean. You see that we have only the relevant part. We have curated this. It is a big page, however, so there's probably chunking happening behind the scene, you know? But then that's that's really the gist of it. If you take another page, for instance, here, let's go and check the blog. I I found something very interesting, offline purchase. This one was released this week. If you open this guy here, you'll notice a few things that are, I'd say, irritating for me. What do you notice that could be better here? Yeah. Well, we we've got some noise. We don't have just the content. We have a lot of things that is not helping search, like knowing that there's a CTA or banner. It's not relevant for your users. Most likely it doesn't have information really. It's promotion for our current session. Yeah. So I complain too much but still I agree this is noise and behind the scene these will be if you realize these will be in the chunks that will be sent to the LLM converted into tokens. So at this point when we start adding up the semantic embeddings, for semantic search and you start, like, chunking and creating generative experiences, Every you try to be as lean as possible. That's how you get you get fast. That's how you get good. These things scale. So at one point, you have a million document, a million time this, header starts to create weight. I think that's it for that part. Now I want to understand a little bit better, the impact, of it. So or yep. So if you start doing a markdown, so or the do you see kind of an increase in terms of relevance, between all of them? Yes. Because we've what we've done here is, this the few steps we walk you through are really the the basic of, like, bringing the data, organizing it, cleaning it up, and we do all that and we go the extra mile by making it available in markdown. Now, it doesn't like, that's not enough. We still need to kind of like retrieve those chunks, etcetera. It helps. It definitely helps to have the right structure and the right and not no noise, the the right, markdown for for LLM. That that really helps. But we still need to slice and dice those documents because we saw that really long document. Yeah. If you try to search for data or pass send that document to an LLM along, maybe you have like five document and they are all that long. Because maybe there is a page PDF that's like hundred pages. You're gonna blow up your context window, like, really big. So the the passage, like, people are going to, like, passage retrieval and semantic documents, was also to kind of, like, get the meaningful information without all the context of one single document. So that's what also what we do through our passage retrieval API, and it does a couple of things, underneath. It's not just, like, retrieving, the passages. It also slice those passages based on the structure of the markdown. So if here, we've we've we've got two section, it will probably slice where the the sections are, to some extent. We it intelligently chunks along the, the structure of the document and the semantics of the document. So we will try our best to give, like, good passages, good chunks, for the retrieval, so they are meaning they are meaningful and and really bring information back, to to the LLM. The maybe the other thing is as we retrieve those, those chunks, there is an an algorithm that's, that's ongoing through the chunk, and and that tries to look at how they are stacked together. So if I if let's say, in the first position, we retrieve a passage, and the the second one is just the chunk that's next under underneath it, we'll merge it into one. So we do that auto merging just to because they're relevant together. We don't wanna split them out. So we have a lot of, like, a back processing that's done to make the most out of the chunks. That's a really important material for us, and yeah, we're trying to make the most of it. I was not aware of that, so if if I'm asking a question that only refers to Trip Planner, here I'm going get this chunk, but if my question refers both of them, you realize they are in the same document, you merge them. Yeah, we merge them. That's pretty advanced stuff. Let's go and dig into these chunks and I like the rag and the chunking approach because even if the context windows of specific models are now in millions of tokens, you could pass the entire bible in there. It's gonna confuse the model. Think it's gonna slow down, but also you can say goodbye to your wallet. At this point, it's just gonna cost so much. So the retrieval is still very relevant in terms of speed, but also in terms of cost effectiveness. So if we go back here in the console, I think the best model to showcase this is the Passage Retrieval. So if we go in the Passage Retrieval, we'll find here if we open it up that you're going to detect the overview. And what I love about this overview is the fact that you can see here stats. So this is one of the place where we can really understand like how many chunks do you have, how many items have chunks, so this is a very healthy score at this point. I almost wonder by curiosity who's the detractor in the index. We managed to build sixty four thousand chunks and you can see the proportions. A thing I really find interesting when I'm studying a data set is the average. So you can see that those are substantial articles, three chunks. How big is a chunk in average? Do we know? Well, now because we chunk based on the structure of the the document, we have a maximum, but we we stay within, like, three hundred to four hundred token. That's kind of like the the average, but they could be, like we have, like, upper bound and and lower bound to this. And token is for for it's it's a word. It's just You can you can can convert to probably two hundred, two hundred fifty characters. Yep. You can see that the the top we got was a thousand, but then the average word per chunk Is a hundred and thirty eight. So we're we're pretty much spot on. You're talking about that new strategy of chunking that is intelligent and just understand the structure which we can see here. What do we we had before? What's the what's classical way of Yeah. The the more naive approach that's kind of like easier to to make is just like slice by the same same height or the same amount of characters every chunk and you're kind of like irrespective of how the document what the document looks like. So it's like slicing your pizza in like eight equal slices versus like going over the meat and potato. So and and making sure, you know, everybody gets the right pieces. So let me take a note to not eat pizza with you. But at this point, I understand the concept. I think initially the way we we managed to get good relevance was by having kind of a An overlap. An overlap. Yeah. A little bit of an overlap, but that method of like better better slicing and merging is far more accurate and and gives better results. Of course. It's it's very good. And these are these features available across all the stack? Meaning, if I use RG versus conversational versus PR API, like, converted all of these to the same ground of technology. Right? So we still have, like, different maturity. Features are are not on the same maturity scale. So the most advanced feature, passage retrieval and our search agent, yes, they are leveraging this. Crga, which was built before, is still on conflict the the standard chunking. Okay. And that's why we're seeing, like, in the experience you showed, it's a search agent. It uses the latest and greatest, and we see better results Yeah. More accurate, not only more, higher answer rate, but also more accurate results because part of it, the the data is cleaner, the chunking is better, but also the retrieval, has been improved. Yeah. Maybe with merging. Well, I was gonna go with the, the Rank Fusion. Oh, yeah. Okay. So that's the that's the next step. Yeah. We are, we've talked about the indexing part of it. We've talked about the UI here and what you can do in terms of front end. If you want to try it, please go ahead, be my guest. But now I want to talk about control because that's the main thing. Building a prototype in these days can't be easier. You just use cloak code, you build an awesome prototype. But then the devil is in the detail and we see it all the time here. You start building a cool prototype for a customer and then came the requirements. Like, oh, but for this specific query I don't want this document. Or we are under regulation, we need to respond in this specific fashion for this one. So behind the scene this is on top of Coveo retrieval system or Coveo search. So you have all the control you had in traditional search. Right? The same scoring system, the same rule based or tell me more about it. Yeah. The important part is, like, we we wanna we want you to keep your business role and your business context that you are already able to bring in the the query pipeline. So a lot of those feature tap into some of the rules and filtering that you you already have in place. Let's say Thesaurus for instance. You don't wanna kind of like put Thesaurus into a pipeline and then have to retype in all your Thesaurus in, in a specific, agentic model. It's been like, oh, it's already in Coveo. So we made those connection with the existing infrastructure and and search pipeline So that it takes advantage of that intelligence. And you could decide to turn it off or on, like, depending on quality if you if you the quality of your roles, etcetera. There's a lot to unpack here, so I want to go very specific. I got good examples here. It looks like Scott got frenzy on on the Thesaurus a while back, and that's fine. You see here IE for Internet Explorer. That's common. Or GSUI for JavaScript search interface. This is something we we're guilty of this one. But we're using standard models that have learned the universe parsing the web and they're not aware of the intricacy of Coveo nor of your business. Basically if you're in a medical world or in the manufacturing you have acronyms. A quick shout out to my friends at the government for instance where everything is acronym based. Where do you put that in the system? Do you build a custom model that's going to cost you an arm and a leg to train? Our recommendation is to use this kind of approach here. But how do you transfer this search part to the chunks and the generative part at the end? Yeah. Because what you gotta understand is the the chunking and the conflict that power, the semantic search, which is different from the lexical and more classic search. Semantic search is not really good with all those acronyms. First, has not seen it in pre training. GS UI unlikely knows what it is. And it's not gonna be really good at recognizing it, especially we see it with error codes, code samples. And if you do semantic search alone, you will get a really mixed bag of results. Not Sometimes you're looking for an error code, it's an exact match. You're not in the world of fuzziness or or it might be. It's just error code twenty two is what it is, not it might. And so what we've done is we've combined both worlds. So the cement the the lexical search happened first, so that's where we see get some of those rules injected in, some of the conditions, etcetera. And we do lexical retrieval that's boosted with any behavioral data that we might have collected. There is also a semantic boost, and we get, like, a smaller set of document that are relevant. And from there, we do another steps, which is a re ranking of those of those documents passages with our passage retrieval API and say, hey. Out of this pool of documents, let's see what are the most relevant passages. And what the the link we made between the two is to keep the scoring form from the first the first path into the second part. So we wait Okay. The the the score of your kind of, like, initial retrieval when we do the semantic search. So we don't kind of, like, go off the off track and and start to kind of, like, recommend an article that was, like, maybe ninety nine position in your top hundred You'll have less weight than the one that's, like, in the top ten, for instance. So we We combine those two, those two technique in kind like, our search pipeline architecture to make sure we benefit from lexical search even when we're trying to find semantic match. So that's and and that proves positive. The next step we wanna do is go even further with all our learning to rank model and and have an even more complex combination of signals. So so that's that's kind of like the next step for us. I know in commerce there are already some of those models but we we aim to generalize that approach. That's very cool. And then at this point we'll introduce a little bit where we're going with these solutions. As you may be aware, the more we go and that's the funny part, Laurent Simoneau, our CEO, was always telling people that unified search is the key but then people were only looking at service or commerce. And now, I guess it was right a few years in advance because people started to ask like, yeah, but I need to search for my products and my manuals and my tickets, etcetera. So we start to see that merge and we decided as a product organization to lean into it. So let's go and dig directly in this part. What we have here is basically an example, and this example is a merge of what we see in commerce but also what we see in service. The query is quite simple here. One of my palletizing robot axe drifting and the servo is running hot. So if you are in the packaging industry, you're probably screaming, oh my god, it's going to break. I'm not in a good position to appreciate it. Still, you can see that the response we get is extremely interesting. You see the product directly with a link to the PDP. You can see really the title. You could see even like if it's available in terms of like you could run the full commerce stack here. So if you are entitled to see it, if you have the availability. And that was probably pulled also from your entitlement. So it might have been even we can pull this information from another system to say like, hey, CRM of record and and say, that you own this piece and that's based on your query, we found the right purchase. You made the right piece you ordered. I can connect this through through Agentic and external connections. And what you see here right after is basically once the object is identified we can sub query the knowledge part of your business and look at basically the documents that can help you fix this specifically. You see here that it pulled like some very complex PDF and external links. So a web page but also two PDFs, one of them a troubleshooting guide and then it's going to start merging this with your context with this part and the instructions to really make something come I think that will guide you into the next steps that you need to do specifically. You said something interesting because here you stay within the world of your organization, so those are my product, those are my PDF, but you said here it could be in another system. How does it work? It's the first time I think of you leaning into an agentic position where we're leveraging other things. Yeah, to solve complex problem, you sometimes need more, and you often need more than just the the documentation. For sure, that's what's gonna ground you and tell you how to resolve it. But there is context that leaves outside of Coveo. Some could be pulled in with the query at query time. Some other things needs to be go be found in another system. So we're experimenting with orchestrating through different tools or different APIs to provide more information, more context, and ultimately a better answer. Yeah. I've seen in manufacturing some some APIs for fitment, for instance, or APIs for VIN search in the automotive industry, or these datasets are just not resolvable or you don't want to search anyway in Other use cases are CRMs and kind of like user data just to get like entitlement rather than just like the session and what's logged in. You want to kind of like get deeper into that account for instance and understand. So that's a good solution. For instance, we got often the past purchases kind of concepts in B2B. Although it can be useful on the search interface, this is mostly for these kinds of things. And at this point, you can index it, but it's going to cost you again. So it's always a matter of, like, using the right tool at the right moment. Right? I also appreciate the fact that you let me index whatever content I want, no matter where it is, and also call whatever API I want. Yeah. We don't want to bring a CRM into Coveo. Coveo is not meant to hold that type of data, but the system is available and we can connect to it. Cool. I'll show quickly another example here. This one is interesting because you're going to see a little bit of a different approach here. It's asking for a joint three calibration error. This is an invitation, so I can recognize him, but that's fine. Here, you can see that we found the product again. And here, instead of trying to suggest you some some direct action, it generated what we call a step by step repair instruction. Yeah. That's so you see how, like, the layout has evolved based on, like, the context of the query and and the the context that we have to answer this question. So we call this the dynamic layout. And really the idea is to get like a layout that adapts to the intent of the query and the content that's available to answer it and and make it easier. Of course, we can read those instruction today on the Coveo docs as you saw with your your your your answer with the metadata IP, but it could be even nicer for the user to have that conflict UI that really helps understand and go step by step what what you see here. So we really believe that the UI is is gonna be dynamically powered by LLM as time time in technology continues to evolve, and we see how we can use it to or benefit and to the end users' benefits too. Again, financers find products more easily, as they as they continue in a more conversational way. The the the cool thing too is that you could, from this initial answer, you could keep going and maybe it shows you product because I I realizes that you need a specific part. Or you could call, like, maybe you need a maintenance. You need to schedule An appointment. Appointment with a specialist to come on-site and You could have a layout for that. Yeah. And that will just evolve as you converse. So we believe the experience is gonna it's gonna have to be dynamic to to match customers' expectation and produce the results that the brand and or customer expect from from their support side, commerce sites. So it could be a search in search interface classical that is returned, could be a troubleshooting guide, could be a widget to take an appointment, basically. And and if the LLM is unsure, it will default to conflict the classic search Yep. Whether it's a catalog search or more like a blue link with a with a small answer at the top. So that would still be powered by the LLM. I think that's the the the main padding shift is you say, hey, the LLM can be can be designed or your your prompt and your agent can be designed to support those type of experiences, not only provide what's inside the answer, but also the the full experience. So that's that's the direction we're going. Okay. I'm a I'm a very on the field type of guy. I want it. So right now, how can I get this guy here? Is it available right now? Yeah. And I see that it's bolted on top of a search interface. Like, what's the retro compatibility? Or Yeah. So you need to be a cove Coveo customer. The feature is in beta. A lot of our existing CRGAs or Generative Answering feature customer want this evolution and we made it simple for them. So we stay in the search layout. The components is the same component that they already have. They just have to upgrade the version. So No markup change. Just Just upgrade the version and you were gonna get the the brand new component that has the follow-up. You decide whether to turn it on or or off. You could still, like, leave it as just a single turn, and you'll have to just simply create a model. So in terms of, like, configuration and get it into a test page, we're talking, like, fifteen minute to an hour depending on your comfort level of comfort with with Coveo. The where the most time is gonna spend on is is really testing and evaluating the answers. As much as we we can provide tools and the knowledge of what's a good answer and what's a bad answer lies with the the SMEs that you have in your team. So by now, most teams have their kind of like golden dataset of question and answers. This goes a little bit beyond now because you've gotta evaluate follow ups and kind like see in which direction can go. But that's that's the bulk of the testing. We see from from customer, like, testing it now between two to four weeks of testing. Some will take more time. Some have longer processes that just structural. But within two to four weeks, you can have a good understanding of how it's performing. And then deployment, again, depends on your rollouts and deployment steps. But within six weeks, four to six weeks, you could really be, up and running with this. Technologically speaking, fifteen minutes, but then making sure you're comfortable a little bit longer. If I don't like something, how can I change it? We saw in the pipeline that I can play with the retrieval part, boosting documents, boosting specific things. So this can help with, like, wanting specific pieces of information in the retrieval and in the agent system. But what happens if you want to, for instance, control a little bit more the behavior? Yeah, so we know we provide like a vanilla solution, a vanilla agent, but every customer will is gonna wanna have it its its own flavor to it, whether you have like tone, a simple tone instructions, whether you have like specific guardrail, like legal and compliance guardrail you wanna add to it, or you wanna tell it how to deal with ambiguity in certain business and and business flows or business logic. So that's that's what we called or or behaviors or they're also calling industry policies, where you describe how the agent should behave from, like you said, a condition. If something you see something, then do here's the obligation you got you got to follow. I remember here the prompt enhancement at RG where you had, like, specific instructions you could throw at the model. Yeah. Is it kind of the same thing? It's the same thing but better because here it was like a free text and we saw how a customer like some of their instruction would conflict with the base prompt. So that wasn't ideal, and it was really hard to measure how effective your instructions were. Sorry. So we've evolved this into a slightly more structured contract, behavior contract, where you define the if and the then And you can define all those behaviors. So it also feel more like you're defining its personality Yeah. Rather than just give it a big blurb. It's really easy to switch from one to the other, and the good thing is this had like a two thousand character limit, which I mean you'd if you were really heavy, you could you could hit that limit. We've bumped the limit with the with the with the behavior so it can have that base. That's very interesting. What about the second part we discussed this more layout based approach? What's the time frame for the whole thing? So we have some customers that are in beta with this, and we're probably we don't have any dates confirmed yet, but expect something in the fall to be released as as beta and more to more customers. So, yeah, we were really excited. We have several teams that are working on it and putting all the building blocks to it. Good job. If you wanna know more, don't hesitate to reach out either to your CSM, to Oscar directly, or to anybody at Coveo. We'll be more than happy to respond to your questions. And that being said, thanks very much for this masterclass. It's been a pleasure. Thank you, Scott.

May 2026

3 Proven Paths to Trusted Conversational Self-Service Experiences

Agentic AI Strategy Masterclasses

June 2026

Overview

Chapters

Transcript

From Enterprise Knowledge to Trusted AI Experiences

Vincent Bernard, Director, Applied AI at Coveo, and Oscar Péré, Senior Product Manager, will show you how Coveo customers can turn fragmented enterprise knowledge into trusted conversational self-service experiences without rebuilding their foundational stack or stitching together multiple tools.

See what makes Coveo Search Agent work in real-world environments: unified retrieval across fragmented content, document-level permissions, grounded answers, and support for results, answers, and multi-turn conversations on a single platform.

What you’ll learn

Why trusted retrieval is the foundation of good GenAI
How Search Agent builds on your existing Coveo knowledge foundation
What to connect first across help centers, portals, websites, and chat experiences
How permissions, grounding, and governance protect answer quality
How to move from current-state search to conversational self-service outcomes