Ce contenu n’est disponible qu’en anglais.
Hi, everyone, and welcome to the partner power hour for Coveo's Pathage Retrieval API with Amazon Bedrock. My name is Emma. I'm a partner enablement manager at Coveo, and I'm joined by Fred, an instructional designer. As always, here is our standard disclaimer to keep everything correctly framed. And this is our agenda for today. I'll start with a quick overview of Coveo and our connection to AWS, then I'll pass it over to Fred to walk through the Passage Retrieval API, including what it is, how it works with Bedrock, a demo, and a quick look at implementation. We'll wrap up with best practices and q and a. And if you have any questions at any point, please feel free to drop them in the webinar q and a panel. Let's start off with a quick introduction to Coveo and our role with AWS, especially for anybody who may be newer to what we do on the call here. Coveo has a rich history with over fifteen years working with some of the world's largest enterprises across millions of interactions from websites and service consoles to Internet, agents, commerce, and more. Our partnerships and customers span every major enterprise ecosystem. And across all of these experiences, enterprises face the same core challenge. Knowledge is scattered, and users expect immediate, accurate, context aware answers. Enterprises are scrambling, searching for quick fixes and AI solutions to this challenge. Our mission as a search and generative experience platform is simple, to deliver AI powered relevance at every point of experience backed by secure, unified enterprise retrieval, solving that core challenge we know all enterprises face. The truth is AI can't deliver value without the right knowledge. Gartner now affirms it. AI success starts with a search and retrieval layer. As assistants and agents take action, search isn't a sidecar. It powers the experience. And Coveo wasn't retrofitted for Gen AI. It was built for it. We unify enterprise content, bring together lexical and semantic search, layering in real time analytics and continuously learning AI. Our platform is secure, scalable, composable, powering everything from websites and commerce to service and workplace tools. Coveo brings together unified knowledge, AI powered search, and real time insights into a single platform, making it your foundation for building search, Genii, Agentic, and whatever comes next. Now let's dive into where Coveo fits with AWS. Coveo unlocks the full potential of Amazon Bedrock. Our relevance layer, featured in the middle of this diagram here brings the best of lexical, semantic, behavior, and intent aware ranking and optimizes these towards business outcomes. This is also where we identify and return the most relevant trusted passages for grounding, whether you're using your own LLM or any Bedrock hosted model. Our passage retrieval API is the retrieval backbone for generative apps, supporting agent core agents and other agents in addition to Coveo's out of the box generative experiences. And here's what we're seeing across the market. Most Gen AI projects stall before they even reach production. The issues tend to fall into three buckets. First, retrieval gaps. Models only perform as well as the information they're grounded on. If retrieval is incomplete, outdated, or inconsistent, teams end up with unreliable answers and unacceptable hallucination risks. Next, operational complexity. Companies underestimate the effort required to build, maintain, and monitor a reliable REG pipeline. Connectors, indexing, vectorization, access control, latency layers, quality evaluation. Each piece adds friction, delays, and cost. And third, cost and scalability concerns. Teams prototype something that works in a sandbox but becomes unpredictable or too expensive when deployed at scale. Leadership wants fixed, dependable costs and measurable ROI. Coveo Passage Retrieval solves these challenges head on by giving customers a fully managed enterprise grade retrieval layer, purpose built to take GenAI from experimentation to production quickly, safely, and at scale. And the good news is this isn't theoretical. Organizations across industries are already running our Gen AI solutions in production, grabbing answers with Coveo's retrieval, reducing hallucinations, and seeing real business outcomes like higher self-service deflection, faster agent handling times, better conversion, and reduced time to value. These customers showcase what's possible when retrieval is done right, Gen AI that's accurate, traceable, and ready for enterprise scale. Now let's dive into one example of our Passage Retrieval API in action with a customer. A major software company implemented Coveo's Passage Retrieval for a generative chatbot for employees. They saw value within two weeks with a twenty two percent increase in article and passage retrieval accuracy and a whopping seventy three percent in generated answer accuracy. These numbers are significant, and they highlight why a strong retrieval solution like Coveo's is critical for enterprise success. With that, I'll pass it over to Fred. He can share more with you about the Passage Retrieval API itself and walk you through the demo. Thank you, Emma. Thank you for presenting, what is Passage Ritual at some point. On my side, I will go forward to a demo. But just before we go through the demo, let me explain why Passage Ritual is needed in, Gen AI app application and implementation. So, basically, a large language model, they came up here, and, they have a big impact in enterprise. And the thing is, unfortunately, when they came up, they're not enterprise ready. The first thing to consider is that large language model, they generate answers based on probability, meaning they're trying to predict what's coming next when you send a query. But the thing is even they sound really confident, and even sometimes they're wrong. So at some point, they can sometimes bring hallucination. This is what they call. And the second thing, large, they have access to humongous data, they might not know your business. They don't have access to your internal documentation, knowledge page, product information unless you explicitly give it to them. But the thing is if you give to them, you know, this internal information, this information could be sensitive, bringing the security risk, compliance risk, or governance risk. So at some point, what happened here, enterprise needed a virtual augmented generation. And these, ground answers so, basically, RAG system ground, answers entrusted contents and content source, and I would say this is where a Passage Retrieval API comes in. To make a quick, you know, image of what is a Rag and what is a Rag with passive retrieval, basically, if you have a a virtual system used with LLM, you have a question for from the user, and the system would retrieve the most relevant document as possible to be used in a prompt as a context. And this prompt will be sent to the LLM and then generate the answers. The thing is really good with Passage Retrieval is it's part of the Coveo platform. So at some point, you will have access to many connectors, many sources to retrieve the most relevant passage that would be sent to build the prompt with the context. And then because you will have relevant passage inside of your prompt, the LLM will have the most relevant generated answers. So this is where, basically, the passage retrieval, I would say, improve a rack system. So to introduce the Coveo passage retrieval, what it is exactly so it's the killer API for retrieval augmented generation. Alright. So, basically, you know, most of the retrieval system, they would send the full document to, to build the prompt, bringing to the LLM some so much noise. But on the passage retrieval API side, not the full document is sent, but only the most relevant passage of a document are sent. And this passage, most of the time, answer to the question that has been asked in the query. And to make sure that, the passage which we will understand the query and not only make a search using the keywords, it is built and not built, but it relies on the Coveo Semantic encoder make sure that it understand the query, but also retrieve the most relevant, passage to bring to the prompt. And the thing is, like I said, because it's inside the Coveo platform, all enterprise permission and security rules are enforced automatically, meaning that the users or any system can only retrieve the content they are allowed to see. So at some point, the result here, the Passage the Passage Retrieval API is a retrieval layer that can be used with any large and engulfed model. But in today's session, we're gonna show it how it it been used with Amazon Bedrock. But keep in mind, it can be used with any LLM. So if you ever if you already have, a reachable system, you might ask yourself, why would I use reach Passage Retrieval API? So, basically, what happened over the time since the LLM came into our world, there's an explosion of and and a quick adoption of LLM across chatbat chatbot, sorry, assistant and content generation. But the thing is we wanna make sure that, you know, the user LLM use the most accurate, you know, content, And this is a critical wanna make sure that the more we provide to our customer, a general answer, well, is the most relevant thing. But because the Gen AI adoption is skyrocketing, we wanna make sure that the reachable system is efficient and scalable. But to be able to have that scalability and efficiency, a a reachable system must be able to interrogate or to get the sort, to get from one source most of the time to be more scalable, not only trying to retrieve content from five, ten, fifteen database. And finally, we wanna make sure that when the LLM use the content from the reachable system, it does verify precise data to minimize the risk of hallucination. And at some point, because Passage Retrieval API is inside the Coveo platform, it answers all of these problems that I just mentioned. So where can you use Passage Retrieval API? So, actually, this is Passage Retrieval API is one advanced virtual system for all of your generic application. So one of the first use case that we had is generating answers on the search page. So that was one of our first use case for, the Passage Retrieval. But, also, you can use Passage Retrieval API for your custom AI Gen AI app. Just thinking about article generation, we always wanna make sure you get the most the best information to create article article generation. As an example, if you have a support case, you wanna make sure you get only the best part of it. But, also, any Jira that's been worked through the, to the support case, any Coveo documentation that has been used for the support case should be inside the ritual system to make sure the article generate the best article in the world. And finally, you can use the passage ritual for your copilot. So like I said, you want to grant ground your answers in trusted, relevant, and and and secure secure content source so your copilot can use Passage Retrieval for that. So just before we jump to the demo, I just wanted to show you a little bit about how, about the workflow of using, Passage Retrieval API. So just think about a search page with a query query box. So, basically, a user will go in your search page and make a query, like, what are the advantage of generative Answering? So, basically, what's happening here, you're going to use the Passage Virtual API. The query will be sent to the Coveo platform. And then inside the Coveo platform, the semantic encoder would understand your query. And with the use of the semantic encoder, the passage retrieval model will will retrieve the most relevant passage of document according to according to the query. And on the response, that would be sent through the API to your back end handling the prompt. So the prompt would use the the response, from the passage passage virtual API to to build the prompt and to send it to the LLM. And then your LLM would generate the answers, put, like I said, relevant passage, content. So this is a little bit the workflow a simple workflow, but it works for, really, most of the agent AI application. So before we go forward to a demo, do we have any question for now? I don't think we have any yet, but I will keep an eye on them. Thank you. So just before we go towards the demo, I just wanna show you first, you know, how it's been built, in the on the Coveo platform, but also how my search page, I would say my Generative app has been built so you can understand the workflow that I just show, and you can connect to the dot. So on the Coveo platform, what I've done, I indexed three sources. So, basically, the Coveo documentation, the Coveo public articles and knowledge base, and few items or, obviously, many items from, the catalog of Barca Sports, which is our Coveo commerce demo site. And, and yeah. So we have three source. So I can show you three line of business or three use case, on the demo. On the other hand, Passage Retrieval, they need the cover Coveo Passage Retrieval model and also the semantic encoder. So they learn from the source that I just, told you. And finally, to bind up everything together and to make the search possible on the Chen AI application, we need a query pipeline. So the query pipeline will help me to have the query from the password retrieval to go in and use the semantic encoder and the Coveo passage retrieval model. The next things to show you is my search page. So if I show you my search page in here so you will see there's many different section. I did keep the API configuration because the app here would be available to you to download and to test yourself. Because the thing is I wanted to share what's it's been done as a demo, but, also, I want you to understand how you can implement passage retrieval on your site. So we have many, different section on the the demo site. So like I said, API configuration, I have a drop down list to select a different use case or different line of business. I have two mode that I want to show you because the thing is to explore what you can do with, the Coveo Passage Retrieval API. So I have a search or query query mode, I would say, and introduce conversational agent using DeepAssets retrieval. And then I have a query box section. I have a generate answer section also, which would show the generate answers from the Amazon benchmark. And finally, for troubleshooting or just to show you what's been returned from the Passage Retrieval API, I added a section where it shows all the passage returned by the, Passage Retrieval API. And finally, to go, what's needed for, to connect with Amazon Bedrock? And it I would say the configuration was quite easy. You need an account for sure. You need to create an API key to make it work. And at some point, it's really easy. You follow, the recommendation about how to in to use the API, and this is just how we can get the generate answer as well. Yeah. So let's go forward, and I just wanna do that first. So let's go with a demo. The first thing I'm going to show you, is quite an easy use case. So, basically, on the lines of this documentation, let's think about a user going to your website and wants to know more about a specific documentation thing. On that side, it's just could be going to be a a simple query. So how to install Coveo for Salesforce? So let me explain what's going on. It's just like the workflow I explained. So the query is sent to the core platform. The semantic encoder understand the query. And with the help of the semantic encoder, the passage retrieval models understand the query and get some more relevant passage, send it back to my Gen AI app. And with the prompt that I I tailor for my land of business documentation, it does it it generates the answers with Amazon Bedrock. And you can see what I asked for in my prompt is to set up, steps and and numerical list to show how to do things. You can see from my section, you know, that's the generate answer from Amazon Bedrock. And, also, in the end, I asked to, enumerate me which source, the most generated for these answers. So you can see here. And to show a little bit more about, you know, the passage, you can see on below which passes have been used by the LLM, Amazon Bedrock, to generate the answers. So I don't wanna go to all of them, but I just want to show you how it works. So if we go in the knowledge so the knowledge, includes, the Coveo co documentation and the public articles from Coveo. So my my my point here, what I wanted to show you here is, okay, we have a query. So most of the pages when you have someone going to support page and have an error message, sometimes just providing the error message is not enough, you know, for your search. And here when you use the passage retrieval, which is really nice, is you can make a full sentence with the error message to get a generic answer. So if I go forward with an error message, let's go with that one. So let's say and I'm using it like I'm chatting with it just to be honest. That's my query, but it works. So I am getting this error message on the site core logs, and here's the error message. So what can I do to avoid it? So here, again, just explaining again, the query is sent. The semantic encoder understand the query where it understand what's the question, but also understand that, oh, there's an error message. And with the passage retrieval, it understand, and it retrieves the most relevant passage. So here, we can see on the generate answer, it was able to retrieve three possible calls and able to provide a resolution to, to user query. And, again, just quickly, which source has been used and, at some point, what are the relevant passage, to resolve that kind of issue at some point. So I can see just some question. Aima, do you wanna go forward with the question? Yeah. I think there's actually two. There's one in the chat, but I'll I'll start with the q and a. Which permissions does the API key require? The API key require. So in a I would say in a real world, I was oh, that's a good question, actually. I would need to go back to the documentation to make sure I do not provide any bad answers. But as that's something I can provide offline and give it to you, what's the permission needed for the API key. I was going to say, you know, when you do this search token and you impersonate someone, it would take the permission from what's indexed and everything. But for the API key and the use in in in a public site, I think if it's a public site and you have secure content, well, the public site won't the API the API key used in the public site will never use retrieve the the secure content at some point. But yeah. Okay. Perfect. And we've got one more. What's the right way to think about prompt design when using Passage Retrieval? Should we aim for minimal prompts, more structured instructions, something else? Okay. So in my demo here, you know, to make sure I understood everything, how to use it properly, Passage Retrieval, so, I did follow the best practices of prop engineering. Prop engineering, basically, to to to make it short, you know, you want to tell your LLM to act as a persona. You want to provide context to it so he understand what he has to do and how he wants to do it. So you need to provide the context. But, also, you know, the context part of the context is everything is retrieved from the per the from the pass passage retrieval API. And then you want to set up, you know, in the prompt what kind of output format you want to. And here, can see, you know, as an example, knowledge here, I I specify a specific format to output. So this is why you see step one, step two with numerical call list because I in my prompt, this is what I ask, you know, to do for my LLM. And in the end, add some guardrails, things to not to do at some point. So if you follow the best practices of prompt engineering, you're gonna have a really good generic answer at some points. Yep. Perfect. Thanks, Fred. I think we can continue. Great. So the last line of business to test with the Passage Retrieval is commerce. I just make you aware, for commerce here, I didn't want to, show you something that, you know, you do a query and you have a result there of product. That's not the purpose of the Passage Retrieval and the demo here. It's more acting like an adviser, a sales adviser for someone who would go in a commerce site and asking for advice on specific product or mini products. The thing I can think about is just, you know, you're going to a pharmacy and you see, you know, someone who works at the pharmacy, and you just, hey. Do you have Advil, or do you have Tylenol? So it's pretty much the same, I would say, way that you would ask here for commerce. So in my example here, I'm just going to do a a simple query. So I'm just going to ask, if they have kayak. And at some point, I'm going to have a chain of answers telling me, yes. I do have Kayak. And the thing is, because in my problem, I asked, the l Amazon vet truck to act as a sales adviser. If we go over the results, you will see that, you know, it it does provide suggestive, I would say, comments about the kayak, so we can sell it to you. So this is how it's been tailored at some point. But yeah. And, again, in the end, it can show the source and the retrieve passage that has been used by the LLM to, generate the answer. So I'm going to go forward with the conversational agent. And, again, I just wanna make sure that you're aware this is not a production ready. It's more exploring what's possible with the passage retrieval API. So the thing here, how I build my conversational agent, basically, I have my initial question, and I have one or two follow-up question. So just to make sure it it understand the context, of the chat, and then it is able to provide a follow a follow-up answers to my question. So the thing here, like I said, is not production ready. Make sure you understand that. But it's really good to explore that possibility with the passive retrieval. So I'm gonna start with knowledge, and I will skip, documentation. But the thing is, here, I'm just going to go with a simple query, just to show you that we can do follow-up question. So do I need Sitecore instance to install Coveo for Sitecore? So if there's no hallucination, it should be yes. So, yes, you need Sitecore to install Coveo for Sitecore, then and it's it's a good thing to know. But, yeah. And as a follow-up question, I would like to know which Sitecore version do I need to install Coveo for Sitecore? So version which version of Sitecore to install. So, again, it do the I would say the LLM understand the question. And the other thing is there's still even if there's a follow-up question, it doesn't only go to the LLM, but it also go back to go back to the passage retrieval. And and and the query, it is being understood by the LLM first, and it okay. Sorry. So the thing how it works behind, I have my initial question, and I have the first answer. And then I have a follow-up question. In this conversational agent, what I'm doing, I'm asking the LM to rewrite the query so it's be so it would be understood by the semantic encoder and to make the search again and retrieve the most relevant passage to build the generate answers on the follow-up question. So this is how it works on Discover version of agent. And you can see in the end, you can see which version can could be you could be installed for Sitecore to use Coveo for Sitecore. In this, I would say, in this answer, you can see I didn't put any ranking rules or ranking weights on my query pipeline. So at some point, you're just taking the most relevant passage from from the model. So, actually, I don't wanna say it's it's wrong. It's true that the version five point zero one three six eight, it supports ten point two, but at some point, we're supporting Sitecore ten point five. And the latest version is five point zero point one four as I as I remember. But just to make sure you understand, we can tailor and optimize, you know, the relevant passage. But yeah. We can go next with commerce. So it is the same with commerce. We can have something as a conversational agent. Again, think about having conversational agent powered by by Passage Retrieval to to answers to and to have follow-up answers to questions of a user. So in my conversation, the agent, let's go back with our previous query. Do you have kayak? But let's say this time for two persons. So at some point, yes, we do have kayak for two persons. And at some point, the the passage retrieval returns few few items. And the thing is I can do okay. What's the difference between, video kayak and team tandem kayak. So it asked my follow-up question, and the thing is, it goes back again to, no, the LLM to redo the query, and the query is sent again to the Coveo platform. The seventy encoder understand it, works with the passage reachable model, which is the most relevant passage of each items, and he's able in the end to say, hey. Here's the difference between these two. This is how it works. So this is where it ends the demo, basically, here. So I really hope you understood, what's happening. So let's go forward with the best practices, to implement, password retrieval, and this is a bit from, my experience here, doing that. So the first thing to to think about when you are implementing Passage Retrieval, the content quality. The content quality is really critical when you implement Passage Retrieval. Like I said since the start, Passage Retrieval gets the most relevant passage. But if your passage is one years old, it's still the most relevant from your source. So make sure your content quality is the most up to date, most accurate. And, also, if you have permission, permission and security, make sure these these are integrated. One thing to remember so like I said, Passage Retrieval is a ritual system. The good thing, it doesn't retrieve the full document. It goes and get, you know, the most relevant passage. You have to remember when you use LLM, the cost of using LLM is the number of tokens. So when you send two hundred page document to the LLM, the cost of using LLM get increased. So when you're using Passage Retrieval, because we only sent small chunks and the most relevant passage of a document, it reduced the cost of using LLM. So that's something you you you should be aware of. And the other thing is about using Coveo Passage Retrieval. So it's inside the Coveo platform. So at some point, it managed the relevancy, the permission, and the security. So leave to Coveo platform to handle all these things and keep your prompt simple and deterministic. So the point here is to say, in your prompt, like I said, just follow the best practices of prompt engineering. But the thing is don't put anything logic related to permission or to relevancy. Leave that to Coveo platform, and you will have a success implementing Passage Retrieval. And like I said, LLM are, like I said, LLM are used to generate natural language. Use LLM for that only using a a good prompt. And finally, you know, the architecture that I show you, the workflow I show you today, you know, it has been used with Amazon Bedrock, but it can be used to any LLM. So, basically, the architecture that I show you today is LLM agnostic. And, yeah, hopefully, you can implement in with any LLM at some point or your Gen AI, application. So like I said, Coveo Passage Retrieval, is really good for agent AI agent or any Gen AI application. It can be used with Agentforce, Amazon Bedrock, then Graph, and so on. And to end, we have some resources that you have access to. So if you want to understand more about Passage Retrieval Semantic Encoder, we have links that we can share to our co, documentation. Also, I made available the Chen AI demo app I just presented to you you can use it, for your testing purpose. There's a big ring me file in the the in the app that I would ask you to review before making changes because there's still few things that you need to change inside, the app to be able to use with your Coveo organization. And quick reminder, to use with Amazon Bedrock. The the documentation site is really great. It's been really easy for me to implement Amazon Bedrock with my app, creating an app, how what is, Amazon Nova Lite, and how to make the API request was really easy using, the Amazon Direct documentation. And I hope you really enjoy the demo, and I hope, you know, it brings easy things on your side. Yeah. Thank you so much, Fred. Just to to add to that, to the resources, we will be sharing the links to all of those, including, the the demo app by the end of the week via the email. So we'll keep an eye out for those coming to your inbox. And I believe we've got one more question here. How does Coveo handle situations where multiple documents conflict or content is out of date? Does the ranking model handle freshness signals or metadata? Okay. It is sure at some point, if you have content conflict and things like that. The first thing is always to go back and look at the content if it's highest quality. That's the first thing. Okay? If you have multiple version of the same document, it it is pretty much sure it's going to bring some conflicts. The other thing that you can always do, and to ensure that, you know, the most relevant document, if you have mini version of the same document, is is is being most the the most relevant at some point. Because it goes the query goes through the query pipeline, you can use query pipeline rules, you know, to to, query pipeline also to, like I said, to boost items. So items with the most updated, the last update date can be boost. So that would be that would bring the most relevant document, if there's many version of the same document. So that's one the other way, it's possible to handle with Coveo. Great. Thank you so much, Fred. I think with that, I don't see any other questions here, and I think we will wrap this up. I'm gonna keep an eye out again for the recording coming your way and all of the links in the demo app, and stay tuned for more of these awesome webinars. Thank you all so much for joining us.
Implementing Coveo Passage Retrieval with Amazon Bedrock

As organizations embrace generative AI, ensuring responses are accurate, trustworthy, and grounded in enterprise knowledge has never been more important.
Join us to see how Coveo’s Passage Retrieval API, combined with Amazon Bedrock, enhances LLM performance by anchoring responses in your own content. Through live demos and real-world examples, you’ll learn how to integrate and optimize Coveo’s Passage Retrieval to deliver reliable, context-aware AI experiences.






Rendez chaque expérience plus pertinente avec Coveo

