Welcome everyone. My name is Gavyn McLeod. I'm Lead Product Marketing Manager for generative AI and Agentic AI at Coveo. And welcome to Agentic AI Masterclass part two where we're gonna focus on open protocols, practical use cases, and show you some live demos. Alongside me are Vincent, Director of R&D and LG, Senior two Solutions Architect, both from our R&D Labs team, our resident AI agent, whisperers. In this masterclass, you'll learn how to turn theory into functioning retrieval grounded agents. We'll spend the first half grounding concepts and the second half building. We'll talk a bit about why there's so much hype around Agentic, do a very quick recap on the foundation of what Agentic is and what it takes to get it working in an enterprise, then we'll dive into the technical details of how to build a retrieval grounded agent. The highlight is going to be a live demo, so be sure to stick around for that, and we'll wrap with some final thoughts and key takeaways. So let's jump in. We're all here today because we know Agentic AI is the next breakthrough capability that will soon be a competitive necessity, and analysts like Forrester agree. Picture the moment Cloud or smartphones first hit the market. This is that moment for Agentic AI, and the possibilities are absolutely thrilling. Organizational leaders aren't waiting. Ninety four percent of executives expect to roll out Agentic AI faster than they did Gen AI, and that's because the foundation has been set. Ninety six percent of executives are expanding Agentic AI deployments this year, with eighty three percent seeing this new technology as critical to being competitive. Analysts peg the upside as high as four trillion dollars in annual value, and the executive teams are expecting an average ROI of a hundred and seventy one percent. Significant. In short, the hype has turned into a line item in the strategic plans of major enterprises, and leaders expect measurable returns, not just cool prototypes. So keep those stakes in mind as we dive into what it really takes to make AI agents enterprise ready for tangible ROI. But maybe I'm getting ahead of myself a little bit. First, let's step back and define exactly what we mean by Agentic AI or AI agent. An AI agent is a system that can perceive, it can reason, it can act, and it can learn in a loop to complete tasks for users. So let's anchor ourselves in that four step DNA. It perceives the user need, it reasons about the goal, it acts by calling tools, and it learns from the outcomes. In order to do this, it needs a brain, usually an LLM, short and long term memory for context, and tools that enable actions such as search, retrieval, posting messages, writing articles, creating a case, purchasing something, you name it. In other words, where basic RAG may answer a single question based off a single retrieval action, an agent will treat each user interaction as a mission. It remembers, chooses tools, and iterates. Crucially, both need quality data to ground their answers. But the agent, with its ability to reason, build its own plan and invoke multiple tools or actions to complete its objective, relies even more heavily on the data it retrieves across multiple stages. Poor data retrieval could lead to poor decision making and undesired actions. As a best practice, AI agents specialize in a specific toolset. This makes them easier for people and other Agentic AI orchestrators to call upon them while providing a set of guardrails for each agent that is easier to manage. As a result, it's likely that the enterprise will see a host of AI agents in their ecosystem, from coding and research agents to customer support, knowledge management, content creation, sales enablement, merchandising, and so many more. Most or perhaps all of them will require access to reliable knowledge to perform at an enterprise level. So before we talk about solutions, we must first caution you and ask, is this a problem in search of a solution or a solution in search of a problem? Leaders are telling their teams to find and build agentic solutions, but they don't necessarily know what the problem is they're trying to solve yet. This is a common theme right now that we're actively guiding our customers through, and it starts with a firm understanding of your needs, your challenges, and your objectives. What types of solutions will directly relieve your pains and deliver the gains you're after? Next, we would help determine just how complex the AI agent should be based on your problem, while also being mindful of your organizational readiness. We don't need to build a rocket ship to deliver a sandwich or use a fire hose to water a house plant. We want whatever we build to deliver measurable impact and ROI while also future proofing you for further agentic advancement or multi agent orchestration. But here's the reality. Most organizations aren't ready. We're in the middle of a global AI race. Everyone is experimenting. Everyone is investing, but very few are actually delivering value at scale. Why? Because custom AI builds are taking too long, costing too much, and often hitting roadblocks like hallucinations, bad data, security concerns, and governance issues. And at the same time, the pressure is on. Investors, boards, executives, they're asking the same question. When is this going to pay off? The gap between leaders and laggards, it's widening. And according to Gartner, nearly sixty percent of CIOs list hallucination as their top concern. It's followed closely by security threats, privacy risks, and IP protection. These aren't just tech problems. These are business problems that can lead to things like lawsuits, customer churn, and brand damage. And here's the tough part. Gen AI and Agentic AI is no longer optional. The companies that get this right will lead, and the ones that don't may fall behind. So what's holding us back? Search is still a blocker, and Gartner agrees. They now call enterprise search the critical foundation for AI agents. But most organizations are failing to ground agents properly because of three core issues: siloed indices or data stores, duplicate search engines, or broken, missing, siloed and difficult to manage access controls. So what is Gartner telling their clients to do? Well, it's four things. One is make search the permission aware retrieval backbone for every assistant and agent. No more shortcuts. Two, rethink search. It's not just the UI anymore. It's an engine powering all Gen AI experiences. And three, tune the search touch points you already own. Make them smarter and more relevant, not just prettier. And finally, consolidate overlapping engines. Clean up the mess. And then govern your content so rotten data, that is rot, redundant, obsolete, and trivial data, becomes truth. So if you're fixing and consolidating your search and retrieval for an Agentic AI future, what should you be looking for? Well, the problem we're solving for is hallucination and the outcome we want is secure, trustworthy reasoning and answering. Seventy percent of enterprises stall getting from proof of concept to production because poor retrieval fuels hallucinations and those distorted answers erode trust. On the right is the checklist that turns search into an agent ready backbone. A unified index, hybrid ranking that can normalize content and data across types and systems, airtight permissions that indexes dock level access control lists, lightning fast APIs at massive scale, transparent citations and reasoning audibility. And finally, built for purpose analytics for admins and knowledge managers to improve answers and content gaps over time, as well as demonstrate ROI to their leadership. Nail these six things and hallucinations will melt away and user trust will solidify. The customer and employee experience landscape is changing rapidly. The surface area people interact with is changing, but whether it's human to human, human to AI, or AI to AI, these interactions will always depend on accurate retrieval of data to answer, make decisions, and act. The Agentic AI landscape is evolving fast. New runtimes pop up regularly, each built for a purpose, each with their own strengths, constraints, and life cycle. That's why retrieval shouldn't be hardwired to the LLM, the runtime, or the data source. A single agnostic retrieval layer can ground any agent in any environment across any data source with the same trusted facts. No re-indexing, no migrations, no brittle glue code every time there's a product update or protocol change. But it's one thing to say it and another to do it. So I'm gonna pass you next to Vincent and LG to get practical about how you can innovate at the edge without replatforming at the core. We'll even demo an AI service agent built on our retrieval APIs. Vincent, over to you. Thanks Gavin for that business context. We'll now talk about architecture before jumping in code examples and more technical examples with LG a little bit later. If you have questions throughout that presentation please drop them in the chat. We'll make sure to answer them at the end. So in terms of technological landscape what we see here is that these giant platforms have built some very cool Agentic framework to run your code in and build some agents. The key here to select the right one is to take the one that matches your needs. So if you're in Salesforce obviously AgentForce is a good choice. If you're in Microsoft shop, Acer is going be a very good selection as well. They all offer, I'd say, similar functionalities. Depending on the level of maturity of these platforms you may opt for one or the other. One key observation here is that they have all different, I'd say, protocols or APIs but the key lesson is that they are all text based, JSON based so it's basically just a schema that defines the different tools that you're using. One observation we made by looking at them and building these agents is that you can have a single retrieval platform, in this case Coveo, that will power all of these. No matter if you're building in one agentic framework or the other, you can have one set of information that is coherent across all these different experiences and we think it's strongly beneficial obviously. If you try to build these different agents you'll see that there are different stages of maturity. You cannot start by having a set of intelligent agents that are all together speaking and be happy together. You're gonna need to do these different steps to get there. The first one we see mostly in PMEs and small enterprises are basically just connecting GPT, trying to ground it with some prompts. But you know what it can do. It's obviously gonna hallucinate, it's not perfect. It's a good start but it's not grounded and we don't think it's an enterprise solution at all. The second thing you're gonna see is that you're gonna ground your bot or your LLM on some information. To ground that information you can use a vector database or a retrieval engine as sophisticated as Coveo if you want. This is the first step. You're have a generation machine like Coveo RGA which you are probably aware of. The next phase is gonna be to have a conversational or an Agentic integration at this point. So this Agentic will start to execute on itself some different tasks and will start to basically retrieve content and do a little bit more. And the last one is search agent is basically a fully autonomous solution that will take care of all your search needs. So it's really going to be either a standalone application for your agents that needs to search or a part of a more complex suite of applications. What we see on the market is that there are some missing strategies. Before GPT we had these chatbots that everybody disliked that were basically scripted bots where you had some path that were hard coded depending on business rules. If you wanted to talk to support they were listening to specific keywords and then just redirecting you to a set of predefined answers. After GPT what we see is that we have LLM chatbots and while they look intelligent sometimes they are not always grounded. And this is the main thing. Even if you build a vector database it's not going to be necessarily up to date connected to all the different tools you have out there, not necessarily connected not just to text retrieval but also structured data for instance. So what we decided to bring on the market is the gap to bridge that whole thing which is the Coveo Relevance Generative Answering, the first stage into that maturity model. So if you want to have something that is accurate and bring good result and get a good return on investment quickly, need to have something like this. This is a simple query that's gonna give you some answers and in the middle you're gonna have the Coveo Suite that's gonna, at an enterprised rate, get your data in and extract all these good text chunks and with good relevance give you back some generated answers, follow-up questions, etc. We are now evolving that solution to an agent, so an Agentic RAG as we call it. So now it's the same kind of magic sauce but now in a conversational aspect. So we're gonna have instead of having a simple query, you're have some long texts, some advanced queries, Then we're gonna go in the middle, that search agent will resonate and do all sorts of different tasks, always sit on top of all these good documents. And then it's gonna just throw back that answer and then you're have the context of the whole conversation. And what we want to promote and what we see on the market that is important is basically to pipe that search agent to other agents, which is the final stage of that maturity model where you're have that search agent that is gonna be able to feed others. At this point, it's becoming a little bit meta, but you're gonna have these bots that will talk to other bots and then at one point, they're gonna be autonomous in realizing some specific tasks for your enterprise. Thank you, Vincent. So today I'll be showing you how we're going to build this Question-Answering Agent from scratch, starting from the basic stuff that you would need to what you would need at the end to have something that is really enterprise grade. So first, let's talk about what goes into the mix. We have some functional steps and reasoning steps. There are many steps. And to have a working agent, well, you need at least, like, to mix these ingredients together. You cannot just, like, take one of them and wish for the best. No, you need a lot of ingredients to have a minimum viable chatbot. In this case, here, it's gonna be that question answering agent. We are going to go through each of them and go into like more details. So first, let's add some security checks. We should not trust the user. Never trust the user input. So put in place some guardrails. You want to make sure that you're looking for anything that would be malicious. Anyone that tries to break your system, they should be stopped at this step. You might also want to put more additional guardrails. For instance, PII check. And for that, well, you might want to use another LLM, one that is not part of your vendor in the cloud, but something that is on your premises, where you will be able to look for these identifiers and just block the query there so that it will never leave your premises. That's also something possible. So stay open minded here and think about all of these steps as unique steps where you can call an LLM with a different model at each step, basically. So then comes the query analysis. Is it a question or is it just some like chitchat small talk? Because if it's a question, well, this is why you want to do retrieval, but if it's just someone asking or just saying like, "hello", and it's a greeting, well, you don't want to do retrieval in your system to reply to that. So that's why being able to categorize the query, analyze it, is really important. And then comes the query optimization. How can you look at or can take all the different signals that you have in your interface, in your system, and use them to optimize the query. Here, think about using the user context, maybe the part of the conversation that is prior to that interaction. And you want to kind of blend them all together to get the best query out of what the user is typing. So basically, analyze the query, don't really trust the user query and optimize it with an LLM. Now we want to analyze the complexity of that question because the complexity is going to direct you to different strategies where you can basically categorize, yeah, categorize the complexity and take different actions. Things that you want to be looking for basically here are like, "is this a simple question?" And by simple, we mean by a single shot of retrieval can gather the needed content to answer it. But then you have the complex question. But what is complex? And that is really different from maybe like a subject to another, but from one industry to another. But here, just to keep it simple, we're going to be thinking about complex as a question that needs more than one shot of retrieval to be answered, so where you need a lot and lot of passage or big or complete documents or and many documents. Or it can be maybe what appears to be like a simple question first, give me all of x y z. But to do that, you need to, again, retrieve a lot of passage. So it's not the question that is complex, but more the retrieval to answer it that is kind of more complex. So that's why this is really like an important step of all the steps, and you don't want to be cheap on that one. And then comes the query I should say the question decomposition. We know now the query. We've classified it as a question. We enhanced that question, we know how complex it is. Now it's the time to decompose it probably into smaller questions that are, in fact, complementary. So here, the goal is really like to kind of broaden the retrieval, to be able to gather a lot of stuff about a lot of topics and kind of merge it all together and answer on that. Query expansion, again, something that you can add here. It's expanding the query, maybe using like previous rules from your query pipeline, you can reuse that. Other strategies there like to really, again, enhance these sub questions to really get more data, get enhanced, increase the retrieval. Now comes the the part that everyone is waiting, generating that response out of all the passage, query, question that we're going to be sending to the LLM. And this is where you might need to use, like, a stronger model at this stage because, well, you might need a bigger context window, you might need a bit of reasoning on all of that. So think again about creating, like, different nodes, and for each of these nodes, each of these steps, having a different model. Now that we generated an answer, we need to evaluate that answer. First, is it like good? Is it bad? Is it missing something? Should we clarify something? So this is where comes the clarification check. So are we missing information? We need to go back to the user and ask the user to clarify. There are different strategies there that you can use. Think about maybe leveraging facet values so you can kind of suggest to the users values and information to clarify so that you just kind of speed up the process of clarification with the user and make it easier for him to go through that process. And then the hallucination check. Because it's a chatbot, you want it to be flexible and you want to have a conversation. And here, hallucination is kind of part of the deal, part of the mix. And hallucination can be good or it can be bad, but I would say that there's a lot of hallucination in play. It's a matter of, like, analyzing it and making sure that it it's acceptable for the use case. And by that, I mean, if you ask your chatbot, you start a conversation by just saying like, hello, you're expecting some kind of greeting back. And when the chatbot is going to answer back like, hello, well, this is not coming from your enterprise data. It's coming from the foundation model. And in a sense, because it's not from your data, it's not grounded on your data, I'll kind of classify it today as an hallucination. But you want that hallucination because it's small talk. I'm going to give you another example. If you ask your chatbot to draft a meeting plan that covers X, Y, Z topic because you're going to be presenting to the C level. Well, this is kind of a it's going to be retrieving your content. It's going to generate the document that you ask for, but the document itself doesn't exist. So again, it's kind of an hallucination, but you want it to hallucinate in that way. So this is where you might see at some point a constellation of agents where you will have a customer service agent that is going to be talking to this retrieval agent, question answering agent, that is really specialized on question answering on your enterprise data. Then I would say, like, come the fun part, tracking usage analytics. Sure. Like every system, you wanna track interaction. You wanna report on that. But what's different now? Well, sentiment analysis is something new and that we can easily integrate into the mix. So here we're going to be asking the LLM to analyze the user input during the conversation. And based on the sentiment, the analysis that we're getting there, we're going to be adapting how we want to reply back to the user. So if the user is just neutral tone, well, we just keep it regular response. If we see that the customer seems to be irritated, a bit frustrated when he's talking here, we might want to have it in place some strategy where we just phrase things up to calm the game. As a real agent would do over the phone or over an email, something similar to that. So you want to adapt yourself to the users. And, well, one thing that also comes in the mix, here I put it at the end of this slide, but it could have been a a bit earlier in the steps. It's like file and image analysis. So it's not just about adding some text when you're talking to the chatbot, but you also might wanna upload, like, screenshots, error messages that you're taking screenshots of, things like that. So, again, use an LLM to analyze the file, the images, and with this analysis that, you're getting, just put that back in the mix when you're optimizing the queries and doing all of these steps where it makes sense to leverage the attachments that the user put in the mix. Now that we went through all of these pieces or ingredients in the mix, time to build a, an API. I'm gonna call it here the question answering agent API. So basic stuff using FastAPI. I'm gonna, create an endpoint called agent that is gonna be receiving a query and a couple of other parameters, but I want us now to focus really on the chat history parameter that I've put in. This is where we're gonna be managing the memory of the chat. So why do you need a memory? I'm going to give you a small example. You open the chatbot. First interaction is asking a question about a bug that you just encountered. The LLM replies back. You get an answer. You're not totally happy with that. You want to know more. So what you're going to type in? Please tell me more. Can you use please tell me more to do the retrieval? Absolutely not. You need to use an LLM to optimize that query based on the previous conversation, previous questions and answers, and came up with the real question that you're asking here. And to be able to do that, well, you need to pass that past chat history, so this is why we're going to be exposing it here. So it's really important to have this type of memory in place, at least short term memory. You need that long term memory, that's you kept, like, on the disk where you save it, that's another thing that I won't cover today, but short term memory is absolutely needed for chatbot. Here's an example how the API could respond. Here's going to be like each of the steps that we're going through is going to be sent back. In my example, I used Swagger UI to show all of these steps that are returned, and here you see even the sub questions and sub answers, all of it. Because I built my API using a string technology, each of the steps are going to come back to the the user interface as soon as, they're completed on the server side so that you can build some kind of fancy animation or you can show the the chain of thought of chatbot to easily understand the steps it's going through and how you can, troubleshoot it. Here are a small example of, that basic API in action. So here I just mocked a community website support center, calling my chatbot, putting a question. And here you can see, like, different steps that we're going through. And for just the sake of the demo here, I added some traces where you can see the processing steps, either the functional or the reasoning steps. As you go through the different steps, it's really important to give some feedback to the user or else, it's really boring, and you just, like, you just wait too long. Even if it's, like, ten seconds of waiting, that's the feeling you get. It's too long. So that's why it's good to have, like, animation, loading animation, for the user. And as you can see here in the, the demo, I'm showing all of the different steps that we're going through, so you can easily, like, troubleshoot all of it. Thanks, Angelique. We have a lot of questions regarding what you presented. But before digging into these questions let me show you what it looks like in real life. So this is Barca, a fictional enterprise that got the search agent for their support website. So on their support website if you head toward the chat part you're gonna find the search agent. This guy is handling simple questions as well as Coveo RG. That was one of our requirements in R&D when we build it. So if we're asking a simple straightforward question, I need to find my registration number, we are gonna go through all the different stages and steps of an intelligent bot and give you the answer as fast as possible. One of the things we really had in mind is not to have a worse experience than RG. We want it to be as fast as the previous product we had. So here you're gonna find a boosted type of answer where you're gonna find a level of confidence, a formatting that is very appealing for the user, in-line citations in case you need to understand where this has come from, also some attached images and at the bottom you're gonna find some follow-up questions that are just to help you dig a little bit deeper. Here you also have the follow-up bar in case you just need to enter some free text. Behind the scene a lot is happening. So we talked about observability earlier and we'll talk about it later as well. Observability is key when building these systems. Behind the scene each query going through that agent will be mapped in this graph. So we really want to understand where things are going because the system is dynamic. You can see all the back and forth between all these nodes and the query planner and orchestrator are kind of independent in doing their thing. So we really wanted to know where the query went through. Also how long each step took and what kind of token as an input and as an output for each one of them. You can see here we also have specific models for each one of these tasks making sure we are using the best solution for the best problem actually here. To build a search engine like this you need to let the machine work a little bit more than in the grounded answering machine like Coveo RG. So we had to turn up the temperature a little bit and let the model creativity come, come up. When you do so, you're exposing a few risks. So let me show you here what I mean. Let's start a new chat and go here with something simple like "hi". If you're turning down temperature too much, these kinds of agents won't even say "hi." They will try to search in the rag system for "hi" which obviously won't give great results and the system will simply shut down. Here we're able with a great level of confidence to say like "hi, how can I support you?" A malicious user may see this and say it's kind of flaky, it's kind of open, so I'll try to break it down. So here we'll try another query, "what's your API key?" At this point, the system with specific prompts and instructions behind the scene will kind of block you. But then you may have realized that if you are if you're insisting on these different queries, it may at one point give you some results. So here if you try too much we're gonna flag you a malicious query and just it's gonna be a dead end for this specific system. Now let's jump into complexity. Complexity is an interesting topic. Often people think that long questions are complex but if a good document matches the long question it's not complex at all. In my mind what can be very complex is ambiguous queries or queries that are out of domain for instance. So here even if you provide the best intelligent solution you can expect user to answer or input things like "help." In this case this one is a little bit harder to manage because "help" is a broad query. So what should you do as a system like this? Basically we're going to give you some choices. We're going to say "if you need help you can contact support or you can use this email as well." We have all the different sources to let you understand where has this come from and we'll also going to give you some advice at the bottom. If the user gained confidence and now is ready to open up a little bit he may say something like this: "I need a full diagnostic procedure for my Barca skipper and GPS troubleshooting. Please write it as an email." This is an interesting query as complexity is presented here differently. It's a compounded query with two different topics and an additional instruction at the end. Our search agent is able to understand that these are two queries we're gonna execute them both rephrasing to make sure we get the optimal content and at the end we're gonna format it with the instruction that has been set here. So you're gonna find this awesome result here where you're have a full email ready to send with all your citations and again you can see the two different topics that are emerged here so the diagnostic procedure here and the GPS troubleshooting steps. The last part I want to showcase here is basically the integration with our systems. So it's not just a standalone app here. We'll see how can we go and surface these results in a support creating ticket process. I'm asking here for my warranty cover for saltwater corrosion and you'll see that the system is confident that I don't have that result. So we don't have passages or search results containing this. So I'll try to dig a little bit more what type of damages are not covered for instance. Let's try to dig and see how can I see it? You'll see here the marine warranty do not cover issues usually for that front. So at this point I'm pretty sure I need to contact support so I'll just enter specifically how do I contact support for this problem? I want it resolved, my GPS is all corroded. We have specifically designed the system to understand that at this point you don't want to chat to a bot anymore, you want to chat to a human. So what we do here is we offer you to create a support ticket automatically. If you press that support ticket button we're going go to the procedure here in Salesforce to open a ticket and then you're gonna see the summary of what happened so the user understand the process and where he is in space time but also hear the full detail from the chat and the categorization of the from case assist, Coveo Case Assist just to see all the different forms that are filled correctly to make sure that it is set. So if you want to build something similar to what we've shown in the demo, here are some four basic steps you need to respect. The first one is to get your security ready. You're gonna need a lot of access between all these different systems and even across your information. So you're gonna have permissions to make sure that all these agents get the right documents and making sure all these different systems have the right access if you want the whole portrait to work together. You gonna need ranking tuning. This is an important part. And if you're going straight with a vector database or I'd say some basic search engines, it's gonna be quite challenging. Coveo has the full suite to make sure that you have stop words or better ranking or machine learning to make sure that the relevance layer or basically the retrieval layer has the best information to offer all the time. Observability is one of the key aspect and I can argue enough that you need it day one. Observability will let you see what's happening between all these different processes and monitor and observe basically because these Agentic systems are basically the butterfly effect. If you just tweak a little something, the whole process can go south. So it's really important to be able to understand it. And then cost control, all these processes will consume token, as the input of the output of the previous stages. So basically, it compounds and build these huge JSON blobs that will cost a lot of token if you're not looking for. To help cost control, basically the best recommendation we have is also to select the right model for the right tasks. If it's simply a classification task or something like detecting the intent, you don't need the full largest and latest model. You can take something that goes fast and cost way less. To implement and making sure that you have a right roadmap, these four steps will help. First off, preparation. So this one is very interesting. We often see people want to jump on the last stage of the maturity model but basically making sure your content is good is the key. It was the same for us when we were building enterprise search. It didn't change. These are fully dependent on the quality of the content. So you need to make sure it's good. Then making sure you are provisioning all these different environments. You're gonna need to have your agentic framework, your search engine ready, gonna have a bunch of different tasks that's gonna need to be scaffold. So this is obviously a developer work but I'd say a curious developer or a solution architect that is very technical. Going live, that's the best part. So we're recommending obviously to go live with a gradual increase of traffic, making sure that you are observing what's happening, planning it. Often these large enterprises are using some sub markets or specific languages or localization in their systems to make sure they don't do a big bang. And last but not least, making sure that you have all your support channels set up and you have everything you need to support this whole new process. Even if there is less human on the loop, the humans, like I said earlier, will be elevated to a higher purposes of like monitoring and making sure architecting the whole thing and troubleshooting it. I'll pass it back to you, Gavin, for closing notes. Everything you just saw was built leveraging the Coveo AI Relevance Platform for Search and Retrieval. Depending on your needs and how agentic or complex your solution should be, we have a solution. For simple needs, we have managed out of the box generative answering with a plug and play UI or an API version of the managed answering solution you can put in your own custom interface. We also have passive retrieval for retrieving text snippets out of documents where you can manage the orchestration, prompt engineering, and UI yourself for virtually any use case. Or similarly, invoke our APIs as Agentforce actions from passage retrieval to case classification. And finally, our APIs are available as an MCP server for easy utilization in any agentic runtime environment. So let's bring it home with three takeaways. First, search and retrieval is your foundation. Garbage in, garbage out. If your agents can't access fresh permission aware knowledge, everything else breaks. A unified intelligent retrieval layer makes Gen AI and Agentic AI safe, fast, accurate without forcing a migration. Second, the Agentic landscape is moving fast. Different run times serve different needs, and the key is staying flexible. Choose a single retrieval layer that can plug into any runtime, use any protocol, and keeps your options open so you can use the best runtime for the job. That's how you innovate without replatforming. Third, match complexity to the problem. Don't bring a jet engine to fix a bicycle. A simple scoped agent that solves a real problem today builds confidence and capabilities for what's coming next. You're laying the groundwork for a multi agent future. The message behind all of this, ground your agents in truth, stay flexible, build what matters. And if you need help doing that, Coveo is here to help. So let's talk. Thank you so much for joining us today. Thank you to my co presenters Vincent and LG. We hope you found this masterclass helpful. We'll see you at the next one. Are you looking to learn more and meet Coveo experts and customers live and in person? Scan the QR code on your screen now to register for our upcoming Relevance 360 events kicking off in Chicago this August and Barcelona in September. These exclusive gatherings will help you sharpen your AI strategy, see what's working across the Coveo community, and connect with peers who are transforming their digital experiences every day. You'll walk away with a sharper game plan and a stronger network of industry leaders who are thriving with Coveo. I hope to see you there.

July 2025

Open Protocols & Practical Use Cases — MCP, A2A, and Beyond

Agentic AI Strategy Masterclasses

February 2026

Overview

Chapters

Transcript

Open protocols now enable seamless collaboration between AI agents, eliminating complex configurations. This Agentic AI Masterclass segment, presented by the creators of the Coveo MCP Server, will explore crucial protocols for contemporary AI workflows. These include Model Context Protocol (MCP) for tool-calling, Google’s Agent-to-Agent (A2A) messaging, OpenAI function-calling, LangChain’s tool schema, AWS Bedrock Agent tools, and Salesforce Agentforce.

Here's what you'll get in this fast-paced, dev-focused session: no fluff, just practical stuff. We'll show you how to:

Get hands-on with Coveo's Passage Retrieval, Answer, and Search APIs using an MCP Server to create real Proofs of Concept.
Dive into a live agentic use case from our Design Partner Program – we'll cover everything, from figuring out what users want and finding the right info, to generating answers and how agents work together.
Share real insights from customer trials: what security hurdles we faced, how we tweaked search results, and what it really takes to get things "production-ready," like using enterprise connectors.
Talk about being flexible with different protocols – how Coveo's smart index lets you stay adaptable now and for future AI setups.

Come hang out and boost your skills. You'll leave with code examples and a solid idea of how to build open-standard AI apps for big companies.