Hello, everyone. Thank you so much for joining our webinar. Go beyond the hype enterprise ready GenAI for industry leaders. My name is Bonnie, and I'll be your moderator today. I'm really excited to be a part of today's session with Kobeya's CTO, President and Founder, Laurent Cimino. And our director of R and Z, Vincent Bernard. I have a couple of housekeeping items to cover quickly before we get started. First, everyone is in listen only mode. However, we do want to hear from you during today's presentation. We'll be answering questions at the end of this session, so please feel free to send those questions along using the q and a section at the bottom of your screen. Today's webinar is being recorded and you'll receive the recording within twenty four hours after the conclusion of the events. Now because we're showing real products, today, I do want to draw your attention to our forward looking statement. Luckily for you, I won't be reading this slide out loud, but feel free to read at your leisure when received the recording. Now our agenda today is simple. In the next forty five minutes, we'll cut through the hype and talk about what's real with generative AI in the Cobeo perspective. Laurent will introduce you to the new Cobeo relevance generative answering solution, and Vincennes will share a live demo. Then we'll take your questions at the end, so get those fingers ready. Now for those of you who may not be familiar with us, Coveo has been building enterprise ready AI for over ten years. Our goal is simply to democratize AI so that any enterprise can take advantage of AI capabilities for search, recommendations, and one to one personalization. And we do this across three lines of business, commerce, service, and platform so that you can deliver those personalized, relevant, and profitable experiences across the entire customer journey. Now, our perspective and the strategy that we pursue is last to hype first to results. So let's take a look at what's happening in the tech landscape today. Over the years we've seen how large technology disruptors like Netflix, Amazon, and Spotify have shifted customer expectations and refined the way businesses interact with people. The launch of OpenAI chat DPC is no exception. From large enterprise companies, all the way to even our grandma's everyone is getting their hands on GPT. Genative AI really is an experienced game changer. And this has been driven by quantum leaps in large language models, and this is driving that need for enterprises to move quickly to start leveraging this technology so that they can deliver on those evolving customer expectations and not get left behind. And when we talk about these expectations, it's important to remember that this isn't a blip. These expectations have changed for good. And we need to align around delivering unique, individualized journeys, experiences that are prescriptive, and intent driven, and journeys that are coherent and don't differ depending on where you search for an answer. So when thinking about leveraging generative AI, It's important to remember that this should be seamlessly threaded into your digital experience, not bolted on to the side. We've gotten the question. You know, will generative AI replace search? And the truth is it won't. The worlds of intelligent search, recommendations, generative chats, all converge into a new more modern digital experience paradigm. And combining large language models with a mature and reliable AI search and relevance capability, is imperative to create those generative experiences that enterprises can trust. Now I shared earlier that we've been building AI for over aid. And this is just the latest step that is part of the natural evolution of our AI journey. So, again, not something to think about in a separate silo, not something to think about as a bolt on. We're very excited about what the future holds with this new technology, and we want to help our customers solve key challenges found with platforms such as chat and UPC. So what are those challenges? You're all very familiar with, you know, all of the the need for security and privacy, you know, ensuring that we're securely accessing that proprietary content, and multiple content sources. There's a currency of content, factuality, being aware of hallucinations, and the coherence of the search and chat channels. There's also the sources of truth and verifiability. How do we know that we can try those generated answers. And then, of course, the high costs of generative AI can be very, very expensive to build on your own. So Koveo relevance generative answering is designed to address these obstacles and make generative AI enterprise ready. So now I'll pass this over to Laurent to show how we do this. Thank you so much. Thank you so much, Bonnie. So, alright, so what I'm going to do in the next few minutes is some of the context and do the introduction and then Vincent who's who's going to join me will show you a a live demo of what we're doing about about GenAI. So what we so first of all, what I'd like to what I'd like to do here is to show you some of the questions that we're receiving more and more from our customers in service, for instance, how to add a new bank account, feed in Canada, My robot is not falling to pad the fun in the map. How can I solve this issue? Explain the difference between atomic and headless. Those are UI frameworks that are being used at Caveo. So basically the last query is really Caveo serving our own customers and developers with our internal resources. Those kinds of questions, we expect to see them more and more because of Chad GPT that has changed the expectations and the behavior of the users. So we will see some of that in commerce with guide and shopping, of course. And we will see some of that also in what we call workplace where employee facing will have questions internally about how to solve specific problems or generic information. But we still believe that the search box is critically important. The classic search with one of few keywords. And why is that? Because first of all, people are lazy, they like to discover as they type, they have they like to have suggestions about how to refine their intent when they are not sure about their indent. And they like to know what's out there, and slice and dice. And a few clicks, when you think about it, is typically faster than writing a long prompt. So this is not going away. And these are the kinds of capabilities that we've been doing for a long time going after multiple sources of data with advanced relevance and security. But now what we need to do at a different scale is provide great answers to these kind of queries where the intent is clear. So that's what the expectation of the the end user is. Right? And this answer may be in to a knowledge based article, but typically it may be generated from multiple knowledge based articles in this example. That's what we are that's what we need to build and address. But to do so right now, you have classic search, right, inside large enterprises. Classic search, classic enterprise search, advanced enterprise search such as Cavella involved secure connectors to basically the entire enterprise and index to do a search in a scalable fashion, complex, advanced and powerful relevance to really personalize for each and every user And besides that there's analytics, administration, and all sorts of integration. So that's classic search. What we're starting to see are is this. A specific question answering system that involves extraction and embedding with a vector database that will prompt large language models so it can provide answers. And we think that while these two systems may exist together, it's creating a problem. It's creating a problem because first of all you may have two different search boxes, Second of all, and that's what's more important, you will have duplicate content and data and infrastructure making making the answers different from one system to the other because we are going to deal with different set of facts. So we think that it's not suitable for the enterprise, and hence why we're consolidating those to systems into one. So the left side is classic Caveo. The right side is what we are in the process of adding at scale inside the Covio platform. So what we're going to what we what we're going to make available to customers in the coming weeks is the ability to do is the ability to have access to all of the content in the back end with our connectors, These connectors will allow freshness and security in the systems with administration, and with this information we're going ground contextual prompts to a large language model hosted on our own infrastructure, that will support unified search box for all sorts of queries, keywords semantic and question answering. This is generated on the most relevant paragraph semantically from all of those documents coming from search, it will allow personalization and relevance, so basically the answer that you will get will be for you, and because we are running this on our own infrastructure, we're running the large language model on our own infrastructure, and all of the work is done at the prompt stage. We believe that it's the most advanced protection against cells in that is available. So at this point, enough slide I would like to ask if Vincent, Bernardo to take control and show you this in a real live demo. Vince. Hey, folks. Thanks for having me. Let's get started. So I will share with you the amazing prototype we built. This is an atomic search interface, so the classic Coville experience and we bolted it on top of Coville documentation. So I'll get started with simple queries, the more wego the deeper we'll get, but I just want you guys to get used to how it works. First query we'll try here is how does Caveo determine relevance. So when you click on these queries, the major change you see here is obviously the generative component at the top. Like Let her explain, we're taking the best snippet of information from everything that has been retrieved and we're able to merge that and explain it in a very easy to understand sentence at the top. So this experience here and you see how fast it is, it's respecting Coville's security, all the goodies you have by having Coville. But then on top of it you have that easy to understand. So as promised, we'll go a little bit deeper. We'll see some more advanced things. Second query here will ask two different topics like let me explain what's the difference between atomic and headless. There's a lot of things happening here. First off, we're querying it for two different topics, so the documentation exists in Pavil, but it's scattered across different parts of our documentation, so it's not just a single article, we're able to find many things and merge them together. But it's also able here to process queries like explain the difference, so we're falling in something more like a semantic search approach, which I'll dig a little bit deeper in a few minutes. We are letting people interact with this. So, for instance, explain the difference between atomic and endless, yeah, I like that block of text. I think it's pretty good, but I'm lazy and most of us are, so let's just ask the interface or the model to synthesize it using bullet points So we're letting the user actually interact with these snippets to have something a little bit easier to understand. So in this case, we just asked for bullet points and then we're gonna have something a little bit more easy to understand. Soon we're gonna get a citation, so all the reference will be there. You're gonna be able to understand why these have been predicted and we'll talk about it a little bit later, but for now I think it's already pretty sweet. Let's go a little bit deeper again. Here we're going to ask an opinion. So explain in details the difference between atomic and endless and when should I use it? So this is quite different compared to the standard search capacity we have because we asked first two different topics, but we're also asking in the future, when should I use it? So you see, it prompts you at the end, the choice between the two depends on the level of customization you need, which is obviously the right answer. I've built UI for a long time now and I get that this is exactly the right answer. We can ask also for step by steps. I'll show you with the next example. So Lara touched upon something interesting here, apart from the amazing generation that happens at blazing speeds, we're also able to process very complex queries or queries that have been, I'd say, damaged by users. So in this example, you see that how to install Covio for Salesforce, a human can read it because I mean we're pretty smart, but the computer usually cannot process these very damaged queries. Here with semantic capabilities we added on top of everything here, we're able to process these complex queries, such as how to install Covial for Salesforce, so we're going to give you the right snippet, point to the right document, and then there's more. Again, we can ask for people to interact with that content a little bit more. So here in this example, we'll use the step by step option, and Kvio in a few seconds will give you the right step by step answer. So this works with any queries or any document you may have in your documentation or your website since we're just doing a semantic search taking that damaged query on damaging it, sending it so we can retrieve all the documents. From these documents, we're extracting the most relevant snippets of information, and then we ask to generate something out of it. So it's a very, very fast and accurate process. That's gonna give you tremendous good results like what you can see here. The other example I wanna share with you is a little bit more complex. In this case, what's the difference between a partner org and a trial org and how to create one? If you're not familiar with Cavell, we have a very strong network of partners that are helping us actually go into the market and install the product once a client have a subscription. So there's a lot of information we want to share with them, but that information is not necessarily accessible to the public in general. So what we've done here is we enable the full security of the Covio index. Covio can respect the security of your systems no matter if Salesforce or Sycor or AEM or whatever the system you're using, if you have permissions on your items and you have roles, we got to respect that. So here we're gonna log in. Right now we're anonymous. If I click on this guy, I'm gonna log in as Mr. Van delay. Mr. Van delay is one of our partners. And now that is logged in, you can see here that we are able to retrieve internal knowledge document from our Salesforce. So this is this would happen if you logged in Kaville Connect. And then within that documentation, you're gonna find useful information like this one here, where we can find that the partner will have a full featured fledged organization. A trial is a short term Organization, the partner one is the longer one. You can also register for free on the Covi website or on Covi for Sycor or Salesforce or ServiceNow, very complete answer. And again, we can ask to have bullet points if we really want to get the trained of talk. Of the model. So here you see that we dissected the question in two parts, first one, difference between a partner in a trial or in the second part how to create it if you wanna create a partner organization. So that's it for me. I hope it helps you understand what we're building. This those capabilities will be soon released. Laurent will talk a little bit more about the next steps. Thank you so much, Vincent. So this is quite exciting, and I think we're were one of the few first ones to do a real life demo in an enterprise context. So What's coming soon? We thought we would also share a few road map elements So, Vince in his demo showed a great answer coming from a large language model, hosted inside our own infrastructure, what we are in the process of adding is also sources and citations, which is critically important, especially in the enterprise when child and source of truth is mandatory. So this will be this will be part of the release in next coming soon. And I want also to share this, we are adding the ability to do follow ups. So after the answer, there will be the ability to start a dialogue with the system. So with ask follow ups, and we're also suggesting queries here prompts, that are good, valid follow ups with respect to the answer that was provided. So those are two examples of what we're investing in and what we're going to release soon. We are adding semantic search to our commerce also. This is an example here where for a for a query with a typo. We found for what are the best battle boards for beginners, so this is an example of an example of more of a guided more of a guided shopping that will provide more offers, and that will provide more options for the shopper. To go through. So, in conclusion, some some final thoughts, We think that cost is the factor that customers will look into. Running large language models is very expensive. So there will be there will be some arbitrage regarding the use cases on where to apply this in the enterprise. So we are in the process of optimizing the cost of running large language models, but because we also support search that is hundreds of times less expensive than acquiring a large language model, there will be a lot of situations and inquiries where a simple search will do a great job. Hence why we're consolidating those two systems together. And we think that there will be a lot of large language model choices down the road. Today, GPT4, is is the best one out there, but there's first there's domain adaptation that is happening as we speak, meaning a large language model that is specific that is fine tuned for a specific domain or theme or industry, but also we're seeing from that on Google good movement on the open source, in the open source, models. So this will this means that a year from now, there will be multiple options for companies, organizations regarding large language models. Therefore, we believe that we need to be a large language model agnostic. Our architecture is designed so we can connect to those models that are running on our infrastructure, but it may well be that some large customers will decide to host the model themselves finds you net a few times a year, even if it's quite expensive because of the size of these customers, it may well be a good to consider, so they may fine tune those models. So in this case, Coveo very easily will point to the large language model that is running that is hosted by the customer to provide great answers again based on to search results that are grounding the prompt to figure these large language models. So we believe in conclusion that it's really all about relevance across content, within the enterprise and the interactions that make this possible. So if you remember my left side earlier on, my left side slide, this is where all of the connectivity, the relevance, the security happens, and this is what is used to generate a prompt and go after a large language model that is basically generic to create a great answer based on this is very advanced prompt actually, and as Vincent showed during the demo, it's personalized, it takes into account security, and it provides the best answer based on the content that is the most French and the most relevant. So with that, I thank you for your time, and back to you. Bonnie. Amazing. Thank you so much, Laurent and Vincent. Now we do have some questions from the audience. So let's go ahead and get started. The first question is, does the generative AI steps and bullets rely on SEO optimized content to work properly. This do you wanna take this one? So the way it works is not for the goal is not to have SEO and crawlers to hit your search interface and then generate answer for it to crawl it back. It's gonna cost you a fortune if you let search engine hit the generative feature of your website. So the answer is that it is dedicated for human understanding, not necessarily for web indexing. The goal of the feature is to answer to humans and not to crawlers. Okay, great. Another question. It sounds like you will host your own large language model, will your private LLM be trained with all of your customers' index content or will there be an element of segmentation? So, for example, competitor information doesn't show up in your question results if you're both using Coobello. So, so I'll take this one, the short answer is no. The LLMs that we're running in our own infrastructure are are the of the shelf one that, currently, we're using open AI GPT four and GPT three point five, so we are not fine tuning this, we're not training this. Okay? That's a generic model. If we were to do that for one single customer, it would be a separate thing, and it would be a specific engagement for that customer. So there's no there's no content from any customer that will be used to fine tune or train these models, this model that we're using in our own infrastructure. Okay. Great. Next question. Do I need to change the content architecture to get the kind of results you had shown in the demo? I can take this one and the answer and I'm very pleased to say that it's actually one of the simplest model to configure that we released in the past years. Kavio will take care of extracting the right snippets from your documents. You don't need to clean them up as that technology is resilient to weird formatting or whatsoever. So you don't need to clean it up. We'll find the right relevant snippets and then we'll use it in the generation. Obviously, your content need to be good quality. If you have content that you don't want to be used for generation, you need to exclude it from your index or from the source you're gonna use behind the seat. But then in terms of preparation, there is no additional steps than the classic Covio search experience. Perfect. And there is a similar question, which is is there a possibility tab which documents would not be appropriate for summarization. So I think dents in your answer, I guess an example would be a critical document which contains medical and safety language. So if we don't want to include that, is there a way to exclude? Yes, Same mechanism as previously, you can filter it out in your query pipeline for a given interface, or you can add security on top of it. So a normal user will never get access to that document. So the same search mechanism we've been promoting for the past years are still applicable here. Great. And I think if I may add, Bonnie, what we what we currently believe is that if it's good enough to appear in a result list for you. It's good enough to be part of an answered that is basically a summary of the bad most relevant snippets that have been shown to you. Down the road, we may add additional mechanism where it may be good enough for a result, but we don't wanna see that in the answer. So that would be fairly easy to add, but at this point, we don't think it's required. Okay. Another question is could you be more specific about how the generated answers are personalized? It seems like the answers would be the same regardless of who makes the query. So so that's a great so that's a great question. Because search results are individualized and personalized for you. You have security, you have personalization, you have relevance, so when Vince searching and what I'm searching, I don't see the same thing, depending on my click journey on the site, I may see different things for the same query than Vincent. That output, that result list is what feeds the large language model to provide the answers in real time. So we're building a prompt to the large language model based on those results that are specific to me, that are on personalized to me, that are secured for me. So that answer is for me. Obviously, if we do this on a small public documentation side, well, queries may queries may not that be that personalized, but in the example it didn't show just when you love yourself as a partner, Daniel starts seeing results that are only visible to you, basically. So that's where this personalization relevance thing is critically important, we believe. Okay. Alright. And we have a question from a customer and they're in the process of rolling out the traditional recommended results feature. Will this be replaced with a new generative answering feature? The recommended results, which is smart snippet if I think correctly, No, both will still exist. We think that smart snippet comes with very valuable features, such as people also ask or follow-up question that we're going to still use in our rollout. The generative part or the output of Smart snippet, we can see generative as a successor of it, but it's not yet define how this is gonna happen yet. Alright. For ecommerce, if my machine learning model is firing on all cylinders and my products are set up correctly, what additional value does AI search provide? So so I'll go with this one. Some customers have seen a portion of their users. Currently, it's not a large portion, but Instead of asking those short queries and navigating, they may ask a longer query like the examples I gave. I'm a I'm in law school. I live in Canada. I'm traveling to Italy. I'm going to study in Italy. What's the best laptop for me? Right? Those kinds of queries right now, even if they are small number, we could do a better job, would benefit basically from that kind of technology. And we expect We don't know yet to what extent, but we expect that the number of the volume these kind of queries to augment, therefore, why we want to to provide better support for it. Make sense. Alright. How effective is the AI for deeply technical topics, such as hardware part numbers, schematic, chipsets, things like that? I'd say that right now we've tested it on Cavell docs which is quite technical. We have code snippets. We have a bunch of different products. We have concurrent documentation, so Covio for Salesforce, Covio for Sycor, these are two products that talks about the same topic in their own area of expertise. And right now, we've got very good success with it. Obviously we're going to fine tune this for specific verticals in the future such as the medical industry or a super high-tech or even complex manufacturing, But right now the first approach is a generic query answering and we see it performed very well as long as the content you have is a good quality content, we should be able to figure it out. Great. Another question. Is there a capability to help the snippet detection by adding known structures in the content, so a known question answer pair. I'd say, hiding question in the content might be confusing, but if you have very structured content the model will respect it and chunk it accordingly. So the more structured the content, the better it is for us. Still, we're able to go through unstructured like PDFs or large pages without any problem. Okay. Few more questions here. As an enterprise operating in the service arena, should I have any apprehensions on the answer generated by Coveo? Well, I think that If the security is set properly, if the search results are relevant, and personalized. The the odds or the probability of having hallucination in the answer is not zero. It's never zero, but it will be very, very, very low. And we're putting in place Canvas is also to to have some feedback on the quality of the answers, coming from the users, So we we're very confident about the approach, and we think that's the best way to reduced to close to zero, the hallucination problem and increase the credibility and the relevance of the answer. K. Great. And there is a question related to the Cobeo insight panel, which is accessible within the agent workspace, can can we implement generated answers in the insight panel? Down the road, we can assume that the capability is room, we can roll it out on any Caveo search interface. So it just depends on how many queries you get from these interfaces, and if you're willing to take that additional feature over there. But yes, we can assume that this is gonna be rollable in agent panel or a website or even a workplace or whatever you want to have in the future, right now we scope it down to service, but we can assume this is something that's gonna be rolled out at large eventually. Okay. Lots of questions here. Let's just take a couple more. Let's see. I think how do you verify correctness? Of the answers provided? This is a good one. We got initially a lot of science going on to make sure that we're able to chunk snippets correctly to get the real thing. However, user feedback is still very important. Like a traditional search interface, if you search something on Google and you don't like it, you may click somewhere else or you may discard the result. Here we got a feedback button on the top thumbs up, thumbs down, but also we're gonna process the feedback of the clicked document after the search to make sure that the most relevant snippets are also the one that people interact with in the search results. So there is a feedback loop based on the user behavior and then I'd say that it's pretty much how we gotta evaluate it at scale. Great. And there is a question that's very similar to that. So maybe your answer is has already answered this, but I'll go ahead and ask. Does generative AI improve over time? Does it somehow incorporate feedback over time improve the answers generated. So, yes, it will. It will get better over time as more users are using it. And also what we realize using it on our own documentation website is that it's going to highlight some weaknesses in your own documentation. So some questions we got from our internal folks we didn't have the answer in the docs. So we are preventing hallucination, but sometimes the model will not have enough confidence. And when that threshold is not hit, the model will just shut it up, it won't give any answer. So this is when we realized that we had content gaps actually. So it improved on both ends, I'd say, user feedback but also better content through identification of content gaps at scale. Great. Makes sense. Couple more questions before we wrap. This one is related to product content, so We have versioned product content that is indexed, generating an answer with mixed versions could create hallucinations or incorrect answers depending on the user's product version. We currently provide facets so that users can filter those results. Will there be a better way to manage this version scenario, especially if they don't select a facet. I like this, and this is getting me all excited So yes, you can have facets to select the right versions, but then Cavell has the capacity to automatically detect and select facets for you. So you can combine generative with what we call DNA auto selection. So we're gonna find the right version, we're gonna take that facet for you, snippets will come back scoped and the answer will be scoped again. So this is, don't forget on top of everything we've built, which is already very strong. So I suspect we're gonna have fun with these use cases. Great. Alright. So, Laurent, you mentioned that Coveo will be LLM agnostic. How do the two work together? Now how does what? Coveo how does Coveo work with LLMs if we're if we're LLM agnostic? So So the secret recipe is the prompt engineer here. And at one point, LLMs will be at a level of competency regarding prompt ingestion that will allow us to send the same prompt to multiple to different LOLMs, and and we believe, as I said, we are likely going to have one, two, maybe three LLMs on our own infrastructure that maybe good that they have specific characteristics depending on the use cases. Some of them may be more expensive, some of them may be faster and so on, right? So we expect to run down their own multiple alarms on our own infrastructure, but where it's really important is when we'll have the ability to connect to large language models that will be hosted on our customer on our customer premise. And those large language models may be trained fine tuned on their content. So we've have to be able to also use that to provide a great answer. Okay, great. So are close to wrapping up, you know, there are several questions related to cost and timelines. So, Laurent, is there anything that you can share today pounds costs or timelines with these features? Yeah. So so timeline, we expect, as we have announced last week, we are starting a a beta phase with select customers in the coming weeks. And then we're going to announce early adopter pricing for those who wanna enter a pilot program this fall, and we'll go from there, so we we expect if all goes as expected by the end of the year, we will have a GA for this. And regarding pricing, we are going to make announcements regarding pricing around for early adopters in the coming weeks. Okay. Great. Some more details to come. Thank you so much, Laurent and Vincent, for your wonderful overview of Cobeo relevance, generative answering. Thank you everyone for attending. If you ask them questions that we weren't able to get to today, we will follow-up with you via email. Thank you so much, and have a great rest of your day. Thank you. Thanks. Bye bye.
Register to watch the video
Get the real deal - straight from Coveo AI Experts
In just 1 hour, you'll see..
- A unique feature you can use to fact-check every result with a single click
- The novel way we’ll transform your search box into a “cohesive” GenAI experience
- How our stringent security protocols protect you from data leaks without compromising performance
Next
Next
Make every experience relevant with Coveo

Hey 👋! Any questions? I can have a teammate jump in on chat right now!
1
