MICES 2026: Lessons Learned from Berlin’s Ecommerce Search Community

Search is no longer just a box on a page. It’s becoming the infrastructure behind product discovery. That was one of the clearest signals to emerge from MICES 2026, Berlin’s annual gathering of ecommerce search practitioners, where Coveo was proud to participate as a speaker.

The organizers closed the day by calling out four key themes that kept resurfacing:

Search(re)platforming
Searchandising and business rules
Conversational search
Emerging agentic search and search optimization

Whatever the topic, semantic retrieval, hybrid search, sparse vectors, conversational experiences, or coding agent, speakers kept circling back to the same idea: search is evolving from a standalone search box into a broader product discovery platform. Here’s what stood out, talk by talk.

Search, Chat, Ockham: When Two Interfaces Are One Too Many (Coveo)

Chatbots have gone mainstream on both desktop and mobile. At the same time, search itself has become more conversational, handling complex queries and returning both products and direct answers. This raises an important question: do shoppers really need both search and chat?

In our own talk, we explored whether maintaining parallel search and chat experiences always creates value. There’s one shopper, one intent, and one query, yet often two separate interfaces. If those experiences are very similar, one risks becoming redundant. If they’re substantially different, they can introduce confusion rather than clarity.

We argued the future lies in three areas:

Better understanding and refinement of shopper intent
Adaptive interfaces that render the right experience for the context
Continued focus on latency and responsiveness

The core argument was simple: product discovery should be treated as a single intelligence rather than two independent systems that happen to share a catalog.

Precision vs. Recall: When Good Enough Beats Perfect (OTTO)

Sometimes the most useful finding is a negative result. OTTO, one of Germany’s largest ecommerce retailers, set out to improve precision and discovered something surprising: it didn’t matter. They measured conversion, filter usage, scroll depth, and reformulation rates, and precision gains barely moved any of them.

OTTO’s journey began after introducing semantic retrieval into a previously lexical search stack. While the added fuzziness improved recall and conversion, it also raised difficult questions about how precision should be measured in modern ecommerce search.

The explanation was simple but powerful: query does not equal intent. The same search can come from shoppers who are exploring, comparing, or ready to buy. Products that look irrelevant from a strict retrieval perspective may still help users discover what they need.

The broader lesson: search shouldn’t be viewed solely as a retrieval problem. In many cases, product discovery matters more than exact query-product matching.

The Journey to Semantic Search in Omnichannel Retail at dm-drogerie markt

dm-drogerie markt, Germany’s leading drugstore chain, shared its journey from traditional keyword search to production-ready semantic retrieval across a catalog of around 22,000 products and roughly two million daily searches spanning 14 countries. The goal wasn’t to replace keyword search outright, but to improve relevance, reduce zero-result queries, and better capture shopper intent at scale.

Rather than treating semantic search as a silver bullet, dm approached the challenge incrementally. Their evolution included guardrails, business signals, attribute filtering, two-stage retrieval, cross-encoder reranking, and ultimately fine-tuning multilingual E5 embeddings on clickstream data, a stack that will be familiar to anyone working in modern retrieval.

One particularly interesting finding: retaining the full clickstream dataset, including noisy interactions, outperformed aggressively curated training sets. Their experience reinforced a recurring theme of the conference, that most search gains come from the overall relevance system, not any individual model. For teams managing catalogs at this scale, that’s a useful reminder that architecture and data discipline tend to outweigh any single algorithmic upgrade.

Inside Zalando Search: Architecture Behind Product Discovery at Scale

Zalando, Europe’s leading online fashion marketplace, presented search as a product discovery platform rather than a standalone feature. The same infrastructure powers search, browse, personalization, promotions, sponsored products, experimentation, and the Zalando AI Assistant.

What stood out was the architecture’s separation of concerns. Retrieval generates candidates, orchestration coordinates business logic and experimentation, and ranking determines the final result set. Search, browse, and conversational experiences become different entry points into a shared discovery platform rather than isolated systems.

The takeaway: as catalogs and customer touchpoints multiply, modern search architecture increasingly serves as the foundation for a wide range of experiences, not just the search results page.

Fine-Tuning Sparse Neural Retrievers for Ecommerce Is Not That Scary (And Often Worth It) (Qdrant)

Qdrant made the case for an unfashionable idea: sparse retrieval still has a place in modern search architectures. BM25 itself is a sparse retrieval approach, while sparse neural models such as SPLADE extend that paradigm with semantic term expansion while preserving the advantages of inverted indexes, including efficiency and explainability.

The presentation showed that retrieval quality depends less on choosing a fashionable architecture and more on data quality, hard-negative mining, and disciplined evaluation. Fine-tuning SPLADE on ecommerce data delivered substantial improvements on the Amazon ESCI benchmark.

Modern retrieval isn’t a contest between lexical and semantic approaches, but a combination of sparse retrieval, dense retrieval, and reranking.

How Much Searchandising Is Too Much Searchandising?

Longtime search consultant and industry advisor, Charlie Hull, explored the benefits and risks of searchandising, the use of rules, boosts, redirects, synonyms, and query rewrites to influence search outcomes.

These tools remain important, but organizations often accumulate thousands of rules that become difficult to govern, maintain, and explain. Drawing on examples from Rubix, Hull advocated for stronger governance, clearer ownership, and greater reliance on analytics and evaluation.

His central message, and one worth repeating to any team building out rules today: is that searchandising should support relevance rather than replace it. It’s a tactical capability, not a long-term search strategy.

Autoresearch: Coding Agents to Optimize Ecommerce Retrieval

Search relevance expert, Doug Turnbull presented autoresearch, a self-improving workflow in which coding agents use relevance judgments and evaluation metrics to iteratively propose, test, and validate search improvements.

Rather than replacing search systems, these agents operate on top of existing retrieval and ranking infrastructure. Using the ESCI benchmark, the approach improved NDCG from 0.289 for a BM25 baseline to 0.453 using the strongest agent-generated configuration.

The most important lesson: evaluation remains central. The metric doesn’t just define success, it acts as a safeguard against overfitting and unintended changes.

Hybrid Search at idealo: From Fine-Tuning to Production

idealo, one of Europe’s largest price comparison and shopping platforms, shared lessons from operating search at massive scale — millions of products and hundreds of millions of offers.

Their architecture evolved from a heavily customized Lucene-based system into a hybrid stack combining BM25, vector retrieval, Learning-to-Rank, and Reciprocal Rank Fusion. The presentation kept coming back to training data quality: hard-negative selection and false-negative removal often produced larger gains than changing embedding models.

LLMs played a role too, but primarily as tools for improving datasets and evaluation, not as replacements for traditional search components.

Why Your B2B Search Engine Doesn’t Understand Your Users (Adelean)

Maëlly Dubois argued that many B2B search challenges get misdiagnosed as ranking problems when the root causes lie elsewhere.

Using examples from PPE and industrial search, she showed how B2B queries frequently combine technical specifications, certifications, dimensions, industry jargon, and usage constraints. Failures can occur in query understanding, retrieval, synonym management, or product data quality long before ranking ever enters the picture.

Her diagnostic framework encouraged teams to treat zero-results pages and relevance failures as valuable signals, clues that help pinpoint exactly where the search journey is breaking down.

The Bigger Picture

MICES 2026 reinforced a familiar lesson for search practitioners: success rarely comes from a single model, algorithm, or interface. Instead, the strongest teams are combining query understanding, retrieval, ranking, governance, experimentation, and increasingly AI-driven optimization into cohesive product discovery platforms.

The future of search isn’t about replacing what’s already there. It’s about orchestrating it all effectively.

Dig Deeper

See how AI transforms product discovery.

Learn more