The last 12 months have had many fits and starts—perhaps not surprising after such a bleak 2020. One area that really awoke from slumber was ecommerce research. In fact, AI journal publications have witnessed double-digit growth over the past few years. 

But it’s not just the volume of published research that has been outstanding. We looked at the research published in the most prestigious venues (like KDD, WSDM, NAACL, RecSys, WWW, CIKM, and SIGIR), and saw that the quality of these papers was impressive, too. 

And Research in Ecommerce AI was no exception. Papers included many critical aspects like biases, interpretable machine learning, and comparison recommenders. 

To catch you up, we have prepared this curated list of the best papers published this year.

The Top 10 Ecommerce Research Papers of 2021

So how much does the time of year actually impact sales?

Amazon’s scientists explored the concept of ‘season relevant’ at the Conference on Information and Knowledge Management (CIKM ‘21).

The authors assert that 39% of queries are seasonally relevant. This translates to 42% of total purchases in a year are made following those queries. 

The trouble of diving shopper intent
Blog: Is Semantic Search Enough for Ecommerce?

Need an example? Think of a query such as “jacket.” A summer jacket is typically much different from a winter one. The authors present ways to identify seasonality in queries and products—and define features that capture it.  

We’re excited to see more research into this topic. On multiple occasions, we’ve defended that ecommerce search is sui generis. Online shopper queries often don’t provide enough context to truly understand intent — and seasonal relevance nicely illustrates this point. 

Interested in the topic of seasonality? Don’t miss our post discussing how local climate has an influence on consumer behavior

2. Theoretical Understandings of Product Embedding for E-commerce Machine Learning 

How do you tell the difference between a men’s work shoe and a women’s running shoe? Many might say the difference is obvious, even inherent. But there are many structural and visual cues that aren’t easily identified by machines. 

The gist of the second paper in our list, presented at WSDM 2021 and comes from Walmart’s and Instacart’s researchers.

Product embeddings have become a cornerstone for a considerable amount of machine learning models in ecommerce. Yet, the authors argue that our understanding of product embedding is inadequate. Especially so when compared with traditional techniques like collaborative filtering. 

To this end, they provide theoretical insights to answer what product embeddings are, and how they are unique to ecommerce.

This is a fine piece of research that resonates well with our recent work leveraging product embeddings. Over the past years, we’ve educated many about product embeddings, explaining why they are so successful in ecommerce. It’s certainly good news that other researchers have realized the urgent need for more theoretical work on this. 

Add relevant results, rinse, repeat
Blog: Clothes in Space: Real-time search personalization in less than 100 lines of code

3. “Are you sure?” Preliminary Insights from Scaling Product Comparisons to Multiple Shops

A successful ecommerce strategy used by the likes of Amazon and Home Depot are comparison recommenders—a special type of recommender systems. These types of tables show shoppers differences between similar or related products. Comparisons are proven to help tip the scales of shoppers who are looking to rationalize purchases. 

However building product comparison tables at scale is a difficult chore for mid-level companies. They require pre-existing training/taxonomy data – which remains an open challenge. As a result, these recommendation types are still very rare in ecommerce.  – and only a limited number of relevant papers are available. 

That’s what makes this third paper, presented at SIGIR and from our own Coveo lab, is a must-read. To tackle this problem, our researchers present preliminary results from building a comparison pipeline designed to scale in a multi-shop scenario. 

By using a mix of behavioral and catalog data, and extensive investigation of feature importance, the Coveo team found interesting insights while building comparison tables in a multi-shop scenario. The authors describe design choices and run extensive benchmarks on multiple shops to stress-test it.

Interestingly, the paper also leverages Coveo’s new MLOps stack (including Metaflow for research and production ML pipelines). It also includes a MTurk-based user study as offline validation for some modelling choices made.

Would shoppers trust your recommendations—and perhaps be more likely to buy them —when search is explainable? For example, consider if you presented a product page with an explanation like “we selected this product for you because you often buy products from brands such as River Island.”

The fourth paper in our list explores this idea and was presented at CIKM 2021

Representing a contribution from academics rather than industry practitioners, the authors propose an explainable product search model with model-intrinsic interpretability. We believe this makes a contribution that is both timely and highly relevant to increasing concerns on the transparency and accountability of AI systems. 

Simply put, explainability in this context refers to the ability of a product retrieval system to provide explanations that allow shoppers to understand, trust, and effectively control the retrieved products.

Interestingly, previous studies have shown that providing recommendations with explanations not only increases conversion rates, but also improves user’s satisfaction on ecommerce websites. 

However, the authors rightly point out that despite the extensive studies on explainable product recommendation, the effectiveness and the potentials of explainable product search is mostly unexplored. And while the paper offers a model for interpretable product search, that would still need to be validated in real ecommerce settings. 

In the meantime, learn about effective product recommendation placement

5. Query2Prod2Vec Grounded Word Embeddings for eCommerce

Pop the term ‘Nike’ into a search box, and you’ll (hopefully) get back search results full of shoes. (Assuming that’s what you were looking for.) It’s not a lot of information to work with, but the top 20% of ecommerce queries that drive traffic are this generic.

(Or sometimes even more so—try narrowing down a set of products with just the term ‘shoe.’)

The fifth paper in our list tackles this problem. The paper explains how product vectors provide accurate lexical representations for words such as “Nike” and “shoes” through grounding

Grounding is a fancy term highlighting the fact that “Nike” and “shoes” refer to actual objects in a catalog, and not mere strings of symbols. The focus of the work is on short queries and, in particular, on the efficiency gain that Coveo’s method provides compared with traditional NLP approaches based on co-occurrences.

The paper was a result of a successful industry-academia collaboration between researchers from Bocconi University and Coveo’s AI team and winner of the best industry paper award at NAACL 2021.

A graphic shows how type-ahead suggestions appear when a searcher inputs the beginning of a query into a search box

Importantly, while retail giants indubitably played a major role in moving ecommerce use cases to the center of NLP research, this paper presented at NAACL contributes to finding solutions that address a larger portion of the market—an exciting agenda of its own. 

If you like the grounding idea, check out our second NAACL paper, which develops the intuition further and provides some theoretical contributions at the intersection of dense and set-theoretic semantics.

How well-versed are you on the cold-start shopper problem?
Blog: ROI on Personalization for Cold-Start Shoppers

6. Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation

As the title suggests, this next must-read paper focuses on cold start recommendation problems. Collaborative filtering and deep learning-based models have achieved considerable success in recommendation systems. Yet, it is difficult to provide recommendations for new users or items that have sparse interactions.

Presented at CIKM 2021 by Alibaba’s researchers, the authors show how traditional solutions (e.g., those based on introducing additional information such as content like descriptions or photos) have achieved some success. Yet, they point out these solutions only alleviate the cold-start problem from the short-term viewpoint. This harms the long-term rewards that can impact a product’s performance throughout its “life period.” 

The authors argue that Reinforcement Learning provides a natural, unified framework that maximizes the instant and long-term rewards jointly and simultaneously. To this end, they offer an approach based on Reinforcement Learning that addresses these issues. 

This is an interesting piece that discusses two topics we are very much interested in at Coveo, namely Reinforcement Learning and cold start problems.

Our scientists have been applying techniques based on Reinforcement Learning to improve the experience of product discovery, so we are pleased to see further interesting research on Reinforcement Learning applications in information retrieval. Further, Coveo has also researched extensively on the cold start problems, so it is exciting to see a flurry of new work on the topic.    

Amplify the long tail
Blog: Wag the Tail: Getting the Most Out of Your Product Catalog

7. Understanding Multi-Channel Customer Behavior in Retail

Online shopping skyrocketed during the pandemic and changed the way we consume products and services. But do multiple shopping channels affect shopper behavior? If so, how?

The seventh published paper in our list provides interesting insights into multi-channel customer behavior that have decisive implications for the field of information retrieval. Presented at CIKM 2021, it is the output of industry practitioners at Airlab and researchers at the University of Amsterdam.

While there are numerous studies examining user behavior in online shopping platforms, little is known about multi-channel customer behavior. The paper tries to fill this gap and argues that understanding multi-channel customer behavior is crucial for ML tasks such as recommending products and predicting purchases.  

The authors focus on investigating the Next Basket Recommendation (NBR) task. In this case study, the goal is to predict items that customers will buy in the future, given their previous shopping history.

Prior NBR research relies on data from a single shopping channel only and does not consider multi-channel settings. The authors try to remedy this shortcoming.

They analyzed a sample of 2.8 million transactions from 300,000 customers gathered from a food retailer with multiple physical stores.

They found significant differences in customer behavior across online and offline channels. This included larger online baskets and  a greater tendency for online-only customers to purchase previously-bought products.

Struggling to connect physical and digital?
Blog: 3 Reasons Why Your Omnichannel Retail Strategy Is Failing

While the research is based on a specific retail vertical (i.e., food) and might not generalize to other verticals (e.g., apparel or fashion), the general plea for more work on trying to understand multi-channel behavior and incorporate insights into information retrieval tasks and systems is a valid and interesting one, which we hope will inspire further research.

8. SIGIR 2021 E-Commerce Workshop Data Challenge

Understanding that most ecommerce organizations are not Amazon, Alibaba, or Walmart-sized, Coveo’s AI team went above and beyond in 2021. They organized a data challenge that focused on mid-sized ecommerce shops.

Since the majority of research papers typically tackle datasets with constraints that apply only to Amazon-like shops, there’s very little that the industry as a whole can take home from the innovation of the last few years. 

The paper announced the release of the richest anonymous dataset ever for ecommerce research. The dataset comprises three main files for training: browsing interactions, search interactions and product metadata. All files are available for download, and all training data comes from sampling several months of interactions in the life of a midsize shop with Alexa Ranking between 25k and 200k. 

Since the dataset is based on a midsized ecommerce, it is a representative example of thousands of shops. This helps remedy a shortcoming of much ecommerce research. 

The release of this dataset finally allows midsize players to benefit from research that is relevant to them.

Learn how we're applying small data research to real-world problems
Blog: Reframing the Small Data Problem With Philosophy, Linguistics, and AI

9. Transformers with multi-modal features and post-fusion context for e-commerce session-based recommendation

The ninth paper in our list was based on research submitted as part of the Coveo Data Challenge and conducted by NVIDIA researchers. 

Starting in April 2021, the challenge for two months of intense competition. Starting from the released dataset, solutions addressing two tasks were accepted: 

  1. A cart-abandonment task, where, given a session containing an add-to-cart event for a product X, a model is asked to predict whether the shopper will buy X or not in that very session, 
  2. A session-based recommendation task, where a model is asked to predict the next interactions between shoppers and products, based on the previous product interactions and search queries within the session.

NVIDIA’s researchers won with a solution using state-of-the-art transformer models for the recommendation task. The solution achieved great performance through some careful design choices and a final ensemble. Aside from the quantitative and modelling considerations, the authors went above and beyond with interesting qualitative analysis on popularity, metadata, and intent shifting in shopping sessions. (They have their own blog post, too!) 

However, all of the papers submitted as part of the Coveo Data Challenge were very strong, and you can find a brief overview of the contributions here.

Get all the details
Blog: The Coveo Data Challenge: Aftermath of SIGIR Ecom 2021

10. Towards Unified Metrics for Accuracy and Diversity for Recommender Systems

The last paper in our list was presented at RecSys and focuses on the topic of recommender systems evaluation, which has become increasingly important in recent years. 

Accuracy is the traditional benchmark used when assessing the worth of one method over another. Our authors, a combination of Google researchers and academics, stress that recommendation diversity and novelty are also frequently recognized as critical to users’ perceived utility. 

To address this, the paper suggests evaluating recommenders with a metric commonly used for comparing search systems. Their suggestion combines topical diversity and accuracy, and is an interesting step in the right direction. 

For instance, consider that Coveo researchers just released RecList, an open source package for behavioral tests that is designed to help scaling up testing and building systems that can be trusted. 

While it is clear that recommendations need to be relevant, assessing the performance of recommender systems in practice is by no means easy. This paper is an important contribution that resonates well with some of the research we have been conducting at Coveo. 

In the market for a new recommendation engine?
Ebook: 6 Most Popular Recommenders to Entice Shoppers

What This Ecommerce Research Means for Retail Leaders

There is still much uncertainty that haunts digital commerce providers as we head into year three of COVID-19. But one thing that has become increasingly clear since the beginning of the pandemic is the importance and acceleration of digital transformations (already underway or not). 

As companies look to the future, they should solidify digital foundations and ecosystems to meet near-term demands while building long-term resilience and flexibility. Our  summary of the best AI research of 2021 strives to give companies confidence in their technology investments. As we have seen, the depth and breadth of 2021’s Ecommerce AI research papers  is quite inspiring. Our list is just the tip of an iceberg and points to a number of features, techniques, and datasets to help them lead their innovation efforts. 

Let us know what your favorite papers have been using the hashtag #BestEcommerceAIpaper on Twitter.


Have a minute? We’d love to get your feedback.