The promise of is a key feature request of most search architects. After all, who wants to dig around in query logs trying to figure out all the ways people are words or spending their time developing a list?
Yet for some reason, after we advanced to the point of letting the machine do that work for us, we still bring in the humans to verify the words we have extracted.
This ability to accept or reject results might sound initially appealing. After all, you get to affirm (or not) if a is synonymous with another — for , that “iWatch” and “Apple watch” are . And maybe you can reject that Airpods and iPods are .
Unfortunately, this is less of an opportunity and more of a burden that results in added costs and poor employee experience. Your merchandisers would rather focus on more creative work than be stuck with dull, repetitive tasks.
Automatic Synonym Detection … as Manual Process?
And as it turns out, other vendors don’t just stop at offering up for approval. They offer the same “opportunity” for typo corrections, placing on merchandisers the heavy burden of approving or declining other es generated by .
As you can imagine, this can result in hours and hours spent (or more accurately, wasted) by your team in judging the value of a given . And worse, they could be wrong.
This isn’t due to the fact that your merchandisers and product managers lack outstanding knowledge of your catalog. They know it and the industry inside and out. However, the typically employed offer overly simplistic solutions to really complex problems.
Word2Vec Still Needs Context
For instance, most vendors will use Word2Vec or find . The latter is a database of English-language synonyms that contains terms that are semantically grouped. It is a good resource — but falls short. to
As an , it knows that “apple” is a fruit, but doesn’t know it is also a business. Word2Vec was a computational model written by Tomas Mikolov and his team at Google. It takes words and produces a vector (vec) of — also known as embedding. By analyzing the mathematical similarities between those forms, Word2Vec teaches a computer context by highlighting words that are “close to” other .
But while this representation is a great start, it isn’t perfect.
Synonym Identification — Determined by Intent
Merchandisers can be asked if “pants” should be treated as “trousers.” What is the answer if you have both North American and British visitors? Or how about: is mixer a of blender? Surely, in most cases. But perhaps a user that has been viewing a couple of DJ speakers already during their session may be looking for a “USB mixer” instead.
This method of “supervised learning” gives the wrong impression that closing vocabulary gaps can be just a matter of accepting or declining a suggestion for a rigid relationship. For , it turns out that some shoppers inadvertently type “iPods” for “Airpods.” The question shouldn’t be: Is this a (accept or reject)?
The question should be: what was the shopper’s intent? A human retail clerk would pick up the confusion immediately because when the shoppers pointed to their ears, she would understand the context of what they wanted.
These recurring context gaps are why people have always been part of the solution — for and management.
So what can we add to our list of ?
Enrichment on Index Gets Stale
Removing ambiguity from so many polysemous words requires having a ton of people to review the content — and the context — and enriching that content on ingestion so that it can be indexed properly.
For your merchandisers may approve “Airpods” to be a for “headphones,” and “iPods” to be a “music player,” and the corresponding metadata would be indexed accordingly. But as we saw in the above, an effective solution cannot just be about determining a rigid definition and relationship between words in your corpus. “iPod” was consistently confused for “Airpod” in .
Therefore, the metadata became stale.
Enrichment at Time of Query
Instead, needs to analyze the queries themselves and make decisions (enrichment) at time of query. It requires listening to and learning from customer signals and the context – then iteratively learning from them — and then syndicating insights to applications that need them.
When the machine sees that statistically relevant numbers of people are making the same mistake, it dynamically adjusts. algorithms should leverage signals from users, including current location, previous searches, and past product views, as well as clicks and implicit preferences.
In a video on management, Noah Locke, manager of web technology at UW Health, explains how UW Health has benefited from Coveo . In his own words, does all the heavy lifting, allowing business users to “sit back, get to watch and celebrate.”
We think AI ecommerce companies should do the same — give users and merchandisers superpowers, not additional work. At Coveo, we call this Automatic Relevance Tuning.
The AI Trust Problem
The reality is, the most likely reason for the comeback of manual curation is human control: business users and digital players often don’t trust . This makes the option of adding layers of human control and supervision more appealing.
As it turns out, there’s a long history of mistrust around and , with 61% of marketers sharing widespread doubts over accuracy, on top of the common fears that over-reliance on automation could spell the end of creativity.
This is an unfortunate barrier to unlocking business success. Just as power tools gave an unprecedented advantage to construction workers far beyond the hammer, AI algorithms can help today’s merchandisers and digital leaders improve customer understanding and dedicate more time to producing stronger business results.
Machine Learning Lets You Focus on Your Business
Sliding backward though is not the answer. As we noted, having to resort to continuous control is tedious, time-consuming (i.e., expensive) — and error prone! But the amazing breakthroughs on finding shopper intent through ML is the real reason why this manual curation comeback will be short-lived.
Advanced algorithms can now help merchandisers aspire to a new way of working, one that allows them to focus on delivering their creative expertise. And it can also help companies save plenty of money, as manually accepting or rejecting outputs and suggestions identified by is labor intensive and expensive.
The real challenge is to close the trust gap when it comes to AI/ML adoption. How can we help business users and digital businesses accept and adopt these new AI-based solutions?