Why data-driven methods will shape the future of relevance search

06/18/2019 - 15:20 to 16:00
Frannz Salon
long talk (40 min)

Session abstract: 

You already have your search engine in place, users have started using it and now see that your results could do better. You hire some engineers to get you through this and they included a synonym here, an exception there, and asked you to create a taxonomy to organize your items. After a few months,  everything seems to work fine except for a few minor issues that you got your team working on. 

Now, your company needs to expand to meet the new demands of your business, so you get a wider range of products in your catalog. Your search results start to show some issues and your team is having a hard time controlling all of your search parameters and maintaining rules and exceptions. You know that improving your results will be the key to the success of your company. You now get yourself the biggest decision to make: should I hire more relevance engineers to handle it?

The first part of this talk will show why it is so complicated to scale an engineering approach of relevance based only on synonyms, taxonomies, rules, and exceptions. As a consequence of these limitations, the second part will focus on the latest advancements of natural language processing and information retrieval to show that all you need are good data scientists responsible for providing new data-driven solutions to your needs without causing any change into your search stack. 

With the vast amounts of data collected, you can find applications of it for all your search needs. The talk will guide you through topics such as language modeling for autocompletion, deep neural networks for named-entity recognition, network embeddings for relevance score, and query-product embeddings to improve the discoverability of your most exquisite items. All of this, optimized by your learning-to-rank algorithm and enabling you to personalization capabilities.

With all of these data-driven approaches, your search engine will be actively learning from the interaction of your users and there will be no need to hire an entire department of relevance engineers.