Product search relevancy with Elasticsearch/Solr

All E-commerce relies on search: ranking relevant results. When I look for “phone”, I would be disappointed to see pages of “phone accessories” before I get to the actual phones. Within phones, should we bubble up the best selling ones? The ones with best reviews?

This doesn’t only apply to E-commerce and products. If I look for “java developer”, I wouldn’t want to see JavaScript developers on the first page. Some recruiters prefer experienced people; others would rather “grow them” in the company. We need to provide relevant results for both, without cheating with a “sort by experience” option. That would ignore all other ranking criteria (e.g. endorsements).

In this session, we’ll explore how we can approach relevancy issues with two of the biggest names in big data search: Elasticsearch and Solr. We’ll discuss the following topics:

  • What are Elasticsearch and Solr and why are they fit for searching in billions of documents
  • Main differences between Elasticsearch and Solr
  • How is relevancy score calculated by default and how can we influence it
  • How to continuously tune relevancy score criteria
  • Trade-offs between functionality and performance, especially on autocomplete, which is latency-sensitive