Improving Search Performance - The Journey of Creating a Semantic Search Engine

October 10, 2023
5 min read

Part 1: Benchmarking search accuracy


At Wonka, we are constantly building exciting Language AI use cases for teams who are excited about the potential of these emerging technologies in their business. Recently, we were challenged to build a Flexible Semantic Search Engine to search through a database with medical content. 

We accepted the challenge and we’re proud of what we’ve built. Since some parts of the solution provided aren’t easily findable online, we would love to share our learnings throughout this project. So in this series of blog posts we explore different aspects of the challenge to inspire you to build integrations with large language models to capture value from information already present at your company!

In this first part, we will explain the problem and focus on general architecture considerations when building a Search Engine. We will compare out-of-the-box approaches with building your own solution, and discuss the trade-offs between both. In the next two parts, we’ll zoom in on the custom solution approach, and investigate what the general challenges are and how to navigate them.

The problem

Whenever there is a need to find information efficiently in a large knowledge base, either by a company’s customers, users or its employees, the company should look for powerful search solutions that offer both ease of implementation and high-performance results. Until now, organizations usually implemented a simple keyword search engine to allow users to efficiently retrieve data through a search query. 

However, the company we worked with wanted to take things a step further. Their keyword search engine was performing adequately, but users had to specify very specific search terms to be able to retrieve the right results. Furthermore, querying in a highly technical context like the medical field is often very hard since there are lots of scientific words & synonyms for a specific topic, and not all of them might be explicitly mentioned in the database.

Suppose we have a keyword search engine designed to help medical professionals find relevant research papers and studies related to specific medical conditions and practices. 

Now, let's imagine a medical researcher who is interested in finding information about a surgical procedure, e.g. "Heart Surgery". They enter this term into the search engine expecting to retrieve relevant papers, but they encounter a challenge due to the technical nature of the field.  

There are lots of different types of heart surgeries: Artery bypasses, maze surgeries, transplantations, pacemaker implantation, etc. Not all papers will mention the specific words “heart surgery”, and hence those papers won’t show up in the search results. This will cause  some database entries to stay under the radar, even though they might contain very relevant and valuable information.

What’s the answer to this problem? A Semantic Search Engine. Let’s look into how we improved the search performance!

The approach

Azure Cognitive Search

Our team began the search for a better solution by considering various off-the-shelf solutions. The option that stood out was Azure Cognitive Search, a cloud-based search service that streamlines the process of implementing search functionality in applications and websites. It operates by creating an index, which is a structured representation of the searchable data. The index consists of fields containing the content to be searched, such as text, numbers, or dates. After indexing the data, Azure Cognitive Search allows for querying using a powerful query language that supports advanced search features. The service also offers advanced features like language analyzers, custom scoring profiles, and integration with Azure Machine Learning for cognitive enrichment.

Custom built search engine 

Another approach would be to build a custom search engine. A possible solution for this involves a two-stage process: initial retrieval and result reranking, with each stage utilizing its own LLM model. 

In the initial retrieval stage, an LLM model is used to efficiently retrieve a set of candidate documents or records that match the user's query. LLM models like BERT or RoBERTa excel at understanding semantic meaning, enabling accurate retrieval based on relevance.

Once the initial retrieval stage generates candidate results, the reranking stage employs another LLM model to perform a refined analysis and reorder the results based on relevance. This stage goes beyond simple keyword matching, leveraging contextual understanding to capture nuances and infer deeper semantic relationships.

Evaluating performance

For our client, search accuracy was of course the key point of the whole project, so we investigated very thoroughly both approaches, and helped them to build an isolated test environment where power users of their platform could test out different search methods and evaluate performance. And we had some quite surprising results: In the case of Azure Cognitive Search, the out-of-the-box relevancy achieved was around 50%. On the other hand, our custom-built semantic search model provided search relevancy exceeding 80%, indicating a significant improvement in accuracy compared to Azure Cognitive Search.

So what’s the key difference, you may ask? Well, as you might have already realized after the explanation of both methods, they are quite alike in architecture. Both approaches have two phases: an initial retrieval stage and then a reranking stage. However, Azure Cognitive search still uses an index-based approach for the initial retrieval, meaning you will only get the right results if you specifically know the words mentioned in the text you are looking for.

Our custom built search engine instead already tries to link the search query with the results semantically, making sure that none of the database entries were left out in the initial retrieval phase!

Conclusion & next steps

By leveraging LLM models and fine-tuning them on domain-specific data, it is possible to achieve higher search accuracy. While Azure Cognitive Search provides a convenient setup, the results were mostly average. The ability to fine-tune the model & parameters to specific requirements resulted in more accurate and relevant search results, aligning closely with the needs of our client. 

Of course, achieving higher search accuracy with a custom-built model comes with additional development effort, to achieve this high accuracy. Also, since we are using two language models in series, search speed is also a big factor to consider. So in the next two parts of our series, we will zoom in on each of these topics, so you can hopefully achieve similar results and succeed with your first steps into integrating Language AI into your company.