This site looks healthier in portrait mode.

This week, we’re rolling out an improved search experience for Zocdoc, called Patient-Powered Search. Patient-Powered Search is a more intuitive search experience, built so that each patient can use his or her own language, including colloquial terms, slang, and misspelled symptoms, to confidently find the right provider for their needs. After a beta period on desktop, we’re now expanding the new experience to mobile web and mobile app.

This release is directly related to one of our core principles at Zocdoc: ‘Patients First.’ With Patient-Powered Search, you no longer have to be a doctor to find a doctor, and this puts a lot of power in the patient’s hand. Our search experience is comprised of two main components. The first is what we call the ‘front end,’ which is where we solicit an unstructured query from the user and translate that to a list of structured search items. The second part consists of using the selected search item and retrieving a list of medical specialists. In this post we will cover the first part, and discuss several of the challenges and obstacles we faced while building this capability. We’ll start with some more of the guiding principles behind this design:

  1. Evolve the design to give users a familiar experience. Even with a lot of complexity behind the scenes, we wanted consumers to see what they were used to elsewhere – i.e., a free-text search bar.
  2. Give patients the ability to search in their own language. “I have stomach pain” should be enough to return relevant medical specialties. In this example, “Gastroenterologist” might be a suggested result.
  3. Allow for multiple search intents. I.e., enable patients to search for doctor names, specialties and visit reasons (which include symptoms, medical conditions and treatments), all in the same interface.
  4. Empathize with the patient – don’t force them to spell medical terms correctly! For example, even the medical community cannot agree on the correct spelling of “arrhythmia”. Patients shouldn’t have to know this either.

Evolving Search Design

As we thought about how to improve the design, we opted for evolution over revolution. We challenged ourselves to find the best way forward while still being able to attribute performance gains or losses to the change, and we did not want to delve too far from the original design.

Previously, the search consisted of a drop-down and the user had to select from a list of specialties shown:

search drop-down
This drop-down followed the traditional healthcare industry standard format of a structured list.

The first thing we did was change the drop-down to a free-text box that patients can type into. This text box became the entry point to the search process.

patient powered search

Note: the name “Scotch” matches “Couch” on edit distance of 1. The “t” in Scotch becomes “u” in one step and it’s a substring match.

One option we had to move beyond the structured search was to go Google style, with purely free text search that results directly into a set of doctors. We decided on a hybrid, guided search experience, where we first offer a set of structured suggestions before the final doctor search. This fed into our evolutionary design approach, and also gives us better control of the Precision and Recall tradeoff when inferring intent (this is healthcare after all – we want relevance but we also don’t want to miss anything).

As mentioned above, Zocdoc’s search is a two-phase search. The auto-suggestions provide hints or “guides” that serve as input parameters to the final search query.

Phase I:
Guide the user to the right specialty, visit reason or specific doctor
Phase II:
Execute a search for doctors that match the user selection in Phase I. This takes the patient to the search results page with doctor results that match the criteria..

The auto-suggest in Phase I is a mini-search itself. As you type, a full search is being executed that both infers the intent of the query and surfaces a list sorted by likely relevance. In this example, the input string ‘gyno’ returns a set of “Visit Reasons” or “Procedures.”

patient powered search

Clicking on “Annual Pap Smear / GYN Exam” will execute a search for doctors that perform that procedure, hence narrowing down the search to doctors you actually want to see.

Before moving on to parsing intent, did you know that it is “gynecologist” and not “gynocologist?” We learned early on that many medical terms are hard to spell, and even a simple thing as spell check goes a long way for a search product like this.

How does this stuff work?

Previous Technology

The original Zocdoc search implementation was written in C# in what we lovingly refer to as “the monolith.” The monolith search was built in house using C# .NET and SQL Server to power the original Zocdoc search experience.

As we are evolving the search experience, its tech stack must also evolve. Zocdoc as a company made the strategic decision to go to AWS for its infrastructure. See this blog post for details.

Front End

Zocdoc’s original search was served out of the monolith and written in backbone. When building our new front end components, we evaluated a bunch of frameworks and React came out the winner. Thus we embarked on creating a component-sdk not just for search, but to help power all of Zocdoc.com. We also set out to make the pages responsive. They’re not 100% responsive yet, nor are they 100% React, but as we continually evolve the front end, more and more of the site will be.

Since the Zocdoc.com home page and search results page are rendered out of the C# Monolith, we needed to provide a hybrid solution for rendering React. So there’s a lot of glue code that’s involved in importing our component sdk into the monolith so it can render out the new components. As we continue to iterate, more Zocdoc.com infrastructure will evolve down this path.

Search Service Pipeline

The core problem we needed to solve in first stage of the search service was translate unstructured text into a structured query. The second stage of search is keyed off of a taxonomy of doctors, specialties and medical procedures (we have medical experts who help curate this taxonomy), which is why we need this translation layer. We solved this using a combination of architecture, UX and machine learning.

On the architecture side, we didn’t want to reinvent the wheel, so we chose Elasticsearch, which does fantastic work and there’s a host of companies using it (see Elastic.co’s use-cases page here).

auto-suggest portion of the search

This architecture shows how we accomplish the auto-suggest portion of the search.

user search

With each keystroke we’re executing the user’s search and gathering data. As we generate results, those results are also logged and fed into our ranking models (more on ranking later).

The auto-suggest “search” can be broken down into two styles of queries.

Information retrieval for Structured Matches

Many patients come to the site knowing precisely what type of doctor or procedure they want. In this case they’ll be fairly precise and specify either a doctor’s name, specialty (e.g., Pediatrician) or visit reason (Flu Shot). We maintain separate search indexes for each of these intent categories. Approaching the problem like this means we don’t need to build and maintain a separate natural language processing (NLP) layer that parses queries into intent categories. The retrieval mechanisms that Elastic offers perform this step for us.

While being fairly straightforward, we still have the spelling issue to contend with. In order to achieve strong and relevant results we have tuned and tested the string match auto-suggest algorithm multiple times. We’ve tuned analyzers and tokenizers, and built a mini query builder (JavaScript/NodeJS) to allow us to reuse elasticsearch query code in various ways. We have a robust set of APIs that power various aspect of our doctor name, specialty, and visit reason retrievals. We have instrumented nearly every aspect of our Phase I – Auto-Suggest system and keep a keen eye on query performance and end-to-end latencies to ensure the user gets a very responsive (as in snappy) experience.

One thing we learned early on is to not just assume fuzzy string matching is a solved problem and use a standard library without any tuning or optimization. For example, in an early iteration, if the user typed in “cold” the top returned result was “Colorectal Surgeon.” This isn’t a good result for the patient searching for cold. As developers, we can see the logic of why “cold” would consider “colo” as a match, but of course, simple errors like this provide a jolting user experience, and patient trust is something we take very seriously.

Here are some other fun examples where the initial experimental algorithm went awry:

Query Errant Behavior Desired Behavior Underlying Issue
flu sho returns “shoulder surgeon” as the first specialty; returns “flu shot” as the first visit reason return “pcp” as the first specialty; do not return “shoulder surgeon” fuzzy string matching on one word of a two-word query
cold returns “colorectal surgeon” as the first specialty do not return “colorectal surgeon” fuzzy string matching on a common word
internal returns “sleep medicine” as the first specialty return “internist” as the first specialty not fuzzy enough matching
broken toe returns “toxicologist” as the first specialty; returns “broken tooth” as the first visit reason do not return “toxicologist”; do not return “broken tooth”; return Podiatrist among results fuzzy string matching on one word
teeth pain returns “teeth whitening” as the first specialty do not return “teeth whitening” exact string matching on one word
irregular heartbeat returns “irregular menstruation” as the first specialty do not return “irregular menstruation” exact string matching on one word

We test our search product rigorously and document any usability issues we detect. By studying aggregated sets of errors, we can identify patterns that lead to good intuition about how to develop a heuristic or data driven algorithm to scalably improve the experience. As we iterated, we ultimately went from fuzzy matching heuristics to a fully data driven (a.k.a. ‘Patient Powered’) auto-suggest experience.

Semantic Query (“are you looking for?”)

While many patient queries come fairly focused, others of course are not. We have also built a semantic layer that uses NLP maps query text into an embedded concept space and identifies specialties that align with this concept space. This layer enables us to recommend a medical specialty that is most closely associated with the set of symptoms the user may specify.

In the example shown above this means that “i have stomach pain” would translate into:

  • Gastroenterologist
  • Oncologist
  • Pain Management Specialist
  • Surgeon

The power of using NLP in our search engine is that patients can finally search in their own, colloquial terms. By looking at before/after snapshots of medical specialties booked, we have found that adding this semantic layer has enabled us to surface an order of magnitude more results in the specialty long tail.

Other examples that semantic search allows are:

  • “Bloated” → “Gastroenterologist”
  • “Sad” → “Psychiatrist”

Ranking – the “patient-powered” in “Patient Powered Search”

Everything we’ve discussed so far involved creating the structured query list. As order matters in search, we next had to solve the problem of ranking the results.

We took a data driven approach and leveraged well researched machine learning methods to build a ranking system. To start this process we built a new data lake and ETL process so that search data, including suggested results and patient selections, can be collected.

In the ranking loop shown in the above architecture diagram, we built a multi-armed bandit (MAB) process that learns from offline batches. Our patient behavior does not fluctuate too much, so as an initial iteration we did not need our MAB algorithm to be truly dynamic. The main goal of the MAB is to use patient feedback (via clicks) to disambiguate rank position bias from actual relevance. Our initial experiments with this system showed a dramatic improvement in our core metric for this component: Mean Reciprocal Rank (MRR).

This technology allows our platform to continually learn what patients want, and adjust based on trends or new needs.

Our search is not done and we’ve begun this journey with the phase-I auto-suggest. What we’ve built so far is a major improvement that has enabled patients to search in their own terms and surface more relevant care. Our mission is to “give power to the patient”, and a continuously learning search experience is one way we’re doing that. We’re headed in the right direction and we know this by gathering data and gaining insights into how patients behave.

One thing to emphasize here is that we didn’t want to reinvent the wheel. It’s not hard to build an elasticsearch index, academics are already developing word embeddings from medical text, and data lakes and ETL have been built before. The challenge we took on was connecting these pieces together in a way that paves the best possible path for the patient. This is just the beginning, and we have tweaked and iterated to get this point. We’re working to build a great healthcare search experience for our patients, and our journey involves a rewarding symbiosis between us, our patients and the medical community.

I hope you enjoyed this overview on how we went about building the Patient Powered Search. In future installments we will dive deeper into each of the infrastructural, software, and data solutions that enable us to provide a best-in-class industry leading search experience.

About the author

Pedro Rubio is the Engineering Manager for the Search Team at Zocdoc. He’s passionate about creating powerful user data-driven experiences that sit at the crossroads of design, engineering, and machine learning. He loves taking an iterative and curious approach to solving problems. “What happens if” is a great question to ask.

Pedro also enjoys crossover metalcore music, but not all of it. Only a fine selection of painstakingly curated and highly syncopated metalcore makes the grade. Some bands like Arcite (metalcore) and Machinae Supremacy (altrock), Wovenwar (melodic metalcore), and Teramaze (Progressive Metal) have incredible examples of this.

No comments yet, be the first?

Close comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You might also like