This site looks healthier in portrait mode.

Zocdoc was, as recently as 2 years ago, a monolithic application. The problems of running such an application, both obvious and subtle, have been detailed in this previous blog post. Since then our engineering team has been pulling core components out of the monolith and into their own independently deployed services in the cloud. In this post, we will discuss one key aspect (actually, a prerequisite) of this move – moving legacy data.

In our decade of existence our data model has evolved and stabilized; it is easy to explain and new products are able to plug into it easily. There is room for improvement, obviously, but we did not feel it was necessary to change it dramatically as part of this data move. The one thing we did want to do was to enable new data access patterns. With a monolithic (relational) database, we were constrained with how we could read our data. Report generators could block transactional queries and ETL jobs would scan SQL tables to get the data they needed. The new system should provide correctly formatted data for analytics, data science, event based systems, batch processes, in addition to query/response apps.

We noted down these, and other goals and parameters before we started the job.

Goals

  • Provide a read-only copy of core Zocdoc data outside the monolith to enable new services and products, until a later time when this data exists completely outside the monolith.
  • Provide data in multiple representations for different use cases.
  • New data should be available under a reasonable amount of time (under 10s for events, an hour for analytics).

Constraints

  • Data model should remain unchanged.
  • Data should still be written to the monolith, and copied to new datastores outside of it.
  • The new system should be used only for legacy data. New applications and services should create their data outside of the monolith.

Then there was this small matter of naming this project. We stack ranked a bunch of ideas and took a vote between Legacy Data Mover, Chimichanga and Wasabi. Talk about a landslide.

A Peek at the Wiring

Before moving on we should understand some key points relating to the monolith’s architecture. I’ll steal a diagram from the post I referenced earlier to aid this explanation

Monolith architecture

The high-level architecture is very linear – webservers running our C# IIS app talk to the SQL Server database. The webservers use in-memory caches to store application data. When a webserver has to mutate state it issues a cache update, a SQL update, and reaches out to the other webservers to apply the same state change to their caches. This is what we call the Notify system; it will be central in our design. Most applications (frontend and backend) get most of their data from the caches, going to SQL only when necessary. This saves us from making expensive calls to the database, but on the flip side, we have to deal with the complexity of our cache and Notify systems in dev, test and prod environments.

Outside the monolith we use a much wider toolbelt. We’re an AWS shop so we have a lot to choose from. We run Nodejs, Scala, .Net Core and Python apps in ECS (on Linux) and Lambda. We use DynamoDB, RDS (MySQL, Postgres), S3, Redshift, Elasticsearch and ElastiCache to store data. We have more Kinesis streaming apps than I can keep track of. It’s a great time to be a Zocdoc developer!

Build vs Buy

We knew that we wanted to build an event-based system – they’re simple and minimize coupling. The alternative would have been a periodic batch push, but that would essentially make our latency goals (#3) hard to achieve.

Our (collective) first instinct was to purchase a database migration solution. There are several vendors that provide such services and, although expensive, they would save us a lot of long-term maintenance costs. Most of these are one-time or batch services but there are some that hook into SQL Server log shipping to actually send DML commands to the new database as they’re happening. We thought that was pretty cool.

Unfortunately, we couldn’t go down this route because our application data is often different than what is stored in SQL Server. We transform data from SQL before storing it in the application’s in-memory caches. Using a database migration service would have required us to take this business logic, replicate it elsewhere, and keep both places in sync. This was a prohibitive cost and we decided not to pursue further.

It was obvious now that we needed to roll out an in-house solution. The team aligned around a high-level design: every state change event passing through the Notify system would be sent to a messaging system suited for high scale. From there it would be picked up by an event processor which would apply these state changes to the new datastores. Kafka and AWS Kinesis were the main contenders for our data stream. We went with Kinesis mainly because it, being a managed AWS service, would reduce operational costs for our team. At Zocdoc, dev teams support their own applications so any amount of operational cost reduction goes a long way. Also, being new to developing any kind of streaming applications, Kinesis had a much gentler learning curve. The Kinesis client library (KCL) provides a very simple and convenient API for interacting with the stream.

Storage Tech

We wanted to solve Goal #2 really well, so the choice of databases was key. We picked Kinesis again for streaming out changes from Wasabi to its consumers. S3, battle-tested and ubiquitous, was our choice for storing periodic snapshots of our data. Choosing the database technology for the query/response use case required more thought. SQL was an obvious choice because it would match the source data exactly. SQL also makes it easy (well, relatively easy) to query data arbitrarily. On digging deeper, however, we found that a majority of our application queries were satisfied by primary, and a few additional indexes. We were determined to not allow arbitrary querying in the new system; if we ever needed to, our proposed solution was to load snapshots from S3 . NoSQL started looking really good. NoSQL would also allow us to move to a Document-oriented schema which would work well with healthcare data originating from disparate sources. We settled on DynamoDB – AWS’s fully managed (yes, we really don’t like to manage servers on my team), pay-per-use NoSQL database.

Processing Events

The central piece of the design was the event processor. The event processor’s responsibility was (i) to read an incoming event, (ii) determine if it was a valid state change by reading the current state of the document it intended to alter, and (iii) if valid, propagate the new (changed) document to DynamoDB, the outbound Kinesis stream and an S3 bucket for further processing by the snapshotting system. An example might make this simpler to understand:

  • Current state: { doctor_name: “Dr. John Dorian”, specialty: “Surgeon”, hospital: “Sacred Heart” }
  • Event fires: { doctor_name: “Dr. John Dorian”, hospital: “Chicago Hope” }
  • Final state: { doctor_name: “Dr. John Dorian”, specialty: “Surgeon”, hospital: “Chicago Hope” }

Note that the event does not need to furnish all information related to the document; it just needs to provide document identifying information (doctor_name) and some changes to apply. The absence of information is taken to mean that the information hasn’t changed. To remove a piece of information we, explicitly, empty or null it. The processor applies these changes and publishes the full new document for downstream systems. It also throws away no-ops and out of order events – a requirement to deal with some idiosyncrasies of the Notify system.

The processor was implemented as a KCL app, written in Scala and deployed with ECS. We evaluated writing the processor as a Lambda first but decided it wasn’t a good fit. The processor would do a lot of IOPS, and would be constantly running, so we wanted to give it some extra muscle.

For the snapshot system, we went with a Spark SQL solution, running on EMR. The Spark job loads the last snapshot, and all the events since then, applies the diff and outputs the new snapshot to S3. This system processes massive amounts of data with only a few hundred lines of code. And it does this very efficiently, with little maintenance.

Let’s take a look at an architecture diagram that puts it all together.

Wasabi initial architecture

Running in Production

With the design finalized we went straight to work to implement it. We were working with things that were new to us – tech stacks, infrastructure as code, and owning the deployment pipeline – but there was a lot of enthusiasm as we knew we were building transformational infrastructure that would boost developer productivity and enable innovation. With Wasabi, designing new features in the cloud doesn’t require a plan to make the data available first. Product engineers could focus on producing business value, fast.

We began rolling out Wasabi and auditing the data against SQL Server. After some bug fixes and performance improvements, we were ready for primetime. The system was handling hundreds of thousands of events every day. Very soon, we had the data snapshots and the event stream powering production systems. The front-end team began evaluating Wasabi as the main data store for Zocdoc’s shiny new webapp. This was a huge deal for the company as this new webapp was going to live completely (assets, backend and data) outside the monolith.

Unfortunately, during this evaluation we realized that data required by the webapp to render a page fully would require multiple (often sequential) calls to Wasabi. This was considerably slower than the monolithic webapp which got most of its data from the in-memory caches. Beating render times in the monolith was never anyone’s intention but we still needed our webapp to be snappy.

This also exposed a design flaw. Wasabi provided only raw, “building block” style data. Applications were expected to mold their own view of the data on request. Since a lot of our applications are read-heavy, this meant every read would involve mostly redundant, often heavy, computations. This deficiency is the most apparent when the application is a webpage requested by a person.

We realized how our predicament was similar to the problem described in Martin Fowler’s Bounded Context essay. The gist of it is this: fitting a large domain (in our case, healthcare) into a single model is hard. Instead, model smaller “contexts” and specify relationships between them. Keeping with our healthcare theme, the website’s view of a doctor (reviews, certifications, hospital affiliations, etc) is different than the billing system’s (email, payment details, etc). But when a doctor purchases a new Zocdoc product, it should affect both the website and billing.

Reducing Latency

Our next steps were clear. We needed to scale back the responsibility of Wasabi to providing raw data documents and, at the same time, provide an easy, repeatable pattern to incrementally build an application’s view of the data before this data was requested. Additionally, these views (or data projections, as we call them) needed to live in a blazingly fast data store. No points for guessing this needed to be Redis. I’ll accept Memcached too, but points deducted for guessing otherwise.

We were solving the problem for our webapp first, so we had to start with providing support for javascript, which is our frontend teams’ language of choice, to build their applications’ data projections. To this end, we wrote a library to

  • Read event data from the outbound Kinesis event stream
  • Grab document data from DynamoDB to supplement information retrieved from the event
  • Write projection data to Redis (via ElastiCache)

Now, to build their projections, these teams would package this library with their projection builder logic in a nodejs package and deploy to AWS Lambda. Let’s see how this might work with an example:

  • The doctor profile page projection builder receives an event stating that Dr. Dorian now works at Sacred Heart’s Brooklyn location.
  • It gets this location’s display information from DynamoDB.
  • Then, it reads the current projection from Elasticache, inserts the new location to the list of Dr. Dorian’s locations and writes back to Elasticache.
  • On a daily basis, we send the current state of each object as an event through this system – so a new projection will be fully hydrated with data within 24 hours of coming online.

Wasabi revised architecture

Because of the low cost and ease with which these projections could be built, we encouraged teams to start moving their presentation data to Wasabi projections. The new webapp gets most of the data it needs to render the doctor profile and search pages from Wasabi projections. A lot of our small-size reference data is also in Wasabi now. This includes our medical procedure and visit reason data, specialty information and so on. As a result, most webpages only need to make a couple Elasticache calls – our data fetch times are down tenfold!

Wasabi retrieval timing

Notes on Scalability

There is still a decent amount of work ahead of the team. We need to support other development languages for the projection builders, provide more monitoring, and so on. However, we are happy with how our design can scale out as needed. Starting from the event stream to the snapshot spark job, Wasabi scales horizontally. We can shard streams, add more nodes to a cluster, quickly and easily to deal with an increase in traffic. This, in my opinion, is a hallmark of good software design for a growing company. As we grow, we get to work with new partners in our industry and get access to more data. Our application scales up to meet this long term trend of growing data. On the other hand, our patients’ engagement with the website has temporal trends – Mondays are busier than Sundays, and daily traffic peaks around the lunch hour, compared to lower traffic in the early mornings. We have both manual and automatic scaling measures to rise to the occasion, and also cut back on computational waste, as needed. Contrast this with a monolithic design where you are mostly limited to vertical scaling, paying for resources 100% of the time that you might need on the rarest of occasions.

Key Learnings

Given how many AWS services we use in Wasabi, we, naturally, have a lot of learnings to share. Without going into detail, and resisting the urge to list both obvious and specific ones, here are some that I offer without hesitation:

  • Lambdas are awesome. However, AWS has a concurrent Lambda execution limit at the account level. So if one application invokes a lot of Lambdas at the same time, it might prevent others from triggering. Limits at the individual Lambda (or project, tag) level are not supported. I recommend the use of message queues (e.g., SQS, Kinesis) to trigger Lambdas as opposed to a potentially unbounded trigger (e.g., S3 events).
    (update: AWS announced function level concurrency limits at re:Invent 2017)
  • Global secondary indexes in DynamoDB can get expensive since each of them is essentially a separate table. Consider other databases if you would need multiple indexes on your data. This makes it necessary to enumerate all access patterns while designing your architecture.
  • DynamoDB is not particularly suited for spiky reads and writes as you would have to pay for peak traffic, plus some wiggle room. Auto scaling can help to some degree, but there are some obvious flaws with DynamoDB’s native autoscaling capabilities. Caching can also take away some of the read load; check out DAX. If cost is the primary motivation for caching DynamoDB, then make sure your cache clusters are cheaper than the savings from read capacity reduction. Consider sharing cache clusters with other applications, or caching only most-often-read data.
  • Keep an eye on your AWS bill. With auto-scaled infrastructure there is a risk of going over budget with an unaccounted-for scenario. We use CloudHealth to track our AWS service usage.
  • AWS moves fast and adds new features regularly. A good way to keep up is to subscribe to their “What’s new at AWS” newsletter or read it here.

About the author

Mustafa Rizvi is a Principal Software engineer in the API Platform team at Zocdoc. When he needs a break from application infrastructure, he likes to create important internet art and utilities.

No comments yet, be the first?

Close comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You might also like