Sys Design Notes

Todo:

Go through the following resources:

https://www.youtube.com/watch?v=lX4CrbXMsNQ&list=PLCRMIe5FDPsd0gVs500xeOewfySTsmEjf

Good list for non-functional requirements

The CAP Theorem (Choose between consistency or availability)
1. But honestly, maybe PACELC is better.
  1. Partition, Availability, Consistency. Else, Latency, Consistency.
  2. Basically, provided you have partitions in your system, you need to choose between availability and consistency. Otherwise, you’re going to end up picking between latency and consistency instead.
  3. Honestly, I feel like Latency and Availability go hand in hand because theoretically you could put a message queue in front of your db system, and that results in very high availability but also high latency.
Scalability
1. Anything special that needs to be extra scalable at certain times or days?
2. Burst scaling, user surges, etc.
3. Is the system read-heavy or write-heavy? Plan accordingly.
Environment constraints
1. Will your system be impaired by the environment in any way?
  1. Example: google maps when you’re driving through a dead zone. You can make the decision to show offline versions of the map, cache locally, etc.
2. Does the hardware have limitations (memory, battery, bandwidth, etc.)
Security
1. Knowing how your data is handled and secured
2. GDPR, PII, etc.
3. Health data, for example.
Latency
1. Important to user experience
2. Kafka, pubsubs can help with decreasing latency
Durability
1. How important is it that your data isn’t lost?
2. Data replication, backups
Partition Tolerance

Intro

When prepping for system design, I feel like the best way to consider what to use and when is a direct result of the benefits and consideration of using that technology. I’ve decided to attempt to create a simple list of the various options and when to use them.

I’ll use the following to describe each item:

Short description of how it works
When to consider it
Any limitations that exist with it
Any extra notes about it

Alex Xu’s Sys Design Interview

Relational / SQL databases
- Description:
  - An organized method of storing data that’s relational. Allows for table structures that represent records you insert / change.
  - Offers ACID guarantees (atomicity, consistency, isolation, durability) via transactions, locking, etc.
  - Offers query performance speed ups through index creation
- When:
  - You have highly relational data
  - You need consistency guarantees (a read after a write returns the most up to date data)
  - You need to perform joins
- Limitations:
  - Indexes incur latency at write time because each write needs to update the index’s data as well.
- Extra:
  - They’re traditionally viewed as CA systems, but distributed options sacrifice the A for P, so they can also be CP systems.
NoSQL databases
- Description:
  - Essentially a giant key-value store
  - Usually distributed, and allows for really good scaling.
- When:
  - You need to store data but maybe don’t have the relational structure
  - You need to quickly serialize/deserialize data
  - You have a high number of writes and/or a large number of data in general
- Limitations:
  - You don’t get consistency with a NoSQL database. It’s an AP system. This is because each of the nodes need to be updated with the most up to date data, which requires some “async” style workflows.
- Extra:
  - I know AWS DynamoDB supports things like primary and secondary keys which correlate to how data is stored on disk. Can make reads faster, similar to indexes.
  - Also supports indexes, but likely fall under the same camp as the SQL ones.
  - Can also support locking via optimistic locking. Version numbers are used for this. This doesn’t make the system CP, though, since it just helps with consistency at write time.
Load Balancer
- Description:
  - A machine that maintains a public IP that can receive requests, and then forwards them to relevant privately routed servers.
  - Generally, it evenly balances the traffic across multiple servers, without exposing those servers publicly.
- When:
  - You expect a lot of traffic, and you need to balance it out horizontally
  - You need high availability
  - A small latency hit is okay
- Limitations:
  - Your load balancer can be a single point of failure. Consider adding multiple, similar to how databases manage multiple nodes.
- Extra:
Database Read Replicas
- Description:
  - Instead of having one DB system that handles all read and write requests, we consider creating some read-only copies of the original DB.
  - Generally a main DB will get the writes, copy the data over to the replicas, and the replicas will serve the reads.
- When:
  - You have a read-heavy system. You don’t want your write DB to be overloaded.
  - You need availability. One of the replicas can take the place of the main, or replicas can be replaced.
- Limitations:
  - Can impact consistency. If we write to the main DB and then we read immediately after, we don’t have a guarantee that the read replica has the updated data unless we explicitly build for that.
- Extra:
  - There’s probably multiple versions of this, with a leader/follower approach being a main one, but there’s probably also a pooled version of this. Not sure which one is industry standard.
Cache
- Description:
  - Very similar to a NoSQL database in structure, as a key-value store, but typically stores frequently accessed keys somewhere in memory so responses are faster.
  - Generally placed in front of your DB layer, so it can process the requests first, in case a hit exists.
  - Can generally be easily scaled, making it a great option for distributed systems.
  - There are different types of caches:
    - Write Around → write to the db, then read from the cache. If it doesn’t exist, read from the db and write the updated value to the cache.
    - Write Back → write to the cache, and in batches / eventually, write back to the DB
    - Write Through → write to the cache and immediately write to the DB.
- When:
  - If you have some subset of data that is frequently accessed, you should use a cache.
  - If you need to prioritize latency, caches can help.
  - If you need to reduce strain on a database, you can use a cache as an alternative to DB replication.
  - If you have a read-heavy system, caches are good.
- Limitations:
  - If you have uniform data accesses, the cache can actually hurt your latency due to the extra step. Eviction policies might help with this.
  - Caches are usually in-memory (this is why they speed things up). This means they’re not durable. If something happens to the cache, you lose your cache data.
  - Caches may cause consistency concerns
  - Can be a single point of failure. You can replicate caches to help with this.
- Extra:
  - Try to keep expiry values on cache items so that they properly get removed after some amount of time.
CDN (Content Delivery Network)
- Description:
  - Like a cache, but a set of servers that are geographically spread out and serve static content.
  - Leverages server geolocation to provide speed ups.
  - When user requests certain data, the CDN is hit first. If the CDN has a cache miss, the CDN will get data from your server and then place it in the cache. If another user tries to hit that same data, it will then be served from the CDN. (TTL can be sent from the server to the CDN)
- When:
  - You have users across the world
  - You have to serve a bunch of static content (think videos)
  - You need to prioritize latency
  - You need to reduce strain on your own servers
- Limitations:
  - Can be expensive
  - A long cache expiry time means that you may have changes to your website or content that don’t get reflected to users because your cache is technically outdated.
  - Clients may need to handle direct routing if the CDN is down
- Extra:
  - Dynamic CDNs exist that can serve HTML based on request params.
  - I think some CDNs also allow you to provide some starting data rather than need to go full cache mode
Message Queues + Pubsub
- Description:
  - Durable, usually in-memory component that allows for async communication.
  - Serves as a buffer between your producers and your consumers. When consumers are ready to process messages, they will be able to grab from the queue.
- When:
  - You want decoupling between your producer and the consumer. The queue allows for processing/storage when either one (or both) are down.
  - You want to handle bursts and you need some form of auto scaling. If the queue reaches some size, you can scale up workers. If it reduces heavily you can scale down the workers.
- Limitations:
  - Bad for latency / consistency but good for availability
- Extra:
  - pubsubs are very similar to message queues. Queues are one to one (producer to consumer), but pubsub are one to many (one producer broadcasts to many consumers). In general, they provde very similar benefits.
Database Scaling (mostly Horizontal, but Vertical covered as well)
- Description:
  - The process of adding more resources vertically (when you beef up the hardware you’re using) or horizontally (when you use multiple pieces of hardware to solve the same problem)
  - Horizontal scaling is also called sharding. Sharding is the process of splitting data on a particular field. Based on that field’s value requests are routed to the right shard.
- When:
  - Vertical scaling when the data size is not too large, although this rarely happens if you’re designing a system to scale very large.
  - Horizontal scaling when the size of your data doesn’t reasonably fit into a single database.
- Limitations:
  - Vertical scaling is not generally suggested, as it’s still a single point of failure, and there are hardware limitations as you reach a certain scale. However, if the size of the data allows for it, using a single machine and vertically scaling can be beneficial, as you may not need to worry about Partition Fault Tolerance.
  - Horizontal scaling introduces a bit of latency, as instead of going to a single DB, we need to identify which DB to go to (may be an extra request to a server).
  - Joins can become expensive if you’ve sharded your database.
- Extra:
  - Sharding key choice is important (partition key). Usually you want to make sure it can be evenly distributed.
  - Resharding can help solve issues down the line if a specific shard is getting too much traffic.
  - Denormalizing the db allows for querying to happen on a single table, although I need to learn more about how this actually happens.

This point marks the end of Chapter 1 of Alex Xu’s first book. I’m going to just keep adding random structures and concepts that I think are relevant to the sys design interview below.

Neetcode’s Sys Design For Beginners

Consistent Hashing
- Description:
  - An applicable use case of hashing in general. Generally good for when you want to figure out which servers to route requests to.
  - Imagine a circle, and our nodes are placed throughout this circle. We have a hash function that can map to points on that circle. If we have a request and we want to figure out which node it should be routed to, we hash the request, get the value, place it on the circle, and find the first node clockwise after that point. That node is the one that gets the request.
  - Adding another node to your cluster is easy. We can just drop a new node on the circle, and at that point forward, certain requests will be served to that node.
- When:
  - You require uniform load across your servers
  - You want a solution that allows for high availability:
    - Consistent hashing helps with single points of failure
    - Allows you to add more servers as needed
  - You have caches and want to minimize the number of invalidations that happen
- Limitations:
  - Consistent hashing arises as the solution to limitations with simple hashing. In simple hashing, if a server goes down, modular arithmetic changes since your number of server changes. In this case, let’s say we have 3 servers, so we perform a mod 3 operation. This means we have to perform remapping, logic changes etc. So, with consistent hashing, requests going to server 2 before will still go to server 2 after. With normal hashing, those values may change.
  - You don’t always need consistent hashing. Sometimes, round robin is perfectly fine (imagine a case where you don’t have caches / data distribution)
- Extra:
Object Storage / BLOB storage
- Description:
  - A giant distributed key value store. Provides the facade of a file system, but is actually just storing paths as the keys (example: s3://bucket-name/folder1/folder2/file.txt) and the actual data as the values.
- When:
  - You have giant binary data that doesn’t make sense to keep in a database
  - You want durability
  - You want to reduce strain on your DB
  - You can access the blob storage directly through http requests
- Limitations:
  - Not really made for optimized reads / writes. We have to make network requests to get our data
- Extra:

KP's Thoughts

Explorer