Twitter - System Design
First Created: 2021-01-23 23:10
A read-heavy system (compared to write)
2. Users are able to follow/unfollow others
3. The service should display a user's all posts sorted by timeline in his/her mainpage
4. A user can view his followers' posts
Non-functional requirements
High available
It is ok to have eventual consistency and 200ms latency for example.
Each tweet has a limit of 140 characters.
Daily Active
Daily Active Uers: 200Million
100 million new tweets every day = 1150 write QPS
System API
POST
post_tweet(api_dev_key, tweet_text, location, imageId, videoId)
get_tweets(api_dev_key, max_number_to_return, next_page_token)
follow(userId)
202 Accepted
200 OK
Schema
User Profile DB
userId (Primary Key) | Name | Description | PhotoId | Email | DateOfBirth | CreationDate | Tweets[] | favoriteTweets[]
Tweet DB
TweetId (Primary Key) | userId | Text | ImageIds[] | VideoId[] | Latitude | Longitude | timestamp | Hashtag | (originTweetId in Re-tweet)
Follow DB (if SQL)
userId1, userI2 (Primary Key)
Tweet Favorite DB (if SQL)
FavoriteId (Primary Key) | TweetId | userId | timestamp
ImageDB (same as video)
ImageId (Primary Key) | image_url
Followee DB (specially for hot user)
userId (Primary Key) | userIds []
FANOUT Service
Async
Non-celebrity:
Save tweet to DB/Cache
Fetch all the followers that follow user A
Inject this tweet into all the followers' queues/in-memory timelines
Finally, all the followers can see this tweet in their timelines
Celebrity:
Pre-compute non-celebrity tweets
Mixed with the tweets from celebrity in the runtime of user's request
Not computing for inactive users (not log in for more than 15 days)
Infra
[Clients] -> [Load Balancers] -> [Application Servers] <-> [Databases / File Storages]
Sharding
1. By userID: hot user issue / not uniform distrubution of tweets
2. By TweetID: need aggregation and return top tweets. App server will merge all the results.
Need to query all the databases.
3. By CreationTime: can be quick to find latest tweets. But it only queries a small set of servers (recent tweets) . Not suggested.
4. Combined timestamp into tweetID. Epoch time + auto-incrementing. 64 bites = 8 bytes.
Cache
- improve read performance and reduce database pressure
- least recently used (LRU)
- try to cache 20% tweets which have 80% traffic of reading (size of cache) in the past 3 days
- Due to limit of number of connections. It should be split into multiple servers.
- celebrities timeline should be in the cache.
- key : userId, value: tweets (double linked list due to descending order)
Monitor/Metrics
Number of tweets
Latency
Trending Topics/Top news
By search queries/ hashtag / re-tweets
Search
API : search_tweets(api_dev_key, key_word, user_location, sort_method, max_number_to_return, next_page_token)
- Schema
ID, Word/Term, Document IDs (inverted full-text index)
[Index Server] index read / index update
- Partition by Term/Word : 1. some words may contain a lot of document(status) IDs 2. Only one server is working
- Partition by Status ID : Will query all the servers and do aggregations to return to the user.
- Fault Tolerance
Every Index server has a backup (a secondary sync on the different rack in the same data center).
Use [Index-Builder Server] to rebuild the dead index server
Reference
[1] https://github.com/donnemartin/system-design-primer/tree/master/solutions/system_design/twitter
[2] https://medium.com/@narengowda/system-design-for-twitter-e737284afc95
[3] https://www.educative.io/courses/grokking-the-system-design-interview/xV9mMjj74gE