System design interview: how to design a twitter search system

Xiaoyu
2 min readJul 4, 2020

--

大家美国国庆节快乐!川建国在Mount Rushmore又是一顿脱口秀,平头老百姓的最大的体验就是:在美国工作,假期真是多呀,🙃

今天聊得话题Lucene,倒排索引,一下子让我的记忆拉回到了15年前,李开复从微软到谷歌,然后到处在学校演讲布道diss百度,这个词儿就是从那听来的,呵呵. 15 年之后居然还能用来应付面试,真是万金油呀

Baozi System design interview: how to design a twitter search system

Blogger: https://blog.baozitraining.org/2020/07/system-design-interview-how-to-design.html

Youtube: https://youtu.be/c8e66QFjik8

博客园: https://www.cnblogs.com/baozitraining/p/13176152.html

B站: https://www.bilibili.com/video/BV1TC4y1Y7hn/

Methodology: READ MF!

[Originally from the Post: System design interview: how to design a chat system (e.g., Facebook Messenger, WeChat or WhatsApp)]

Remind ourselves with the “READ MF!” methodology.

Requirements

Simple text based search with “AND” or “OR” support. E.g., give me all the tweets that have words “Black” AND “Life” in it sorted by published time in descending order.

search(api_dev_key, search_terms, maximum_results_to_return, sort, page_token)

A follow up question would be sort those results in other orders, E.g., give me all the tweets that have words “Black” AND “Life” in it sorted by the most commented tweets in descending order.

Estimation

  • Storage: assuming twitter has 1B users, 50% of them post a tweet per day, each tweet is 140 char = 140 Byte. Each day, 1B * 0.5 * 280(roughly 300)Byte = 150GB data
  • Write QPS = 1B user * 0.5 post a tweet per day / 86400 = 5787 QPS = 6000 QPS
  • Query QPS = assuming 3x than write QPS = 18000 QPS

Write QPS in general we don’t worry too much since it could be processed asynchronously. 18000 QPS for read is easy to handle as long as we have enough number of boxes and most of query would be returned via memory (not hitting disk).

Key designs and terms

You can also divided the system as usual into three major layers, presentation, service and data layer.

A few key terminologies.

If the follow up question, we just need to build another index given the sort key or we could fetch the results on flight and sort (local optimal). Follow “Index -> Search -> Rank” steps.

Baozi Youtube Video

References (Credits to original authors)

--

--