Scalability at YouTube Problems Youtube was facing First of I will start talking about the problems they faced at YouTube was facing. YouTube has faced a lot of problems in its life time so far and the main issues they faced were with scalability. The main reason YouTube faced so many issues was due to its large and rapid growth, YouTube went from a small website to a technology giant and large player in the IT industry practically overnight.
The solution to dealing with this large and sudden growth was to keep things simple, by using simplicity this allowed YouTube to cope with the large growth. The simplicity approach was used to conquer a lot of problems. Make it simple, look for the simplest solution to a problem, sure the problem the faced may have been complex however the best solution to the most complex problem may just be the simplest one. If not then the complexity of the solution will evolve over time. The main problem YouTube faced was in the beginning they didn’t keep things simple.
It is now the lesson they give “Tao of YouTube: choose the simplest solution possible with the loosest guarantees that are practical. The reason you want all these things is you need flexibility to solve problems. The minute you over specify something you paint yourself into a corner. You aren’t going to make those guarantees. Your problem becomes automatically more complex when you try and make all those guarantees. You leave yourself no way out. ” -Mike Solomon Scalability Techniques used
The scalability techniques used in the growth and development of YouTube are not new techniques. Some simple ideas can be used in many different ways. Dived and conquer is one of the scalability techniques used at YouTube, by divide and conquer it means to say that all work is broken down or portioned into small pieces and then processed a perfect example of this is to say that you have a lot of web servers which are close to identical and you take them and grow them or expand them horizontally and that is divide and conquer.
You are simply adding in more servers to divide or distribute the work load among a large group of web servers. At YouTube the use of python makes doing this easier as the dynamic nature of python makes it easier to do. In other words “no matter how bad your API is you can stub or modify or decorate your way out of a lot of problems”. Other scalability techniques used by YouTube would be to say that they cheated a little or to say instead of data being correct it is approximate correctness.
What is meant by cheating is that not all of their servers are consistent at the same time a good example of this is that if a user writes a comment on a video then instead deploying that comment to all servers in the network and keeping all data consistent what they have done at YouTube is they done it so that the system does not have to have globally worldwide consistent transactions.
Meaning that when the a user rights a comment on a video he/she sees that comment instantly however someone on the other side of the world may not see that comment for a couple milliseconds, in other words the cheat. The data will be consistent in the end, the comments are not financial transaction therefore it is essential that you know when to cheat, for example if it was a financial transaction where a user is purchasing something then it is not good enough for the data must be consistent.
Another main part of the cheating technique is knowing when to fake data, for example to say that “The fastest function call is the one that doesn’t” or hasn’t happened a good example of this would be video views, you could count video views and do a transaction every update or you could simple do a transaction every once in a while and update by a random amount and as long as the numbers changed the users would probably believe it was real, so you have to know how to fake data and what data should be faked.
Another technique used would be expert tweaking or in other words not all data needs to be consistent for example we ask the question is comments being eventually consistent good enough? The answer is yes, comments is an important part of YouTube however it is a good enough consistency model to have comments be consistent eventually. However if data was a financial transaction it would not be good enough so different consistency models are needed for different kinds of data. One of the main techniques used and one that is always a hot topic at YouTube is jitter.
If your system does not jitter then u thundering herds, Thundering herds problem occurs when a large number of processes waiting for an event are awoken when that event occurs, but only one process is able to proceed at a time. To solve this problem jittering is used. A good example of how this problem occurs is caching, let’s say that the most view or popular videos on YouTube are cached for 24 hours then when the cache expires all machine caches will expire at the same time this will create a thundering herd.
By jittering this prevents this from happen the solution is that if you set cache expiration to randomly expire between 18-30 hours then machines cache will expire at different times and it will prevent things from stacking up. Technology infrastructure used The technology infrastructure used within YouTube has remand simple throughout the life time of the website. The bulk of the infrastructure within YouTube is coded in python with python being a dynamic language it allows it to be used for most part of the site.
All of the prototypes for YouTube where all written in python and had a long life span and still to this day YouTube contains 1 million lines of python code and if you watch a video on YouTube a bunch of lines of python code is being executed. Another surprising part of the technology used in YouTube is MySQL. MySQL is used a lot in YouTube when a video is watched all the data for that video is coming from a MySQL database or blob store depending on the particular data that is being requested.
As I have said previously YouTube like to keep things simple and this is seen more by the use of apache the very reason YouTube uses apache is because the creators of apache keeps things real simple and this is why it is used at YouTube any request sent to YouTube is going through apache. Another technology used is vitess this is a new project released by and is a really high-tech piece of kit it is a frontend to my MySQL and it can do a lot of optimization on the fly it can also rewrite queries and acts as proxy.
This currently serves every YouTube database request, in other words it is currently a huge part of the query and search functionality at YouTube. It is remote procedure call based technology. As always YouTube has mostly stuck simple technologies and in a lot of cases widely used technologies and tools the main one being linux. As talked about the benefits that YouTube has seen with linux is that no matter how bad your application is functioning you can look at the back end with linux by using various tools like strace and tcpdump.
Spitfire is a templating system that is used by YouTube. It has an abstract syntax tree that allows them do transformations to make things go faster Your own personal Analysis My own personal analysis of how YouTube has handled scalability is simple, YouTube has handled and continues to handle large growth exceptionally well and from the watching the video is it clear that a lot of the dealing with the growth and expansion of YouTube was dealt with by keeping things simple and basic.
The fact that YouTube still runs on what could be called a basic system is remarkable and the fact that majority of the code that was originally written is still in use, for example when you view a video a bunch of python code is executed in order for that to happen. Another surprising use of simplicity is the use of MySQL, it surprises me that such a large website can handle and run so smoothly on an open source database solution when it seems that large businesses would usually migrate to a better support source of storing data.
In terms of scalability I have to agree with Mike Solomon, a scalable system is one that is not in your way, a system that you are completely unaware of and it is not just buzz words it is in other words to say that if a system is built to scale correctly then as the system expands of grows with that natural increase of traffic then it should be seamless and it should not be in your way or require drastic changes in order to handle a little extra traffic, this seems to be the ethos or common practice at YouTube .
It remains my opinion that no matter how large YouTube grows due to its simplicity nature and approach YouTube will be more than able to handle the growth.