Website Latency Tips and the Path to Faster and Scalable websites
If you go through guides and blog posts published on the iWeb blog for the past three months, you can see many articles on how to setup a website, how to choose a dedicated server, how to transfer your website to a new host, etc.
So far, there weren’t any advanced articles mentioning advanced setups or tips on how to scale efficiently a website. The reason is very simple: from experience, it’s better to just launch a website, and then optimize only when bottlenecks and performance problems occur. Most development teams follow this principle; otherwise, they will be optimizing prematurely their setup, and also due to the fact that every website has different needs and thus different problems to solve.
You can see below for instance the graph when delicious’s homepage is loaded

It took about 1.85 sec to render the website, knowing that other websites such flickr.com target 250ms total loading times. The graphic above shows that the server response took 1.3 sec, which is almost 2/3rd of the total time. This means the bottleneck is either in the DNS server, or maybe because the delicious servers were handling too much traffic, and were queuing user requests.
Here’s a graph for another website (TechEntreprise):

Response time is similar to delicious, in 1.83sec, however, the responses are very different in nature: it’s loaded in less than 100ms, but static files such as pictures, css, or javascript take the remaining. Assets delivery should then be optimized on this website, using compression, trying to use less static files, or using special hosting solutions to make the response faster.
During the lifetime of a website, a development team must then track those metrics; and optimize iteratively, each time on a different bottleneck. The problems can occur:
- DNS Servers
- Front-End Servers Capacity
- Application Servers speed and capacity
- Back-end and database servers speed
- Static files servers
1. DNS Servers
When a new user wants to visit a website that wasn’t visited recently, there must be first a DNS query. The DNS queries can be noticeable if the visitor is another continent or if you have slow DNS servers. Learnhub, a website made by a company from Toronto, saw for instance that DNS response time took up to 500ms, and switched to Dynect for ultra-fast DNS Hosting. The graphic below shows the improvements when they made the switch:

The response time is now 3 times less for learnhub, a noticeable improvement for its massive user base in India.
2. Front-End Servers Capacity
Front-End servers are the first servers to deal with your request, putting it into a queue, and then dispatching it to the appropriate server, such as an application server, the application cache (or memcache). Front-End servers can be specialized software (such as haproxy or nginx, which have built-in load-balancing features) or it can also be dedicated load-balancing hardware such as the ones here. In more modest websites, Apache or the web server would be the front-end and the application server at the same time.
In most hosting architectures, the bottleneck is rarely the front-end servers. If it is, that’s because you didn’t choose the best routing algorithm, for instance chosing round-robin queuing instead of more intelligent load-based queuing. In most cases also, it’s because there are not enough application servers, and the front-end servers are just waiting for requests to be computed. Here is for instance how response times change when you add more connections to app servers (with the same load-balancer in front)
![]()
It’s a significant boost, so tweaking the website’s latency can just be configuration change.
3. Application Servers speed and capacity
The application server computes the request, for instance delivering a personalized page according to a user’s preferences.
This depends on technologies used by the website (php, python, java, ruby + any used frameworks)
If the bottlneck is the application server, there are two paths: either optimize the web app code, or scale it by using more and better hardware.
Optimizing the code is beyond the scope of this article; it involves testing, using patterns and best practices, benchmarking sections of your code, and then try to refactor the code for better response time. Go to the resources relevant to your technology stack, benchmark it, and get help from an experienced engineer or development team.
If you’ve hit the wall in code optimization, you can think about getting beefier servers, try to find the best mix of RAM and CPUs, and then use this “base server” to scale horizontally, in clusters. An easy solution for LAMP websites can be seen here.
Many modern websites (put the “web2.0″ keyword here) also have advanced features such as user emails and notifications, computations of social graphs, search, messenging, text messages, video transcoding, etc. If you have such functionality, a very quick way to decrease response time is “outsource” those features to dedicated servers. You can use messenging servers such as ActiveMQ, RabbitMQ (an Apache project) or even kestrel (which Twitter uses) to offload long-running tasks to specialized servers. Doing asynchronous requests would allow in theory instantaneous response times, so that’s something you would want to look at as soon as you have more than a couple of dedicated servers.
Caching is also an efficient way to process requests, to prevent requests hitting app servers. As for web application code, this depends on technologies used for your website.
4. Back-end and database servers speed
Fortunately, optimizing database servers is easier than the above points.
There are known and “battle-tested” solutions for instance on scaling MySQL, from replication to master-slaves setups, and balancing the loads. Like application servers, you can search for the best hardware for the server, using power servers, and with very low access time hard drives. You can see for instance in the following graph how MySQL behaves for different hardware on different loads, and then plan accordingly:

Many web companies also use heavily memcached in front of the SQL Servers, in order to retrieve frequently-accessed objects from memory.
5. Assets Servers
Assets servers delivers static files, such as pictures, videos, javascript files, and other static elements such as flash animations.
You can tweak your web application to serve less files (for instance get all javascript files into just one file), compress files (and then gzip when serving the request).
Optimizing static file servers is probably the easiest when scaling and lowering response times.

Blog
Forum
Status



Recent Comments