Using a content distribution network (CDN) or website caching software can tremendously improve the performance to your website through caching content or website assets. This can lead to improved search engine optimisation (SEO) as the website loads quicker. Another beneficial side-effect is that your users remain on your website for longer as when a website takes longer than 3 seconds to load, users generally start to look elsewhere for their content.
A CDN is a network of servers around the world that is designed to cache and serve traffic to users. By leveraging these networks, you are in a position to cache your content in a point of presence (POP) closest to the user. This lowers the latency experienced when loading your content, improving the user-experience of anyone browsing your site. It is generally recommended to use a CDN to serve your static assets at a minimum to improve your website loading speeds due to the global network of higher-capacity servers.
Cached content is a version of your data that is stored in either of following places:
This can decrease the amount of time it takes for your website to load, and potentially even have it instantly load on the user's browser if they revisit your page at a future date. Due to the content being served elsewhere, you can also decrease the load your server experiences as each user visit doesn't result in another request to your web-servers. You can also save some cost here, as your server does not need to serve traffic for each request.
Traffic from the user will hit a CDN server, which serves cached data (or will request data from your web server if a cached version doesn't exist) to the internet. This means that the CDN will always sit in between your user and your server. Here's a simplified diagram to show how internet traffic behaves when using a CDN.
Wow! Why don't we just put a CDN in front of my servers, and cache everything you might ask? Well that is a good question. You definitely should be using CDNs or caching where you can. Despite this, there are also a multitude of reasons why you shouldn't just enable caching on everything. Here is a small list of things you should think about when configuring caching or CDNs in front of your website.
The biggest issue with caching your content is that this can also inadvertently cache content that you don't want cached. Content such as personal information (name, address, email, etc) and even credit-card information. Configuring your caching incorrectly can cause these details to be leaked to other users on your website. There are multiple ways to get around this, one being to disable caching on these pages entirely. Another is to cache on a per-session or cookie basis. This will mean that the cache will take the cookie, and cache content for the specific cookies provided by the browser as well as the URL. This will ensure that content is cached for all your users, and only the valid user will be able to access their cached content / information.
As traffic to your website is now handled by the CDN, your website will not be able to use the usual methods to identify the IP addresses of your clients. Some CDN providers will use the
X-Forwarded-For header to provide the IP address of the end user, and you will need to change your logging / website behaviour to account for this. I'll guide you to the docs for NGINX, Apache and IIS as these are the most likely to be used.
A timeout occurs when the website does not respond within a certain amount of time. Configuring this correctly can be tricky, as sometimes it may take longer than expected to process data. General recommendations usually include keeping your content loading under 1 second where possible. Despite best efforts, you might notice longer wait-times on your website. This can be due to a plethora of reasons including (but not limited to) inefficient SQL queries / code, 3rd party APIs your website depend on or even simply a spike in traffic. All of these factors can play a part in your website responding slower than usual. You should consider these factors when configuring your timeouts, and even increase it for specific pages that are known to take longer to respond.
When a timeout occurs, some CDNs will allow you to retry the request. You might initially think that this is a great idea, but there are some pitfalls you might encounter when doing this. As the retry will cause another HTTP request to trigger, this can cause issues when the specific request involves sending data. If this is not accounted for, duplicate entries on the database can be inserted. Even when retrieving data, the timeout may likely be due to the website taking longer to process data than normal. Performing a retry will likely result in the server queuing the same request multiple times eventually resulting in a gridlock. If this occurs, you will have accidentally executed a distributed denial of service (DDOS) attack against your own website. You will need to consider which pages are allowed to be retried when enabling this option, as these side-effects may result in your website quickly becoming inaccessible with undesirable knock-on effects. As more and more users start to browse your website (or even refresh when errors occur), retries will start to trigger and render the server unable to respond to most of the requests.
Some CDN providers will allow you to configure a web application firewall (WAF) as part of their service. A WAF service involves rules that each request must abide by which can greatly improve your website security by blocking malicious traffic from ever hitting your servers. This can range from bot traffic, SQL injections and sometimes even for 0-day attacks.
CDN providers by nature act as a man in the middle for traffic between end users and your servers. This means that the CDN will have access to all traffic being passed through their network in its decrypted state. Due to this, you should only choose reputable providers to serve your traffic as you will need to trust that these providers will not modify content or siphon your data in any way when traversing the internet. It may be tempting to use that random free provider that you found whilst digging online, but you may damage your reputation in irreparable ways. Additionally, if your company must follow regulations such as PCI-DSS then you will definitely need to limit your providers to those that guarantee you remain compliant.
I will only list out providers I have experience with below. The features provided by these providers are generally quite comparable, and decisions will largely depend on personal preference or cost. Please do your own research when choosing a provider and take my suggestions with a grain of salt, as I am not privy to any rules or regulations you are bound by. As a disclaimer, I'm not affiliated with or sponsored by these providers in any way whatsoever and there are other great alternatives out there that I have not listed.
For free options, Cloudflare is generally the first choice for most users. Cloudflare currently spans 200 cities in more than 95 countries.
AWS Cloudfront is powered by AWS's network of 216 POPs around the globe.
Feeling boujee? Akamai is trusted by many large organisations to cache their content. Their network covers over 130 countries within more than 1,700 networks.