I always desired a website that was extremely reliable and fast. Even under very heavy load. It's very unlikely that I could ever have a considerable amount of traffic, but I still want my site to feel fast and very responsive.

In the first part of this post I will be talking about how to CDN mask a simple website like this blog. And then I will describe some development considerations use while programming a more complex system.

My first approach consisted in using an static web generator like Hugo, or Jekyll. But while I'm editing they don't feel very pleasing to my ayes. Obviously, they are static, no comments, no forms, nothing. But, we all know that those kind of sites were born from the need of hosting documentation for developers. But that kind of programs aren't really for normal people that has no programming knowledge. In other words, static site generation makes collaboration very difficult.

My second approximation consisted in using Wordpress, the most used CMS in the world. But using it turns out to be very expensive. Even if there is an endless flow of WP Hosting options, like BlueHost, DreamHost, SiteGround, WPEngine, etc; all of them are expensive. They all say they are cheap or 1$/Month, but you can be sure they will end charging you much more than that.

Even self hosting a WordPress installation can be hard. It requires you to configure a MySQL Server, the PHP/Apache server and backups. This translates in to the need of at least 1GB of ram for your server, this requirement is simply insane! do we really need 1GB to run a full fledged database?

We can get around this, by having a SSD VPS and 1GB of swaping. But that doesn't sound right to me.

Instead o using WP, I decided to go with Ghost because it's written in NodeJS, and it can use SQLite as its database. SQLite is just a library that let you write your data to a file like if it were a normal RDBS. This means, I won't be wasting any extra memory to have a database.

If I tell you that you need to use Ghost because is cheaper to host, and it runs 17 times faster than WP, this post would have no value at all. There is another big reason for the choice I made: Ghost lets you connect through multiple domains without any problem.

CDN Masking

How can be able to connect from any domain something important? In most cases would be very naive to consider such functionality. But for having the fastest website in the world we need two things:

  • A website that is extremely easy to use
  • Our CMS should allow its masking through a CDN

If a CMS doesn't enforce an specific domain into the browser and also lets you to configure what domain you want to show in the content that is rendered to the user, then and only then we have a CMS that can be masked through a CDN.

I my webserver can be CDN masked, then every final user can receive a cached and geographical replicated version of my website. Having geographical replition for a normal blog like this is extremely overkill. But, maybe for some reason you need it.

You can do this also with WordPress but you would need to change some code changes to make it work, and maybe those are not that easy to do. idk.

The idea is that you will have two domains. example.com and edit.example.com. The first domain, will be your CND distribution, the domain your final users will use to read your content. Then you will have another domain, edit.example.com, where you and your team will connect and edit the content of your main page. edit.example.com is called origin, and is the source of truth for your example.com CDN distribution.

With this domain scheme you can also avoid using HTTPS completely if you don't want to setup any SSL certificate in your origin. Why? Because you can connect to it directly doing some port forwarding with ssh :

ssh -L 8080:localhost:80 user@edit.example.com

If you choose this very effective method for optimizing your website, you will need to be very aware of the semantic difference between GET, POST, HEAD, REMOVE, UPDATE, and PUT HTTP methods. You need to understand how the CDN caching works. You may end in a situation where you CDN cache some critical/confidential information, you could see that your cookies don't working anymore, and a lot more of problems aside of the fact you will be likely unable to login.

At the end of the day doing such a strange optimization should be consider an advanced topic and only implemented by a developer.

Price

Doing CDN Masking allows you to use a very cheap server, and deliver a virtually unlimited amount of traffic without worrying to much.

I my particular case, for this ghost blog, I use an Amazon EC2 t3a.nano instance. It costs $0.0047 per Hour, or $3.38 per Month.

The nicest part of using a CDN like AWS CloudFront is that you only get charged by traffic you use. If you don't have any visitors you won't spend a penny.

Dynamic Content

First read: Why you should not serve an entire site from a CDN.

CDNs won't make your life very easy while dealing with dynamic content. But you can always find the a to hide behind a CDN even while processing GET requests. If you work with AWS CloudFront you can try:

  • Only cache GET requests.
  • Honor the Cache-Control header.
  • Cookie based cache.
  • Header based cache.

For example, I created a simple Python @annotation to set the Cache-Control header in my views:

class ProductSearchCompleteView(APIView):
    @CacheControlHeader(hours=3)
    def get(self, request, format=None):
        //...

This is a very simple solution, that may be able to work in isolation, if you design your application to use unique URIs per user-resource pair combination. But, be very cautious with this idea.

class CacheControlHeader():
    def __init__(self, seconds=None, minutes=None, no_cache=False,
                 hours=None):
        self.no_cache = no_cache
        if seconds:
            self.seconds = seconds
        elif minutes:
            self.seconds = minutes * 60
        elif hours:
            self.seconds = hours * 60 * 60
        else:
            self.seconds = 1800

    def _is_no_cache_qs(self, request):
        try:
            return request.query_params.get('no_cache') == 'true'
        except Exception:
            return False

    def __call__(self, func, *args, **kwargs):
        def wrapped_func(*args, **kwargs):
            request = args[1]
            no_cache_qs = self._is_no_cache_qs(request)
            response = func(*args, **kwargs)
            if self.no_cache or no_cache_qs:
                response['Cache-Control'] = 'no-cache'
            else:
                response['Cache-Control'] = f'max-age={self.seconds}'
            del response['Expires']
            return response
        return wrapped_func

Security and Complexity

The more complex a solution, the more insecure it is. Masking a complete real application behind a CDN can be very hard in practice. And that is something you should only do when you already have an incredible amount of traffic.

Also keep in mind, that if you do something wrong, you can end up showing confidential information just because your CDN cached it.