Mirror hosting is a well-known technique for speeding up sites by replicating their content on different server locations. This is opposed to backups which are archive copies that are created for data recovery purposes. This article gives a thorough overview of how these two techniques are performed, why they are different, and how they should be used by website administrators.
What Is Mirror Hosting and Why Does it Matter?
Network Traffic Reduction — When the users attempt to access a given site, the web page may determine the location of the visitors and choose to direct them to a geographically-closer mirror. This will reduce the traffic to the main site and ensure that all sections load properly. In case of a sudden traffic spike the traffic will be balanced between the individual mirrors.
Availability Guarantee — The mirror sites can serve as backups, in case the main site goes down. The web server can of the main site can be loaded with a special script that can determine this and automatically forward the users to the given site.
Language Sites — Large sites will have mirror copies of versions that are created for a special site. This is a differentiating mechanism allowing the administrators to leverage such a structure.
Speed Testing — Sites replication onto a mirror location can occasionally be used as a benchmark test for the speed of the main pages.
It is important to note that by definition mirror sites are not meant to be perfect copies of a given site. The reason for this is purely technical. To avoid sites overload, these mirrors are constructed based on a predetermined technology. This involves the setting up of certain technologies and special protocols that synchronize the main site and the mirror(s).
How Are Website Backups Different From Mirrors?
Backups are dated copies of a given resource that are created at a given moment of time. They are stored on a medium and location that should be offsite to the place where the original resource is found. The main goal of the backup image is to be used for data recovery purposes. However, this is very conditional and depends on a number of factors — one of the most important factors is the date and creation. Often website backup creations for such purposes are done automatically by the website hosting company. A good rule for their creation is to follow these principles:
Regular Intervals — Backup creation should be done at regular intervals. This is very important, as this can give website administrators a sense of how “old” is the created copy. In case of disaster recovery, they will be able to use it to restore the sites to a given moment in time. The exact date of creation will allow them to replicate the follow-up steps and content changes in order to create an almost identical copy to the one that was lost.
Backup Security and Integrity — One of the most important factors when it comes to backups creation is the principal way of making them. They should be created by a special script that lists all of the files, creates identical copies, and the whole image is stored in a safe location. The best security practices govern that these backups are stored at offsite locations and stored in a suitable form. This often means that the image will be guarded (usually by applying some sort of encryption) from tampering. Backup copies usually come with special checksums that are created alongside them. During data restore this checksum should match with the resulting file.
Access Controls — By default access to the website copies means that the privileged users will be able to replicate a version of the target site of a prior date. For this reason, security controls should be very strict.
To reduce disk space usage and save Internet traffic most of the backups will be compressed using a standard algorithm.
How To Decide Where to Host Media?
One of the most important aspects of speeding up data access is the use of mirror hosting services. This can be done either automatically on request to the hosting company or using manual methods. In this case, the administrators will need to oversee the technical administration and organization of the servers. This includes a step-by-step process of enabling the required backup agents to access the main site and create synchronized copies on the additional servers.
Depending on the chosen approach the website owners will need to choose the most appropriate technology, regular update intervals, and how the overall website will manage heavier loads. A popular solution is a use of content delivery networks, a popular scheme of distribution that redistributes resources on web servers located in different servers. Their use provides an optimized approach by relying on a large-scale distributed network.
There are many web users that maintain their own mirror site images for archive purposes. They are mostly enthusiasts that seek to maintain certain pages for nostalgic or reference reasons. Currently one of the best examples for a well-maintained public network of mirrors is the Debian Linux project — one of the biggest distributions of the free operating system. Over the years of their existence, they have served millions of users across the world, providing resources of interest.
Ultimately, the choice of location will depend on the type of resource and the estimated traffic demands. The most widely used resources can and should be mirrored as they are the most likely to lead to intensive loading. On the other hand, static content can be distributed on CDN networks, as they perform very well with images and text.
The use of mirror site hosting as a technique for speed increase and data validity is becoming more and more widespread. Most of the major web hosting companies already have created their custom automated solutions for this. For this reason, it is important to note how this is done in each hosting company’s offering.