Webhosting
08.12.06 - 12:12pm
Over the last couple of days I’ve been focusing very heavily on the Internet and how it all fits together. You might be wondering why I have been focusing on such a dull topic but there is a reason and it fits into this final article. Lately I’ve gotten it into my head that it would be fun to try my hand at some form of webhosting. Once I had this idea lodged in my head I realized exactly how much there is to webhosting and have since been filling my head with knowledge since. Because I find it interesting I have assumed that you do guys do also so without further ado let’s get into the gritty details of webhosting.
The Basics
In the previous articles I covered the basic requirements on how to reach a webserver and how the content is distributed but I never discussed what actually happens when you reach that server. After reading through pages and pages of material I think I have the hang of it so bear with me if a few of the details are a bit fuzzy. To start perhaps we should define exactly what is offered by webhosting and then the various plans available.
If desktops are the family sedan of the computing world then servers are enormous transport trucks. These boxes are made for the longhaul and they are meant to do it efficiently and quickly. Servers tend to have two or more processors, at-least 4 gigs of ECC memory (error correcting code), redundant power supplies, and redundant hard drives. If this sounds like overkill trust me it is however the redundancy has a purpose. Servers tend to deliver vital and important information, even if the server is available only 99.9% of the time in a year that adds up to a little under 9 hours of downtime. Considering some businesses could earn thousands of dollars of income in 9 hours, you can see why avoiding downtime at all costs is important. To add another layer of protection some servers are mirrored so that if a server goes down another will be able to take up the slack until the administrator can fix the problem
Now these servers can’t just sit in your closet and function correctly. Usually racks and racks of servers reside in special buildings that feature all sorts of safety measures such as UPS (un-interruptible power supplies), surge protectors, backup generators, waterless fire protection systems, physically secure buildings, and extensive firewall and DDoS(Distributed Denial of Service) attacks. Besides all these preventative system datacenters also require huge external power connections and a very wide fiber pipe coming in. Usually you will find all the major datacenters near large fiber hubs that are located within the major cities of the US.
Hosting Plans
Now that you have yourself some webservers how do you actually go about putting clients on them? If you were to give each individual client a webserver your operating costs would shoot through the roof and a server per customer is overkill for most websites. To save space and money a single physical server can be divided up in pieces and then each of these pieces can host a website. This type of webhosting is called shared webhosting and is the cheapest and most popular form. Once a website begins to outgrow shared webhosting you generally move on to a virtual private server. Virtual servers differ from shared servers by literally cloning the server multiple times on one piece of hardware instead of just serving lots of websites from a single server instance. By cloning the OS multiple times you create multiple “sandboxed” servers with the ability for each website administrator to directly modify all aspects of their virtual server. One security benefit of virtual hosting is that each individual website could destroy their particular virtual server and the physical server would still be operational due to it being virtually separated.
Now if that was confusing just imagine someone cloning you and making all your clones perform a job. Now imagine one of them gets sick. Just because one clone is sick, the master copy(you) are just fine and so are the rest of the clones. That wasn’t exactly the most clear analogy but it should get the point across. Now the last form of hosting are actually two slightly different forms of hosting. First off you have dedicated servers which are well.. dedicated servers. Your webhosting company will maintain a server for you by performing all the daily chores that keep it alive and they let you wreak havoc upon it and host your personal website. The second server is called co-location which is when your webhosting company lets you bring your own server in, set it up, and let it run. Here the company just provides you with rackspace, power, and bandwidth, all administration and support is on your end. This solution is usually reserved for small companies, upstart webhosts, and technically inclined bloggers looking for adventure (myself included). Since you are only limited to the physical size of your server it is possible to cram a lot of hardware into a rack which could potentially make this a more cost-effective solution than a dedicated server but you must also factor in maintenance and hardware failures.
So now you know what a webserver is, how they are divided up, where they live, and what they eat, but what makes them tick? The software used to run a webserver tends to be a very specialized blend of applications with the sole purpose of running efficiently and fast. In the business world there are servers to serve up everything ranging from FTP transfers, email, webpages, credit card transfers, chat rooms, web forums, web stores, and so on. Each individual task requires a separate application running to provide this service along with the standard operating system and administrative processes.
Software
At the base of any server you have your operating system. For the general corporate world you have Linux and Windows with a smattering of OS X and Solaris. Since there are so many flavors of Linux and I have just about zero knowledge of Linux it is a bit hard to say what is good and what isn’t but the going trend I am seeing is Redhat Enterprise Edition for corporate servers and then some of the open/free Linux distributions for smaller groups and single people. Microsoft right now has Server 2003 with Longhorn coming out in a few months which will be based on Vista. OS X Server has a few extra packages added to the standard OS X and all this goes into a Mac.
For general webhosting you usually have a webserver application such as Apache, a database server such as MYSQL, and then a scripting language such as PHP or Phython. All the applications I just mentioned have been bundled into unique packages for Windows and Linux machines named WAMP and LAMP. While these are great for creating easy development servers they aren’t quiet up to snuff for creating an active webserver as they can be limited in the product versions and the ability to customize the installations. I won’t be going too in-depth here as it gets a bit fuzzy but once you have all these applications setup you simply need some form of database management (PHPMyAdmin) and general administrative panel (cPanel) and you are set to go. Within the next few weeks I’ll be setting up my own personal development server and documenting everything so that should be a rather interesting article as I get used to using Linux.
Administration
Now that the servers are all setup and humming away, what now? Since humans are controlling these servers it is guaranteed that at sometime something is going to break. There are applications available that monitor servers and indicate when a system has failed but the most effective source of information is user feedback. A web admin will quickly start pulling all sorts of strings when a server crashes so when this happens the support team needs to jump quickly to avoid any extenuating downtime. Occasionally hardware will fail which is why hard drive backups and redundancy is crucial in webhosting. There are too many ways to make a near-bulletproof server farm but the general idea is to split the hosting load among multiple cloned machines so that if one goes down it won’t affect the others. This same idea is used with Google’s enormous dataclusters. Even if they lose 1% of their computing power there are enough redundant systems that we as users won’t notice any decrease in performance.
That is about it for right now, it’ll be a while till I actually finish this up with an operational web server build but it will happen. It has been very enlightening writing about these topics and hopefully what I have learned will help me to serve you guys better.
Very interesting article on webhosting. I have been in the hosting business for a few years now. It\’s actually quite fun having a server of your own and being able to control all your clients. When setting up my own server for the first time, the trickiest part was the DNS. It was a bit difficult trying to configure the named.conf and getting the nameservers to work. After an hour or so, I got it working though. Goodluck with your test server!
I\’d recommend Plesk over cPanel. cPanel can\’t run on Apache 2.0.x and this opens you to many many vulnerabilites as well as poor performance. You want 2.0.x so you can use the Worker Multi Processor Model (which works even on a single CPU).
This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with less system resources than a process-based server. Yet it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads.
I\’d recommend DirectAdmin or Plesk 8 for a control panel, if you\’re going to resell space, go with Plesk. It\’s a pain to setup initially, it has some pretty lame defaults, but it is super sweet afterwards.
I\’ve had no complaints from any of the clients I host.