The hard disk is where all data is stored - the
operating system, ancillary programs, and HTML/images/movies/etc for every webpage.
The hard disk is an often overlooked bottleneck in the server architecture. While many people correctly focus on the
CPU and memory constraints, they incorrectly only focus on the hard disk size. This is a misunderstanding of how a hard disk operates.
Just like RAM, a hard disk not only varies in size, but also in speed of data accessed. Unlike RAM, hard disk space is cheap - Adding another hard disk or getting a bigger hard disk is not a big expense. What is really important is how fast the hard disk responds.
A hard disk has three main stats - its storage space, its seek time (how long it takes to find data), and its RPM (how 'fast' the hard disk operates). In a server situation, the seek time and RPM become increasingly important.
Before further elaborating, we have to quickly mention IDE/SATA vs SCSI. Most desktop computers feature IDE or SATA hard disks. The IDE/SATA nomenclature refers to how the hard disk interacts with the rest of the computer. A higher performance solution is SCSI (pronounced 'scuzzy'). For servers, it is recommended that SCSI drives are used.
Now, going back to our earlier discussion of seek time and RPM, most desktop computers have hard disks with seek times of roughly 8 ms. SCSI drives clock in at around 3 ms - this means data is found in 50% of the time!
Regarding RPM (revolution per minute), an IDE/SATA drive usually goes at 7200 RPM, with some high-end versions going at 10,000 RPM. SCSI hard disks come in at 10,000 and 15,000 RPM. Combined with the earlier seek time, this means that SCSI drives not only find data faster, but also get the data faster.
For the sake of completeness, a fourth factor to consider is throughput - the speed at which data is transferred from the hard disk to the CPU. IDE/SATA solutions peak at around 100 Mbps. SCSI drives can obtain speeds up to 320 Mbps.
SCSI solutions also have other advantages. These include higher mean time between failures (MTBF - an estimate of how long the HD will properly work), more advanced controls for data integrity, less server resources utilized, and also larger cache size. Lastly, SCSI drives can be changed together much easier when compared to an IDE/SATA solution.