Grid Computing and its Advantages

Grid computing has been around for a few years now and its advantages are many. Grid computing can be defined in many ways but for these discussions let’s simply call it a way to execute compute jobs (e.g. perl scripts, database queries, etc.) across a distributed set of resources instead of one central resource. In the past most computing was done in silos or large SMP like boxes. Even today you’ll still see companies perform calculations on large SMP boxes (e.g. E10K’s, HP Superdomes). But this model can be quite expensive and doesn’t scale well.

Along comes grid computing and now we have the ability to distribute jobs to many smaller server components using load sharing software that distributes the load evenly based on resources and policies. Now instead of having one heavily burdened server the load is spread evenly across many smaller computer which can be spread around various locations.

Some advantages are quite obvious.

1) No need to buy large SMP servers for applications that can be split up and farmed out to smaller servers (which cost far less than SMP servers). Results can then be concatenated and analyzed upon job(s) completion.

2) Much more efficient use of idle resources. Jobs can be farmed out to idle server or even idle desktops. Many of these resources sit idle especially during off business hours.

3) Grid environments are much more modular and don’t have single points of failure. If one of the servers/desktops within the grid fail there are plenty of other resources able to pick the load. Jobs can automatically restart if a failure occurs.

4) Policies can be managed by the grid software. Some of the most popular grid enabling software include Platform LSF, Sun Grid Engine, Data Synapse, PBS, Condor, UnivaUD, among others. Each do a good job of monitoring resources and managing job submissions based upon internal policy engines.

5) This model scales very well. Need more compute resources just plug them in by installing grid client on additional desktops or servers. They can be removed just as easily on the fly.

6) Upgrading can be done on the fly without scheduling downtime. Since there are so many resources some can be taken offline while leaving enough for work to continue. This way upgrades can be cascaded as to not effect ongoing projects.

7) Jobs can be executed in parallel speeding performance. Using things like MPI will allow message passing to occur among compute resources.

Some disadvatages:

1) For memory hungry applications that can’t take advantage of MPI you may be forced to run on a large SMP

2) You may need to have a fast interconnect between compute resources (gigabit ethernet at a minimum). Infiband for MPI intense applications

3) Some applications may need to be tweaked to take full advantage of the new model.

4) Licensing across many servers may make it prohibitive for some apps. Vendors are starting to be more flexible with environment like this.

Areas that already are taking good advantage of grid computing include bioinformatics, cheminformatics, oil & drilling, and financial applications.

With the advantages listed above you’ll start to see much larger adoption of Grids which should benefit everyone involved. I believe the biggest barrier right now is education.