In August 2009, Penguin computing announced “Penguin on Demand” (POD), which they deemed to be HPC in the cloud. It amounted to remotely accessing their pre-installed cluster and submitting your jobs. Virtual machine images were not an option with POD at the time. Today, Amazon announced their Cluster Compute Instances (CCIs) for EC2, which offers the ability to boot Linux-based VMs on a new “Cluster Compute Quadruple Extra Large” instance to form virtual clusters, which have better performance characteristics relative to previously available EC2 instance types.
The main differences from other EC2 instances:
- if you boot multiple CCIs, the instances will be more closely linked together to offer lower inter-node network latencies with a full bisection 10 Gigabit/s bandwidth.
- you will be able to identify the processor architecture so your code can be tuned appropriately
From Amazon’s HPC applications page:
The Cluster Compute instance family currently contains a single instance type, the Cluster Compute Quadruple Extra Large with the following specifications:
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge
As of the announcement on July 13:
- the cost per instance was USD$1.60 per hour (if purchased on-demand per hour) or USD$0.56 if a 1-year or 3-year reserved instance is purchased.
- only Linux VMs are supported on these instances
- a default cluster size limit of eight of these instances are available (for a 64-core cluster) without needing to fill out a special request form.
Here are a couple of articles that point to some cluster management providers specifically geared toward provisioning/interfacing with EC2-based virtual clusters:
- Cycle Computing’s CycleCloud (also see here on insideHPC)
- Clustercorp’s ROCKS+ on EC2 (also see here on HPCwire)
It is interesting that the instances are listed as providing 33.5 EC2 Computing units. I wonder what method Amazon uses to establish these measurements. Since the new Cluster Computing Quadruple Extra Large instances allow you to know the processor architecture, you may realize a higher number of Computing Units depending on the extent to which your code will benefit from compiler optimizations taking advantage of the specific Nehalem cores.
If you’ve run across other helpful articles with more details, please do leave a comment with the info.