Sun’s Colorado Consolidation Saves Millions describes how Sun used Liebert’s XD rack cooling, clear vinyl cold isle curtains and flywheels to increase the density of its data center while also reducing energy consumption. They reduced 165,000 square feet of data center space into 700 square feet while reducing their monthly power usage by one million kilowatt-hours.
When we considered the XD cooling units, there were two options: chilled water or refrigerant. In the case of chilled water, there was the question of potential water leaks in these rack-attached units. With the refrigerant option, there was the question of an increase in the number of condensers and where they would be placed and how much more maintenance would be needed. With either option, there is also an increase in the need for maintenance inside the server room amongst the servers, storage, switches, etc. The obvious benefit of the XD units is the fact that they can provide enough cooling for up to 30 kW in a single rack. Although, if I recall correctly, there is a limit to the total number of racks with the refrigerant-based version due to limits on the maximum pressure or capacity of the refrigerant in a single system.
As for the vinyl curtains, there is usually more of an objection to their aesthetics. Personally, I would like to see them installed to help keep the cold air completely contained in the cold aisle, where it is intended. Especially in raised floor environments with high velocity air flow where the cold air might be pushed outside the confines of the cold aisle without such containment.
One question about Sun’s use of the flywheel: How large are your flywheels? Flywheels generally supply on the order of ten seconds or so of power, which is usually enough time for generators to kick on but cuts it very close. What type of services run out of Sun’s Colorado facility?

First-principle simulations have been used to directly determine the miscibility of helium (gold balls) in dense metallic hydrogen (white balls) under the extreme conditions that are present in the interiors of the Jovian planets. Illustration by Kwei-Yu Chu
Physicists at Lawrence Livermore National Laboratory and the University of Illinois at Urbana-Champaign have done First-Principles Molecular Dynamics (FPMD) simulations on LLNL’s high performance computing systems to “determine the equation of state of the hydrogen-helium system at extremely high temperatures (4,000-10,000 degrees Kelvin), similar to what would be found in the interior of Saturn and Jupiter.” Read more here.
Out of necessity, we have some file systems that are 25 TB. As of 2009, we consider 25 TB a large file system and we are concerned about the potential downtime that may result if an fsck is needed.
Some storage vendors advertise that they can have single file systems that are hundreds of terabytes or even a petabyte. Often, however, there is no mention of when or if fsck or similar operations would be needed and how long they take.
ZFS claims to eliminate the need for fsck and Chunkfs (ext2 enhancements from around 2006) claims to reduce fsck times by splitting the repair domain. Further, “journaling file systems only speed fsck time in the case of a system crash, disconnected disk, or other interruptions in the middle of file system updates. They do not speed recovery in the case ‘real’ metadata corruption” (see third paragraph here).
1.) What do you consider a large file system? (What file system do you use for them?)
2.) Are you concerned about fsck times? (Why or why not?)
3.) Can you predict fsck times based on some parameters (e.g., inodes used, disk size, etc.)?
4.) Any special cases related to fsck or similar operations for clustered file systems?
LightHaus Logic has available some video analytics appliances aimed at analyzing store security camera video to help assess customer behavior in stores. The appliances are mutli-processor nodes that can process five to twenty video streams. Adding more appliances into a cluster can increase capacity. The end result is supposed to be “to condense hundreds or thousands of hours of video into actionable intelligence.” The HPCwire article has more information.
Two other recent posts talked about using HPC for other types of video processing:
Here’s another interesting new market for HPC: HD content delivery.
AMD plans on creating a computing system (their “Fusion Render Cloud”) based off of more that 1,000 GPUs along with some Phenom II CPUs. The GPUs alone (ATI Radeon 4870) will provide over one petaflop. This will be used to render HD video and stream it to devices with limited battery and/or processing capabilities, such as some smart phones and ultra-thin notebooks.
The article doesn’t go into great detail on the architecture of this system though (e.g., is it 1,000 nodes each with a Radeon and a Phenom?). Also, the term “cloud” is mentioned so perhaps this will be sold as a cloud service to content providers?
Four of these drives deliver 4TB in the same space as a standard 3.5-inch HDD.
via HPCwire: pureSilicon Debuts 1TB 2.5-Inch SSD.
Is that right? Four 2.5-inch drives in the place of one 3.5-inch drive?
If so, in systems that can hold 48 3.5-inch drives, then could we fit 192 of these 2.5-inch, 1 TB drives? If those 48-drive systems fit in 4U of rack space and we put 10 of them in one rack, we could get 1,920 TB in one rack. That’s incredible density.
According to the stats at the at the article above, this rack would require about 9.2 kW of power when active and only 192 Watts (yes, Watts) when idle. Of course this considers only the drives’ power consumption.
At 240 MB/s read and 215 MB/s write per drive, we’d have incredible I/O rates per 192-drive system. Imagine the performance of such systems for large OLTP databases, for example.
So what are the challenges with such a system (besides price, I’d imagine)? With one drive potentially nearly saturating the theoretical SATAII bus capability, how could we take advantage of so many drives?
Instead of a 192-drive in a system in 4U then, what about 48 drives in a 1U system? Are the same technical challenges there as far as getting more of the I/O potential out of these drives?
CultureVis is a growing number of projects using information visualization to graph cultural patterns, relationships, and dynamics. According to this article, they have been awarded 330,000 hours of time “to explore the full potential of cultural analytics in a project on ‘Visualizing Patterns in Databases of Cultural Images and Video.’ The grant is one of three inaugural awards from a new Humanities High Performance Computing Program established jointly by DOE and NEH.”
It is an interesting new application of HPC. Do you know of other humanities projects utilizing HPC?
Here’s a notable quote from Not All Apps Are Fit for the Cloud | The Intelligent Enterprise Blog
With cloud computing the trick is not to follow the hype and the crowd, but to understand your own issues and applications first. From there you can make an educated call as to what applications make sense to outsource to a good cloud computing platform, and what applications to keep local. Keep in mind that this should be an evolving process, and you can always relocate applications as the cloud computing resources improve, and clearly they will.
Which applications have you put in the cloud?
The article Forecast Mostly Sunny for Company Opting for Cloud Computing describes some of the common cloud computing benefits and potential drawbacks. One of the common drawbacks listed is that you are completely reliant on your cloud computing vendor (in this case Amazon). Near the end of the article, two comments from Stevie Clifton, Animoto’s CTO, match two that I’ve often thought when hearing the drawback about vendor reliance:
- It’s no different than using any other web or application hosting as companies have done for some time.
- At some point, you could use servers instances across more than one vendor’s cloud to limit your risk of reliance on one vendor.
Do you see using services in the cloud from being any different than utilizing other web/application hosting, server co-location or managed services in terms of reliance on other vendors?