Posts tagged: yahoo

A Look at the Yahoo Systems Used to Sort a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds

By Gary Stiehr, July 2, 2009 10:01 pm

The Yahoo! Developer Network Blog reported that Yahoo! was able to sort a petabyte of data in 16.25 hours and a terabyte in 62 seconds using Apache Hadoop.  I was curious what type of hardware they needed to pull this off.  From their post, they mentioned using this hardware (their Hammer cluster):

  • approximately 3800 nodes (in such a large cluster, nodes are always down)
  • 2 quad core Xeons @ 2.5ghz per node
  • 4 SATA disks per node
  • 8G RAM per node (upgraded to 16GB before the petabyte sort)
  • 1 gigabit ethernet on each node
  • 40 nodes per rack
  • 8 gigabit ethernet uplinks from each rack to the core

Here are my observations/inferences and questions about this:

  1. They used 95 racks of equipment dedicated to this test (3800 nodes with 40 nodes per rack)
  2. An 8 Gb/s uplink must consist of eight trunked gigabit connections
  3. From 1&2 above, we see they’d need 95 * 8 = 760 Gigabit Ethernet ports at their core.
    • What type of switches do you think they have at the core?
  4. Assuming 90% efficiency on the GigE link, 760 Gb/s over 16.25 hours could push about 5 PB of data.
    • How much data do you think is transferred when sorting 1 PB of data?
  5. 3800 nodes were used each with two quad core Xeon processors means 30,400 cores were used for this (?!)
    • What percentage of CPU time do you think they used to sort 1 PB of data on 30,400 (?!) cores?
  6. 3800 nodes were used each with 16 GB of memory means 60.8 TB of memory were used to sort 1 PB.
    • How much of the 60.8 TB of memory do you think was required to do the sort of 1 PB of data?
    • Are these Nehalems?  If not, if they were using Nehalem-based systems, do you think the run times would have changed due to increased memory bandwidth?
  7. Assuming “40 nodes per rack” is implying 40 1U servers, I think it is safe to assume that each rack (with 320 cores in 40 systems) uses around 10 kW of power.  With 95 racks like this, it seems this system would require just under 1 megawatt to operate.  Of course the various switches also contribute here.  Assuming $0.04 per kWh, it would have cost about $650 in electricity to do this sort.  This cost may double given the electricity needed to provide the required cooling.

Ok, so is this right?  Perhaps they meant to say 3800 cores instead of 3800 nodes?  Does Yahoo have a 30,400 core cluster sitting around to do these benchmarks?  I’m assuming this a cluster that they also use in their day-to-day operations?

    Panorama theme by Themocracy