Storage Node Details Part II
In a previous post, Storage Node Basics, we went through how the storage for each compute node is set up. Today, we will discuss data rack capacities as SQL Server Parallel Data Warehouse (PDW) has a couple capacity models to choose from. When ordering the SQL PDW appliance, an organization can choose between the Performance Model and the Capacity Model for storage.
Currently, Microsoft is working with two hardware vendors Dell and HP, with a couple more to come in the near future: IBM and Bull. Each Dell data rack will include eight compute nodes. The Dell appliance uses an EMC AX4 storage array for each compute node. Since the HP MSA storage arrays are slightly smaller (2U for HP vs 3U for EMC), ten compute nodes will fit into each data rack. Both the EMC and HP storage arrays house 10 disks.
Performance Model
The performance model includes ten 450 GB SAS disks with spindle speeds of 15k. SQL PDW compresses the data using SQL Server compression giving us additional capacity. For a Dell system, we can estimate that you will have about 36 TB of data rack capacity at 2.5x compression. SQL PDW built on an HP system will achieve approximately 45 TB of capacity per data rack. A 40 node system could manage approximately 180 TB of previously uncompressed data.
Capacity Model
The capacity model includes 1 TB STA disks with 7.2k spindle speeds. For a Dell system, one can store about 80 TB of data in each data rack, again assuming a 2.5x compression ratio. SQL PDW built on an HP system has a capacity of about 100 TB for each data rack. A 40 node system would store about 400 TB of previously uncompressed data.
Dell/EMC AX4 (8 Arrays/Rack)
|
Drive Capacity |
Spindle Speed |
BUS |
Rack Capacity with 2.5x Compression |
40 Node System (5 Racks) |
|
450 GB |
15 K |
SAS |
36 TB |
180 TB |
|
1 TB |
7.2 K |
SATA |
80 TB |
400 TB |
HP MSA (10 Arrays/Rack)
|
Drive Capacity |
Spindle Speed |
BUS |
Rack Capacity with 2.5x Compression |
40 Node System |
|
450 GB |
15 K |
SAS |
45 TB |
180 TB |
|
1 TB |
7.2 K |
SAS |
100 TB |
400 TB |
Expanding PDW
No matter which storage model you choose, adding capacity to the system will be as straight forward as adding an additional data rack as necessary. An added benefit of an additional rack is that you gain extra processing and memory power on top of the further storage. Currently Microsoft plans on supporting up to 40 nodes in an appliance. This limit is in place due to the development teams ability to build a test appliance beyond 40 nodes. Even within Microsoft, there are monetary limits to what we get to play with. Architecturally, there is no reason why the number of nodes supported can’t increase in the future.

