Content Database Storage Provisioning for SharePoint – where’s my SQL DBA?!?!

May 4, 2011

So, you are a SharePoint Admin….

You want to create a new content Database …
It will be very large….you want it on seperate storage…

What do you need to do to achieve this?
Well, from Central Admin/Stsadm/PowerShell….

…you cannot specify which storage SQL will create the content database on (it will use the SQL instance defauts for Data and Log files)….
…you cannot specify the inital size of the content database…
…you need to find your SQL DBA….
…and your storage admin….who needs to arrange a time to create and unmask the storage….

…ever try building a house and installing a kitchen?  You need the kitchen guy, the plumber and electrician….

Well, not any more.    Our microsoft platform-focused development teams in EMC have listened and are working diligently on creating integrated, lightweight tools to help…

Announcing….. the EMC Storage Integrator Tool ….as part of a bigger Microsoft management release from EMC – life just got easier for us…

EMC Enhances Management of Virtualized Microsoft Applications

I cant say much more at this point, but I will have answers, details and demos hopefully next week….


We’ll be at Microsoft Tech-Ed 2010

June 4, 2010

Hello all,

Just incase you find yourself at Microsoft Tech-Ed 2010 New Orleans next week….we’ll be there.

Eyal Sharon, Brian Henderson, and myself will be there amoungst other crack Microsoft folks, showcasing how EMC technology definately enhances the experience of a Microsoft solution.

* Hyper-V, small, medium, and seriously enterprise will be shown
* Automated Disaster recovery
* EMC integration and management
* Move your SQL database off 60 FC 15k Drives to just 5 EFD Flash drives and get better performance!….

…just tiny bits of whats on offer, so…

If yer at it…please call by the EMC booth. 

Thanks – James.


Best Practice for Hyper-V with iSCSI

March 10, 2010

Hello folks.

In EMC Proven Solution testing for a given use case, I have come across a serious issue in relation to iSCSI responses, which inheritently causes slow storage response times, very slow cluster polling and enumeration.

Test config
Windows 2008 R2 with Hyper-V.   6-node Hyper-V cluster.  65 Disks.  2x iSCSI NIC per node.

What I saw was a slowness in the cluster in bringing a VM’s Virtual Machine Configuration cluster resource online which had a large amount of disks configured.  They would time out as they passed their default pending and deadlock time-outs.

Firstly, when you online, refresh (or failover) a VM configuration, Hyper-V performs a sanity check to ensure the underlying components of the VM (network, storage, etc) are available.  This means scanning all the disks.
So, say I had a VM with 25 disks (in my case), the VM config took over 10 minutes to online!

Why?  Well working with Microsoft OEM Support, they asked me to try to tighten the TCPAckFrequency to 1(millisecond).  I say, OK I’ll try it.
This brought the online time from 10 minutes to 19 seconds!  Result…or maybe…

I needed to fully understand the issue, so out came WireShark in order run some Ethernet traces…

Let me explain what the actual issue is…

The problem is basically that iSCSI is a victim of Nagle’s Algorithm.  Optimization of TCP networks in terms of minimizing congestion due to TCP overhead.  iSCSI is essentially stung by send colalescing.

Windows 2003 onwards, the default TCP acknowledge time is 200ms.
This means that if a TCP segment (1462 bytes) is not full, it may need to wait up to 200ms before the data actually sent from the Windows host.

This is a problem when using iSCSI for two reasons
1) You need the fastest response time possible from storage for your application
2) iSCSI payloads can be very small, esp SCSI control OpCode (CDBs).

Now, the iSCSI CDB (control/query) OpCode commands involved in enumerating the disks during the online action have a tiny payload (10 bytes).

From looking at the ethernet trace, the cluster disk driver is performing SCSI(10) read commands (LUN read inquiry, read capacity, etc).  It does this sequentially, at least twice for each disk involved in the virtual machine.

With the default TCPAck time of 200ms, for each SCSI Read command issues by the cluster node, the payload of the command is 10bytes (on wire is 66bytes).  The SCSI Read command does not fill a TCP segment and so while the TCP payload is sent to the storage controller and the controller responds, the node waits to send the ACK until a segment on that NIC is filled or hits the max ACK time of 200ms.

So, let’s say I am the cluster disk driver and want to read LUN metadata as part of onlining a cluster disk…
…I send the command via the iSCSI Initiator, Storage Controller responds…wait….wait….wait…TcpAckFrequency trigger fires after a max of 200ms….only then is my ACK sent to storage….and the TCP transmission completes and the data is returned to the Cluster disk driver process.

This means for each SCSI command we attempt to send to storage (target and target LUN), we will typically end up waiting for the 200ms timer!  This is regardless of how busy SCSI data I/O is because this algorithm has Winsocket-based granularity.

In a Windows iSCSI cluster, this issue really comes to light because the cluster operates (controls/validates) on resources in the cluster in a sequential manner.
So, for say 65 LUNs, each LUN takes a significantly longer time to validate/control because the duration elongation due to the 200ms ACK timer happens multiple times per LUN.

Storage response times should be in the sub-10ms range, not 200ms 🙂

This is also why I did not see this issue in Fibre Channel environments.

So, while setting the TCP ACK frequency to hardcode to 1ms per-NIC helps, it may have adverse performance implications in terms of wire and Storage Port congestion.

But…for iSCSI Networks, this should not really be of concern because by best practice, you should have isolated iSCSI networks with minimal hops between host and storage.

 

The real fix I believe is by using the TCP_NODELAY option in SockOPTNS for the iSCSI Initator.  This by-passes the Nagle Algorithm completely for that process, so you dont need to even wait for that 1ms (seems a short time, but it is still a trigger time) – the SCSI command will fire immediately.

I have a design change request in for Microsoft to consider this as the way forward. Probably wont see it until Windows 8, but hey!

How to set the TCPAckFrequency on your iSCSI NICs

Regedit – backup your registry – before proceeding 🙂

Subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\<Interface GUID>
Entry: TcpAckFrequency
Value Type: REG_DWORD, number
Valid Range: 0-255
Set to 1.

Do this only for your iSCSI NICs, unless directed otherwise.

Hope this helps

James.