EMC and Microsoft at the SharePoint conference!

October 4, 2011

Time:  Yesterday, 8.30 Pacific
Where:  SharePoint Conference 2011 in Anaheim, California
Venue: Conference kick-off keynote

On stage: A DJ and a high performance SharePoint farm in two racks.

Backbone: EMC VNX 5700 Unified Storage Array with NEC high density Servers.

So what?: how about a 14TB Content Database with FAST search and 100 millions documents, thats what!

Delighted to say that EMC was center stage at the SharePoint conference and a main focus of the SharePoint main demo.
The demo which lasted 10 minutes showed how a large-scale sharepoint farm with a extremely large content database of 14TB, with FAST Search Server, running at full load suffering a SQL Server node outage.
The environment, using SQL Server Denali CTP3 and SQL’s AlwaysOn technology was key in failing over the SQL Content databases in seconds.

EMC was key in maintaining performance around the clock with this solution, and the VNX Array was not stressed at all even though very large IOPS/second were generated.

SharePoint 2010 Sp1 bring much larger content database supported sizes, up to 4TB for general use
and UNLIMITED for Documents and Records Center in specific circumstances (- 5% content access or 1% content modified per month/avg)

More information
================
www.emc.com/sharepoint

whitepaper – Managing Multi-Terabyte Environments – http://go.microsoft.com/fwlink/?LinkId=223599

SP1 Announcement
http://blogs.technet.com/b/office_sustained_engineering/archive/2011/05/11/announcing-service-pack-1-for-office-2010-and-sharepoint-2010.aspx
Joel’s quick blog entry for the new limits –  http://www.sharepointjoel.com/Lists/Posts/Post.aspx?ID=457
Technet article on the details –  http://technet.microsoft.com/en-us/library/cc262787.aspx

Advertisements

SharePoint Conference Season

September 12, 2011

Hi all,

While May, June and August are the era for the big platform events such as EMC World, TechED, and VMWorld…
October is the season for my two major application events.

I am happy to announce that EMC are proud Gold Sponsors at:-

  • SharePoint Conference USA                    Anaheim, California – Oct 3-6
  • SQL PASS Summit                                    Seattle, WA – Oct 11-14

EMC @ the SharePoint Conference

  • Large booth where key experts from the EMC Business Units will be able to describe to you how to make your life easier with SharePoint
  • Demonstrations, mini-lectures, and Q&As
  • Free give-aways.  Yes, again, like TechEd, we will have free t-shirts and on the final day many, many cash spot-prizes for wearing your EMC T-shirt

      Two Sessions

Speaker(s):  James Baldwin, Eyal Sharon  (James & Eyal show)
Level: 200
Understand technical best practices to design and deploy a virtualized SharePoint that leverages FAST Search. Understand how design a flexible and robust architecture that supports your advanced collaboration requirements. Understand how to architect a solution that addresses IT challenges for data growth, application availability and simplified management that also enables your users to find and leverage the right business information to make better decisions.

Speakers:-  Matt Roberts, Nate Treloar
Level 300
Demonstrate how to integrate external video metadata generation services with native SharePoint Search capabilities

Dont forget Europe!

The European SharePoint Conference is taking place in Berlin, Germany   – October 17-20.

I will be there presenting the following session:-

Optimize, Store and Protect SharePoint 2010 Server…Best Practices     Wednesday 15:00 – Session W21

Learn about the critical best practices and considerations for optimizing and growing SharePoint farms, storing user data efficiently and securely, while backing up TB’s a data in minutes. RBS (Remote Blob Store) and Virtualization, are just two of the many techniques discussed in this session. Realize the considerations for providing fast, automated disaster recovery for the entire SharePoint environment through SAN-based technology.

EMC @ SQL PASS Summit

We will have something kinda special at the SQL PASS Summit.  Can’t say more.

But what I can say…

  • Large booth area in the Pavillion, with SQL Experts from EMC including two heros from our team, Tony Wu and Bruce Ye, travelling all the way from Shanghai.
  • Demos, booths, best practices and most importantly application-led conversations around;
  • SQL Server scalability – Infrastructure
  • Optimized Data Protection
  • High availability to where? Same SAN? Same site? next door? next state? next country?  – All of the above <—
  • Something Flashy
  • Proven Solutions around high-speed SQL deployments, one of which is in build right now with Michael and David in our Cork labs.

Hope to see you there.

James.


EMC World 2010 – Boston

May 10, 2010

Hi all,

If you find yourself at EMC World, why not drop into the Solutions Pavillon where we are showcases SharePoint and SQL solutions.

I’m also presenting the following sessions

Tuesday 08:00         SharePoint Storage Best Practices

Wednesday 08:00   Birds of a Feather – Expert Panel – SharePoint, SQL, Oracle and SAP

Thursday 13:00   SharePoint Storage Best Practices (repeat)

I’ll drop the slide here into here once we are done.

Cheers!

J


The iSCSI performance issue…

April 9, 2010

Just an update to the iSCSI performance issue.

Microsoft has released the KB article we worked on last week

http://support.microsoft.com/kb/2020559

Due to the significant change required, the design change request has been submitted and will not be in Windows 7, but is currently being considered for Windows 8 platform.


Best Practice for Hyper-V with iSCSI

March 10, 2010

Hello folks.

In EMC Proven Solution testing for a given use case, I have come across a serious issue in relation to iSCSI responses, which inheritently causes slow storage response times, very slow cluster polling and enumeration.

Test config
Windows 2008 R2 with Hyper-V.   6-node Hyper-V cluster.  65 Disks.  2x iSCSI NIC per node.

What I saw was a slowness in the cluster in bringing a VM’s Virtual Machine Configuration cluster resource online which had a large amount of disks configured.  They would time out as they passed their default pending and deadlock time-outs.

Firstly, when you online, refresh (or failover) a VM configuration, Hyper-V performs a sanity check to ensure the underlying components of the VM (network, storage, etc) are available.  This means scanning all the disks.
So, say I had a VM with 25 disks (in my case), the VM config took over 10 minutes to online!

Why?  Well working with Microsoft OEM Support, they asked me to try to tighten the TCPAckFrequency to 1(millisecond).  I say, OK I’ll try it.
This brought the online time from 10 minutes to 19 seconds!  Result…or maybe…

I needed to fully understand the issue, so out came WireShark in order run some Ethernet traces…

Let me explain what the actual issue is…

The problem is basically that iSCSI is a victim of Nagle’s Algorithm.  Optimization of TCP networks in terms of minimizing congestion due to TCP overhead.  iSCSI is essentially stung by send colalescing.

Windows 2003 onwards, the default TCP acknowledge time is 200ms.
This means that if a TCP segment (1462 bytes) is not full, it may need to wait up to 200ms before the data actually sent from the Windows host.

This is a problem when using iSCSI for two reasons
1) You need the fastest response time possible from storage for your application
2) iSCSI payloads can be very small, esp SCSI control OpCode (CDBs).

Now, the iSCSI CDB (control/query) OpCode commands involved in enumerating the disks during the online action have a tiny payload (10 bytes).

From looking at the ethernet trace, the cluster disk driver is performing SCSI(10) read commands (LUN read inquiry, read capacity, etc).  It does this sequentially, at least twice for each disk involved in the virtual machine.

With the default TCPAck time of 200ms, for each SCSI Read command issues by the cluster node, the payload of the command is 10bytes (on wire is 66bytes).  The SCSI Read command does not fill a TCP segment and so while the TCP payload is sent to the storage controller and the controller responds, the node waits to send the ACK until a segment on that NIC is filled or hits the max ACK time of 200ms.

So, let’s say I am the cluster disk driver and want to read LUN metadata as part of onlining a cluster disk…
…I send the command via the iSCSI Initiator, Storage Controller responds…wait….wait….wait…TcpAckFrequency trigger fires after a max of 200ms….only then is my ACK sent to storage….and the TCP transmission completes and the data is returned to the Cluster disk driver process.

This means for each SCSI command we attempt to send to storage (target and target LUN), we will typically end up waiting for the 200ms timer!  This is regardless of how busy SCSI data I/O is because this algorithm has Winsocket-based granularity.

In a Windows iSCSI cluster, this issue really comes to light because the cluster operates (controls/validates) on resources in the cluster in a sequential manner.
So, for say 65 LUNs, each LUN takes a significantly longer time to validate/control because the duration elongation due to the 200ms ACK timer happens multiple times per LUN.

Storage response times should be in the sub-10ms range, not 200ms 🙂

This is also why I did not see this issue in Fibre Channel environments.

So, while setting the TCP ACK frequency to hardcode to 1ms per-NIC helps, it may have adverse performance implications in terms of wire and Storage Port congestion.

But…for iSCSI Networks, this should not really be of concern because by best practice, you should have isolated iSCSI networks with minimal hops between host and storage.

 

The real fix I believe is by using the TCP_NODELAY option in SockOPTNS for the iSCSI Initator.  This by-passes the Nagle Algorithm completely for that process, so you dont need to even wait for that 1ms (seems a short time, but it is still a trigger time) – the SCSI command will fire immediately.

I have a design change request in for Microsoft to consider this as the way forward. Probably wont see it until Windows 8, but hey!

How to set the TCPAckFrequency on your iSCSI NICs

Regedit – backup your registry – before proceeding 🙂

Subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\<Interface GUID>
Entry: TcpAckFrequency
Value Type: REG_DWORD, number
Valid Range: 0-255
Set to 1.

Do this only for your iSCSI NICs, unless directed otherwise.

Hope this helps

James.


EMC Live! WebCast Series – SharePoint

March 4, 2010

Hi Folks

Back again – been a while, aye but have been working on some exciting stuff, which I will share with you in due course.

So today if anyone has an hour to kill, Eyal and I are presenting an EMC webcast (part 1 of a 4-part series) on SharePoint and touching on what’s coming in SharePoint 2010.

http://info.emc.com/mk/get/DBM6473-4099_raf_lp?reg_src=JamesB

 

Today will be

Thursday, March 4, 2010 – 12 pm PT / 3 pm ET
Learn how to design your SharePoint infrastructure to ensure optimal performance and scalability, as well as leverage the benefits of virtualization.

Part 2 is Dave from our SourceOne engineering wing

Thursday, March 11, 2010 – 12 pm PT / 3 pm ET
Find out how to mitigate risk, reduce costs, and improve SharePoint performance with EMC SourceOne.

Part 3 is Eyal and I again

Thursday, March 18, 2010 – 12 pm PT / 3 pm ET
Learn how to design, deploy, and manage your SharePoint infrastructure to ensure availability and rapid recovery, as well as understand what options are available—from native SQL Server functionality to array-based replication.

Part 4 is another Dave from another engineering wing, EMC Documentum
 
Enhancing SharePoint to Meet Your Information Management Needs

Thursday, March 25, 2010 – 12 pm PT / 3 pm ET
Discover how EMC Documentum integrates SharePoint into your broader information infrastructure, enabling you to cut operational costs and reign in server sprawl.

Hope to see ye there !

But I will share the content with ye later anyways 🙂

Thanks

James,


EMC Replication Manager for SharePoint – watch this!!!

October 16, 2009

As I mentioned before about a special project I was working, well this is it.  SharePoint DBAs SHOULD be excited.

Imagine
*configuring backup protection for the whole SharePoint farm in less than 5 minutes!

*full backup of an active (240,000 heavy users) SharePoint farm 
   – 3 hours 11 minutes. online, no disruption. 
   – 1.5TB of user content, 2.5TB of SharePoint files.

* incremental backup (with a daily change rate of 1%), 
   – 11 minutes

* restore a 100GB content database 
     – 7 minutes

* perform item-level recovery from a backup in minutes
   – without distruption, without a recovery farm
 

This blog is all about making life easier for the SharePoint Admin, users and architects.  This product brings that thought much closer to reality.  EMC Replication Manager for SharePoint 5.2 SP2.

A single application, central console, simple, easy. A storage guy, a DBA, a windows guy – they can all relate and understand it….

While the Blueprint documentation has not yet hit EMC.com, I share it here with you now.

h6600-backup-recovery-ms-sharepoint-clariion-cx4-replication-manager-ontrack-hyper-v-blueprint

For more information to what EMC can offer on SharePoint go to http://tinyurl.com/EMCMOSS
Here is a diagram of the production farm.

 

To illustrate the point, I have created 4 video demonstrations;

1) Creating that application protection (backup configuration & scheduling)

2) Running a backup against a very busy SharePoint farm (worse case scenario test)

3) Restoring a content database from a single user interface

4) Using the combination of Kroll Ontrack Powercontrols and EMC Replication Manager to simplify item-level recovery