Before I start writing ….
I want to set the stage here…
I didn’t think I liked blogs, I actually thought they were a bit self-indulging!
I certainly STILL know I dont like writing about myself!
But, every day I now see how blogs _help_ people like you and me.
It is to me, a great way for people to distil information in a friendly format.
The reason for this blog is you.
I feel like I am in a privileged position in what I do as a profession and I want to be able to share my experiences and information in order help you in your daily endeavours. I had been asked before by people if I had a blog, why I didn’t and would I consider writing one. I resisted, in hindsight, wrongly. I said to myself, the next person who asks me, that will be the trigger-pull I need to start…..that was today…
To state clearly, I am not out to promote myself, some of my blog posts might be total hogwash or is not what you see in your environment (I want to hear about that), and some will indecently try to promote some of the technology my company has to offer.
My name is James Baldwin and I work in EMC Corporation, in Cork, Ireland. I am (wait for this title!) the EMC’s Global Solutions SQL and SharePoint Lead Engineer. Waiting on the business cards, it will be a riot 🙂
As you can guess, I don’t take myself seriously, but I DO take what I do very seriously. I suppose people call me, amoungst other things, an EMC and application evangelist, I actually perfer to call myself a customer evangelist. The former falls into place. That is far more important in my eyes, and I hope that reflects in my subsequent blog posts.
I started working in EMC in 2001, when the shares were still soaring and business class travel on flights was standard.
Before that, I came from 3 years in a special wing of Dell engineering where we built custom or complex desktop and server builds. All OS’es, all hardware, engineered the first Redhat 6.0.x orderable on PE server, drowned in OS/2 Warp for a very special customer for a bit, and importantly delved into all kinds of challenges which customers had.
If you really want to rewind further, I did a Bart Simpson on my dad to force him to buy me a Spectrum 48K at the age of 13. Now he has his own back on me any time he has an “anomoly” with his home PC. IT Karma. I remember loading VMWare on Slackware 4.0, seeing my own PC booting inside itself, displaying a gammy pseudo Pheonix BIOS et al, saying to myself “Jeez that’ll never catch on!” 🙂 Think of the shares….Think of the shares….Forget the Sports Almanac, if I get a working DeLorean, I want the IT Almanac to go to 1996 with. Enough of that…
I arrived into EMC in a technical support capacity, supporting their enterprise backup product at the time, EMC Data Manager (EDM), which ran on Solaris, slightly different beast Linux, but a great OS I must say for multi-threaded applications with some really well thought-out debugging tools. We backed up everything, all mainstream applications, all mainstream OS’es. I quickly understood that regardless of the severity of a call, absolutely nothing is trivial to a customer.
It may well be trivial to someone preaching the topic, but when you are the customer, responsible for a live user environment where a critical business application depends on you and your team member, it’s a whole lot more serious. Go on, see if you can crack a joke with a customer who called you looking for help and guidance because their SAP instance is down and need to recover ASAP. My record for affected users, 325,000. Wont say why, who or how, but we got it fixed and afterwards figured out what went wrong, why and how to prevent it. I must say, actually, I can easily say quality and customer focus was driven into us in technical support. While the job was sometimes stressful, I loved every minute of that job due to the amount of satisfaction in helping solve problems for people. That and the fact the next support call was like a box of chocolates…yes, the guy on the bench…
I remember having a customer call me directly and say “James, dont laugh, I just blicked SG2 on Exch-04”. 1,400 users. These things happen every single day.
Along came technology, disk costs lowered, and this funky thing of point in time replication became a household name, well ok in the storage nerd’s house.
EMC Replication Manager came along and changed things for us in support.
I changed role slightly and had more of a free hand in making things better for the support team in documentation, training and mentoring. In this role, I now understood more. I understood the customer’s problems, but as importantly, I understood the challenges of my fellow technical support people in dealing with such events, in gaining experience and knowledge and being able to apply it. Just as we stabilized our perceptions on Replication Manager and these Virtual Tape Library (VTL) Units, along came RecoverPoint! Again, another significant step in technology. Time to write more “uncovered” documentation to help ourselves figure this stuff out and to chase engineering groups to make their products more sustainable.
I started to understand how technology could actually help us (you and me) in our daily grind. Those poor customers who lost data once had to recovery from tape, could then recently recover from a point in time, could now recover to any point in time. Ignore vendors, the story is there. Technology was getting better. In some circumstances, it was really helping, ie recovery. But, technology was not helping in many respects. Added complexity, the same old human traits (mistakes), mis-placement of technology, and so on. Its there today and will be there tomorrow.
I began to travel to customer sites to assist on the ground with highly technical issues, nothing in a specific area, but more around what environment or “solution” the customer was operating in. I now had the four angles. The customer, the technical support person, the product, and the person who designed the environment (be that person a customer, fellow employee or a third party consultant).
I thought I had a pretty good idea and more importantly, an appreciation of what problems exist in the IT industry…now how could I apply my knowledge more effectively, to prevent those customer issues in the first place?…
Wind forward to today…
I joined the EMC Global Solutions Center in 2007, after 6 good years in support. When I saw the job posting and read what it was about, I said to myself “this is for me”. The position was for a SharePoint solutions engineer.
The Global Solutions Centers (6 of them scattered around the world) are engineering centers of people who take a given solution, design it, built it, test it, break it, analyze it and document it. In a nutshell. We hope to find bugs and when we do we work them out with the product owners (EMC, VMWare, Microsoft, Oracle, etc). We test the environment to scale. We document the best practices. We document what not to do, and why not. We try to take the guess work out of a solution for a customer.
“What do a need to run 240,000 heavy SharePoint users in a tiered environment on Hyper-V with rapid backup” – we answer these questions.
To apply that in context…
We work with customers, all EMC practice and engineering groups and application vendors (e.g. Microsoft, Oracle, SAP, etc) in trying to understand what the most common use cases are for certain applications and environments. Through customer and field feedback, we then decide on what demands and requirements customers have on such an environment.
My team is focused on SQL and SharePoint solutions, but dont let your questioning stop there. Half the battle is knowing I am in a good team of multi-disciplined people, I’ll get the answer.
We go ahead and built out the racks, servers, network & fibre channel switches, storage etc in our customer integration labs. “For this project, I’ll have a 6x 24-core servers please :-)”
Using industry-standard load generation tools, we performance iterative tests on a good datacenter day and on a bad armageddon datacenter day, functional testing, core switches dying, Clustered physical or virtual servers being unplugged (oops!), half the storage array going down – would never happen 🙂 You get the picture.
We test and profile the bad things, so you as a customer dont have to guess what if….
Remember, we test with two things in the forefront of our mind at all times – application and customer.
As a customer, I dont really care that an EMC switch, disk, cable or software component has died, I just care about my app.
We take that approach. If we see something we don’t like, we let the relevent people know “You’ve got to fix this, don’t let a customer suffer this” is the message.
We are the “external” customer that product groups always wanted, but they dont really know it.
I will share our experiences with you. I will share “look what we found” with you in concise, technical details. I want you to share your experiences with me. The more I know about your struggles, the more I hopefully will be able to help. I am entirely willing to test something you hit in our labs if we have the given time and capacity.
In summary, I’m here to help…