This is page 2 of:
Do We Have To Sneak Audit Site Hosts Now?
After all, what happened to American Eagle should have been impossible. The retailer and IBM, its hosting provider, had the right plan for dealing with an outage. Disk drive fails? Storage array recovers automatically. Second drive in the array fails? A quick restore brings the data back. Complete failure at the main datacenter? Switch to the backup site. The plan should have been bulletproof. Instead, it was a failure at every level, and American Eagle was crippled for more than a week.
Should American Eagle have sent a steady stream of IT employees to constantly check on IBM’s hosting work–or, better still, contracted an elite SWAT team to spy on Big Blue? That team probably would have spotted the faulty backups and the fact that they weren’t being routinely verified. It might have noted more subtle problems, like an aging storage array or a datacenter running just a bit too warm.
That approach would also have cost a small fortune, wiping out the cost advantages of outsourcing. It probably would have made American Eagle the kind of customer that just isn’t worth keeping. It would have been overkill.
But overkill isn’t necessary. Just a little more attention to how an outsourcer is doing–at only a little greater cost–can work wonders.
That means monitoring service levels to confirm they meet the SLA. It means making sure you have a clause in your contract that lets you audit, so you can confirm that backups are being made and they work, that datacenter operations appear to be crisp and professional, and that important plans (such as American Eagle’s planned-but-not-ready backup site) aren’t shelved. And it means actually doing those audits; not constantly, but often enough to make sure you’re getting what you’ve paid for–and often enough to make an impression.
Such attention does more than just identify disasters-in-the-making. It also gooses providers just enough to remind them they can’t cut too many corners or become complacent. It shows them you’re serious and can’t be taken for granted, but without implying that the outsourcer is incompetent and completely untrustworthy.
Remember, you don’t really want to find a mess when you check on an outsourcer. You want to find that everything is working the way it should. That may mean a little more expense and a little less trust than you’d like.
But in the wake of American Eagle’s crash, it’s a tradeoff you can’t afford not to make.
August 5th, 2010 at 2:25 pm
Outsourcing is a lot like franchising. You franchise your business so you can scale it. Doing so however, you end up depending on hundreds or thousands of independent franchisees to represent your brand before the public. If you don’t verify the execution at store level in a franchise business, you are walking dead. Corporate programs need to be executed consistently, health and safety standards adhered to, training taken, etc…Do retail chains solely rely on each franchisee’s goodwill and the training program? No they don’t. Good will and training are necessary but not sufficient. Large retail chains, particularly those in the food service, send district managers to their stores to do “store walks”, “store visits” or “store audits” as they are sometime called. Likewise I don’t think you can or should blindly outsource. Surprise visits? Absolutely. Furthermore, visits should be scripted and recorded digitally. Any non-compliant issues should be noted and remedied. Trust your partners yes, but verify…
August 17th, 2010 at 1:13 pm
Monitor and response to the failure backup of logs on the disaster recovery site is easy and basic. The alarm message must be somewhere. The problem is who will response and how to upgrade if fail to response. Those questions should be answered in the SLA.
Do we need a special audit on site? No necessary! However, we should review our SLA and make it clear on how to response and upgrade these kind of failures.