advertisement
advertisement

Down For 8 Days: American Eagle’s Site Disaster

Written by Frank Hayes and Evan Schuman
July 29th, 2010

In one of the longest site outages ever for a multi-billion-dollar retailer, Tuesday (July 27) saw the apparent end of more than a week of Web problems and days of an outright crashed site for Pittsburgh-based clothing chain American Eagle Outfitters, which outsources much of its Web operations to IBM. The site crashed last Monday (July 19) and stayed dark until Friday (July 23), when it limped along with various parts not functioning until Tuesday afternoon (July 27).

The site’s problems, though, shed light on an interesting strategy. During the many days of complete Web site death, the $2.7 billion apparel chain’s mobile site was still up. But it apparently was not able to perform purchases. Officials at American Eagle Outfitters, IBM and Usablenet—which handles the chain’s mobile site—wouldn’t comment on the mobile site’s functionality during the crash.


New Details About The Crash Causes: Oracle Backup The Culprit, Along With Big Blue

But this raises the question: Should retailers look to their mobile sites as emergency backups for their Web sites? Should pages indicating that a site is down automatically include a link to the site’s mobile version?

Mobile sites, of course, work just as well on desktop machines as they do on phones. American Eagle Outfitters, which has the admirably short URL of ae.com, exists as a mobile site.

Before we dive into that mobile-as-site-backup issue, let’s look at exactly what happened with American Eagle’s site. None of the players involved would get specific as to what was wrong with the site, other than to say that there was no upgrade going on at the time and that the site experienced “a hardware issue.”

A server failure almost certainly would not have caused this problem; redundant servers would likely have kicked in while the defective machine was replaced with a new server and a backup was restored. That process would have taken a few hours, not almost eight days.

This delay suggests some sort of storage problem. Say the storage array begins to fail. OK, no problem, we’ll just find the bad drive and replace it. Whoops, looks like something has corrupted multiple drives. (That could happen if power gets flaky inside the array.) Now we have a catastrophic failure of the storage array. No problem, we’ll just fix the hardware and restore.

Whoops, new problem: Turns out this problem has been going on for a while. The last set of backups is corrupted. So is the set of backups before that. Sorting through to reconstruct good data is going to take time.

Alternatively: All recent backup sets are toast. Maybe nobody was verifying that the data was actually being written. However, all the transactions are being logged. No problem, then: All it takes is a lot of time and special expertise to essentially rerun all the recent transactions (since the last good backup) into an empty database, merge the new stuff with the old stuff and then load it all back into the replacement hardware.

By the way, it seems that American Eagle was recently searching for a “Manager – Business Continuity & Disaster Recovery”. The job was still an active posting on May 25 but has since been filled. Not a moment too soon, eh? (Thanks, Google cache!)


advertisement

One Comment | Read Down For 8 Days: American Eagle’s Site Disaster

  1. Gareth Evans Says:

    Contingency planning is frought with all sorts of pitfalls. The suggestion about running your mobile site on “mirrored versions of the key databases” sounds great, aprt from in AE’s case the gradual curruption of the main site’s databases due to the array problem would also be “mirrored” onto the mobile site.
    You could handle bandwidth issues by locating in the same datacentre and sharing the main site’s bandwidth. But that leaves both sites vulnerable to both a bandwidth outage or a datacentre failure (say, the power supply fails.
    It reminds me of the phrase currently very popular with politians (certainly over here in the UK) “it’s a problem of unintended consequences”.

Newsletters

StorefrontBacktalk delivers the latest retail technology news & analysis. Join more than 60,000 retail IT leaders who subscribe to our free weekly email. Sign up today!
advertisement

Most Recent Comments

Why Did Gonzales Hackers Like European Cards So Much Better?

I am still unclear about the core point here-- why higher value of European cards. Supply and demand, yes, makes sense. But the fact that the cards were chip and pin (EMV) should make them less valuable because that demonstrably reduces the ability to use them fraudulently. Did the author mean that the chip and pin cards could be used in a country where EMV is not implemented--the US--and this mis-match make it easier to us them since the issuing banks may not have as robust anti-fraud controls as non-EMV banks because they assumed EMV would do the fraud prevention for them Read more...
Two possible reasons that I can think of and have seen in the past - 1) Cards issued by European banks when used online cross border don't usually support AVS checks. So, when a European card is used with a billing address that's in the US, an ecom merchant wouldn't necessarily know that the shipping zip code doesn't match the billing code. 2) Also, in offline chip countries the card determines whether or not a transaction is approved, not the issuer. In my experience, European issuers haven't developed the same checks on authorization requests as US issuers. So, these cards might be more valuable because they are more likely to get approved. Read more...
A smart card slot in terminals doesn't mean there is a reader or that the reader is activated. Then, activated reader or not, the U.S. processors don't have apps certified or ready to load into those terminals to accept and process smart card transactions just yet. Don't get your card(t) before the terminal (horse). Read more...
The marketplace does speak. More fraud capacity translates to higher value for the stolen data. Because nearly 100% of all US transactions are authorized online in real time, we have less fraud regardless of whether the card is Magstripe only or chip and PIn. Hence, $10 prices for US cards vs $25 for the European counterparts. Read more...
@David True. The European cards have both an EMV chip AND a mag stripe. Europeans may generally use the chip for their transactions, but the insecure stripe remains vulnerable to skimming, whether it be from a false front on an ATM or a dishonest waiter with a handheld skimmer. If their stripe is skimmed, the track data can still be cloned and used fraudulently in the United States. If European banks only detect fraud from 9-5 GMT, that might explain why American criminals prefer them over American bank issued cards, who have fraud detection in place 24x7. Read more...

StorefrontBacktalk
Our apologies. Due to legal and security copyright issues, we can't facilitate the printing of Premium Content. If you absolutely need a hard copy, please contact customer service.