Recovery Disaster: PayPal Crash Strands Merchants

Written by Frank Hayes
November 4th, 2010

Two major technology glitches in a row knocked PayPal offline on Friday (Oct. 29), preventing the alternative payment giant from processing any E-tailer transactions for 80 minutes. First a network hardware failure shut down all PayPal payments. Then the backup plan failed when a handoff to a secondary datacenter didn’t go smoothly. The result was a worldwide shutdown of PayPal’s $40 billion merchant-services business that left E-tailers scrambling to limit damage from the outage.

PayPal’s outage again spotlights the problem of backup strategies that simply don’t. It’s painfully reminiscent of recent datacenter fiascos at American Eagle Outfitters and Wal-Mart. And while some major retailers were kept apprised of the progress of PayPal’s outage and disabled PayPal payment functionality on their E-Commerce sites to minimize problems, most of PayPal’s customers got the word late or not at all. Apparently there was no effective plan for dealing with that side of the outage, either.

PayPal isn’t saying much about the outage beyond its official statement by Scott Guilfoyle, the company’s senior VP for platform services: “At around 8:07 AM [San Francisco time Friday], a network hardware failure in one of our datacenters resulted in a service interruption for all PayPal users worldwide. Everyone in our organization was immediately engaged to identify the issue and get PayPal back up and running. We were not able to switch over to our backup systems as quickly as planned. We partially restored service by approximately 8:45 AM and the issue was fully resolved by 9:24 AM. A second service interruption started at around 11:30 AM and was partially resolved at 11:55 AM with full recovery at 12:21 PM.”

But the company’s “Live Site Status” blog tells a more detailed story. According to the technical blog, the incident (and PayPal) went down like this:

  • 8:06 AM (San Francisco time): Networking hardware failed in a PayPal datacenter, cutting off service for all PayPal users worldwide. Ordinary customers received a “Sorry—your last action could not be completed” message. E-tailers using the PayPal APIs got timeouts. PayPal won’t say exactly what happened (a backhoe cable cut? a datacenter fire? someone playing tip-the-cow with a rack of switches?), but all users worldwide were cut off in the outage.

  • 8:07-8:44 AM: Merchants and ordinary PayPal customers remained completely cutoff, as PayPal attempted to switch over to its Denver datacenter. PayPal won’t explain why the handoff failed for so long.

  • 8:45 AM: The PayPal Web site partially recovered, so some consumers could make payments. Merchant APIs remained down.

  • 9:24 AM: Merchant APIs began to recover, running out of the Denver datacenter.

  • advertisement

    One Comment | Read Recovery Disaster: PayPal Crash Strands Merchants

    1. Bill Bittner Says:

      There are two thoughts this whole incident inspires. The first is just the whole idea of backups in general. The simple answer is “practice, practice, practice”. Backup plans have to be exercised on a regular basis and they must go full circle, transferring to the backup site and also bringing services back on location.

      But the other thing that comes to mind is “too big to fail”, to borrow a phrase from the financial crises. A lot of retailers are considering Cloud Computing and they should. Cloud Computing makes significant sense economically, but it also introduces a whole new set of risk factors. The backup plan becomes even more significant because the retailer is counting on their service provider to be practicing it. As processing becomes more centralized the impact of a single outage becomes more significant. At the same time, the processes necessary to ensure adequate backup are becoming more opaque. Retailers considering Cloud solutions should consider this in their evaluations.


    StorefrontBacktalk delivers the latest retail technology news & analysis. Join more than 60,000 retail IT leaders who subscribe to our free weekly email. Sign up today!

    Most Recent Comments

    Why Did Gonzales Hackers Like European Cards So Much Better?

    I am still unclear about the core point here-- why higher value of European cards. Supply and demand, yes, makes sense. But the fact that the cards were chip and pin (EMV) should make them less valuable because that demonstrably reduces the ability to use them fraudulently. Did the author mean that the chip and pin cards could be used in a country where EMV is not implemented--the US--and this mis-match make it easier to us them since the issuing banks may not have as robust anti-fraud controls as non-EMV banks because they assumed EMV would do the fraud prevention for them Read more...
    Two possible reasons that I can think of and have seen in the past - 1) Cards issued by European banks when used online cross border don't usually support AVS checks. So, when a European card is used with a billing address that's in the US, an ecom merchant wouldn't necessarily know that the shipping zip code doesn't match the billing code. 2) Also, in offline chip countries the card determines whether or not a transaction is approved, not the issuer. In my experience, European issuers haven't developed the same checks on authorization requests as US issuers. So, these cards might be more valuable because they are more likely to get approved. Read more...
    A smart card slot in terminals doesn't mean there is a reader or that the reader is activated. Then, activated reader or not, the U.S. processors don't have apps certified or ready to load into those terminals to accept and process smart card transactions just yet. Don't get your card(t) before the terminal (horse). Read more...
    The marketplace does speak. More fraud capacity translates to higher value for the stolen data. Because nearly 100% of all US transactions are authorized online in real time, we have less fraud regardless of whether the card is Magstripe only or chip and PIn. Hence, $10 prices for US cards vs $25 for the European counterparts. Read more...
    @David True. The European cards have both an EMV chip AND a mag stripe. Europeans may generally use the chip for their transactions, but the insecure stripe remains vulnerable to skimming, whether it be from a false front on an ATM or a dishonest waiter with a handheld skimmer. If their stripe is skimmed, the track data can still be cloned and used fraudulently in the United States. If European banks only detect fraud from 9-5 GMT, that might explain why American criminals prefer them over American bank issued cards, who have fraud detection in place 24x7. Read more...

    Our apologies. Due to legal and security copyright issues, we can't facilitate the printing of Premium Content. If you absolutely need a hard copy, please contact customer service.