This is page 2 of:
New PCI Changes: Network Segmentation, One-Way PAN Hashing
One-Way Hashing Of PANs
Another change will be clarification on strong one-way hashing of PANs. Merchants can remove PAN data from PCI scope either by truncation (deleting all but the first 6 and last 4 digits) or using a secure one-way hash that cannot be reversed.
There is a controversy is simmering among some in the cryptographic community regarding the reversibility of one-way hashes, and it looks like this change is designed to address it. The details of the arguments are mathematical, and they have to do with the limited size and randomness of PANs (e.g., the BINs are a limited set of numbers, PANs are 16 digits, one digit is a calculated check digit, etc.).
Once again, this clarification promises to be a welcome step by the Council to help merchants and their QSAs clarify what is and what is not in scope. We’ll have to see the details to know how well this argument has been laid to rest.
Moving To A Three-Year PCI Lifecycle
One of the more significant changes we will likely see is the Council moving to a three-year PCI lifecycle. At this time it is only a recommendation to the Technical Working Group; no final decision has been made to move from the current two-year lifecycle to a three-year lifecycle for PCI revisions. The longer period reflects the lower number of new requirements expected, and it would allow more time for feedback. Emerging threats and new technologies would be addressed through a combination of interim revisions, supplements and the FAQ on the Council’s Web site.
This change is a positive one for merchants and it further defuses the criticism that PCI is a shifting standard–an argument I disagree with. A longer lifecycle gives more stability to the requirements while still preserving the flexibility to respond to sudden developments or new attack vectors. I hope this change is adopted, but I have one concern.
If the Council is going to rely on its Web site and FAQ to communicate changes for the three years between updates, it needs to find a better way to distribute new information to merchants. For example, I wish the Council would provide an RSS feed on the FAQ (it has done so for announcements) so merchants and QSAs could learn about updates as they happen.
Although the FAQ is currently searchable, it can’t notify you when a new or revised answer is posted. A real-life example of this difficulty happened with the recent revision and then again with the re-revision of the Council’s position on voice recordings. If the PCI Council is going to use its FAQ and its Web site to update merchants, then I hope it will find an automated way for all of us to learn when new FAQs or clarifications are posted.
Acceptable Network Segmentation Definition
In another change, we can expect the Council to clarify what constitutes acceptable network segmentation. Although segmenting your cardholder data environment from the rest of your network is not required by PCI, it is the only way to approach compliance that preserves both your budget and your sanity.
Segmentation sounds simple, but there are frequently questions about whether particular implementations are “segmented enough” to reduce scope. I am hoping the revised version will have examples, but there are so many possible alternatives that I am not holding my breath. Nevertheless, any additional guidance will be very welcome.
End-To-End Encryption, Tokenization, EMV
As part of the changes, we can expect to see position papers that provide guidance on a range of emerging technologies. One of the first will address end-to-end encryption and we can expect it this Summer or Fall. Other papers will address tokenization and even the Eurocard-MasterCard-Visa (EMV) chip-card standard.
It will be interesting to see if the EMV paper addresses the most recent compromises, especially with the recent conversion to chip-and-PIN in the Canadian market.
The position papers will be welcome, especially if they fulfill the promise of combining a primer on the topic with recommendations for implementing the technology in a PCI-compliant manner. Don’t expect the implementation recommendations to be too specific, though.
Also among its clarifications, the Council will update its table of what constitutes cardholder data and, interestingly, address the applicability of PCI for card issuers. This latter issue is one that has annoyed and upset merchants since the dawn of PCI.
I am looking forward to this clarification knocking down some of urban myths about whether PCI applies to issuers. The Council’s official position (FAQ #5391) is that PCI applies to everyone. But enforcement is a brand issue, not a PCI Council issue.
April 15th, 2010 at 9:33 am
Changing the way secure hashed PANs are used is going to cost retailers millions in updates to systems they just stood up. I bet most will find a way to sweep the compliance issue under the rug without addressing the actual security problem.
April 15th, 2010 at 11:19 am
Who was the chum that came up with “Strong one-way hashs that can not be reversed”. There’s an oxy-moron.
Who is kidding how here? How long does anyone think it takes to hash every possible number and pull out the combinations that are MOD10 numbers. You are only dealing with a 13-16 digit PAN space! How difficult does anyone think it is to put these numbers into a lookup table.
HASH in, PAN out. It’s that simple. Perfect application of a rainbow table. This hashing “solution” solves virtually nothing.
There is no replacement for properly implemented strong encryption algorithms. While on this topic, how many are encryting a 16 digit (16 byte) number with a 256 bit (32 byte) encryption key? Brilliant– ever heard of probable clear text crypto analysis?
Perhaps it is time to start rewarding creative people who are able to poke holes in “requirements” like this. There are practical attack capability views that need to drive standards that adequately manage risk which in turn properly protects cardholder data.
Perhaps a points system needs to be developed for protection measures implemented which translate to a risk score. Perhaps the rules for the risk score need to be based upon merchant levels. Bottom line is always how many cards are at risk, and how at risk are they.
PCI claims to be a standard that requires all answers to be YES for compliance. Put it into a binary AND logic for every control. There is one state of yes and all other answers are no.
Ultimately a business can do so much to get a YES on compliance and still totally miss the boat on managing risk. In a sense this is an exercise in copying someone’s paper in school only to find out that they flunked.
April 15th, 2010 at 12:28 pm
I disagree with security manager’s comment. A salted hash approach is a great way to make a rainbow table impractical. Hashing can securely enable great functionity when done right.
April 15th, 2010 at 12:56 pm
I question where the term “strong one-way hash” came from. I assume they selected this term to disqualify older MD4 and MD5 algorithms that have been compromised. To me it would have been better to define acceptable list of hash algorithms. By excluding “weaker” hash algorithms, they may be shooting themselves in the foot if the weaker algorithm has limited resolution which produces multiple false positives in a lookup table. By this I mean using a simple CRC-16 or CRC-32 will suffice for most merchants especially if combined with the last 4 of the card number. While these hash methods would be considered “weak”, both these will generate duplicate hits for multiple card numbers in lookup tables making a lookup virtually useless.
As far a “strong” hashes go, hopefully they will require a hash salt of some sort and limit the scope of a particular salt value (like not all customers of one payment processor can share the same salt value).
April 15th, 2010 at 4:19 pm
Can someone point to me the official article that states that PCI will become the law of the land (I’m assuming in US) as of October 2010??
Or am I reading this wrong as the author merely is trying to say that there is going to be another PCI modification official release in October and not making it a legal requirement for all??
April 15th, 2010 at 4:28 pm
Editor’s Note: The term “law of the card-processing land” was not meant literally. It simply meant that it will be the PCI standard as of that point.
April 16th, 2010 at 6:30 am
The new version of the standard will come into effect Oct 1st 2010 on a global level. However merchants / service providers etc,etc do not have to be audited against that version until Oct 1st 2011. So if your audit is due Jan 2011, you can still be assessed under v1.2.
April 16th, 2010 at 8:58 am
“Can someone point to me the official article that states that PCI will become the law of the land….”
This is the link to the Life Cycle. About as close as you can get at the moment to a schedule.
https://www.pcisecuritystandards.org/pdfs/OS_PCI_Lifecycle.pdf
April 16th, 2010 at 6:03 pm
This issue is not “simmering” in the cryptographic community. Its a simple mathematical fact that the level of attainable security with hashing has distinct limit if its not implemented and used correctly.
Security has to be correct otherwise you are achieving nothing, let alone meeting a compliance requirement.
Luther Martin posted on the specific facts around this last year on this blog post which is well worth reading for those who want detail:
Some people want to use hashing to render cardholder information unreadable, but a closer look at hash functions shows that this technique ends up either being non-secure, or if it’s done in a secure way, then it’s equivalent to encryption because the security depends on the secrecy of secret information.
The FAQ for the PCI DSS has the following to say about using a cryptographic hash function to render cardholder data unreadable:
Are hashed Primary Account Numbers (PAN) considered cardholder data that must be protected in accordance with PCI DSS?
One-way hashing meets the intent of rendering the PAN unreadable in storage; however the hashing process and results, as well as the system(s) that perform the hashing, would still be in scope to assure that the PAN cannot be recovered. If the hashing result is transferred and stored within a separate environment, the hashed data in that separate environment would no longer be considered cardholder data and the system(s) storing the hashed data would be out of scope of PCI DSS. If however, the system hashes and stores the data on the same system, that system is considered to be storing cardholder data and is within PCI DSS scope. The difference lies in where the data is hashed and then stored. More on hashing: A hash is intended to be irreversible by taking a variable-length input and producing a fixed-length string of cipher text. As the PAN has been ‘replaced’, it should most often be considered out of scope in the same manner receipt of truncated PANs are out of scope. However, PCI DSS Requirement 3.4 also states that the hash must be strong and one-way. This implies that the algorithm must use strong cryptography (e.g. collisions would not occur frequently) and the hash cannot be recovered or easily determined during an attack. It is also a recommended practice, but not specified requirement, that a salt be included. Since the intent of hashing is that the merchant or service provider will never need to recover the PAN again, a recommended practice is to simply remove the PAN rather than allowing the possibility of a compromise cracking the hash and revealing the original PAN. If the merchant or service provider intends to recover and use the PAN, then hashing is not an option and they should evaluate a strong encryption method.
Note that including a salt is recommended but not required. The PCI SSC should consider revising this to require a salt and to reconsider how this affects determining exactly which systems are in scope and which ones are not for a PCI DSS assessment.
A hash function H takes a message M and calculates a message digest or hash D=H(M) from it. A cryptographic hash function is one in which the following three operations are adequately hard:
1.
Finding two messages M1 and M2 such that H(M1)=H(M2). This is called finding a collision.
2.
Given a message digest D, finding a message M with H(M)=D. This is called finding a preimage.
3.
Given a message M1 and its digest D=H(M1), find another message M2 that produces the same digest, or that D=H(M2). This is called finding a second preimage.
When a hash function is used to render cardholder data unreadable, we’re really saying that it needs to be hard to find a preimage for a given message digest. If it’s easy to do that, then an attacker can recover a PAN from a hash of the PAN, which means that the hash wasn’t really unreadable. Making a hash of a PAN unreadable really requires more than just running a PAN through a cryptographic hash function. This is because there really aren’t that many PANs possible.
You can divide a 16-digit PAN into three parts. The first six digits are the Issuer Identification Number (IIN). The next seven digits are an account number. The last digit is a checksum that’s calculated from the previous 15 digits.
With a 16-digit PAN, there are 1016possible PANs. Calculating all 1016 possible message digests for these PANs sounds hard, but it doesn’t require the level of effort required to make it as hard as breaking other forms of cryptography. It’s roughly equivalent to the work required to break a 53-bit cryptographic key. That’s a non-trivial amount of work, but one that isn’t enough to really be considered secure against hackers today.
On the other hand, because the first six digits of a PAN can often be guessed, it’s probably even easier to reverse a hash of a PAN than that because it’s very reasonable for a hacker to be able to guess the IIN.
The IIN just tells you what type of card a PAN is from and what bank issued the card. If you’re a hacker that manages to breach the security of a particular bank, for example, then it’s very easy to greatly limit the range of possible IINs, leaving only the account number and the checksum that are unknown.
If you know the first six digits of a PAN, then reversing a hash function from a hash of the PAN is very easy. You only have to calculate 1010 possible message digests, which is roughly the work required to break a 33-bit cryptographic key. That’s an amount of work that’s fairly easy with today’s computers, and one that’s feasible for many hackers to do.
This means that if an attacker knows the IIN part of a PAN then replacing the PAN by a hash of the PAN doesn’t really provide that much security for the PAN. It provides some security, but not enough to really defeat a moderately-determined attacker.
One way to make it harder for an attacker to recover a PAN from a hash of the PAN is to add additional information called a salt to the PAN when it’s used to calculate a hash of it. So instead of calculating D=H(PAN), you might calculate D=H(PAN||SALT) instead. This makes it much harder for an attacker, but it also requires keeping the value of the salt secret to make it difficult for a hacker to find the value of a PAN from a hash of the PAN.
If the salt isn’t secret then using it doesn’t make it harder for an attacker to find a preimage of D, which means that it’s no more difficult to recover a PAN from a hash of the PAN. If this is the case, then the reason behind replacing a PAN with a hash of the PAN doesn’t make sense any more because the hash function is no longer reversible.
On the other hand, if the difficulty of recovering a PAN from a hash of the PAN depends on the secrecy of a salt, then there’s no real difference between the protection provided by replacing a PAN with a hash of the PAN and replacing a PAN with an encrypted version of the PAN. In the case of using encryption, we call this value a cryptographic key. In the case of using a salted hash, we call this value a salt. In both cases, reversing the transformation is easy if an attacker has access to a secret. This means that for the purposes of complying with the PCI DSS, the two probably ought to be considered equivalent.
April 16th, 2010 at 7:26 pm
I’d like to respond to a couple of the comments.
For those following the hashing discussion, I’d refer you to the PCI Council FAQ 8718 (http://selfservice.talisma.com/display/2n/kb/article.aspx?aid=8718). It has an excellent summary of how hashing can meet the intent of Requirement 3.4 and what is acceptable. Give it a read.
As for the effective date of the new release, it is expected be 1 October 2010 per published plans. That is when it becomes “PCI law of the land.” For merchants and processors in the process of compliance validation, if the past is a guide they can expect to have a three-month (*not* one year) grace period. That is, you might expect to have until December 2010 to use version 1.2 in some cases. Check the Council website and ask your QSA if you have questions.
Remember, compliance enforcement is up to the card brands and not a PCI Council decision.
April 19th, 2010 at 6:39 pm
Walt, the above link to the FAQ does not work. However, I was able to get to the cited text by starting at the pcisecuritystandards.org site, clicking on the FAQs link in the left navigation bar, and entering 8718 into the search box.
April 19th, 2010 at 7:19 pm
Editor’s Note: David and Walt, there comes a time when even the power of a good HREF link must give way to common sense. The FAQ in question is barely 12 lines! Here it is, in its entirety, from PCI’s site: “Are hashed Primary Account Numbers (PAN) considered cardholder data that must be protected in accordance with PCI DSS? ANSWER: One-way hashing meets the intent of rendering the PAN unreadable in storage; however the hashing process and results, as well as the system(s) that perform the hashing, would still be in scope to assure that the PAN cannot be recovered. If the hashing result is transferred and stored within a separate environment, the hashed data in that separate environment would no longer be considered cardholder data and the system(s) storing the hashed data would be out of scope of PCI DSS. If however, the system hashes and stores the data on the same system, that system is considered to be storing cardholder data and is within PCI DSS scope. The difference lies in where the data is hashed and then stored. More on hashing: A hash is intended to be irreversible by taking a variable-length input and producing a fixed-length string of cipher text. As the PAN has been ‘replaced’, it should most often be considered out of scope in the same manner receipt of truncated PANs are out of scope. However, PCI DSS Requirement 3.4 also states that the hash must be strong and one-way. This implies that the algorithm must use strong cryptography (e.g. collisions would not occur frequently) and the hash cannot be recovered or easily determined during an attack. It is also a recommended practice, but not specified requirement, that a salt be included. Since the intent of hashing is that the merchant or service provider will never need to recover the PAN again, a recommended practice is to simply remove the PAN rather than allowing the possibility of a compromise cracking the hash and revealing the original PAN. If the merchant or service provider intends to recover and use the PAN, then hashing is not an option and they should evaluate a strong encryption method. “
April 19th, 2010 at 11:28 pm
Astute readers will note that in my prior posting, the cut and paste editing omitted some formatting such as superscript fonts. So, where the value 1016 is noted, read this as 10^16 and the value 1010 should be read as 10^10.
regards,
Mark
April 22nd, 2010 at 1:42 pm
Who else is hungry? Dinner time — to salt or not to salt.
On a more serious note – I am glad my concerns around this were expanded upon.
On solving “normal” hashing (without the use of salt) the expansion of the concern was well put. Better than I could have done (which is why I elected not to) so GREAT JOB.
I considered attempting to write something on the number of bits you have to solve for when since the BINs are all well known and the last 4 may be known in some environments. I didn’t want the main point to be missed.
Great explanation of what salted hashes and encryption have in common.
If the salt is not sufficiently complex or long enough it serves little-to-no value.
The times when someone should select hashing instead of encryption are really times when they probably need to choose secure delete instead of retaining it.
If someone truely believes they don’t need the data anymore (which they are saying when they hash it instead of encrypting) they have a hard time making a business case for why they have the PCI data in any form at all.
Perhaps this is a case of giving too much guidance to people who are not acutally qualified to make smart decisions about it.
The law of nature says the stupid shall be punished — indeed — along with the rest of us who trusted “the stupid” with our credit card number in good faith.
In a tokenization world I’ve seen hashing as well as salted hashing applied. I can intelligently argue that the designers going down this path have implemented a completely flawed system. It reminds me of the orginal design of WEP. Perhaps someone was cheating off the kid that flunked.
In every case where I’ve come across hashing I’ve tried to have conversations with the vendor. No arguement on if tokenization systems function or not– clearly they do. The issue is that they have absolutely no argument on what value the hash or salted hash presents for the purpose of tokenization that a foreign key relationship doesn’t also present. Foreign key relationships have another value too– they have absolutely no mathematical relationship with the cardholder data.
Note: I have chosen not to explain the innerworking of foreign key based database relationships. Foreign key here is used in that context. Google it if you need to)
Allowing hashing to move forward at large scale is a drastic reduction in implemented security and represents a huge win *for attackers*. It reduces security puzzle they have to solve drastically. The work factor of 2^128, 2^168, or 2^256 (for strong encryption) compared to the work factor of 2^33.
For those of you who haven’t thought about it lately– how long did it take your old Pentium 133 to break a 40 bit WEP key? Have you considered how much faster the PC on your desk 10 years later is– how long do you think it takes now? Answer: Practically speaking some wirelesss attack programs out there now can (and do) break the key and start using it faster than a legitimate user or administrator could type it in.
A rainbow table that can solve for MD5 or even SHA can be constructed in a matter of hours if you reduce the problem correctly as Mark Bower suggested first.
Adding salt should make it take more time, but an attacker can deduce within a matter of only a few lookups if yours is salted or not. They might move on if it is, but then again they might just find your salt value and use it.
Effort vs reward says those that do not salt are at higher risk than those who do.
Comfort wise ponder this: An attacker was able to get all the way to the data going undetected– what are the chances they will just assume it is not salt? (really good) What are the chances that they won’t get to the salt value? (answer that for yourself merchants as it always depends. Invest in an HSM or use PEM encoding with strong pass-phrases guarded by split-knowledge dual control and you may have shot at keeping the secret safe– then again choose to short or simple of a phrase and see how long it takes your a simple PC to break it)
Ultimately security always must be a balancing of risk vs business case. Just make sure it is not a trade-off between compliance and security. It’s not about what you CAN do– it is a matter of what you NEED to do. Make sure you’re not cheating off the kid that flunked.
Sarcastically speakin I wonder how the evil doers of the world keep getting their hands on the money needed to do evil.
If more experience related information was out there about people who have suffered through the painful process of having their credit cards breached maybe more people would actually listen to the security processional that have chosen to wear the white hat.
I would put the skills of many of the other professionals I’ve known in my career up against the best of them wearing a black hat. The problem I’ve seen time and time again is it is not the guys wearing the hat on the “good guy” side that are calling the shots.
The businesses needs to trust their best security professionals– not the ones that like to compromise for compromise’s sake. Ask intelligent questions like which one is the lower risk.
Weed out the guys and gals that have no business in a profession where the attackers are dead serious.
Allow those of us wearing a white hats that know our stuff — and I mean KNOW our stuff — to do our job.
Support us, stop undermining, and get out of the way for this is a war of good vs evil and your society of laws and other things only apply to the bad guys that got caught.
Anything less is stupid, and as history has shown, is punished accordingly.
April 29th, 2010 at 5:00 pm
There is a lot of confusion about applied cryptography in this thread. Here’s some items that should help a reader to understand:
“Hashes” really are one way. There is no way to “reverse” the hash back to plain text. However, attackers can guess the plain text, hash the plain text, then compare the hash to the one in question to see if it matches. They can keep doing this over and over to “brute force” guess.
“Rainbow tables” are precalculated hash values for common hashing algorithms (e.g. MD5 and SHA1) for plain texts saved in a database. When there is a limited amount of plain text values such as the case of valid credit card numbers (aka PAN), a rainbow table can save an attacker time. It’s a lot of work to initially create the database, but you can do it once and use that database moving forward as a huge time-saver. Rainbow tables are already popular for breaking passwords using words and slight deviations from word dictionaries (e.g. password, p@ssw0rd, password1).
To defend against rainbow tables, the crypto implementer should add random “salt” data to the plain text before hashing, then store that plain text salt with the resulting hash. When you need to compare against the stored hash, you programmatically read the plain text salt, combine with the input plain text, hash, then compare to the saved hash. Sure an attacker that compromises the stored data would have access to the salt too, but they would still have to go through the trouble of calculating the hash value for every possible plain text + salt. The whole benefit of a rainbow table is out the window. Additionally, two identical plain texts will result in different hashes due to having different salt data applied which protects against frequency analysis.
If you’re dealing with limited key space such as credit card PANs, brute forcing is still a valid concern even without salt. The common way to defeat brute force guessing is to make it computationally expensive. “Iterations” involve hashing the plain text, then repeatedly hashing the hash. 2,000 iterations is a default used by many implementations out there. Attackers trying to brute force are slowed down to 1/2,000th their original speed.
It is a common myth that MD5 is broken. MD2 and newer and all the SHA family are not broken in a sense that figuring out the plain text is easier. “Collisions” naturally occur with any algorithm since the number of possible values is limited to the size of the hash. If a data is one bit larger than the size of the hash, there is room for 2 plain text values that produce the same hash. An attacker ideally has to brute force guess to find one and in a hash keyspace, it’s computationally improbable to do. The weakness with MD5 and older algorithms is that there’s methods found that make it easier than chance to find a collision. The danger in this is that it’s now possible for attackers to create malicious data or assume identity of the data it’s pretending to be. Integrity is lost. A while back, someone used a bunch of PS3s for their GPU power to exploit the MD5 weakness and create a working fake Certificate Authority certificate that produced the same MD5 hash of the real one. Another possible attack is that a malicious exe could be created with the same MD5 hash as a whitelisted exe it’s replacing. software that looks at file MD5 hashes to determine if they’re trusted would be fooled. As you can see, collisions have nothing to do with figuring out the plain text. Sure someone could find a collision, but in the case of a PAN, that’s going to look nothing like a credit card number and it’d actually be more work to find the collision than it would be to brute force guess the original plain text. In the case of finding a collision to the password, same thing applies.
For the record, WEP was not broken due to an insecure algorithm. It was broken because of an insecure implementation like any other crypto that gets hacked. From Wikipedia, “The way the IV was used also opened WEP to a related key attack. For a 24-bit IV, there is a 50% probability the same IV will repeat after 5000 packets.”
End result is that for PAN hashing, salt and iteration count is more important than algorithm selection. And if an attacker is in your systems, none of this really matters much since they’ll be getting card data as it’s floating in RAM.
Personally, I wish people would spend as much time analyzing their security policies and procedures as they spend analyzing interpretation of each technical PCI DSS requirement.