[Cryptography] Amongst the requirements for digests... Crypto
I was just looking through some articles on cryptographic digests setting out the criteria for acceptance and it occurs to me that as specified they are necessary but not sufficient. *Pre-image resistance*Given a hash value *h* it should be difficult to find any message *m* such that *h* = hash(*m*). That is necessary but I would want h to effectively disclose no useful information whatsoever about m. Not the number of bits, not the parity, nothing. So I have been designing deranged hash functions that have the correct work factor but are wrong, wrongety wrong. So SHA2-Yuk is defined as H(x) = (SHA2(x OR 1) OR 1) XOR (x AND 1) Where the boolean operators only act on the last bit in the input. The intent here is to ensure that the hash values of adjacent inputs are paired such that H(x) XOR H(x XOR 1) = 1. This would obviously be a highly undesirable property in a digest but it is not excluded by our traditional definition of second preimage resistance. _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
> I was just looking through some articles on cryptographic digests setting out the criteria for acceptance and it occurs to me that as specified they are necessary but not sufficient. > > Pre-image resistance > Given a hash value h it should be difficult to find any message m such that h = hash(m). > That is necessary but I would want h to effectively disclose no useful information whatsoever about m. Not the number of bits, not the parity, nothing. > > So I have been designing deranged hash functions that have the correct work factor but are wrong, wrongety wrong. > > So SHA2-Yuk is defined as > > H(x) = (SHA2(x OR 1) OR 1) XOR (x AND 1) > > Where the boolean operators only act on the last bit in the input. > > The intent here is to ensure that the hash values of adjacent inputs are paired such that H(x) XOR H(x XOR 1) = 1. There are simpler "deranged" hash functions. For example, you could take: H_Leaky(x) = First k bits of x || SHA2(x) You can modify this to pre-pend any fixed-length boolean function of x instead. (Granted, this has "the wrong work factor" as it's only as hard as SHA2, while it "should be" 2^k times harder. But the requirement of minimality - i.e., that an n-bit hash output should have difficulty 2^n for pre-image resistance (2^(n/2) for second) isn't in the explicit requirements anywhere - and in fact we accept cryptographic functions that are "a few bits shy" of the theoretical maximum. The conditions of first and second pre-image resistance relate to the *cryptographically strong* part of the notion of *cryptographically strong hash function*. Historically we don't write down a definition for the "hash function" part - which is where SHA2-Yuk and H_Leaky fail. I'd guess there's a good definition out there, but it's not obvious. People often try "changing any one bit, on average, changes half the bits of the output", but I don't think that works, though a counterexample isn't clear to me. -- Jerry _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
I have an application that requires that no one can ever produce a hash collision on two data blocks of moderate size. Seems to me that Blake2b 160 suffices, and Blake2b 256 is overkill. Someone claimed its easy to produce collisions and sprayed what sounded to me like random technobabble in support of that claim. So, for a given number of bits in the hash, how much time and resources are required to produce two blocks of data of moderate size such that the last n bits of the hash are the same for both blocks? If someone wants to say it is easy, that it does not take all that much time and resources, I would like to hear an explanation that sounds like a description of an algorithm, an algorithm described in detail sufficient I could test it to produce a collision in the last sixty or so bits of the hash. _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On Sun, Dec 30, 2018 at 01:33:29PM +0800, jamesd@echeque.com wrote: > I have an application that requires that no one can ever produce a hash > collision on two data blocks of moderate size. > > Seems to me that Blake2b 160 suffices, and Blake2b 256 is overkill. What's your threat model? If your thread model is truly "no one" and "ever", this is going to be very, very hard. Ok, so let's relax the thread model to say that (as an approximation to "no one" and "ever") you want resistance to (say) the sort of effort the NSA might mount against a major Rusian or Chinese cryptosystem, i.e., several hundred PhD-level researchers, a budget measured in billions of dollars, and a decade of effort. And you don't care about the CPU/memory cost of computing the hash function. So now you probably want to think very carefully before using the word "overkill" in your system design. Are there any constraints on the colliding data blocks? I.e., does the attacker have complete freedom to choose *any* data blocks in trying to construct a collision? Can you combine multiple "good" hash functions in a way that makes the combination much harder to break than any individual one? What sort of breaks are now known against (say) MD5 (a "good" hash function from a while back, which is now rather broken), and how can you combine multiple (todays-state-of-the-art) hash functions in a way which would defeat those breaks? Can you inject nonces in key places? As a non-crypto-guru, I would think along the lines of # random constants have lengths which makes the different hash fn's # blocks straddle each other H1 = hash_fn1(data) H2 = hash_fn2(H1, 0xarandomconstant, data) H3 = hash_fn3(H1, 0xsomerandomconstant, H2, 0xanotherrandomconstant, data) final_hash = H1 || H2 || H3 or even # nonces also have varying lengths chosen to make hash fn block straddle H1 = hash_fn1(nonce1, data, nonce2) H2 = hash_fn2(nonce3, H1, nonce4, data, nonce5) H3 = hash_fn3(nonce6, H1, nonce7, H2, nonce8, data, nonce9) final_hash = H1 || H2 || H3 final_nonce = nonce1 || nonce2 || nonce3 || nonce4 || nonce5 || nonce6 || nonce7 || nonce8 || nonce9 -- -- "Jonathan Thornburg [remove -animal to reply]" <jthorn@astro.indiana-zebra.edu> Dept of Astronomy & IUCSS, Indiana University, Bloomington, Indiana, USA "There was of course no way of knowing whether you were being watched at any given moment. How often, or on what system, the Thought Police plugged in on any individual wire was guesswork. It was even conceivable that they watched everybody all the time." -- George Orwell, "1984" _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
jamesd@echeque.com wrote on 30/12/18 6:33 PM: > So, for a given number of bits in the hash, how much time and resources > are required to produce two blocks of data of moderate size such that > the last n bits of the hash are the same for both blocks? [...]> sufficient I could test it to produce a collision in the last sixty or > so bits of the hash. If the "technobabble" tried to say that hashes could be broken in less than brute force times without using some specific cryptanalysis of the hash algorithm, it is probably nonsense. But if you really do intend to only use the last n bits of the hash and you really are trying to protect against collision, then 60 bits may not be good enough. Brute force lets you find some two blocks with a collision on an n-bit hash by trying on the order of 2^(n/2) hashes on random blocks. The algorithm is to generate blocks, hash each, and save the results in a way that you could quickly find when you have encountered the hash before and what data block made it. The details are left as an exercise. Because of the birthday "paradox" finding a collision makes it n/2 instead of n when you don't care which two blocks have the collision. If what you are concerned about is second preimage resistance, i.e., given a specific block of data find a different block that hashes to the same value as the first, that would take on the order of 2^n hashes to brute force, but you would not have to use the memory to save all the hashes you compute until you find it. Which of the two you are concerned with depends on your threat model. Collision resistance protects against somebody producing two things that have the same hash and using them in some kind of bait and switch scam. But if what you want to use the has for is to prove that an item you supply has not been tampered with, all you need is second preimage resistance. When you say "the last n bits of the hash", that's the n that you use in those estimates. Finding a collision in just the last 60 bits would only take on the order of a bit more than a billion hashes and a billion numbers in memory. What would that be, a few hours on just one core and tens of GB of RAM? But if you really intended to use the full 160 bits of a 160-bit hash, then you are looking at 2^80, so multiply that by another 2^50 which would be bigger than is feasible. And if you really meant second preimage, even only using the last 60 bits would require a brute force attack to calculate 2^60 hashes, which my really rough fermi estimate tells me would be more than a thousand years running on a thousand cores. Sidney Markowitz sidney@sidney.com _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On 31/12/2018 00:03, Jonathan Thornburg wrote: > On Sun, Dec 30, 2018 at 01:33:29PM +0800, jamesd@echeque.com wrote: >> I have an application that requires that no one can ever produce a hash >> collision on two data blocks of moderate size. >> >> Seems to me that Blake2b 160 suffices, and Blake2b 256 is overkill. > > What's your threat model? > > If your thread model is truly "no one" and "ever", this is going to be > very, very hard. > > Ok, so let's relax the thread model to say that (as an approximation to > "no one" and "ever") you want resistance to (say) the sort of effort the > NSA might mount against a major Rusian or Chinese cryptosystem, i.e., > several hundred PhD-level researchers, a budget measured in billions of > dollars, and a decade of effort. And you don't care about the CPU/memory > cost of computing the hash function. > > So now you probably want to think very carefully before using the word > "overkill" in your system design. > > Are there any constraints on the colliding data blocks? I.e., does the > attacker have complete freedom to choose *any* data blocks in trying to > construct a collision? He has complete freedom to choose a twenty byte or so nonce. And "He" might well have the computational resources of Google or the NSA The threat is that he constructs two short messages, one of the transaction he wants, with a 20 byte nonce that he has complete freedom to choose. and one of the form the other guy wants, with a another twenty byte nonce that he has complete freedom to choose, and needs to get them to have the same hash > > Can you combine multiple "good" hash functions in a way that makes the > combination much harder to break than any individual one? You are raising vast pile of unsolved and unsolvable theoretical issues with no obvious relevance to the question, the question being specifically about blake2b 160 _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On 31/12/2018 00:39, Alexander Kjeldaas wrote: > 2^80 random blocks should give one collision, but for a given hash 2^160 > blocks are needed to find a collision. I know that, everyone knows that. Does not seem to have the slightest relevance whatsoever to the question I asked. Obviously the NSA can calculate 2^80 blocks without raising a sweat, but they cannot store and sort 2^80 blocks, which on the face of it they would need to do in order to find the collision that they might have generated. _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On 31/12/2018 02:23, John Pacific wrote: > Blake2b160 only has half the collision security as blake2b256. Those are > 2**80 and 2**128. > > I would consider 2**80 within "vulnerable range" of a nation state. Well I would consider you full of hot air, unless you actually do some resource calculations Obviously a nation state can calculate 2^80 hashes easily enough, but in order to find which two of them can collide would need to store and sort 2^80 hashes, which looks to me to be far beyond the resources of any present nation state, or the likely resources of any nation state in the reasonably foreseeable future. _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On 31/12/2018 04:51, Ray Dillinger wrote: > > > On 12/29/18 9:33 PM, jamesd@echeque.com wrote: >> I have an application that requires that no one can ever produce a hash >> collision on two data blocks of moderate size. >> >> Seems to me that Blake2b 160 suffices, and Blake2b 256 is overkill. >> >> Someone claimed its easy to produce collisions and sprayed what sounded >> to me like random technobabble in support of that claim. >> >> So, for a given number of bits in the hash, how much time and resources >> are required to produce two blocks of data of moderate size such that >> the last n bits of the hash are the same for both blocks? > > Pretty hard. I don't think you have anything to worry about, although > I'd probably have reflexively picked 256-bit. Is a post-quantum > future part of your design requirements? I think quantum gives a > marginal speedup on Blake, but not by more than a fairly small constant > factor. > > I worry about hashes because the vast majority, including BLAKE and > variants, are subject to length-extension attacks. That is, if you can > find a string that matches the hash of any prefix of the message, then > the remainder of the message can be appended to it resulting in a > collision on the whole message. So there is an attack surface at every > block (at the end of every prefix), not just at the final block. Thank you. This confirms my expectation that Blake2b 160 will suffice, since all messages will be short, and all messages will have a prefix that defines their schema, and thus implicitly their length, and, as in SHA 256, an affix specifying their actual length. (In blake2b length affix every chunk, not just at the end, so that hashing is not position invariant.) _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On Sat, Jan 05, 2019 at 10:16:40AM +0800, jamesd@echeque.com wrote: > Obviously a nation state can calculate 2^80 hashes easily enough, but in > order to find which two of them can collide would need to store and sort > 2^80 hashes, which looks to me to be far beyond the resources of any present > nation state, or the likely resources of any nation state in the reasonably > foreseeable future. Would the following method provide a practical-for-the-NSA attack? https://www.semanticscholar.org/paper/Parallel-hash-collision-search-by-Rho-method-with-Weber-Zhang/a953b65f6feb9dae15f5cb0d9458579836a1199e Parallel hash collision search by Rho method with distinguished points Brian Weber, Xiaowen Zhang Published 2018 in 2018 IEEE Long Island Systems, Applications and... In this paper, we realized a memory efficient general parallel Pollard's rho method for collision search on hash functions introduced by Van Oorschot and Wiener in 1996. This utilizes the principles of the birthday paradox to greatly increase the probability of a finding a collision, while using significantly less memory than the classic birthday attack, and allowing a larger portion of the subject hash function to be searched before running out of memory by saving only a few select digests called distinguished points. Using our implementation, we are able to find an average of 50 MD5 half collisions in the first hour of searching using a distributed memory high performance computing system called Penzias (one of CUNY HPC systems) on 32 processors. We then extend the technique with Cyrillic character replacement to search for meaningful MD5 half collisions. Next we analyze and measure how the performance of our implementation scales with different processor counts. Finally, we experiment with how the rarity of distinguished points affects the rate at which collisions are found at varying numbers of processors. LESS full paper (paywalled) at https://ieeexplore.ieee.org/document/8378028 -- -- "Jonathan Thornburg [remove -animal to reply]" <jthorn@astro.indiana-zebra.edu> Dept of Astronomy & IUCSS, Indiana University, Bloomington, Indiana, USA "There was of course no way of knowing whether you were being watched at any given moment. How often, or on what system, the Thought Police plugged in on any individual wire was guesswork. It was even conceivable that they watched everybody all the time." -- George Orwell, "1984" _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
On 06/01/2019 00:29, Jonathan Thornburg wrote: > On Sat, Jan 05, 2019 at 10:16:40AM +0800, jamesd@echeque.com wrote: >> Obviously a nation state can calculate 2^80 hashes easily enough, but in >> order to find which two of them can collide would need to store and sort >> 2^80 hashes, which looks to me to be far beyond the resources of any present >> nation state, or the likely resources of any nation state in the reasonably >> foreseeable future. > > Would the following method provide a practical-for-the-NSA attack? > > https://www.semanticscholar.org/paper/Parallel-hash-collision-search-by-Rho-method-with-Weber-Zhang/a953b65f6feb9dae15f5cb0d9458579836a1199e > > Parallel hash collision search by Rho method with distinguished points > > Brian Weber, Xiaowen Zhang > Published 2018 in 2018 IEEE Long Island Systems, Applications and... Looks like it, for the NSA. Rough guess is that 160 bits would fail against custom, massively parallel hardware using this method, and 256 bits would succeed in resisting the attack. But that is just my back of the envelope calculation resting on a wild assed guess. Does anyone have any good justification that 256 bits is safe? _______________________________________________ The cryptography mailing list cryptography@metzdowd.com http://www.metzdowd.com/mailman/listinfo/cryptography
89.2 MB 3,873 messages
Last sync: 15 July 2019 22:44

Move Messages

Save

Apply Labels


Warning: Unknown: write failed: No space left on device (28) in Unknown on line 0

Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php/sessions) in Unknown on line 0