On Thu, 19 May 2011 01:01:43 BST, Heath Jones said:
My point here is it IS possible to transfer just a hash and counter value and effectively generate identical data at the remote end.
Nope. Let's use phone numbers as an example. I want to send you the phone number 540-231-6000. The hash function is "number mod 17 plus 5". So 5402316000 mod 17 plus 5 is '7'. Yes, it's a poor hash function, except it has two nice features - it can be worked with pencil and paper or a calculator, and it has similar output distributions to really good hash functions (math geeks would say it's an "onto function", but not a "one-to-one" function). http://www.regentsprep.org/Regents/math/algtrig/ATP5/OntoFunctions.htm Go read that, and get your mind wrapped around onto and one-to-one. Almost all good hashes are onto, and almost none are one-to-one. OK. counter = 0. Hash that, we got 5. increment and hash, we get 6. Increment and hash, we got 7. If we keep incrementing and hashing, we'll also get 7 for 19, 36, 53, 70, and roughly 317,783,289 other numbers before you get to my phone number. Now if I send you 2 and 7, how do you get that phone number back out, and be sure you wanted *that* phone number and not 212-555-3488, which *also* ends up with a hash of 7, so you'd send a counter of 2? Or a number in Karubadam, Tajikistan that starts with +992 3772 but also hashes to 7? The problem is that if the number of input values is longer than the hash output, there *will* be collisions. The hash function above generates 17 numbers from 5 to 22 - if you try to hash an 18th number, it *has* to collide with a number already used. Think a game of musical chairs, which is interesting only because it's an "onto" function (every chair gets a butt mapped to it), but it's not one-to-one (not all chairs have *exactly one* butt aimed at them). (And defining the hash function so that it's one-to-one and every possible input value generates a different output value doesn't work either - because at that point, the only counter that generates the same hash as the number you're trying to send *is that number*. So if 5552316000 generates a hash value of 8834253743, you'll hash 0, 1, 2,3, ... and only get that same hash again when you get to the phone number. Then you send me "5552316000,8834253743" and I hash some 5,552,315,999 other numbers till I reach the phone number.. which you sent me already as the counter value. tl;dr: If the hash function is onto but not one-to-one, you get collisions that you can't resolve. And if the hash function *is* one-to-one, you end up sending a counter that's equal to the data.