md5 is a hash function, and hash functions are designed to have two properties: ...

ABrandt · on Dec 13, 2010

Okay thank you for the explanation. Let me try and apply my rudimentary knowledge here...

So a hash function is used to encrypt data by translating it with a certain rule set--I've learned about a simple key%b type function before. But with md5 this hashing function isn't the same each time a new code is created? How is the system able to decode it then? _Something_ out there has to know how to translate that back into a readable string right?

And collision is when different strings end up with the same encrypted code (except if you use a hash chain structure). So how is this used in an attack?

Sorry for all the questions. I know I could probably google this but I always learn better through instruction. Thanks!

edanm · on Dec 13, 2010

"_Something_ out there has to know how to translate that back into a readable string right?"

Wrong. That's exactly your misunderstanding - MD5 is not an encryption function, but a hashing function.

The way it works is, given some string, it will output a new, random-looking string. It's impossible to go backwards, i.e. given the output of running MD5, you can't tell the input.

In a nutshell, The way password authentication works is this: when you sign up to a site, a hash of your password is saved. At this point no one, not even the site itself, can tell what your password was.

When you want to log in, you send the password over to the site, they hash it again, and compare the output with the saved hash. If you put in the same password, the hash will come out the same. And it's very, very hard to find a different string which isn't your password which will get you the same hash output.

bigiain · on Dec 13, 2010

"The way password authentication works is this: when you sign up to a site, a hash of your password is saved."

That's an assumption that's been proven wrong _way_ too many times...

ajju · on Dec 13, 2010

>How is the system able to decode it then? _Something_ out there has to know how to translate that back into a readable string right?

Wrong. Password hashes are meant to be one-way and chosen specifically so that getting plaintext (readable string) from the ciphertext (hashed gibberish) is very hard. When you create an account, the plain text for your password is hashed and stored. When you want to subsequently login, the system only needs to use the exact same hashing steps and see if they produce an identical hash to the one stored.

Things are done this way specifically so that if a compromise such as the one at gawker happens, it is harder for the attacker to get people's actual passwords.

This is the primary difference between encryption, where you want to be able to recover the plaintext and hashing, where you want to make it very hard to recover the plaintext.