On Wed, May 18, 2011 at 4:07 PM, Landon Stewart <lstewart@superb.net> wrote:
Lets say you had a file that was 1,000,000,000 characters consisting of 8,000,000,000bits. What if instead of transferring that file through the interwebs you transmitted a mathematical equation to tell a computer on the other end how to *construct* that file. First you'd feed the file into a cruncher of some type to reduce the pattern of 8,000,000,000 bits into an equation somehow. Sure this would take time, I realize that. The equation would then be transmitted to the other computer where it would use its mad-math-skillz to *figure out the answer* which would theoretically be the same pattern of bits. Thus the same file would emerge on the other end.
The real question here is how long would it take for a regular computer to do this kind of math?
The real question is whether this is possible. And the short answer is No, at least not in general.
Now if your file has patterns that make it compressible, you can make it smaller, but not all files can be compressed this way, at least not in a way that makes them smaller. To understand why, consider the case of a file of one byte, or 8 bits. There are 256 possible files of this size, 00000000, 00000001, 00000010, ..., 11111101, 11111110, 1111111. Since each code we send must generate a unique file (or what's the point, we need 256 different codes to represent each possible file), but the shortest general way to write 256 different codes is still 8 bits long. Now, we can use coding schemes and say that the one-bit value 1 represents 11111111 because that file happens a lot. Then we could use 01 to represent something else, but we can't use 1 at the beginning again because we couldn't tell that from the file named by 1. Bottom line, for some codes to be shorter than the file they represent, others must be longer... So if files have a lot of repetition, you can get a win, but for "random" data, not so much :(