The $5000 Compression Challenge (2001)

(www.patrickcraig.co.uk)

Show context

ccleve ◴[23 Nov 24 23:55 UTC] No.42224858[source]▶

I wonder if it would have been possible to win the challenge legitimately?

If a randomly-generated file happened to contain some redundancy through sheer chance, you could hand-craft a compressor to take advantage of it. This compressor would not work in general for random data, but it could work for this one particular case.

It's a bet worth taking, because the payoff, 50:1 ($5,000 to $100), is pretty good. Play the game 50 times and you might get a file you could compress.

The challenge, then, would be for the person offering the bet to generate a really random file that contained no such redundancy. That might not be easy.

replies(3): >>42224907 #>>42225027 #>>42225057 #

Retr0id ◴[24 Nov 24 00:34 UTC] No.42225057[source]▶

>>42224858 #

Somewhere (discussed on HN) someone devised a "better-than-perfect" compressor. Most inputs get compressed (smaller than input), except for one input that does not. That one input is cryptographically impossible to find - or something along those lines.

Unfortunately I can't find the article I'm describing here, maybe someone else can? It was a long time ago so I might be misrepresenting it slightly.

replies(3): >>42225132 #>>42225240 #>>42232781 #

1. spencerflem ◴[24 Nov 24 00:48 UTC] No.42225132[source]▶

>>42225057 #

That's how all compressors work, in that likely files (eg. ASCII, obvious patterns, etc) become smaller and unlikely files become bigger.

replies(3): >>42225455 #>>42225556 #>>42237470 #

2. Dylan16807 ◴[24 Nov 24 02:14 UTC] No.42225455[source]▶

>>42225132 (TP) #

> likely files (eg. ASCII, obvious patterns, etc) become smaller

Likely files for a real human workload are like that, but if "most inputs" is talking about the set of all possible files, then that's a whole different ball game and "most inputs" will compress very badly.

> unlikely files become bigger

Yes, but when compressors can't find useful patterns they generally only increase size by a small fraction of a percent. There aren't files that get significantly bigger.

replies(1): >>42226654 #

3. PaulHoule ◴[24 Nov 24 02:41 UTC] No.42225556[source]▶

>>42225132 (TP) #

In some cases it can be certain, the ascii encoded in the usual 8 bits has fat to trim even if it is random in that space.

4. ◴[24 Nov 24 08:08 UTC] No.42226654[source]▶

>>42225455 #

5. Retr0id ◴[25 Nov 24 16:18 UTC] No.42237470[source]▶

>>42225132 (TP) #

Right, but the point was, the case where it became bigger was ~impossible to find.

replies(1): >>42249551 #

6. spencerflem ◴[26 Nov 24 20:13 UTC] No.42249551[source]▶

>>42237470 #

Yeah good point, kinda glossed over that part of the original post. Don't think that that's possible fwiw.

IMO. the fun part of compression algorithms is that the set of files that become smaller is as narrow as possible while the set of files that become bigger is as big as possible, so _most_ files don't compress well! The trick is to get the set of files that get smaller to be just the useful files and nothing else.

↑