←back to thread

1901 points l2silver | 1 comments | | HN request time: 0.21s | source

Maybe you've created your own AR program for wearables that shows the definition of a word when you highlight it IRL, or you've built a personal calendar app for your family to display on a monitor in the kitchen. Whatever it is, I'd love to hear it.
Show context
vintermann ◴[] No.35738839[source]
Not something terribly impressive or useful, but I wrote an image-scrambling (anything-scrambling, really) program which is quite unique.

I was fascinated by the story of David A. Scott, who was obsessed with "bijective compression". It means compression programs where all files are valid archives, and moreover no two archives decompress to the same file. So no magic number file signatures, no checksums, no redundancy whatsoever. Scott felt that compression algorithms that didn't have this property were wasteful, and of course, in a narrow technical sense he was right. There are of course a number of practical reasons why we tolerate a little redundancy.

But he wouldn't let practicality stop him. He made bijective versions of many common compression algorithms. He made a bijective Huffman encoder (one where you'll never get "unexpected end of file"), a bijective arithmetic encoder, and even a bijective LZ variant. But most impressive of all, he made a bijective BWT version.

The Burrows-Wheeler transform is fascinating on its own, and it's almost bijective. It sorts letters in a text by their context, so that letters with similar context appear close to each other. In a strange vaguely DFT-like way, it switches long-distance and short-distance patterns around. The result is, in a typical text, long runs of the same letter, which can be easily compressed.

But the traditional BWT technically works only up to rotation. You get a rotation of the original string back when reversing it, but you don't know if it's the right rotation. You need to store a tiny piece of extra information, either the index of the rotation, or a single sentinel character known to be the last (or first) letter in the original string. Getting rid of that last piece of information seemed impossible, but Scott figured out a way to do it!

The result is that we have a truly bijective version of the BWT transform. Now I'm no mathematician, but surely that is beautiful? It's a true permutation now, that still does the weird low-order higher-order swapping thing, that you could surely analyse with many algebraic approaches that wouldn't work for the original.

Anyway, what I did was implement this transformation on the lines or pixels of an image. So you get an effect similar to the "pixel sort" effect that glitch artists were into for a while, but it's reversible. I guess it's not really useful for anything other than making glitch art, but it's at least a program that does something pretty unique, and which only a very specific kind of weirdo would have the skills and inclination to write (namely me).

replies(1): >>35740231 #
jacknews ◴[] No.35740231[source]
I agree the BWT is genius.

"Glitch art"

It sounds like the real art is the algorithm and code. Would love to see it.

replies(1): >>35744105 #
1. vintermann ◴[] No.35744105[source]
A very basic implementation of the algorithm I have here:

https://gist.github.com/HaraldKorneliussen/2bf20ca4f4f28c1aa...

I implemented a slightly more efficient version of it that uses a prefix doubling strategy to do the string sorting step, as well as some glue code to make it work on lines and pixels, but that code is too messy to share for now.