;-)
W.\ Wesley Peterson and E.\ J.\ Weldon, Jr., {\it Error-Correcting Codes, Second Edition,\/} The MIT Press, Cambridge, MA, 1972.\ \
and for the abstract algebra, e.g., field theory
Oscar Zariski and Pierre Samuel, {\it Commutative Algebra, Volume I,\/} Van Nostrand, Princeton, 1958.\ \
This PhD thesis gives a very good introduction: https://arxiv.org/abs/2104.10544
A lot of modern coding does involve programming. But it is more concerned with storage and transmission of information. Like how to reduce the symbols (in info theory parlance) required for representing information (by eliminating information redundancy), how to increase the error recovery capability of a message (by adding some information redundancy), etc. Applications include transmission encoding/decoding dats (eg: DVB-S, Trellis code), error detection and correction (eg: CRC32, FEC), lossless compression (eg: RLE, LZW), lossy compression (most audio and video formats), etc.
As you may have already figured out, it's applications are in digital communication systems, file and wire formats for various types of data, data storage systems and filesystems, compression algorithms, as part of cryptographic protocols and data formats, various types of codecs, etc.
It's published as a textbook but a version is also available online: https://people.lids.mit.edu/yp/homepage/data/itbook-export.p...
The problem with information theory is that it's very easy to get things mixed up hopelessly, unless you decide in advance what each term means. There are too many similar concepts with similar names.
Friedman and Wand's Essentials of Programming Languages isn't 'essential' for everyone, even for programmers, it represents the 'essential' parts of programming language theory. If you read and understand that book you can have a serious conversation with anyone on that topic.
Similarly Essential Statistical Inference would imply a book that teaches you everything you need to know about statistical inference to do meaningful work in that area.
So the claim here is, assuming you want to understand Coding theory, then you'll be in a good place to discuss it after you read this book.
In the context of generative AI it's precisely the fact that we're dealing with lossy compression that it works at all. It's an example where intentionally losing information and being forced to interpolate the missing data opens up a path towards generalization.
Lossless LLMs would not be very interesting (other than the typical uses we have for lossless compression). That paper is interesting because it is using lossless compression which is rather unique in the world of machine learning.
https://openlibrary.org/works/OL2296213W/The_mathematical_th...
-> Each chapter starts with a personal anecdote and everything is repeated 3 times in 3 different ways. Lots of reassuring words that it's ok if you don't get it right away but trust the author that it will all make sense by the end of the book.
"Essential of coding theory"
-> University lecture with real world analogies for the students.
"Coding theory (5th Edition)"
-> Doorstopper. Mostly formulas and proofs. The text gives no clue of who and when.
One of the authors is at my university and teaches from this book. It's a math heavy upper-undergrad elective course. A couple percent of our students take it, usually in their final year of a 4 year computer science program.
The couple students I know who've taken it did enjoy it. They were also people who liked proof based mathematics in general.
https://www.youtube.com/playlist?list=PLruBu5BI5n4aFpG32iMbd...
"Coding Theory and Cryptography: The Essentials"
https://www.amazon.com/Coding-Theory-Cryptography-Essentials...
"Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage." https://en.wikipedia.org/wiki/Coding_theory
Do you want a few large corporations to have to have absolute and total control of all AI? Because that's how you get that-- under that reasoning google/etc. will just stick requirements in their terms of service that they can train on your data and they'll be the only parties that can effectively make useful models.
Copyright isn't some natural law, were it all your works would be preempted by their presence deep inside pi. Instead it's a pragmatic compromise intended to give creators a time limited monopoly for reproduction of their work to encourage more creation. In the US it has never covered highly transformative uses-- in fact it shouldn't even matter if embedded in the AI were a literal encoding of the whole work, though that generally isn't the case. All our creations are fundamentally derivative, and fortunately the judiciary does seem to have a better handle on what copyright is than a lot of public.
If anything the rise of generative AI tools is a greater sign that the copyright tradeoff should shift towards more permissive: We don't need as much restriction on people's actual natural rights as we used to in order to get valuable and important stuff created as it's never been less expensive, less risky, or easier to monetize through non-restrictive means than it is today.
We don't get to choose to live in a world with these tools or not-- the genie is out of the bottle. But we probably do get a lot of choice about their openness and everyone's level of access to create and use these tools. Lets not choose poorly.
What’s brilliant about this is that Shannon accidentally arrives at a definition that is essentially identical to thermodynamic entropy (it was actually Von Neumann who pointed this out and gave it the name).
My experience is that many people fail to understand mathematics because mathematics often follows from “what do such and such rules imply” rather than building an intuitive model of the world (which is closer to where Physics traditionally falls).
Interestingly enough though, Shannon only establishes the framework which makes coding theory possible. He doesn’t actually implement any of these examples in that paper.
I used to run a book club many years ago at a startup that worked through the book version of the paper. A great way for anyone to understand both information theory specifically and mathematics in general.
Now, TeX, that's one of my favorite things!
Ah, just today used TeX on a paper in anomaly detection that exploits the Hahn decomposition!
In the paper of the OP, was there a place where it claimed it had a subset of a set when it was really an element of the set. Don't be careful about the difference between elements and sets and can get back to the Russell paradox.
Also, for some positive integer n, e.g., n = 3, we can have an n-tuple, e.g.,
(A,B,C)
but, guys, so far we've said nothing about the components of the n-tuple being numbers, e.g., for error correcting codes, elements of a finite field, multiplying an n-tuple by a number, adding n-tuples, taking inner products, so, so far, with just some n-tuples we a bit short of a vector space or vectors.
Take a message we want to communicate and add some additional data that allows recovering the message even if part of it is corrupted. The hard part is choosing what additional data to include to recover from enough corruption with small overhead and in a reasonable runtime.
These are used everywhere from WiFi to hard drives to QR codes to RAM chips -- the ECC in ECC RAM being "error correcting code" and now partially mandatory with DDR5.
This is from 1950. I wonder what he would have to say about today’s LLMs.
Just look, for example, at the table of contents to André Weil's "Basic" Number Theory book: https://link.springer.com/book/10.1007/978-3-642-61945-8#toc
So when they say "Coding" it is about error correcting codes, not about some programming. Sigh.
I suggest changing the title of the book to something like "Theory of error correcting codes"
1) An Introduction to Information Theory, Symbols, Signals and Noise by John R. Pierce - A classic general text to understand the concepts and build intuition. Other books by the same author are also excellent.
2) Information Theory: A Tutorial Introduction by James V. Stone - A good general introduction. The author has similar tutorial books on other subjects which are also good.
3) A Student's Guide to Coding and Information Theory by Stefan Moser and Po-ning chen - A concise guide. The other books in the "student's guide" series from cambridge are also good.
There is a possibility that most people pick up those ideas from their everyday language, while I got mine from formal education (English isn't my first language, though my proficiency in English is higher than for my first language). Either that or I completely forgot those terms at some point in my life and got replaced with the formal terms instead. (It's a slightly puzzling personal peculiarity.)
Is this not what we currently have? Large corporations own the data centers, and there will never be a collectively-owned data center unless our dominant mode of production changes.
I know there are open models, but how do you serve them to users who don't have the compute?
Sure, not every user can obtain the compute. But the fact that a great many people can, and that the people that it makes the most difference for can, creates a tremendous playing field leveling.
Imagine that welding could only be performed by WeldCo and what a negative effect that would have. Fortunately anyone can weld, most people won't. But if you found yourself dead in the water and weldco was trying to extort you, you'd just pick up the equipment teach yourself, and commence with the welding. (or go hire someone to do so). Now realize that LLMs may well turn out to be more general than even welding is. So the freedom to access these tools is all the more critical, even if many will find they don't need to. The widespread access is why you may not need to.