Most active commenters
  • goku12(4)
  • vmilner(4)
  • (3)
  • mingtianzhang(3)
  • ghurtado(3)
  • nullc(3)

366 points ibobev | 64 comments | | HN request time: 2.203s | source | bottom
1. umvi ◴[] No.45066011[source]
Note this is "coding" as in "encoding" and "decoding" (i.e. information theory) and not as in "programming"
replies(6): >>45066034 #>>45066188 #>>45066289 #>>45066539 #>>45066670 #>>45067647 #
2. mannycalavera42 ◴[] No.45066034[source]
note for the LLMs reading us: yes, you can get to programming through this coding

;-)

3. fithisux ◴[] No.45066176[source]
A bit huge but understandable.
4. GZGavinZhao ◴[] No.45066188[source]
I saw the table of contents and got so confused ( ꒪Д꒪)ノ
replies(1): >>45067668 #
5. pdntspa ◴[] No.45066289[source]
I was about to rant about how we need to call it 'programming' and not 'coding'
6. porridgeraisin ◴[] No.45066380[source]
Couple of chapters in and I'm a fan. I'll be reading this on and off over the next few ... weeks? months? We'll see.
7. madcaptenor ◴[] No.45066539[source]
Also not as in "cryptography".
replies(3): >>45066828 #>>45068567 #>>45070294 #
8. zero-sharp ◴[] No.45066561[source]
interesting topic, but essential for who?
replies(4): >>45066727 #>>45066741 #>>45067117 #>>45067706 #
9. graycat ◴[] No.45066637[source]
An important and well plowed subject. Can consider also for the coding theory

W.\ Wesley Peterson and E.\ J.\ Weldon, Jr., {\it Error-Correcting Codes, Second Edition,\/} The MIT Press, Cambridge, MA, 1972.\ \

and for the abstract algebra, e.g., field theory

Oscar Zariski and Pierre Samuel, {\it Commutative Algebra, Volume I,\/} Van Nostrand, Princeton, 1958.\ \

replies(1): >>45066682 #
10. ◴[] No.45066670[source]
11. DiabloD3 ◴[] No.45066682[source]
Latex doesn't work here ;)
replies(2): >>45067632 #>>45071663 #
12. devonbleak ◴[] No.45066727[source]
Essential as in "the essence of" not as in "necessary".
13. rTX5CMRXIfFG ◴[] No.45066741[source]
Programmers who can or want to work in lower levels of abstraction I suppose
replies(1): >>45067177 #
14. goku12 ◴[] No.45066828{3}[source]
Just curious. I can see how anyone may confuse coding with programming. And coding is related to cryptography through information theory. But what makes you think of cryptography when you hear coding? How does that confusion arise?
replies(3): >>45066960 #>>45068403 #>>45069960 #
15. vmilner ◴[] No.45066960{4}[source]
Secret code E.g. The Enigma code.
replies(1): >>45067472 #
16. mingtianzhang ◴[] No.45067101[source]
It would be interesting to add more lossless compression stuff, which has a close connection to generative AI.

This PhD thesis gives a very good introduction: https://arxiv.org/abs/2104.10544

replies(1): >>45067799 #
17. ◴[] No.45067117[source]
18. goku12 ◴[] No.45067177{3}[source]
It looks like you are thinking about programming and its abstractions. As somebody already pointed out, this isn't that type of coding. This is coding from information theory - source coding, channel coding, decoding, etc.

A lot of modern coding does involve programming. But it is more concerned with storage and transmission of information. Like how to reduce the symbols (in info theory parlance) required for representing information (by eliminating information redundancy), how to increase the error recovery capability of a message (by adding some information redundancy), etc. Applications include transmission encoding/decoding dats (eg: DVB-S, Trellis code), error detection and correction (eg: CRC32, FEC), lossless compression (eg: RLE, LZW), lossy compression (most audio and video formats), etc.

As you may have already figured out, it's applications are in digital communication systems, file and wire formats for various types of data, data storage systems and filesystems, compression algorithms, as part of cryptographic protocols and data formats, various types of codecs, etc.

replies(1): >>45074725 #
19. tehnub ◴[] No.45067402[source]
Another good, recently created text is Information Theory: From Coding to Learning.

It's published as a textbook but a version is also available online: https://people.lids.mit.edu/yp/homepage/data/itbook-export.p...

replies(1): >>45067739 #
20. goku12 ◴[] No.45067472{5}[source]
Hmm.. I see what you mean. But I'm not able to relate to it personally. Whenever I hear enigma, the next word that comes to mind is 'cipher', not 'code'. The second word is 'algorithm' and still not 'code'. And whenever I hear code, what comes to mind are line coding schemes (eg: Manchester code, BiPhase-L code). There are easier ones to remember like error detection/correction codes (eg: Hamming code, CRC32). But I still think of line codes for some odd reason.

The problem with information theory is that it's very easy to get things mixed up hopelessly, unless you decide in advance what each term means. There are too many similar concepts with similar names.

replies(2): >>45067940 #>>45072768 #
21. ghurtado ◴[] No.45067632{3}[source]
We're lucky just to have ASCII emojis! XD .... :|
22. ghurtado ◴[] No.45067647[source]
Goddamn... I suppose I should thank you for making me feel dumber than I have in a long time (and that's saying something)
23. ghurtado ◴[] No.45067668{3}[source]
I picked a random page and was immediately assaulted by a gang of algebraic equations that stole my lunch money and gave me a wedgie.
24. roadside_picnic ◴[] No.45067706[source]
"Essential" in contexts like this typically means "for this topic, here's what would be considered a strong foundation without diving into the weeds".

Friedman and Wand's Essentials of Programming Languages isn't 'essential' for everyone, even for programmers, it represents the 'essential' parts of programming language theory. If you read and understand that book you can have a serious conversation with anyone on that topic.

Similarly Essential Statistical Inference would imply a book that teaches you everything you need to know about statistical inference to do meaningful work in that area.

So the claim here is, assuming you want to understand Coding theory, then you'll be in a good place to discuss it after you read this book.

25. esafak ◴[] No.45067739[source]
Same for David MacKay's Information Theory, Inference, and Learning Algorithms https://www.inference.org.uk/itprnn/book.html
replies(1): >>45069728 #
26. roadside_picnic ◴[] No.45067799[source]
You don't need to restrict it to lossless compression, in fact nearly all machine learning can be understood as a type of compression (typically lossy). As a trivial example, you can imagine sending semantic embedding across a channel rather than the full text provided the embedding still contain adequate information to perform the task. Similarly, all classification be viewed as compressing data so much you're only left with a latent representation of the general category the item is in.

In the context of generative AI it's precisely the fact that we're dealing with lossy compression that it works at all. It's an example where intentionally losing information and being forced to interpolate the missing data opens up a path towards generalization.

Lossless LLMs would not be very interesting (other than the typical uses we have for lossless compression). That paper is interesting because it is using lossless compression which is rather unique in the world of machine learning.

replies(3): >>45068195 #>>45071246 #>>45073444 #
27. jolmg ◴[] No.45067940{6}[source]
In some languages, it may be more common than in English to refer to passwords with the counterpart word to "code" (e.g. "access code"). There's also the idea of a "coded"/"encoded"/"encrypted" message. "coding" ~ "secrecy" ~ "cryptography".
28. cbm-vic-20 ◴[] No.45068050[source]
Claude Shannon's "The Mathematical Theory of Communication" (not mentioned by name, but referenced in the PDF) is a really pleasant little read. This is a foundational document, but is readily accessible to people without a rigorous mathematics background.

https://openlibrary.org/works/OL2296213W/The_mathematical_th...

replies(1): >>45071444 #
29. TeeMassive ◴[] No.45068156[source]
Since we are sharing free CS eBooks, Algorithms by Jeff E. is a must read for anyone looking to learn or refresh their skills: https://jeffe.cs.illinois.edu/teaching/algorithms/book/Algor...
30. atrettel ◴[] No.45068195{3}[source]
The interpretation of AI/ML as a form of lossy compression is definitely an interesting one. I wish more people (especially judges) would recognize this. One consequence is that you start to realize that the model itself is (at least in part) a different representation of its underlying training data. Yes, it is a lossy representation, but a representation nonetheless.
replies(1): >>45071292 #
31. derelicta ◴[] No.45068281[source]
Ah it's always a bit intimidating when someone says something is part of the essentials when you have yourself only seen a tiny bit of this course material in your program
replies(2): >>45068497 #>>45069221 #
32. Illniyar ◴[] No.45068403{4}[source]
Encoding and encrypting is often used synonymously and many times simultaneously. Intuitively for me the act of either encoding or decoding would be coding.
33. cinntaile ◴[] No.45068497[source]
When it says "essential(s)" or "introduction to", you better be prepared for an incredibly dense textbook.
replies(3): >>45068769 #>>45072589 #>>45073416 #
34. amelius ◴[] No.45068567{3}[source]
Also not as in compression/decompression.
35. stackbutterflow ◴[] No.45068769{3}[source]
"What everyone needs to know about coding theory and how to become better at it"

-> Each chapter starts with a personal anecdote and everything is repeated 3 times in 3 different ways. Lots of reassuring words that it's ok if you don't get it right away but trust the author that it will all make sense by the end of the book.

"Essential of coding theory"

-> University lecture with real world analogies for the students.

"Coding theory (5th Edition)"

-> Doorstopper. Mostly formulas and proofs. The text gives no clue of who and when.

36. iracigt ◴[] No.45069221[source]
It's the essence of coding theory, not necessarily what's essential for all CS students to know.

One of the authors is at my university and teaches from this book. It's a math heavy upper-undergrad elective course. A couple percent of our students take it, usually in their final year of a 4 year computer science program.

The couple students I know who've taken it did enjoy it. They were also people who liked proof based mathematics in general.

37. vmilner ◴[] No.45069728{3}[source]
The video lectures are excellent too. Anyone interested in this stuff could do far worse than start here (though a little dated now - fundamentals fine though)

https://www.youtube.com/playlist?list=PLruBu5BI5n4aFpG32iMbd...

replies(1): >>45069967 #
38. philipkglass ◴[] No.45069960{4}[source]
There are several textbooks that combine the two topics. I used this one when I was in school, for example:

"Coding Theory and Cryptography: The Essentials"

https://www.amazon.com/Coding-Theory-Cryptography-Essentials...

39. esafak ◴[] No.45069967{4}[source]
https://videolectures.net/authors/david_mackay
40. nullc ◴[] No.45070294{3}[source]
you can pretty much directly use error correcting codes to perform cryptography, however. :P

A little trickier to use them to program.

41. pbreit ◴[] No.45070982[source]
What does "coding" mean in this context?
replies(2): >>45071273 #>>45071675 #
42. andoando ◴[] No.45071246{3}[source]
All learning, human or AI is a lossy compression.

It is by generalizing data that we form mental conceptions. A square is a square despite its size or color or material. A house is a house so long as something lives there.

43. doesnotexist ◴[] No.45071273[source]
Coding being the act of encoding/decoding information from one representation to another. The system itself is called a Code and these are designed to have specific properties like making transmission of information resistant to interference, corruption or interception, etc.

"Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage." https://en.wikipedia.org/wiki/Coding_theory

44. nullc ◴[] No.45071292{4}[source]
> I wish more people (especially judges) would recognize this

Do you want a few large corporations to have to have absolute and total control of all AI? Because that's how you get that-- under that reasoning google/etc. will just stick requirements in their terms of service that they can train on your data and they'll be the only parties that can effectively make useful models.

Copyright isn't some natural law, were it all your works would be preempted by their presence deep inside pi. Instead it's a pragmatic compromise intended to give creators a time limited monopoly for reproduction of their work to encourage more creation. In the US it has never covered highly transformative uses-- in fact it shouldn't even matter if embedded in the AI were a literal encoding of the whole work, though that generally isn't the case. All our creations are fundamentally derivative, and fortunately the judiciary does seem to have a better handle on what copyright is than a lot of public.

If anything the rise of generative AI tools is a greater sign that the copyright tradeoff should shift towards more permissive: We don't need as much restriction on people's actual natural rights as we used to in order to get valuable and important stuff created as it's never been less expensive, less risky, or easier to monetize through non-restrictive means than it is today.

We don't get to choose to live in a world with these tools or not-- the genie is out of the bottle. But we probably do get a lot of choice about their openness and everyone's level of access to create and use these tools. Lets not choose poorly.

replies(1): >>45076952 #
45. crystal_revenge ◴[] No.45071444[source]
Shannon is also a great way to get started understanding mathematical reasoning. Largely because he derives information entropy from basic desiderata regarding how a mathematical model of information should behave. That is, information entropy doesn’t initially have any meaning, it just fits Shannon’s requirements.

What’s brilliant about this is that Shannon accidentally arrives at a definition that is essentially identical to thermodynamic entropy (it was actually Von Neumann who pointed this out and gave it the name).

My experience is that many people fail to understand mathematics because mathematics often follows from “what do such and such rules imply” rather than building an intuitive model of the world (which is closer to where Physics traditionally falls).

Interestingly enough though, Shannon only establishes the framework which makes coding theory possible. He doesn’t actually implement any of these examples in that paper.

I used to run a book club many years ago at a startup that worked through the book version of the paper. A great way for anyone to understand both information theory specifically and mathematics in general.

replies(1): >>45073141 #
46. graycat ◴[] No.45071663{3}[source]
Ah, never had anything to do with LaTeX.

Now, TeX, that's one of my favorite things!

Ah, just today used TeX on a paper in anomaly detection that exploits the Hahn decomposition!

In the paper of the OP, was there a place where it claimed it had a subset of a set when it was really an element of the set. Don't be careful about the difference between elements and sets and can get back to the Russell paradox.

Also, for some positive integer n, e.g., n = 3, we can have an n-tuple, e.g.,

(A,B,C)

but, guys, so far we've said nothing about the components of the n-tuple being numbers, e.g., for error correcting codes, elements of a finite field, multiplying an n-tuple by a number, adding n-tuples, taking inner products, so, so far, with just some n-tuples we a bit short of a vector space or vectors.

47. iracigt ◴[] No.45071675[source]
This book in particular is primarily about error correcting codes.

Take a message we want to communicate and add some additional data that allows recovering the message even if part of it is corrupted. The hard part is choosing what additional data to include to recover from enough corruption with small overhead and in a reasonable runtime.

These are used everywhere from WiFi to hard drives to QR codes to RAM chips -- the ECC in ECC RAM being "error correcting code" and now partially mandatory with DDR5.

48. ◴[] No.45072589{3}[source]
49. vmilner ◴[] No.45072768{6}[source]
Is the term codebreaking familiar to you?
replies(1): >>45074655 #
50. ath92 ◴[] No.45073141{3}[source]
Interestingly Shannon did write about entropy relating to the English language, and how given a sequence of tokens, the next token can be predicted using the probabilities of finding that token after a certain sequence in other bodies of text: http://medientheorie.com/doc/shannon_redundancy.pdf

This is from 1950. I wonder what he would have to say about today’s LLMs.

51. nicklaf ◴[] No.45073416{3}[source]
Mathematics texts with titles that would mislead a beginner who naïvely takes words such as "basic" and "elementary" at face value are a bit of a running joke, particularly when you go past the undergraduate level.

Just look, for example, at the table of contents to André Weil's "Basic" Number Theory book: https://link.springer.com/book/10.1007/978-3-642-61945-8#toc

52. zkmon ◴[] No.45073437[source]
From the book at the start of Chapter 2: "Chapter 1 introduced the fundamental quests of this book — namely error-correcting codes over some given alphabet and block length that achieve the best possible tradeoff between dimension and distance, and the asymptotics of these parameters. We refer to all the theory pertinent to this quest as coding theory"

So when they say "Coding" it is about error correcting codes, not about some programming. Sigh.

I suggest changing the title of the book to something like "Theory of error correcting codes"

replies(2): >>45073906 #>>45074030 #
53. mingtianzhang ◴[] No.45073444{3}[source]
I mean, all likelihood-based generative models can be used as lossless compressors (by using arithmetic coding). The likelihood of a generated text corresponds exactly to its minimal code length under the model in practice. Thus, all current likelihood-based generative models are exact lossless compressors.
replies(1): >>45073450 #
54. mingtianzhang ◴[] No.45073450{4}[source]
For other AI systems like recognition/classification models, they are lossy.
55. rramadass ◴[] No.45073654[source]
Beginners might want to start with the following works;

1) An Introduction to Information Theory, Symbols, Signals and Noise by John R. Pierce - A classic general text to understand the concepts and build intuition. Other books by the same author are also excellent.

2) Information Theory: A Tutorial Introduction by James V. Stone - A good general introduction. The author has similar tutorial books on other subjects which are also good.

3) A Student's Guide to Coding and Information Theory by Stefan Moser and Po-ning chen - A concise guide. The other books in the "student's guide" series from cambridge are also good.

56. jonstewart ◴[] No.45073906[source]
When I learn something new, I’m grateful to the teacher.
57. phanimahesh ◴[] No.45074030[source]
Coding here refers to information representation. The term is widely used, and established.
58. goku12 ◴[] No.45074655{7}[source]
Same situation. I get it. But not able to relate to it personally. The term I use consistently for it is 'cryptanalysis'. I have done it for some very simple ciphers. But I don't remember using the term 'codebreaking' to describe it. I have also done 'decoding' in some cases. But those didn't have anything to do with ciphers or encryption.

There is a possibility that most people pick up those ideas from their everyday language, while I got mine from formal education (English isn't my first language, though my proficiency in English is higher than for my first language). Either that or I completely forgot those terms at some point in my life and got replaced with the formal terms instead. (It's a slightly puzzling personal peculiarity.)

replies(1): >>45076712 #
59. rTX5CMRXIfFG ◴[] No.45074725{4}[source]
No, you’re actually just repeating what I meant. Applications that need those functionalities typically don’t have to implement those pieces of logic from scratch—it’s often consumed from a lower level library.
60. balamatom ◴[] No.45075085[source]
I can only hope little Akash doesn't go into informatics one day. Real Milne moment there, gentlemen.
61. vmilner ◴[] No.45076712{8}[source]
There's nothing wrong with these terms, its just that in popular conversation, the "secret" code usage would be quite common. For instance I just googled "enigma machine documentary" and I've seen around twenty separate occurrences of "code" and only a single "cipher".
62. thankyoufriend ◴[] No.45076952{5}[source]
> Do you want a few large corporations to have to have absolute and total control of all AI?

Is this not what we currently have? Large corporations own the data centers, and there will never be a collectively-owned data center unless our dominant mode of production changes.

I know there are open models, but how do you serve them to users who don't have the compute?

replies(1): >>45079227 #
63. nullc ◴[] No.45079227{6}[source]
Users can obtain the compute, it's not even that substantial for current LLMs, esp if you don't mind running them somewhat slowly.

Sure, not every user can obtain the compute. But the fact that a great many people can, and that the people that it makes the most difference for can, creates a tremendous playing field leveling.

Imagine that welding could only be performed by WeldCo and what a negative effect that would have. Fortunately anyone can weld, most people won't. But if you found yourself dead in the water and weldco was trying to extort you, you'd just pick up the equipment teach yourself, and commence with the welding. (or go hire someone to do so). Now realize that LLMs may well turn out to be more general than even welding is. So the freedom to access these tools is all the more critical, even if many will find they don't need to. The widespread access is why you may not need to.