Most active commenters
  • bironran(9)
  • ninetyninenine(6)
  • fuzztester(5)

←back to thread

-2000 Lines of code (2004)

(www.folklore.org)
542 points xeonmc | 56 comments | | HN request time: 1.49s | source | bottom
1. bironran ◴[] No.44382555[source]
One of my best commits was removing about 60K lines of code, a whole "server" (it was early 2000's) with that had to hold all of its state in memory and replacing them with about 5k of logic that was lightweight enough to piggyback into another service and had no in-memory state at all. That was pure a algorithmic win - figuring out that a specific guided subgraph isomorphism where the target was a tree (directed, non cyclic graph with a single root) was possible by a single walk through the origin (general) directed bi-graph while emitting vertices and edges to the output graph (tree) and maintaining only a small in-process peek-able stack of steps taken from the root that can affect the current generation step (not necessarily just parent path).

I still remember the behemoth of a commit that was "-60,000 (or similar) lines of code". Best commit I ever pushed.

Those were fun times. Hadn't done anything algorithmically impressive since.

replies(14): >>44382607 #>>44383577 #>>44383660 #>>44384143 #>>44384528 #>>44384875 #>>44385261 #>>44385550 #>>44385861 #>>44386549 #>>44386714 #>>44386823 #>>44388515 #>>44395012 #
2. ddejohn ◴[] No.44382607[source]
Sounds interesting. Have you written about it in more detail somewhere?
replies(1): >>44390739 #
3. bbkane ◴[] No.44383577[source]
What did the software product do?
replies(1): >>44390742 #
4. bravesoul2 ◴[] No.44383660[source]
Nice when you turn an entire server into a library/executable.
5. pech0rin ◴[] No.44384341[source]
I'm sick and tired of all these AI generated comments. Oh you got the AI to use lower case! Wow it still writes the exact same way.
replies(3): >>44384429 #>>44384553 #>>44384607 #
6. cess11 ◴[] No.44384364[source]
On a medium sized system that isn't young and fresh deleting 60 KLOC is highly unlikely to reflect a "system rethink".

Is this, from elsewhere in the thread, a system rethink, https://github.com/dotnet/runtime/pull/36715/files ?

I've worked on a product that reinvented parts of the standard library in confusing and unexpected ways, meaning that a lot of the code could easily be compacted 10-50 times in many place, i.e. 20-50 lines could be turned into 1-5 or so. I argued for doing this and deleting a lot of the code base, which didn't take hold before me and every other dev left except one. Nine months after that they had deleted half the code base out of necessity, roughly 2 MLOC to 1 MLOC, because most of it wasn't actually used much by the customers and the lone developer just couldn't manage the mess on his own.

I wouldn't call that a system rethink.

7. generalizations ◴[] No.44384429{3}[source]
sounds like the “eigenprompt”
8. ifellover ◴[] No.44384528[source]
I’m a hobby programmer and lucky enough to script a lot of things at work. I consider myself fairly adept at some parts of programming, but comments like these make it so clear to me that I have an absolutely massive universe of unknowns that I’m not sure I have enough of a lifetime left to learn about.
replies(7): >>44384974 #>>44385333 #>>44385697 #>>44385739 #>>44386666 #>>44388151 #>>44390810 #
9. lukan ◴[] No.44384553{3}[source]
Hm. Not convinced. What makes you so sure?

Otherwise just downvote or flag I guess, but this comment of yours just reads as an insult to a person that maybe did not put the most effort into writing their comment, but seems genuine to me at least.

replies(2): >>44384616 #>>44384651 #
10. cb5r ◴[] No.44384607{3}[source]
I advise checking out the users other comments before jumping to conclusions. Doesn't look AI generated to me, rather just an "individual" writing style. Only because it's possible doesn't mean its true. Maybe user can confirm?
11. JdeBP ◴[] No.44384616{4}[source]
The now removed stuff, in the original, talking about a blue whale was somewhat odd.
replies(1): >>44384629 #
12. lukan ◴[] No.44384629{5}[source]
Ok, if there was more and weird stuff, that now got edited out(after being called out?), that would be a different story.
13. ◴[] No.44384651{4}[source]
14. ccppurcell ◴[] No.44384875[source]
Hi I'm a mathematician with a background in graph theory and algorithms. I'm trying to find a job outside academia. Can you elaborate on the kind of work you were doing? Sounds like I could fruitfully apply my skills to something like that. Cheers!
replies(3): >>44387541 #>>44390718 #>>44391739 #
15. PaulRobinson ◴[] No.44384974[source]
Read some good books on data structures and algorithms, and you'll be catching up with this sort of comment in no time. And then realise there will always be a universe of unknowns to you. :-) Good luck, and keep going.
replies(2): >>44385334 #>>44385463 #
16. fuzztester ◴[] No.44385261[source]
>Those were fun times. Hadn't done anything algorithmically impressive since.

the select-a-bunch-of-code-and-then-zap-it-with-the-Del-key is the best hardware algorithm.

17. dev0p ◴[] No.44385333[source]
I've been coding for a living for 10 years and that comment threw me for a loop as well. Gotta get to studying some graph theory I guess?
18. fuzztester ◴[] No.44385334{3}[source]
zen comment :)

uncatchable, so I won't even try.

replies(2): >>44386177 #>>44388478 #
19. sensanaty ◴[] No.44385550[source]
I guess you're the reason we get asked all those "Invert a binary tree" type questions these days!

Jokes aside, could I get a layman's explanation of the graph theory stuff here? Sounds pretty cool but the terminology escapes me

20. neilv ◴[] No.44385697[source]
You could've figured out this one with basic familiarity with how graphs are represented, constructed, and navigated, and just working through it.

One way to often arrive at it is to just draw some graphs, on paper/whiteboard, and manually step through examples, pointing with your finger/pen, drawing changes, and sometimes drawing a table. You'll get a better idea of what has to happen, and what the opportunities are.

This sounds "Then draw the rest of the owl," but it can work, once you get immersed.

Then code it up. And when you spot a clever opportunity, and find the right language to document your solution, it can sound like a brilliant insight that you could just pull out of the air, because you are so knowledgeable and smart in general. When you actually had to work through that specific problem, to the point you understood it, like Feynman would want you to.

I think Feynman would tell us to work through problems. And that Feynman would really f-ing hate Leetcode performance art interviews (like he was dismayed when he found students who'd rote-memorize the things to say). Don't let Leetcode asshattery make you think you're "not good at" algorithms.

replies(2): >>44390614 #>>44403724 #
21. amake ◴[] No.44385739[source]
(More than?) half of the difficulty comes from the vocabulary. It’s very much a shibboleth—learn to talk the talk and people will assume you are a genius who walks the walk.
replies(1): >>44390764 #
22. chamomeal ◴[] No.44385861[source]
I would love a little more context on this, cause it sounds super interesting and I also have zero clue what you’re talking about. But translating a stateful program into a stateless one sounds like absolute magic that I would love to know about
replies(1): >>44386758 #
23. HenryBemis ◴[] No.44386177{4}[source]
do try (so you get the joy of 'small' wins), also do know that it's untouchable (so you don't despair when you don't master quantum mechanics in one lifetime)

:)

replies(1): >>44400721 #
24. ninetyninenine ◴[] No.44386549[source]
I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.

It’s because every task was doing a database call but they had a whole repo and aws lambdas for running it. Stupidest thing I’ve ever seen.

replies(1): >>44393913 #
25. Cthulhu_ ◴[] No.44386666[source]
I want to believe a lot of these algorithms will "come to you" if you're ever in a similar situation; only later will you learn that they have a name, or there's books written about it, etc.

But a lot is opportunity. Like, I had the opportunity to work on an old PHP backend, 500ms - 1 second response times (thanks in part to it writing everything to a giant XML string which was then parsed and converted to a JSON blob before being sent back over the line). Simply rewriting it in naive / best practices Go changed response times to 10 ms. In hindsight the project was far too big to rewrite on my own and I should have spent six months to a year trying to optimize and refactor it, but, hindsight.

replies(2): >>44388521 #>>44429144 #
26. ninetyninenine ◴[] No.44386714[source]
The target being a tree is irrelevant right? It’s the “guided” part that makes a single walk through possible?

You are starting at a specific node in the graph and saying that if there’s an isomorphism the target tree root node must be equivalent to that specific starting node in the original graph.

You just walk through the original graph following the pattern of the target tree and if something doesn’t match it’s false otherwise true? Am I mistaken here? Again the target being a tree is a bit irrelevant. This will work for any subgraph as long as as you are also given starting point nodes for both the target and the original graph?

replies(1): >>44390749 #
27. ninetyninenine ◴[] No.44386758[source]
He has two graphs. He wants to determine if one graph is a subset of another graph.

The graph that is to be determined as a subset is a tree. From there he says it can be done in an algorithm that only traverses every node at most one time.

I’m assuming he’s also given a starting node in the original graph and the algorithm just traverses both graphs at the same time starting from the given start node in the original graph and the root in the tree to see if they match? Standard DFS or BFS works here.

I may be mistaken. Because I don’t see any other way to do it in one walk through unless you are given a starting node in the original graph but I could be mistaken.

To your other point, The algorithm inherently has to also be statefull. All traversal algorithms for graphs have to have long term state. Simply because if your at a node in a graph and it has like 40 paths to other places you can literally only go down one path at a time and you have to statefully remember that node has another 39 paths that you have to come back to later.

replies(1): >>44390701 #
28. ninetyninenine ◴[] No.44386823[source]
Am I mistaken? Is what you say even possible?

Given two graphs one is a tree you cannot determine if the tree is a subgraph of the other graph in one walk through?

It’s only possible if you’re given additional information? Like a starting node to search from? I’m genuinely confused?

replies(2): >>44388505 #>>44390732 #
29. hershey890 ◴[] No.44387541[source]
Look into quantitative analyst roles at finance firms if you’re that smart.

There’s also a role called being an algorithms engineer in standard tech companies (typically for lower level work like networking, embedded systems, graphics, or embedded systems) but the lack of an engineering background may hamstring you there. Engineers working in crypto also use a fair bit of algorithms knowledge.

I do low level work at a top company, and you only use algorithms knowledge on the job a couple of times a year at best.

30. weaksauce ◴[] No.44388151[source]
it’s just graph theory nomenclature. if you study an intro to graph algorithms it would get you most of the way there.
31. rangerelf ◴[] No.44388478{4}[source]
The more you know, the more you know you don't know.
replies(1): >>44453092 #
32. jcynix ◴[] No.44388505[source]
Take a look at Carl Hewitt's Same-Fringe solution, which flattens structures concurrently and compares the final (aka leave) nodes:

http://www.nsl.com/papers/samefringe.htm

If you flatten both of your trees/graphs and regard the output as strings of nodes, you reduce your task to a substring search.

Now if you want to verify if the structures and not just the leave nodes are identical, you might be able to encode structure information into you strings.

replies(1): >>44388735 #
33. antihero ◴[] No.44388515[source]
I'm sure with impending tide of slop-code, we'll have many more things to delete in our lifetimes.
34. geon ◴[] No.44388521{3}[source]
Yes. I invented the Trie data structure when I was 19. It was very exciting finding out it had a name, and it was indeed considered a good fit for my use case.
replies(1): >>44389189 #
35. ninetyninenine ◴[] No.44388735{3}[source]
Thanks. Good solution.

I was thinking in terms of finding all subgraph isomorphisms. But this definitely is O(N) if all you need is one solution.

But then I thought about it even further and this reduces to sliding window problem. In this case you still need to travel to each node in the window to see if there’s a match.

So it cannot be that you traverse each node once. Not if you want to find all possible subgraph isomorphisms.

Imagine a string that is a fractal of substrings:

     rrrrrrrrrrrrrrrrrrrrrrrrrrrr
And the other one:

     rrrrrrr
Right? The sliding window for rrrrrrr will be 7 in length and you need to traverse that entire window every time you move it. So by that fact alone every node is traversed at least 7 times.
36. tired-chimp ◴[] No.44389189{4}[source]
Thats so funny, I had the exact same experience. And when I was 16 I "invented" csv's because I was too lazy to setup SQL for my discord bot. I like to think I've gotten better at searching for the correct solution to things rather than just jumping in with my best guess.
replies(1): >>44389734 #
37. bhaak ◴[] No.44389734{5}[source]
LLMs are pretty good at providing names and search terms for very vague prompts.

Although that's also often an invitation for hallucinations so you have to be even more careful than usual.

replies(1): >>44390615 #
38. bironran ◴[] No.44390614{3}[source]
I despise leetcode interviews. These days, with coding LLMs, I see them as even less relevant than they were before.

Yet, you ask someone "how do you build an efficient LFU" and get blank stares (I just LOVE the memcache solution of regions and probabilistic promotion/demotion).

39. mikepurvis ◴[] No.44390615{6}[source]
I was just going to say the same— LLMs are great for giving a name to a described concept, architecture, or phenomenon. And I would argue that hallucinations don't actually much matter for this usage as you're going to turn around and google the name anyway, once you've been told it.
40. bironran ◴[] No.44390701{3}[source]
kindaaaa....

I oversimplified the problem :). Really it was about generating an isomporhic-ish view, based on some user defined rules, of an existing graph, itself generated by a subgraph isomorphism by a query language.

Think a computer network as a graph, with various other configuration items like processes, attached drives, etc (something also known as a CMDB). Query that graph to generate a subgraph out of it. Then use rules to make that subgraph appear as a tree of layers (tree but in each layer you may have additional edges between the vertices) because trees are efficient, non-complex representation on 2d space (i.e. monitors).

However, a child node in that tree isn't necessarily connected directly to the parent node. E.g. one of the rules may be "display the sub network and the attached drives in a single layer", so now the parent node, the gateway, has both network nodes (directly connected to it) and attached drives (indirectly connected to it) as direct descendants.

Extend this to be able to connect through any descendant, direct or indirect (gateway -> network node -> disk -> config file -> config value - but put the config value on the level of the network node and build a link between them to represent the compound relationship).

Walk through the original subgraph while evaluating the rules and build a "trace back" stack to let you understand how to build each layer even in the presence of compound links while performing a single walkthrough instead of nm (original vertices rules for generation).

As I said, that was a lot of fun. I miss those days.

41. bironran ◴[] No.44390718[source]
That was about 20 years ago. Not much translates to today's world. I was in the algorithms team working on a CMDB product. Great tech, terrible marketing.

These days it's very different, mostly large-ish distributed systems.

replies(1): >>44394156 #
42. bironran ◴[] No.44390732[source]
I oversimplified. See https://news.ycombinator.com/item?id=44390701
43. bironran ◴[] No.44390739[source]
See https://news.ycombinator.com/item?id=44390701
44. bironran ◴[] No.44390742[source]
The product was a CMDB, with great tech and terrible marketing.
45. bironran ◴[] No.44390749[source]
I oversimplified. See https://news.ycombinator.com/item?id=44390701
46. bironran ◴[] No.44390764{3}[source]
That! It took me a while to start. My education of graph theory wasn't much better than your average college grad. But I found that fascinating and started reading. I was also very lucky to have had two great mentors - my TL and the product's architect, the former helped me to expend my understanding of the field.
47. meistertigran ◴[] No.44390810[source]
A lot if it is just technical jargon. Which doesn't mean it's bad, one has to have a way to talk about things, but the underlying logic, I've found, is usually graspable for most people.

It's the difference between hearing a lecture from a "bad" professor in Uni and watching a lecture video by Feynman, where he tries to get rid of scientific terms, when explaining things in simple terms to the public.

As long as you get a definition for your terms, things are manageable.

48. fuzztester ◴[] No.44391739[source]
You can try to get a job at an investment bank, if you're okay with heavy slogging, i.e., in terms of hours, which I have heard is the case, although that could be wrong.

I heard from someone who was in that field, that the main qualification for such a job is analytical ability and mathematics knowledge, apart from programming skills, of course.

49. motorest ◴[] No.44393913[source]
> I deleted an entire micro service of task runners and replaced it with a library that uses setTimeout as the primitive driving tasks from our main server.

Your example raises some serious red flags. Did it ever dawned upon you that the reason these background tasks were offloaded to a dedicated service might have been to shed this load from your main server and protect it from handling sudden peaks in demand?

replies(1): >>44397141 #
50. ccppurcell ◴[] No.44394156{3}[source]
Thanks for replying anyway!
51. ◴[] No.44395012[source]
52. ninetyninenine ◴[] No.44397141{3}[source]
There’s no red flag.

These background tasks are all database calls. That means the cpu is just waiting on the database for the majority of the call. Most modern servers can handle 10k of these calls concurrently. And you can do this off of one not so powerful CPU. Even half a cpu can handle this. Of course it depends on the CPU but you get my point.

The database is the bottleneck. The database is the thing that needs to be scaled first before you scale servers. This is the most common web application pattern. One way is providing more compute to the database (sharding is better then increasing cpu power as the bottleneck in the database is usually filesystem access not cpu power). Another way is to have a queue buffer the traffic spikes. Both of these are addressing an issue with the database first.

In most web apps. All the server does is wait for a database. The database is doing compute. You never want the server to do compute as that becomes what we call a “blocking call.” These blocking calls are the ones you offload to an external service as these calls “block” entire cpu threads. database calls do not “block” as the server will context switch to another green thread during database calls.

If you work somewhere where you’re scaling crud servers but not after scaling a central database it usually means you’re in a company that doesn’t get it and overemphasizes on “architecture” over common sense. It’s actually extremely common in lower tier small companies to have not so smart people build things like this that don’t make any sense. They aren’t thinking coherently and I’ve seen tons of people who just miss this common sense notion.

I’ll be Frank. It’s stupid and defies common sense. It’s likely you are doing this? But it’s also extremely commonplace.

53. fuzztester ◴[] No.44400721{5}[source]
oh, i only meant that rhetorically.

no worries.

54. marksbrown ◴[] No.44403724{3}[source]
Funnily enough, a common idea! What would Feynman do? | Fabulous adventures in coding https://share.google/iSEhAqL9NhAstSlRE
55. nomel ◴[] No.44429144{3}[source]
This is my experience, and my favorite way to learn: go in blindly, look things up when you get stuck/run out of ideas. I think it forces a deeper understanding of the topic, and really helps things "stick". I assume it's related to the massive dumps of dopamine that come from the "Eureka!" moments.

I've "invented" all sorts of old patents, all sorts of algorithms, including the PID algorithm. I think it helped form a very practically useful "intuition".

But, I've noticed that some people are passionately against this type of self exploration.

56. fuzztester ◴[] No.44453092{5}[source]
right.

it's "you [don't] know" all the way down.