Most active commenters
  • isoprophlex(3)

←back to thread

1087 points smartmic | 14 comments | | HN request time: 1.283s | source | bottom
Show context
anthomtb ◴[] No.44303941[source]
So many gems in here but this one about microservices is my favorite:

grug wonder why big brain take hardest problem, factoring system correctly, and introduce network call too

replies(8): >>44304390 #>>44304916 #>>44305299 #>>44305300 #>>44306811 #>>44306862 #>>44306886 #>>44309515 #
default-kramer ◴[] No.44304916[source]
I'm convinced that some people don't know any other way to break down a system into smaller parts. To these people, if it's not exposed as a API call it's just some opaque blob of code that cannot be understood or reused.
replies(5): >>44304992 #>>44305050 #>>44307611 #>>44308060 #>>44310571 #
1. isoprophlex ◴[] No.44308060[source]
I swear I'm not making this up; a guy at my current client needed to join two CSV files. A one off thing for some business request. He wrote a REST api in Java, where you get the merged csv after POSTing your inputs.

I must scream but I'm in a vacuum. Everyone is fine with this.

(Also it takes a few seconds to process a 500 line test file and runs for ten minutes on the real 20k line input.)

replies(4): >>44308237 #>>44308645 #>>44309926 #>>44309942 #
2. withinboredom ◴[] No.44308237[source]
I mean, it would be faster to just import them into an in-memory sqlite database, run a `union all` query and then dump it to a csv...

That's still probably the wrong way to do it, but 10 minutes for a 20k line file? That seems like poor engineering in the most basic sense.

replies(3): >>44308362 #>>44308551 #>>44397638 #
3. strken ◴[] No.44308362[source]
I'd probably think of xsv, go to its github repo, remember it's unmaintained and got replaced by qsv, and then use qsv.
4. isoprophlex ◴[] No.44308551[source]
It's a twenty line bash script. Pipe some shit into sqlite, done.

But the guy 'is known to get the job done' apparently.

replies(1): >>44308836 #
5. cfiggers ◴[] No.44308645[source]
I'm really dumb, genuinely asking the question—when people do such things, where are they generally running the actual code? Would it be in a VM on generally available infra that their company provides...? Or like... On a spare laptop under their desk? I have use cases for similar things (more valid use cases than this one, at least my smooth brain likes to think) but I literally don't know how to deploy it once it's written. I've never been shown or done it before.
replies(1): >>44309179 #
6. bee_rider ◴[] No.44308836{3}[source]
Maybe he’s recognized something brilliant. Management doesn’t know that the program he wrote was just a reimplementation of the Unix “cut” and “paste” commands, so he might as well get rewarded for their ignorance.

And to be fair, if folks didn’t get paid for reinventing basic Unix utilities with extra steps, the economy would probably collapse.

replies(1): >>44309238 #
7. marifjeren ◴[] No.44309179[source]
Typically you run both the client program and the server program on your computer during development. Even though they're running on the same machine they can talk with one another using http as if they were both on the world wide web.

Then you deploy the server program, and then you deploy the client program, to another machine, or machines, where they continue to talk to one another over http, maybe over the public Internet or maybe not.

Deploying can mean any one of umpteen possible things. In general, you (use automations that) copy your programs over to dedicated machines that then run your programs.

8. isoprophlex ◴[] No.44309238{4}[source]
Clearly I'm the dumbass in this story, as we're all paid by the hour...
replies(1): >>44309265 #
9. bee_rider ◴[] No.44309265{5}[source]
Clearly! He’s found a magic portal to the good old days when the fruit was all low hanging, and you keep showing up with a ladder.
10. fredrikholm ◴[] No.44309926[source]
The worst part of stories like this is how much potential there is in gaslighting you, the negative person, on just how professional and wonderful this solution is:

  * Information hiding by exposing a closed interface via the API
  * Isolated, scalable, fault tolerant service
  * Iterable, understandable and super agile
You should be a team player isophrophlex, but its ok, I didn't understand these things either at some point. Here, you can borrow my copy of Clean Code, I suggest you give it a read, I'm sure you'll find it helpful.
11. pbohun ◴[] No.44309942[source]
Was it joining on some columns or just concatenating the files?

I'm going to laugh pretty hard if it could just be done with: cat file1.csv file2.csv > combined.csv

replies(2): >>44310462 #>>44397619 #
12. Xenoamorphous ◴[] No.44310462[source]
You need to account for the headers, which many (most?) csv files I've encountered have.

So I guess something like this to skip the headers in the second file (this also assumes that headers don't have line breaks):

  cp file1.csv combined.csv && tail -n+2 file2.csv >> combined.csv
13. xnx ◴[] No.44397619[source]
There are also a lot of command line options for joining by column like csvkit
14. xnx ◴[] No.44397638[source]
csvkit and duckdb would also be good options. Any llm will spit out a one-liner for any type of join you can describe.