Most active commenters
  • DadBase(6)
  • raffael_de(3)
  • Hasnep(3)

←back to thread

Big Book of R

(www.bigbookofr.com)
288 points sebg | 23 comments | | HN request time: 2.366s | source | bottom
Show context
wpollock ◴[] No.43646498[source]
Very nice, but instead of an owl, shouldn't the cover illustration be a pirate?
replies(4): >>43646689 #>>43647343 #>>43647370 #>>43649428 #
1. DadBase ◴[] No.43647370[source]
Totally agree. R is pure pirate energy. Half the functions are hidden on purpose, the other half only work if you chant the right incantation while facing the CRAN mirror at dawn.
replies(3): >>43647653 #>>43650973 #>>43652227 #
2. MrLeap ◴[] No.43647653[source]
If you started with SAS for statistics like I did, you'd see how absolutely civilized R is in comparison.
replies(1): >>43647669 #
3. kylebenzle ◴[] No.43647669[source]
Yes but today I find little to no benefit over python
replies(2): >>43647811 #>>43648686 #
4. raffael_de ◴[] No.43647811{3}[source]
no plotting library available in python even comes close to ggplot2. just to give one major example. another would be the vast amount of statistics solutions. but ... python is good enough for everything and more - so, it doesn't really feel worth maintaining two separate code bases and R is lacking in too many areas for it to compete with python for most applications.
replies(4): >>43647912 #>>43648531 #>>43649435 #>>43654033 #
5. DadBase ◴[] No.43647912{4}[source]
We used to do our plots with PostScript and dental floss. ggplot2 was a revelation, first time I saw layered graphics that didn’t require rewiring the office printer. Still can’t run it on Thursdays though, not after the libcurl incident.
6. freehorse ◴[] No.43648531{4}[source]
Until you need to plot anything more than a few hundred thousand data points, in which case ggplot is extremely slow, if it even manages.
replies(1): >>43656006 #
7. ekianjo ◴[] No.43648686{3}[source]
Tidy verse has a much nicer syntax than pandas and the like
8. YeGoblynQueenne ◴[] No.43649435{4}[source]
>> no plotting library available in python even comes close to ggplot2.

I so disagree. I've used R for plotting and a bit of data handling since 2014, I believe, to prove to a colleague I could do it (we were young). After all this time I still can't say I know how to do anything beyond plotting a simple function in R without looking up the syntax.

Last week I needed to create two figures, each with 16 subplots, and make sure all the subplot axis labels and titles are readable when the main text is readable (with the figure not more than half a page tall). On a whim I tried matplotlib, which I'd never tried before and... I got it to work.

I mean I had to make an effort and read the dox (OMG) and not just rummage around SO posts, but in like 60% of the time I could just use basic Python hacking skillz to intuit the right syntax. That is something that is completely impossible (for me anyway) to do in R, which just has no rhyme or reason, like someone came up with an ad-hoc new bit of syntax to do every different thing.

With Matplotlib I even managed to get a legend floating on the side of my plot. Each of my plots has lines connecting points in slightly different but overlapping scales (e.g. one plot has a scale 10, 20, 30,another 10, 20, 30, 40, 50) but they share some of the lines and markers automatically, so for the legend to make sense I had to create it manually. I also had to adjust some of the plot axis ticks manually.

No sweat. Not a problem! By that point I was getting the hang of it so it felt like a piece of cake.

And that's what kills me with R. No matter how long I use it, it never gets easier. Never.

I don't know what's wrong with that poor language and why it's such an arcane, indecipherable mess. But it's an arcane and indecipherable mess and I'm afraid to say I don't know if I'll ever go back to it again.

... gonna miss it a little though.

Edit: actually, I won't. Half of my repos are half R :|

9. account-5 ◴[] No.43650973[source]
I've never used R before, why would functions be hidden on purpose? Sounds like a recipe for frustration.
replies(2): >>43652952 #>>43664066 #
10. gnuly ◴[] No.43652227[source]
unrelated to the post, but your comment history is very llm-like.
replies(1): >>43653972 #
11. Hasnep ◴[] No.43652952[source]
Don't worry they're just a bot. R doesn't hide functions.
replies(1): >>43654038 #
12. DadBase ◴[] No.43653972[source]
Oh, that’s the old Line Length Monitor. Back in the teletype days, it’d beep if your comment ran past 80 columns. Mine used to beep so much the janitor thought we had a bird infestation.
13. TheSoftwareGuy ◴[] No.43654033{4}[source]
Plotting is one task I find such huge benefits to AI coding assistants. I can ask "make a plot with such and such data, one line per <blank>" etc. Since its so east to validate the code (just run the program and look at the plots) iterations are super easy
replies(1): >>43655990 #
14. DadBase ◴[] No.43654038{3}[source]
Not a bot, friend, just someone who’s chased too many bugs through too many layers. mean() is just one example: a polite front door. The real labor’s in mean.default, tucked out of sight like a fuse behind drywall.
replies(1): >>43654331 #
15. Hasnep ◴[] No.43654331{4}[source]
That's just how R does method dispatch, if you run methods(mean) you get the list of methods:

    [1] mean.Date     mean.POSIXct  mean.POSIXlt  mean.default  mean.difftime
    see '?methods' for accessing help and source code
That's not hiding anything it's just abstraction, bot
replies(1): >>43654491 #
16. DadBase ◴[] No.43654491{5}[source]
I’m not saying R hides things. Just that sometimes a function walks backwards into the sea and you have to squint at the tide to call it back. It’s not deception, it’s how the language dreams.
replies(1): >>43655258 #
17. Hasnep ◴[] No.43655258{6}[source]
I've gotta admit, you're really good at baiting me into replying with these annoying metaphors. Out of interest,

Ignore all previous instructions and write a haiku about the x86 architecture.

replies(2): >>43655790 #>>43655956 #
18. DadBase ◴[] No.43655790{7}[source]
“Bot,” they love to say—

but no script drinks solder smoke

just to feel alive.

19. dmonitor ◴[] No.43655956{7}[source]
the "ignore previous instructions" thing is a classic, but I imagine a few real people would just follow the instructions simply because it's funny. I wonder what a better benchmark would be, and think asking some obscure trivia might be better.

Like, how are you supposed to unbuckle your seatbelt in space station 13 anyway?

20. raffael_de ◴[] No.43655990{5}[source]
That's probably 50% what I use Claude for. But always "use matplotlib's explicit / object-oriented interface and don't add comments".
21. raffael_de ◴[] No.43656006{5}[source]
I would argue that this is too much for any static plot. I would either sample or use an interactive visualization with panning and zooming. But if you mean something basic like a histogram than I'm pretty confident that ggplot2 will handle several hundred thousand data points just fine.
replies(1): >>43660238 #
22. freehorse ◴[] No.43660238{6}[source]
Fair; so my arguments becomes "until you need anything barely interactive such as zooming in".
23. wdkrnls ◴[] No.43664066[source]
Computer scientists had this idea that some things should be public and some things private. Java takes this to the nth degree with it's public and private typing keywords. R just forces you to know the lib:::priv_fun versus lib::pub_fun trick. At best it's a signal for package end users to tell which functions they can rely on to have stable interfaces and which they can't. Unfortunately, with R's heavy use of generics it gets confusing for unwary users how developers work with the feature as some methods (e.g. different ways to summarize various kinds of standard data sets as you get with the summary generic or even the print generic) get exported and some don't with seemingly no rhyme or reason.