←back to thread

Big Book of R

(www.bigbookofr.com)
288 points sebg | 1 comments | | HN request time: 0.203s | source
Show context
cye131 ◴[] No.43649039[source]
R especially dplyr/tidyverse is so underrated. Working in ML engineering, I see a lot of my coworkers suffering through pandas (or occasionally polars or even base Python without dataframes) to do basic analytics or debugging, it takes eons and gets complex so quickly that only the most rudimentary checks get done. Anyone working in data-adjacent engineering work would benefit from R/dplyr in their toolkit.
replies(6): >>43649143 #>>43649208 #>>43649881 #>>43650319 #>>43650677 #>>43683325 #
aquafox ◴[] No.43650677[source]
Why not mix R and Python in interactive analysis workflows: 1) Download positron: https://github.com/posit-dev/positron 2) Set up a quarto (.qmd) notebook 3) Set up R and Python code chunks in tour quarto document 4a) Use reticulate to spawn a Python session inside R and exchange objects beween both languages (https://github.com/posit-dev/positron/pull/4603) 4b) Write a few helper functions that pass objects between R and Python by reading/writing a temporary file.
replies(5): >>43650688 #>>43653111 #>>43656358 #>>43657369 #>>43690598 #
1. p00dles ◴[] No.43690598[source]
Is this what tools like Nextflow or Snakemake aim to do? I don't know, and I'm genuinely curious, because I'm starting to work in bioinformatics and doing different parts of an analysis pipeline in R and Python seems common, and, necessary really if you want to use certain packages.

I'm wondering if I should devote time to learning Nextflow/Snakemake, or whether the solution that you outlined is "sufficient" (I say "sufficient" in quotes because of course, depends on the use case).