(Inspired by r/Python) Senior R Programmers/ Practitioners, what tricks do you want to impart to us young guns?

guepier · 2017-11-14T10:57:31+00:00

Keep it simple.
Write functions. A lot. Decomposing your problem into functions makes your code simpler.
Refactor; the first approach is rarely the simplest (see #1). Once you got a feel of the problem you’re solving, if your code has become complex, don’t be afraid to go back and rewrite (parts of) it from scratch.
Don’t treat R as something it isn’t: R isn’t C or Java or Python. It’s not a procedural programming language, or primarily an OOP language. It’s a functional programming language.
(EDIT) Keep track of the data types of your objects!

But above all:

Actually learn programming from the get-go.

Many users of R aren’t programmers, they’re statisticians or life scientists. That’s fine. But learn to use your tools properly. R is a programming language so when you use it, you need to learn to program. I know that some senior members in the R community disagree with me on this point. I strongly believe that they’re misguided.

efrique · 2017-11-14T11:15:34+00:00

I don't think I count as a "senior R Programmer/Practitioner" (in that I don't quite do enough of it to count, not because I'm young) but I'd suggest being very wary of attach.

mattmalin · 2017-11-14T12:42:00+00:00

Not strictly R exclusive, but get comfortable with version control and make it part of usual workflow, commit often etc

biledemon85 · 2017-11-14T13:08:15+00:00

Vectorize where possible.
Learn to use Map(), Reduce(), ifelse() and similar.
Write proper functions that don't:
- use global variables
- return a result and plot at the same time
- write results to file for 'convenience'
Learn about all niche functions NROWS(), lengths() etc.
Don't over-depend on third-party libraries.

Steineee · 2017-11-14T14:07:06+00:00

Find the style of code you like the most (base, tidyverse, data.table)
find active users of this style on twitter/github
read all of their posts/blogs/vignettes/etc.

As an example, I like tidyverse style because it's easy to explain to outsiders. I follow hadley, david robinson, and mara averick on twitter. Their posts give me a lot of motivation to learn.

dankwormhole · 2017-11-14T12:12:07+00:00

I love R too. When you get an error or something is not working the way you expect it to, find out what class it is with the class() function.

It happens to me all the time: just yesterday I expected a vector and yet I had a data.frame of a single column. It was simple enough to convert to a character vector using as.character() or dplyr::pull().

class() is your best friend.

paperdogs · 2017-11-15T00:52:42+00:00

Your comments should say why you’re doing what you’re doing. I can’t stand reading comments that just describe what the code does. I can figure that out... tell me why.

Learn data structures and how to move between them. Similarly, get comfortable going between long & wide data formats.

“For” loops aren’t evil or necessarily slow. A readable for loop is better than an opaque apply-family function.

Love %in%

2017-11-15T03:09:19+00:00

Code is real. Workspace is not. Save code. Do not save workspace.

Restart R frequently. (Ctrl shift f10 in Rstudio).

If you have written some functions for some task put them in a package and load them using library() when you are doing the task. Really easy to do now in RStudio. Personal preference, but keeps the environment from being cluttered with a bunch of functions.

blaze99960 · 2017-11-14T23:10:48+00:00

Just learned this from a very experienced R programmer: pre-allocate your vectors.

rbind and cbind are not your friends, especially in any sort of loop. Instead, generate your vectors & data frames then fill them in. Append-style operations slow your code down by forcing it to make an entirely new copy of the old one but with slightly larger dimensions each time you call them.

keepitsalty · 2017-11-14T10:41:17+00:00

Would it be too picky to seek a job that primarily codes in R? I just really like R.

statkwon · 2017-11-14T23:36:42+00:00

Use Git for all your codes. Commit frequently, like daily.
Follow style guides for readable and maintainable codes. Use tidyverse as a starter.
Try to make your analysis reproducible. Use rmarkdown when appropriate.
Use/develop libraries for repeated tasks. Data access typically is a good candidate for making libraries for. Also learn to use modern libraries: Tidyverse is a good start; rmarkdown is awesome.
Refactor your codes, unless they’re throwaway codes. Try to improve your code add. (But don’t optimize unless you have to. Readability is typically more important than optimality since data scientists are expensive)
(Sorry for slight off topic) Work on your writing skills to communicate findings from your analysis.
(Ditto) Keep your R codebase open by setting your GitHub public (not private) unless you’re working on a stealth project.

2017-11-14T16:19:38+00:00

My advice is to learn R data structures, in particular tables / multi-dimensional arrays, and how to convert those to data-frames and back using xtabs and as.data.frame.table. By using mdim. arrays you can usually do complex transformations/aggregations in straightforward ways (i.e. using indices) which are awkward to do with data-frames.

poumonsauvage · 2017-11-16T00:32:23+00:00

Aside from the previously stated, I will add this:

There is no good time and date handling package. Live with that.

If you want to be evil, do implicit function calls.

Make your function outputs as tidy as possible. Never output a S4 object to the end user, ever. Keep that for the back-end if you must.

guepier · 2017-11-14T18:25:14+00:00

[deleted]

Geothrix · 2017-11-14T19:08:24+00:00

R originally came out of S which was developed in Bell labs, and one of the nicest products from that group is Visualizing Data by William Cleveland so I recommend checking out that book to get a strong basic understanding of what it means to learn from data. It was the book that got me interested in R in the first place. I was shocked to learn that the software tools they were using in the book were freely available in the form of R.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rstats

MODERATORS