R: Best Practices of R

R: Best Practices of R

This is the latest in the series on R started by R for Developers and goes into best practices for developers using R, focusing on code and organisation as opposed to statistical approach!

Why Best Practices

Best practices exist across all programming languages, with some going across languages such as SOLID Principles (admittedly not so relevant to R) and other practices specific to that language.

These practices are important for consistency and readability within your own projects but become far more relevant when projects become shared and people start collaborating on code. It´s at that point that being able to understand other people´s code can make or break projects!

R has the added challenge that many users come from a variety of backgrounds and may not have had much experience in coding best practices before. 

Code Styles

A common code style can go a long way to helping with consistency across code bases and projects. It governs such things as variable names, syntax and layout. It´s normally better to have a lightweight style guide as long arguments about the correct casing for variables can result in much lost time!

You´ll find it´s more important to have a consistent style rather than worrying about which one in particular to use.

Google has a well established one at https://google.github.io/styleguide/Rguide.xml and Hadley Wickham published an updated version of this at http://adv-r.had.co.nz/Style.html

Specifics to R

As R is generally a scripted or procedural language, it becomes important to break down what you can into smaller sections and reuse what you can.

Breaking R functions into their own .R files, pushing code into functions when it makes sense, breaking long operations into discrete operations are all part of this.

In more recent times, many languages have advocated the elimination and vast reduction of comments, in favour of improved naming of functions and breaking out pieces of code into units small enough to read their intention easily.

You´ll find comments still very relevant to R because as functions are often used less often and code chunks may be longer. But the comments should be kept concise and as near to the relevant code as possible.

Further Reading

There is another article a R Best Practices here that is worth taking a look at.

In addition naming is one of the hardest things in programming and one of the most relevant for readability. There is a blog post by Robin Lovelace on Consistent naming conventions in R that is particularly interesting.

Hopefully we´ll wrap up the R series soon and move onto one about Python!

 

 

Author

Duncan Thomson

A Remote Software and Database Contractor specialised in Umbraco, Duncan works from wherever he finds himself. He is the co-organiser of the Python Exeter and Data Science Exeter meetup groups and speaks about Remote Working, Umbraco, Python and .NET Outside of work he is keen on travel, random generation, foreign languages and good food.

comments powered by Disqus