Skip to content

15: How to write R script

Mengcheng edited this page May 11, 2015 · 4 revisions

R Script coding standards/guidelines

It is preferred to separate code for data analysis into two files/scripts, one for “helper” functions and another that uses the helper functions to perform the task at hand. The following are example templates for these two types of files.

Sample code "Script_HelperFunctions.R"

The purpose of this file is to load any necessary libraries for the script, and to define helpful functions to simplify the main script file, improving readability and reuse of code.

# Load necessary packages
library(foreign)

# Define individual helper functions, defining defaults when possible
helperFunction1 <- function(functionData, parameter1=TRUE)
{
     # Do something to the data
     if(parameter1)
     {
          result <- functionData*1
     }
     else
     {
          result <- functionData*2
     }
     
     # Return the result
     return(result)
}

helperFunction2 <- function(functionData, parameter1=1, parameter2=2)
{
     # Do something to the data
     result <- parameter1 * funcitonData / parameter2;
     
     # Return the result
     return(result)
}

Sample code "Script.R"

The purpose of this file is to succinctly perform a set of data manipulations (e.g., data normalization, statistical analysis) and reporting (e.g., via plotting or table generation). This file should make use of a helper function file (depending on the overall complexity of the script, i.e., unless having two files makes things more complicated or confusing).

# Source the helper functions file to pull in necessary libraries and define necessary functions
source('<path to helper functions R-file>')

# Create a succinct file that makes generous use of helper functions to increase readability of the code.

# Read in data
data <- read.arff('<path to data file from JEX>')

# Apply first data manipulation
data <- helperFunction1(data, parameter1=TRUE)

# Apply second data manipulation
data <- helperFunction2(data, parameter1=1, parameter1=2)

# Plot the data
plot(data$x, data$y)

# Print the data to the console
print(data)