R-stat Cheatsheet


To make an R script you simply put all commands into a file and give it an .r or .R extension.

Then you can execute the script from the command line, or you can execute it once R is called.

In the command line case you use the additional --no-save and --no-restore parameters to avoid creation of a "bulky" backup R file:

bash# R CMD BATCH --no-save --no-restore scriptname.r
In the second case, once you're inside R you type:
R> source("scriptname.r")
In scripts where you don't want graphic output to be sent to screen but rather to produce an image file, you can do:
R> postscript("somefilename.ps")
R> png(filename="somefilename.png", width=800, height=600)
To install packages in local folder when you are not in sudoers:
R> install.packages("packagename",lib="~/R/library")
R> install.packages("packagename.tar.gz",lib="~/R/library")  #If you dowloaded the source.

Plotting Tricks

To overlay a plot on top of another.
plot(shift, main="Shift Parameter for the Ribosome")
par(new=T)
lines(shift)
To plot dual ordinates in plot.
upvar<-rnorm(10)+seq(1,1.9,by=0.1)
downvar<-rnorm(20)*5+19:10
par(mar=c(5,4,4,4))
plot(6:15,upvar,pch=1,col=3,xlim=c(1,20),
xlab="Occasion",ylab="",main="Dual ordinate plot")
mtext("upvar",side=2,line=2,col=3)
abline(lm(upvar~I(1:10)),col=3)
par(new=T)
plot(downvar,axes=F,xlab="",ylab="",pch=2,col=4)
axis(side=4)
abline(lm(downvar~I(1:20)),col=4)
mtext("downvar",side=4,line=2,col=4)
To plot a circular density plot using ggplot:
histogram :
library(ggplot2)
library(circular)
set.seed(123) 
X = rbeta(100, shape1 = 2, shape2 = 4) 
X = 2 * pi * X 
X = circular(X, type = "angle", units = "radians", rotation = "clock") 
X = data.frame(x=unclass(X)) # drop unnecessary attributes 
p <- ggplot(X, aes(x = x)) 
p <- p + geom_histogram(aes(y = ..count..), binwidth=pi/6) 
p <- p + coord_polar(theta = "x", start = 2*pi, direction = 1, expand = FALSE) 
print(p) 
density estimation:
# use vonmises and convert it into data.frame before calling ggplot2 
vonmises = density.circular(X, kernel = "vonmises", n = 512, bw = 300)
D <- data.frame(lapply(vonmises[c("x", "y")], as.numeric)) 
p <- ggplot(D, aes(x, y)) 
p + geom_line() + coord_polar(theta = "x", start = 2*pi, direction = 
1, expand = FALSE) + ylim(-1, max(D$y)) 
The original link to this list recommendation is at: - ggplot2 list -

Scripting Tricks

Yet another very fun an pleasing way to interact with R is by calling it as a command line program, this is done using Rscript. Rscript is similar to another effort for direct interaction with R through the command line called little r.

So, say no more, here are some examples of how to run R as an online calculator of the average of some numbers arranged in a column, in a field separated text file.

Rscript -e "(mean(read.table(\"file.tab\")))"
Rscript -e "(sd(read.table(\"file.tab\")))"
Rscript -e "(summary(read.table(\"file.tab\")))"
Rscript -e "(sum(read.table(\"file\")[,3]))"

And also if you want to plot in a one liner a histogram of your data in text mode, then Rscript can do it like so:

awk '{print $2}' filewithdataincol2.dat | Rscript -e "fsizes <- as.numeric(readLines('/dev/stdin')); 
summary(fsizes); stem(fsizes, width=10, scale=2)"

ls  -l /usr/bin  | awk  '!/^total/ {print  $5}' |  Rscript  -e "fsizes <- as.integer(readLines('/dev/stdin/')); summary(fsizes)" 

You can also put the following one-liner in a script and call it by giving as first argument value the name of the file and as second the column.

awk '{print $'''$2'''}' $1 | Rscript -e "fsizes <- as.numeric(readLines('stdin')); 
summary(fsizes); stem(fsizes, width=10, scale=2)"
In some systems to direct the standard input you will need /dev/stdin and in others just stdin after the readLines command. Where I'm copying the previous code inside a script which I call given as arguments first the file name and then the column holding the data I want to get a stem-histogram plot for.

One can also load data directly from say, awk stdout. The following one-liner parses the file allenes.ene through awk to read every fourth row starting at row one, then it read the data as a table in R, and then it plots the second column against the fifth.
awk 'NR%4==1' allenes.ene | Rscript -e "data <- read.table(pipe('cat /dev/stdin'),
                                        header=F, sep=""); plot(data[,2],data[,5])"
awk 'NR%4==1' allenes.ene | Rscript -e 'A <- read.table("stdin"); x11();  plot(A[,2],A[,5], pch=".", type="o")'


Yet another one-liner which parses data from a MOLARIS run and makes a plot to screen waiting for the user to close the x11() window.
sed -n '/bond atom1 atom2/,/not good/p' checkbonds.out | sed 1,2d | grep -v "WARNING" \
       | Rscript -e 'A <- read.table("stdin"); x11(); plot(A[,2],A[,5], pch=".", type="o"); locator(1)'

- R and octave -

- R Astrostats -

- Kickstarting R -

- little r -

- command line average -

- data types -