R-stat Cheatsheet
To make an R script you simply put all commands into a file and give it an .r or .R extension.
Then you can execute the script from the command line, or you can execute it once R is called.
In the command line case you use the additional --no-save and --no-restore parameters to avoid creation of a "bulky" backup R file:
bash# R CMD BATCH --no-save --no-restore scriptname.rIn the second case, once you're inside R you type:
R> source("scriptname.r")In scripts where you don't want graphic output to be sent to screen but rather to produce an image file, you can do:
R> postscript("somefilename.ps") R> png(filename="somefilename.png", width=800, height=600)To install packages in local folder when you are not in sudoers:
R> install.packages("packagename",lib="~/R/library") R> install.packages("packagename.tar.gz",lib="~/R/library") #If you dowloaded the source.
Plotting Tricks
To overlay a plot on top of another.plot(shift, main="Shift Parameter for the Ribosome") par(new=T) lines(shift)To plot dual ordinates in plot.
upvar<-rnorm(10)+seq(1,1.9,by=0.1) downvar<-rnorm(20)*5+19:10 par(mar=c(5,4,4,4)) plot(6:15,upvar,pch=1,col=3,xlim=c(1,20), xlab="Occasion",ylab="",main="Dual ordinate plot") mtext("upvar",side=2,line=2,col=3) abline(lm(upvar~I(1:10)),col=3) par(new=T) plot(downvar,axes=F,xlab="",ylab="",pch=2,col=4) axis(side=4) abline(lm(downvar~I(1:20)),col=4) mtext("downvar",side=4,line=2,col=4)To plot a circular density plot using ggplot:
histogram :
library(ggplot2) library(circular) set.seed(123) X = rbeta(100, shape1 = 2, shape2 = 4) X = 2 * pi * X X = circular(X, type = "angle", units = "radians", rotation = "clock") X = data.frame(x=unclass(X)) # drop unnecessary attributes p <- ggplot(X, aes(x = x)) p <- p + geom_histogram(aes(y = ..count..), binwidth=pi/6) p <- p + coord_polar(theta = "x", start = 2*pi, direction = 1, expand = FALSE) print(p)density estimation:
# use vonmises and convert it into data.frame before calling ggplot2 vonmises = density.circular(X, kernel = "vonmises", n = 512, bw = 300) D <- data.frame(lapply(vonmises[c("x", "y")], as.numeric)) p <- ggplot(D, aes(x, y)) p + geom_line() + coord_polar(theta = "x", start = 2*pi, direction = 1, expand = FALSE) + ylim(-1, max(D$y))The original link to this list recommendation is at: - ggplot2 list -
Scripting Tricks
Yet another very fun an pleasing way to interact with R is by calling it as a command line program, this is done using Rscript. Rscript is similar to another effort for direct interaction with R through the command line called little r.
So, say no more, here are some examples of how to run R as an online calculator of the average of some numbers arranged in a column, in a field separated text file.
Rscript -e "(mean(read.table(\"file.tab\")))" Rscript -e "(sd(read.table(\"file.tab\")))" Rscript -e "(summary(read.table(\"file.tab\")))" Rscript -e "(sum(read.table(\"file\")[,3]))"
And also if you want to plot in a one liner a histogram of your data in text mode, then Rscript can do it like so:
awk '{print $2}' filewithdataincol2.dat | Rscript -e "fsizes <- as.numeric(readLines('/dev/stdin')); summary(fsizes); stem(fsizes, width=10, scale=2)" ls -l /usr/bin | awk '!/^total/ {print $5}' | Rscript -e "fsizes <- as.integer(readLines('/dev/stdin/')); summary(fsizes)"
You can also put the following one-liner in a script and call it by giving as first argument value the name of the file and as second the column.
awk '{print $'''$2'''}' $1 | Rscript -e "fsizes <- as.numeric(readLines('stdin')); summary(fsizes); stem(fsizes, width=10, scale=2)"In some systems to direct the standard input you will need /dev/stdin and in others just stdin after the readLines command. Where I'm copying the previous code inside a script which I call given as arguments first the file name and then the column holding the data I want to get a stem-histogram plot for.
One can also load data directly from say, awk stdout. The following one-liner parses the file allenes.ene through awk to read every fourth row starting at row one, then it read the data as a table in R, and then it plots the second column against the fifth.
awk 'NR%4==1' allenes.ene | Rscript -e "data <- read.table(pipe('cat /dev/stdin'), header=F, sep=""); plot(data[,2],data[,5])" awk 'NR%4==1' allenes.ene | Rscript -e 'A <- read.table("stdin"); x11(); plot(A[,2],A[,5], pch=".", type="o")'
Yet another one-liner which parses data from a MOLARIS run and makes a plot to screen waiting for the user to close the x11() window.
sed -n '/bond atom1 atom2/,/not good/p' checkbonds.out | sed 1,2d | grep -v "WARNING" \ | Rscript -e 'A <- read.table("stdin"); x11(); plot(A[,2],A[,5], pch=".", type="o"); locator(1)'