Some snippets to jog my memory.

install local lib

In .bashrc or similar file define a R_LIBS dir.

export R_LIBS=/home/me/local/R/

When installing use the -l switch to install locally.

R CMD INSTALL -l $R_LIBS Some_R_package.tar.gz

recode missing variables

Recoding missing data, example:

data_e$V6[is.na(data_e$V6)] <- 0

recode non-missing data

Recoding non-missing data, example:

data_e$V6[data_e$V6==20] <- 21.2

use factor to recode variables

Here, APOE has 3 levels or doses, and we’re arbitrarily naming those doses as A, B, or C.

test <- factor(df$APOE, labels=c("A", "B", "C"))

summarizing data

For tabular data, the builtin function table and sumamry functions are helpful.

table (lab$apoe, lab$eoad)

For continuous data, the builtin summary and Hmisc package’s describe functions can help.

library(Hmisc)
describe(lab$age)

For summarizing groups, the psych package has a describeBy function that is helpful.

library(psych)
describeBy(lab$age, lab$load)

plotting

density plot example:

test <- rnorm(100);
d <- density(test, from=0, to=0.5)
plot(d, xlim=c(0,0.5), xlab="x label", ylab ="y label", main="title")

saving PDF of a plot:

pdf("maf_density.ps")
plot...
dev.off()

ggplot

ggplot(fig_two, aes(x=fig_two$male_prev, y=fig_two$fem_prev, fill = fig_two$round_fin_h2)) + geom_tile()
xlab("Male Prevalence") + ylab("Female Prevalence")
opts(title = "Figure 1: Mean Late-onset Alzheimer's Disease\n Heritability as a Function of Prevalence")
scale_fill_gradient(limits = c(0.78, 0.51)
ggsave("test.pdf", height=8, width=8, unit="cm") # default is in

glm

lm_r <- lm(formula = disease ~ AGE + SEX, family = binomial(), data = df1)
summary(lm_r)
plot(lm_r) # 4 diagnostic plots