[med-svn] [r-bioc-limma] 01/08: Imported Upstream version 3.22.0+dfsg
Andreas Tille
tille at debian.org
Thu Oct 16 13:05:03 UTC 2014
This is an automated email from the git hooks/post-receive script.
tille pushed a commit to branch master
in repository r-bioc-limma.
commit ae681316ee8d04987e34eb6515ddcf1472b557b0
Author: Andreas Tille <tille at debian.org>
Date: Thu Oct 16 13:38:52 2014 +0200
Imported Upstream version 3.22.0+dfsg
---
DESCRIPTION | 8 +-
NAMESPACE | 10 ++
R/arrayWeights.R | 384 ++++++++++++++++++++++------------------
R/barcodeplot.R | 382 ++++++++++++++++++++++++++++++++-------
R/diffSplice.R | 121 ++++++++-----
R/geneset-roast.R | 12 +-
R/geneset-romer.R | 8 +-
R/goana.R | 236 ++++++++++++++++++++++++
R/lmfit.R | 3 +-
R/plot.R | 82 ++++-----
R/plotExons.R | 100 +++++++++++
R/plotWithHighlights.R | 91 ++++++++++
R/plotdensities.R | 28 +--
R/plots-ma.R | 243 ++++++++-----------------
R/read-ilmn.R | 20 ++-
R/read-maimages.R | 46 ++---
R/voom.R | 2 +-
R/vooma.R | 77 ++++----
inst/NEWS.Rd | 109 ++++++++++++
inst/doc/changelog.txt | 125 ++++++++++++-
inst/doc/intro.pdf | Bin 46191 -> 46191 bytes
inst/doc/usersguide.pdf | Bin 1017047 -> 0 bytes
man/10GeneSetTests.Rd | 4 +-
man/EList.Rd | 32 ++--
man/TestResults.Rd | 2 +-
man/alias2Symbol.Rd | 3 +
man/anova-method.Rd | 4 +-
man/arrayWeights.Rd | 8 +-
man/asdataframe.Rd | 2 +-
man/backgroundcorrect.Rd | 3 +-
man/barcodeplot.Rd | 87 +++++++--
man/camera.Rd | 8 +-
man/diffSplice.Rd | 24 ++-
man/genas.Rd | 4 +-
man/geneSetTest.Rd | 6 +-
man/gls.series.Rd | 4 +-
man/goana.MArrayLM.Rd | 89 ++++++++++
man/goana.Rd | 77 ++++++++
man/heatdiagram.Rd | 6 +-
man/ids2indices.Rd | 46 +++++
man/imageplot.Rd | 4 +-
man/imageplot3by2.Rd | 4 +-
man/kooperberg.Rd | 2 +-
man/lmFit.Rd | 6 +-
man/loessfit.Rd | 2 +-
man/ma3x3.Rd | 9 +-
man/mdplot.Rd | 15 +-
man/mrlm.Rd | 4 +-
man/nec.Rd | 10 +-
man/normalizeCyclicLoess.Rd | 2 +-
man/normalizeMedianAbsValues.Rd | 2 +-
man/normalizeRobustSpline.Rd | 3 +-
man/normalizeVSN.Rd | 4 +-
man/normalizeWithinArrays.Rd | 2 +-
man/normalizebetweenarrays.Rd | 7 +-
man/normalizeprintorder.Rd | 2 +-
man/normalizequantiles.Rd | 5 +-
man/plot.Rd | 87 +++++----
man/plotDensities.Rd | 20 +--
man/plotExons.Rd | 48 +++++
man/plotFB.Rd | 6 +-
man/plotRLDF.Rd | 4 +-
man/plotSA.Rd | 4 +-
man/plotSplice.Rd | 14 +-
man/plotlines.Rd | 4 +-
man/plotma.Rd | 106 ++++++-----
man/plotma3by2.Rd | 10 +-
man/plotprinttiploess.Rd | 4 +-
man/qqt.Rd | 3 +-
man/qualwt.Rd | 2 +-
man/read.columns.Rd | 6 +-
man/read.idat.Rd | 3 +-
man/read.ilmn.Rd | 2 +-
man/read.ilmn.targets.Rd | 5 +-
man/read.maimages.Rd | 2 +-
man/readGPRHeader.Rd | 7 +-
man/readSpotTypes.Rd | 2 +-
man/readTargets.Rd | 4 +-
man/readgal.Rd | 8 +-
man/removeBatchEffect.Rd | 4 +-
man/roast.Rd | 8 +-
man/romer.Rd | 7 +-
man/selectmodel.Rd | 4 +-
man/strsplit2.Rd | 2 +-
man/summary.Rd | 4 +-
man/symbols2indices.Rd | 28 ---
man/topGO.Rd | 38 ++++
man/topRomer.Rd | 4 +-
man/topSplice.Rd | 20 ++-
man/tricubeMovingAverage.Rd | 12 +-
man/volcanoplot.Rd | 4 +-
man/voom.Rd | 6 +-
man/voomWithQualityWeights.Rd | 73 ++++++++
man/vooma.Rd | 8 +-
man/weightedLowess.Rd | 2 +-
man/writefit.Rd | 2 +-
man/zscore.Rd | 4 +-
tests/limma-Tests.R | 5 +
tests/limma-Tests.Rout.save | 163 +++++++++--------
99 files changed, 2356 insertions(+), 972 deletions(-)
diff --git a/DESCRIPTION b/DESCRIPTION
index 67a4a30..ea6a945 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,6 +1,6 @@
Package: limma
-Version: 3.20.9
-Date: 2014/08/27
+Version: 3.22.0
+Date: 2014/10/10
Title: Linear Models for Microarray Data
Description: Data analysis, linear models and differential expression for microarray data.
Author: Gordon Smyth [cre,aut], Matthew Ritchie [ctb], Jeremy Silver [ctb], James Wettenhall [ctb], Natalie Thorne [ctb], Davis McCarthy [ctb], Di Wu [ctb], Yifang Hu [ctb], Wei Shi [ctb], Belinda Phipson [ctb], Alicia Oshlack [ctb], Carolyn de Graaf [ctb], Mette Langaas [ctb], Egil Ferkingstad [ctb], Marcus Davy [ctb], Francois Pepin [ctb], Dongseok Choi [ctb]
@@ -8,7 +8,7 @@ Maintainer: Gordon Smyth <smyth at wehi.edu.au>
License: GPL (>=2)
Depends: R (>= 2.3.0), methods
Suggests: statmod (>= 1.2.2), splines, locfit, MASS, ellipse, affy,
- vsn, AnnotationDbi, org.Hs.eg.db, illuminaio
+ vsn, AnnotationDbi, org.Hs.eg.db, GO.db, illuminaio, BiasedUrn
LazyLoad: yes
URL: http://bioinf.wehi.edu.au/limma
biocViews: ExonArray, GeneExpression, Transcription,
@@ -19,4 +19,4 @@ biocViews: ExonArray, GeneExpression, Transcription,
ProprietaryPlatforms, TwoChannel, RNASeq, BatchEffect,
MultipleComparison, Normalization, Preprocessing,
QualityControl
-Packaged: 2014-08-28 02:01:52 UTC; biocbuild
+Packaged: 2014-10-14 00:27:12 UTC; biocbuild
diff --git a/NAMESPACE b/NAMESPACE
index 357432b..daf3545 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -23,12 +23,15 @@ S3method(as.matrix,RGList)
S3method(as.matrix,ExpressionSet)
S3method(as.matrix,LumiBatch)
S3method(as.matrix,vsn)
+S3method(avedups,default)
S3method(avedups,MAList)
S3method(avedups,EList)
+S3method(avereps,default)
S3method(avereps,RGList)
S3method(avereps,MAList)
S3method(avereps,EListRaw)
S3method(avereps,EList)
+S3method(avearrays,default)
S3method(avearrays,MAList)
S3method(avearrays,EList)
S3method(cbind,MAList)
@@ -50,6 +53,8 @@ S3method("dimnames<-",RGList)
S3method("dimnames<-",EList)
S3method("dimnames<-",EListRaw)
S3method(fitted,MArrayLM)
+S3method(goana,default)
+S3method(goana,MArrayLM)
S3method(length,MAList)
S3method(length,MArrayLM)
S3method(length,RGList)
@@ -70,19 +75,24 @@ S3method(summary,RGList)
S3method(summary,EList)
S3method(summary,EListRaw)
S3method(summary,TestResults)
+S3method(normalizeVSN,default)
S3method(normalizeVSN,RGList)
S3method(normalizeVSN,EListRaw)
S3method(plot,RGList)
S3method(plot,MAList)
S3method(plot,EList)
S3method(plot,MArrayLM)
+S3method(plotMA,default)
S3method(plotMA,RGList)
S3method(plotMA,MAList)
S3method(plotMA,EList)
S3method(plotMA,MArrayLM)
+S3method(plotMDS,default)
S3method(plotMDS,MDS)
+S3method(plotFB,default)
S3method(plotFB,RGList)
S3method(plotFB,EListRaw)
+S3method(plotDensities,default)
S3method(plotDensities,RGList)
S3method(plotDensities,MAList)
S3method(plotDensities,EListRaw)
diff --git a/R/arrayWeights.R b/R/arrayWeights.R
index 6a7060c..1b72a68 100644
--- a/R/arrayWeights.R
+++ b/R/arrayWeights.R
@@ -1,66 +1,73 @@
-arrayWeights <- function(object, design=NULL, weights=NULL, method="genebygene", maxiter=50, tol = 1e-10, trace = FALSE)
-# Compute array quality weights
-# Matt Ritchie
-# 7 Feb 2005. Last revised 16 Jan 2008.
-# Gordon Smyth simplified argument checking to use getEAWP, 9 Mar 2008.
-{
-# Check arguments
- y <- getEAWP(object)
- if(is.null(design))
- design <- matrix(1,ncol(y$exprs),1)
- else
- design <- as.matrix(design)
- if(mode(design) != "numeric") stop("design must be a numeric matrix")
- ne <- nonEstimable(design)
- if(!is.null(ne)) cat("Coefficients not estimable:",paste(ne,collapse=" "),"\n")
- if(missing(weights) && !is.null(y$weights)) weights <- y$weights
- method <- match.arg(method,c("genebygene","reml"))
-
- M <- y$exprs
+arrayWeights <- function(object, design=NULL, weights=NULL, var.design=NULL, method="genebygene", maxiter=50, tol = 1e-10, trace = FALSE)
+# Compute array quality weights
+# Matt Ritchie 7 Feb 2005.
+# Gordon Smyth simplified argument checking to use getEAWP, 9 Mar 2008.
+# Cynthia Liu added var.design argument so that variance model can be modified by user, 22 Sep 2014
+# Last modified 7 Oct 2014.
+{
+# Check arguments
+ y <- getEAWP(object)
+ if(is.null(design))
+ design <- matrix(1,ncol(y$exprs),1)
+ else
+ design <- as.matrix(design)
+ if(mode(design) != "numeric") stop("design must be a numeric matrix")
+ ne <- nonEstimable(design)
+ if(!is.null(ne)) cat("Coefficients not estimable:",paste(ne,collapse=" "),"\n")
+ if(missing(weights) && !is.null(y$weights)) weights <- y$weights
+ method <- match.arg(method,c("genebygene","reml"))
+
+ M <- y$exprs
p <- ncol(design)
- QR <- qr(design)
- nparams <- QR$rank # ncol(design)
- ngenes <- nrow(M)
- narrays <- ncol(M)
- if(narrays < 3) stop("too few arrays")
- if(ngenes < narrays) stop("too few probes")
-
- # Set up design matrix for array variance model
- Z <- contr.sum(narrays)
-
- # Intialise array variances to zero
- arraygammas <- rep(0, (narrays-1))
-
- switch(method, genebygene = { # Estimate array variances via gene-by-gene update algorithm
- Zinfo <- 10*(narrays-nparams)/narrays*crossprod(Z, Z)
+ QR <- qr(design)
+ nparams <- QR$rank # ncol(design)
+ ngenes <- nrow(M)
+ narrays <- ncol(M)
+ if(narrays < 3) stop("too few arrays")
+ if(ngenes < narrays) stop("too few probes")
+
+ Z <- var.design
+ if(!is.null(Z)){
+ if(ncol(Z)<2) stop("design matrix Z must have at least 2 columns")
+ Z=Z
+ }else{
+ # Set up design matrix for array variance model
+ Z <- contr.sum(narrays)
+ }
+
+ # Intialise array variances to zero
+ arraygammas <- rep(0, ncol(Z))
+
+ switch(method, genebygene = { # Estimate array variances via gene-by-gene update algorithm
+ Zinfo <- 10*(narrays-nparams)/narrays*crossprod(Z, Z)
for(i in 1:ngenes) {
if(!all(is.finite(arraygammas)))
- stop("convergence problem at gene ", i, ": array weights not estimable")
- vary <- exp(Z%*%arraygammas)
-
- if(!is.null(weights)) { # combine spot weights with running weights
- if(max(weights[i,], na.rm=TRUE) > 1) {
- weights[i,] <- weights[i,]/max(weights[i,])
- }
- w <- as.vector(1/vary*weights[i,])
- } else {
- w <- as.vector(1/vary)
- }
- y <- as.vector(M[i,])
- obs <- is.finite(y) & w!=0
- if (sum(obs) > 1) {
- if(sum(obs) == narrays) {
- X <- design
- } else { # remove missing/infinite values
- X <- design[obs, , drop = FALSE]
- y <- y[obs]
- vary <- vary[obs]
- Z2 <- Z[obs,]
- }
-
- out <- lm.wfit(X, y, w[obs])
- d <- rep(0, narrays)
- d[obs] <- w[obs]*out$residuals^2
+ stop("convergence problem at gene ", i, ": array weights not estimable")
+ vary <- exp(Z%*%arraygammas)
+
+ if(!is.null(weights)) { # combine spot weights with running weights
+ if(max(weights[i,], na.rm=TRUE) > 1) {
+ weights[i,] <- weights[i,]/max(weights[i,])
+ }
+ w <- as.vector(1/vary*weights[i,])
+ } else {
+ w <- as.vector(1/vary)
+ }
+ y <- as.vector(M[i,])
+ obs <- is.finite(y) & w!=0
+ if (sum(obs) > 1) {
+ if(sum(obs) == narrays) {
+ X <- design
+ } else { # remove missing/infinite values
+ X <- design[obs, , drop = FALSE]
+ y <- y[obs]
+ vary <- vary[obs]
+ Z2 <- Z[obs,]
+ }
+
+ out <- lm.wfit(X, y, w[obs])
+ d <- rep(0, narrays)
+ d[obs] <- w[obs]*out$residuals^2
s2 <- sum(d[obs])/out$df.residual
Q <- qr.Q(out$qr)
if(ncol(Q)!=out$rank)
@@ -72,88 +79,88 @@ arrayWeights <- function(object, design=NULL, weights=NULL, method="genebygene",
# }
# else
# Q <- qr.Q(out$qr)
- h <- rep(1, narrays)
- h[obs] <- rowSums(Q^2)
- Agam <- crossprod(Z, (1-h)*Z)
- Agam.del <- crossprod(t(rep(h[narrays], narrays-1)-h[1:(length(narrays)-1)]))
- Agene.gam <- (Agam - 1/out$df.residual*Agam.del) # 1/(narrays-nparams)
- if(is.finite(sum(Agene.gam)) && sum(obs) == narrays) {
- Zinfo <- Zinfo + Agene.gam
- R <- chol(Zinfo)
- Zinfoinv <- chol2inv(R)
- zd <- d/s2 - 1 + h
- Zzd <- crossprod(Z, zd)
- gammas.iter <- Zinfoinv%*%Zzd
- arraygammas <- arraygammas + gammas.iter
- }
- if(is.finite(sum(Agene.gam)) && sum(obs) < narrays && sum(obs) > 2) {
- Zinfo <- Zinfo + Agene.gam
- A1 <- (diag(1, sum(obs))-1/sum(obs)*matrix(1, sum(obs), sum(obs)))%*%Z2
- A1 <- A1[-sum(obs),] # remove last row
- R <- chol(Zinfo)
- Zinfoinv <- chol2inv(R)
- zd <- d/s2 - 1 + h
- Zzd <- A1%*%crossprod(Z, zd)
- Zinfoinv.A1 <- A1%*%Zinfoinv%*%t(A1)
- alphas.old <- A1%*%arraygammas
- alphas.iter <- Zinfoinv.A1%*%Zzd
- alphas.new <- alphas.old + alphas.iter
- Us <- rbind(diag(1, sum(obs)-1), -1)
- Usalphas <- Us%*%(alphas.new-alphas.old)
- Usgammas <- Z%*%arraygammas
- Usgammas[obs] <- Usgammas[obs] + Usalphas
- arraygammas <- Usgammas[1:(narrays-1)]
- }
+ h <- rep(1, narrays)
+ h[obs] <- rowSums(Q^2)
+ Agam <- crossprod(Z, (1-h)*Z)
+ Agam.del <- crossprod(t(rep(h[narrays], length(arraygammas))-h[1:(length(narrays)-1)]))
+ Agene.gam <- (Agam - 1/out$df.residual*Agam.del) # 1/(narrays-nparams)
+ if(is.finite(sum(Agene.gam)) && sum(obs) == narrays) {
+ Zinfo <- Zinfo + Agene.gam
+ R <- chol(Zinfo)
+ Zinfoinv <- chol2inv(R)
+ zd <- d/s2 - 1 + h
+ Zzd <- crossprod(Z, zd)
+ gammas.iter <- Zinfoinv%*%Zzd
+ arraygammas <- arraygammas + gammas.iter
+ }
+ if(is.finite(sum(Agene.gam)) && sum(obs) < narrays && sum(obs) > 2) {
+ Zinfo <- Zinfo + Agene.gam
+ A1 <- (diag(1, sum(obs))-1/sum(obs)*matrix(1, sum(obs), sum(obs)))%*%Z2
+ A1 <- A1[-sum(obs),] # remove last row
+ R <- chol(Zinfo)
+ Zinfoinv <- chol2inv(R)
+ zd <- d/s2 - 1 + h
+ Zzd <- A1%*%crossprod(Z, zd)
+ Zinfoinv.A1 <- A1%*%Zinfoinv%*%t(A1)
+ alphas.old <- A1%*%arraygammas
+ alphas.iter <- Zinfoinv.A1%*%Zzd
+ alphas.new <- alphas.old + alphas.iter
+ Us <- rbind(diag(1, sum(obs)-1), -1)
+ Usalphas <- Us%*%(alphas.new-alphas.old)
+ Usgammas <- Z%*%arraygammas
+ Usgammas[obs] <- Usgammas[obs] + Usalphas
+ arraygammas <- Usgammas[1:(narrays-1)]
+ }
if(trace && (i==1 || i%%1001==0)) {
x2 <- crossprod(Zzd, gammas.iter) / narrays
cat("Iter =", i, " X2 =", x2, " Array gammas", arraygammas, "\n")
}
- }
- }
- }, reml = { # Estimate array variances via reml
-# const <- narrays * log(2 * pi)
- iter <- 0
- dev <- 0
- repeat {
-# devold <- dev
-# dev <- 0
- iter <- iter + 1
- zd <- matrix(0, narrays, 1)
- sum1minush <- matrix(0, narrays, 1)
- K <- matrix(0, ngenes, narrays)
-
- for(i in 1:ngenes) {
- vary <- exp(Z%*%arraygammas)
-
- if(!is.null(weights)) { # combine spot weights with running weights
- if(max(weights[i,], na.rm=TRUE) > 1) {
- weights[i,] <- weights[i,]/max(weights[i,])
- }
- w <- as.vector(1/vary*weights[i,])
- } else {
- w <- as.vector(1/vary)
- }
-
- y <- as.vector(M[i,])
- obs <- is.finite(y) & w!=0
- n <- sum(obs)
- if (n > 0) {
- if(n == narrays) {
- X <- design
- #Z2 <- Z
- } else { # remove missing/infinite values
- X <- design[obs, , drop = FALSE]
- y <- y[obs]
- vary <- vary[obs]
- w <- w[obs]
- const <- sum(obs) * log(2 * pi)
- }
- # cat(i)
- out <- lm.wfit(X, y, w)
- d <- rep(0, narrays)
- d[obs] <- w*out$residuals^2
- s2 <- sum(d[obs])/out$df.residual
+ }
+ }
+ }, reml = { # Estimate array variances via reml
+# const <- narrays * log(2 * pi)
+ iter <- 0
+ dev <- 0
+ repeat {
+# devold <- dev
+# dev <- 0
+ iter <- iter + 1
+ zd <- matrix(0, narrays, 1)
+ sum1minush <- matrix(0, narrays, 1)
+ K <- matrix(0, ngenes, narrays)
+
+ for(i in 1:ngenes) {
+ vary <- exp(Z%*%arraygammas)
+
+ if(!is.null(weights)) { # combine spot weights with running weights
+ if(max(weights[i,], na.rm=TRUE) > 1) {
+ weights[i,] <- weights[i,]/max(weights[i,])
+ }
+ w <- as.vector(1/vary*weights[i,])
+ } else {
+ w <- as.vector(1/vary)
+ }
+
+ y <- as.vector(M[i,])
+ obs <- is.finite(y) & w!=0
+ n <- sum(obs)
+ if (n > 0) {
+ if(n == narrays) {
+ X <- design
+ #Z2 <- Z
+ } else { # remove missing/infinite values
+ X <- design[obs, , drop = FALSE]
+ y <- y[obs]
+ vary <- vary[obs]
+ w <- w[obs]
+ const <- sum(obs) * log(2 * pi)
+ }
+ # cat(i)
+ out <- lm.wfit(X, y, w)
+ d <- rep(0, narrays)
+ d[obs] <- w*out$residuals^2
+ s2 <- sum(d[obs])/out$df.residual
Q <- qr.Q(out$qr)
if(ncol(Q)!=out$rank)
Q <- Q[,-((out$rank+1):ncol(Q)),drop=FALSE]
@@ -161,41 +168,72 @@ arrayWeights <- function(object, design=NULL, weights=NULL, method="genebygene",
# Q <- qr.Q(qr(X[,-cols[out$qr$pivot[(out$qr$rank + 1):p,drop=FALSE]]]))
# else
# Q <- qr.Q(out$qr)
- h <- rowSums(Q^2)
- zd[obs] <- zd[obs] + d[obs]/s2 - 1 + h
- sum1minush[obs,1] <- sum1minush[obs,1] + 1-h
- K[i,][obs] <- as.vector(h[n]-h)
-# dev <- dev + sum(d[obs]/vary) + sum(log(vary)) + const + 2 * log(prod(abs(diag(out$qr$qr))))
- }
- }
- Zzd <- crossprod(Z, zd)
- Zinfo <- diag(sum1minush[1:(narrays-1)]) + sum1minush[narrays] - crossprod(K[,-narrays])/out$df.residual #(narrays-nparams)
- R <- chol(Zinfo)
- Zinfoinv <- chol2inv(R)
- gammas.iter <- Zinfoinv%*%Zzd
- arraygammas <- arraygammas + gammas.iter
-# arrayw <- drop(exp(Z %*% (-arraygammas)))
- x2 <- crossprod(Zzd, gammas.iter) / narrays
-
- if(trace)
- cat("Iter =", iter, " X2 =", x2, " Array gammas", arraygammas, "\n")
+ h <- rowSums(Q^2)
+ zd[obs] <- zd[obs] + d[obs]/s2 - 1 + h
+ sum1minush[obs,1] <- sum1minush[obs,1] + 1-h
+ K[i,][obs] <- as.vector(h[n]-h)
+# dev <- dev + sum(d[obs]/vary) + sum(log(vary)) + const + 2 * log(prod(abs(diag(out$qr$qr))))
+ }
+ }
+ Zzd <- crossprod(Z, zd)
+# Zinfo <- diag(sum1minush[1:(narrays-1)]) + sum1minush[narrays] - crossprod(K[,-narrays])/out$df.residual #(narrays-nparams)
+ Zinfo <- diag(sum1minush[1:(length(arraygammas))]) + sum1minush[length(arraygammas)] - crossprod(K[,])[1:length(arraygammas),1:length(arraygammas)]/out$df.residual
+
+ R <- chol(Zinfo)
+ Zinfoinv <- chol2inv(R)
+ gammas.iter <- Zinfoinv%*%Zzd
+ arraygammas <- arraygammas + gammas.iter
+# arrayw <- drop(exp(Z %*% (-arraygammas)))
+ x2 <- crossprod(Zzd, gammas.iter) / narrays
+
+ if(trace)
+ cat("Iter =", iter, " X2 =", x2, " Array gammas", arraygammas, "\n")
if(!all(is.finite(arraygammas)))
- stop("convergence problem at iteration ", iter, ": array weights not estimable")
-
-# if (dev < devold - 1e-50)
-# break
-
- if (x2 < tol)
- break
-
- if (iter == maxiter) {
- warning("Maximum iterations ", maxiter, " reached", sep="")
- break
- }
- }
-
- })
-# matrix(rep(1/exp(Z%*%arraygammas), each=ngenes), ngenes, narrays)
- drop(exp(Z %*% (-arraygammas)))
-}
+ stop("convergence problem at iteration ", iter, ": array weights not estimable")
+
+# if (dev < devold - 1e-50)
+# break
+
+ if (x2 < tol)
+ break
+
+ if (iter == maxiter) {
+ warning("Maximum iterations ", maxiter, " reached", sep="")
+ break
+ }
+ }
+
+ })
+# matrix(rep(1/exp(Z%*%arraygammas), each=ngenes), ngenes, narrays)
+ drop(exp(Z %*% (-arraygammas)))
+}
+
+voomWithQualityWeights <- function(counts, design=NULL, lib.size=NULL, normalize.method="none",
+ plot=FALSE, span=0.5, var.design=NULL, method="genebygene", maxiter=50,
+ tol=1e-10, trace=FALSE, replace.weights=TRUE, ...)
+# Combine voom weights with sample-specific weights estimated by arrayWeights() function for RNA-seq data
+# Matt Ritchie and Cynthia Liu, 22 Sept 2014.
+# Last modified 7 Oct 2014.
+{
+ if(plot) {
+ oldpar <- par(mfrow=c(1,2))
+ on.exit(par(oldpar))
+ }
+ v <- voom(counts, design=design, lib.size=lib.size, normalize.method=normalize.method, plot=FALSE, span=span, ...)
+ aw <- arrayWeights(v, design=design, method=method, maxiter=maxiter, tol=tol, var.design=var.design)
+ v <- voom(counts, design=design, weights=aw, lib.size=lib.size, normalize.method=normalize.method, plot=plot, span=span, ...)
+ aw <- arrayWeights(v, design=design, method=method, maxiter=maxiter, tol=tol, trace=trace, var.design=var.design)
+ wts <- asMatrixWeights(aw, dim(v))*v$weights
+ attr(wts, "arrayweights") <- NULL
+ if(plot) {
+ barplot(aw, names=seq(1:length(aw)), main="Sample-specific weights", ylab="Weight", xlab="Sample")
+ abline(h=1, col=2, lty=2)
+ }
+ if(replace.weights) {
+ v$weights <- wts
+ return(v)
+ } else {
+ return(wts)
+ }
+}
diff --git a/R/barcodeplot.R b/R/barcodeplot.R
index a222e43..f1b7d20 100644
--- a/R/barcodeplot.R
+++ b/R/barcodeplot.R
@@ -1,24 +1,119 @@
## BARCODEPLOT.R
-barcodeplot <- function (statistics, index, index2 = NULL, labels = c("Up", "Down"), quantiles = c(-1, 1), col.bars = NULL, worm=TRUE, span.worm=0.45, ...)
+barcodeplot <- function (statistics, index = NULL, index2 = NULL, gene.weights = NULL, weights.label = "Weight", labels = c("Up", "Down"), quantiles = c(-1, 1), col.bars = NULL, worm = TRUE, span.worm = 0.45,...)
# Barcode plot of one or two gene sets.
# Gordon Smyth, Di Wu and Yifang Hu
-# 20 October 2008. Last revised 4 Feb 2014.
+# 20 October 2008. Last revised 7 October 2014.
{
+# Check statistics
+ if(!is.vector(statistics, mode = "numeric")) stop("statistics should be a numeric vector")
+ nstat <- length(statistics)
+
+# Check index
+ if(is.null(index)) {
+ if(is.null(gene.weights)) {
+ stop("Must specify at least one of index or gene.weights")
+ } else {
+ if(length(gene.weights) == nstat) {
+ index <- rep_len(TRUE, nstat)
+ index2 <- NULL
+ } else {
+ stop("No index and length(gene.weights) doesn't equal length(statistics)")
+ }
+ }
+ } else {
+ if(any(is.na(index))) stop("Need to provide index without NAs")
+ if(is.logical(index)) if(length(index) != nstat) stop("Length of index disagrees with statistics")
+ if(length(index) > nstat) stop("Length of index disagrees with statistics")
+ }
+
+# Check index2
+ if(!is.null(index2)) {
+ if(!is.null(gene.weights)) warning("gene.weights ignored")
+ gene.weights <- statistics
+ gene.weights[] <- 0
+ gene.weights[index] <- 1
+ gene.weights[index2] <- -1
+ index <- rep_len(TRUE, nstat)
+ index2 <- NULL
+ }
+
+# Check gene.weights
+ if(!is.null(gene.weights)){
+
+ if(!is.vector(gene.weights, mode = "numeric")) stop("gene.weights should be a numeric vector")
+ if(any(is.na(gene.weights))) stop("Need to provide gene.weights without NAs")
+ if(all(gene.weights == 0)) stop("gene.weights equal to zero: no selected genes to plot")
+ if(length(gene.weights) != length(statistics[index])) stop("Length of gene.weights disagrees with size of set")
+
+ one <- all(gene.weights >= 0) | all(gene.weights <= 0)
+
+ if(one){
+
+ index2 <- NULL
+ gene.weights1 <- rep_len(0, nstat)
+ names(gene.weights1) <- names(statistics)
+ gene.weights1[index] <- gene.weights
+
+ index <- rep_len(FALSE, nstat)
+ names(index) <- names(statistics)
+ index[gene.weights1 != 0] <- TRUE
+
+ gene.weights1 <- gene.weights1[gene.weights1 != 0]
+ gene.weights <- gene.weights1
+
+ } else {
+
+ gene.weights12 <- rep_len(0, nstat)
+ names(gene.weights12) <- names(statistics)
+ gene.weights12[index] <- gene.weights
+
+ index <- index2 <- rep_len(FALSE, nstat)
+ names(index) <- names(index2) <- names(statistics)
+ index[gene.weights12 > 0] <- TRUE
+ index2[gene.weights12 < 0] <- TRUE
+
+ gene.weights1 <- gene.weights12[gene.weights12 > 0]
+ gene.weights2 <- gene.weights12[gene.weights12 < 0]
+
+ gene.weights <- gene.weights1
+
+ }
+
+ }
+
# Are there up and down sets?
TWO <- !is.null(index2)
-# Convert indexes to logical
- set1 <- set2 <- rep.int(FALSE,length(statistics))
- names(set1) <- names(set2) <- names(statistics)
- set1[index] <- TRUE
- if(TWO) set2[index2] <- TRUE
+# Convert indexes to logical and add gene.weights
+ set1 <- set2 <- data.frame(idx = rep.int(FALSE,nstat), weight = NA, wt = NA)
+ rownames(set1) <- rownames(set2) <- names(statistics)
+ set1$idx[index] <- TRUE
+ if(TWO) set2$idx[index2] <- TRUE
+
+ if(length(gene.weights)) {
+
+ set1$weight <- 0
+ set1$weight[index]<- gene.weights
+ set1$wt <- abs(set1$weight)/sum(abs(set1$weight))
+
+ if(TWO) {
+
+ set2$weight <- 0
+ set2$weight[index2] <- gene.weights2
+
+ SUM <- sum(abs(set1$weight), abs(set2$weight))
+ set1$wt <- abs(set1$weight)/SUM
+ set2$wt <- abs(set2$weight)/SUM
+
+ }
+ }
# Sort statistics and indexes
ostat <- order(statistics, na.last = TRUE, decreasing=TRUE)
statistics <- statistics[ostat]
- set1 <- set1[ostat]
- if(TWO) set2 <- set2[ostat]
+ set1 <- set1[ostat,]
+ if(TWO) set2 <- set2[ostat,]
# Trim off missing values
n <- sum(!is.na(statistics))
@@ -26,16 +121,17 @@ barcodeplot <- function (statistics, index, index2 = NULL, labels = c("Up", "Dow
message("No valid statistics")
return(invisible())
}
- if (n < length(statistics)) {
+ if (n < nstat) {
statistics <- statistics[1:n]
- set1 <- set1[1:n]
- if (TWO) set2 <- set2[1:n]
+ set1 <- set1[1:n,]
+ if (TWO) set2 <- set2[1:n,]
}
# Convert indexes to integer
- r <- which(set1)
+ r <- which(set1$idx)
+
if (TWO) {
- r2 <- which(set2)
+ r2 <- which(set2$idx)
if(!length(r2)) TWO <- FALSE
}
@@ -45,118 +141,272 @@ barcodeplot <- function (statistics, index, index2 = NULL, labels = c("Up", "Dow
r <- r2
set1 <- set2
TWO <- FALSE
- message("Using index2 as primary set")
+
} else {
message("No selected genes to plot")
return(invisible())
}
+# Are there unequal weights?
+ WTS <- FALSE
+ wt1 <- set1$wt[r]
+ len.up <- 1
+
+ if(all(!is.na(wt1))){
+
+ len.up <- set1$weight[r]/max(abs(set1$weight[r]))
+
+ anydifferent <- function(x) {
+ if(length(x) < 2) return(FALSE)
+ r <- range(x)
+ (r[2] > r[1])
+ }
+
+ if(!TWO) if(anydifferent(wt1)) WTS <- TRUE
+
+ if(TWO){
+ wt12 <- c(set1$wt[r], abs(set2$wt[r2]))
+ if(anydifferent(wt12)) WTS <- TRUE
+
+ max.wt <- max(set1$wt[r], set2$wt[r2])
+ len.up <- set1$wt[r]/max.wt
+ len.down <- set2$wt[r2]/max.wt
+ }
+
+ }
+
+ pos.dir <- all(len.up > 0)
+
+# Plot setting
+ if(WTS) shift <- 0.1 else shift <- 0
+
# Check other arguments
quantiles <- sort(quantiles)
- if (is.null(col.bars))
- if (TWO)
- col.bars = c("red", "blue")
- else
- col.bars = c("black", "black")
+
+# color settting
+ ALP <- 0.4
+
+ if (is.null(col.bars)) {
+
+ if (TWO) {
+
+ col.bars <- c("red", "blue")
+ if(WTS) col.bars.alpha <- c(rgb(1,0,0,alpha=ALP), rgb(0,0,1,alpha=ALP))
+ else col.bars.alpha <- col.bars
+
+ } else {
+
+ col.bars <- "black"
+ if(WTS) col.bars.alpha <- rgb(0,0,0,alpha=ALP)
+ else col.bars.alpha <- col.bars
+
+ }
+
+ } else {
+
+ if(TWO) {
+
+ if(length(col.bars) == 1) col.bars <- rep(col.bars, 2)
+ RGB <- col2rgb(col.bars)/255
+ red <- RGB[1,1]
+ green <- RGB[2,1]
+ blue <- RGB[3,1]
+ red2 <- RGB[1,2]
+ green2 <- RGB[2,2]
+ blue2 <- RGB[3,2]
+ if(WTS) col.bars.alpha <- c(rgb(red, green, blue, alpha=ALP), rgb(red2, green2, blue2, alpha=ALP))
+ else col.bars.alpha <- col.bars
+
+ } else {
+
+ RGB <- col2rgb(col.bars)/255
+ red <- RGB[1,1]
+ green <- RGB[2,1]
+ blue <- RGB[3,1]
+ if(WTS) col.bars.alpha <- rgb(red, green, blue, alpha=ALP)
+ else col.bars.alpha <- col.bars
+
+ }
+
+ }
# worm plot setting
ylim.worm <- ylim <- c(-1, 1)
ylab.worm <- ""
xlab.worm <- "statistics"
-
+
if(!TWO) ylim.worm <- c(0, 1)
-
+
if(worm) {
ylim.worm <- c(-2.1, 2.1)
if(!TWO) ylim.worm <- c(0, 2.1)
}
-
+
ylim[2] <- ylim[2] + 0.5
if (TWO) ylim[1] <- ylim[1] - 0.5
-
- plot(1:n, xlim = c(0, n), ylim = ylim.worm, type = "n", axes = FALSE, xlab = xlab.worm, ylab = ylab.worm, ...)
-
+
+ if(TWO) plot(1:n, xlim=c(0,n), ylim=c(ylim.worm[1]-shift, ylim.worm[2]+shift), type="n", axes=FALSE, xlab=xlab.worm, ylab=ylab.worm, ...)
+ if(!TWO) plot(1:n, xlim=c(0,n), ylim=c(ylim.worm[1]-shift*(!pos.dir), ylim.worm[2]+shift*pos.dir), type="n", axes=FALSE, xlab=xlab.worm,ylab=ylab.worm, ...)
+
npos <- sum(statistics > quantiles[2])
nneg <- sum(statistics < quantiles[1])
- rect.yb <- -0.5
- if(!TWO) rect.yb <- 0
-
- rect(npos + 0.5, rect.yb, n - nneg + 0.5, 0.5, col = "lightgray", border = NA)
- if (npos) rect(0.5, rect.yb, npos + 0.5, 0.5, col = "pink", border = NA)
- if (nneg) rect(n - nneg + 0.5, rect.yb, n + 0.5, 0.5, col = "lightblue", border = NA)
-
lwd <- 50/length(r)
- lwd <- min(2, lwd)
- lwd <- max(0.1, lwd)
+ lwd <- min(1.9, lwd)
+ lwd <- max(0.2, lwd)
+
+ if(TWO){
+ lwd2 <- 50/length(r2)
+ lwd2 <- min(1.9, lwd2)
+ lwd2 <- max(0.2, lwd2)
+
+ lwd <- lwd2 <- min(lwd, lwd2)
+ }
+
+ barlim <- ylim[2] - c(1.5, 0.5)
+
+ if(!pos.dir) {
+
+ rect.yb <- 0.5
+ rect.yt <- 1
+
+ rect(npos + 0.5, rect.yb, n - nneg + 0.5, rect.yt, col = "lightgray", border = NA)
+ if (npos) rect(0.5, rect.yb, npos + 0.5, rect.yt, col = "pink", border = NA)
+ if (nneg) rect(n - nneg + 0.5, rect.yb, n + 0.5, rect.yt, col = "lightblue", border = NA)
+
+ segments(r, barlim[2]/2, r, barlim[2], lwd = lwd, col = col.bars.alpha[1])
+ segments(r, barlim[2]/2-shift, r, barlim[2]/2*(1+len.up)-shift, lwd = lwd, col = col.bars[1])
+ }
- barlim <- ylim[2] - c(1.5, 0.5)
- segments(r, barlim[1], r, barlim[2], lwd = lwd, col = col.bars[1])
+ if(pos.dir) {
+
+ rect.yb <- -0.5
+ if(!TWO) rect.yb <- 0
+ rect.yt <- 0.5
+
+ rect(npos + 0.5, rect.yb, n - nneg + 0.5, rect.yt, col = "lightgray", border = NA)
+ if (npos) rect(0.5, rect.yb, npos + 0.5, rect.yt, col = "pink", border = NA)
+ if (nneg) rect(n - nneg + 0.5, rect.yb, n + 0.5, rect.yt, col = "lightblue", border = NA)
+
+ segments(r, barlim[1], r, barlim[2]/2, lwd = lwd, col = col.bars.alpha[1])
+ segments(r, barlim[2]/2+shift, r, barlim[2]/2*(1+len.up)+shift, lwd = lwd, col = col.bars[1])
+ }
if(TWO) {
- lwd2 <- 50/length(r2)
- lwd2 <- min(2, lwd2)
- lwd2 <- max(0.1, lwd2)
+
barlim2 <- ylim[1] + c(0.5, 1.5)
- segments(r2, barlim2[1], r2, barlim2[2], lwd = lwd2, col = col.bars[2])
+
+ segments(r2, barlim2[1]/2, r2, barlim2[2], lwd = lwd2, col = col.bars.alpha[2])
+ segments(r2, barlim2[1]/2*(1+len.down)-shift, r2, barlim2[1]/2-shift, lwd = lwd2, col = col.bars[2])
}
-
+
lab.at <- 0
if(!TWO) lab.at <- 0.5
- axis(side = 2, at = lab.at, padj = 3, cex.axis = 0.85, labels = labels[1], tick = FALSE)
- axis(side = 4, at = lab.at, padj = -3, cex.axis = 0.85, labels = labels[2], tick = FALSE)
-
+ axis(side = 2, at = lab.at, padj = 3.8, cex.axis = 0.85, labels = labels[1], tick = FALSE)
+ axis(side = 4, at = lab.at, padj = -3.8, cex.axis = 0.85, labels = labels[2], tick = FALSE)
+
# label statistics on x axis
prob <- (10:0)/10
axis(at = seq(1,n,len=11), side = 1, cex.axis = 0.7, las = 2, labels = format(quantile(statistics, p = prob), digits = 1))
# create worm
if(worm) {
+
# rescale x to new range
rescale <- function(x, newrange, oldrange=range(x)) {
newrange[1] + (x-oldrange[1]) / (oldrange[2]-oldrange[1]) * (newrange[2] - newrange[1])
}
# calculate enrichment
- ave.enrich1 <- length(r)/n
- worm1 <- tricubeMovingAverage(set1,span=span.worm)/ave.enrich1
-
- if(TWO) {
- ave.enrich2 <- length(r2)/n
- worm2 <- tricubeMovingAverage(-set2,span=span.worm)/ave.enrich2
+ if(!WTS){
+
+ ave.enrich1 <- length(r)/n
+ worm1 <- tricubeMovingAverage(set1$idx, span = span.worm)/ave.enrich1
+ if(TWO) {
+ ave.enrich2 <- length(r2)/n
+ worm2 <- tricubeMovingAverage(-set2$idx, span = span.worm)/ave.enrich2
+ }
+ }
+
+ # calculate weighted enrichment
+ if(WTS){
+
+ ave.enrich1 <- mean(set1$wt)
+ worm1 <- tricubeMovingAverage(set1$wt, span = span.worm)/ave.enrich1
+
+ if(TWO) {
+ ave.enrich2 <- mean(set2$wt)
+ worm2 <- tricubeMovingAverage(-set2$wt, span = span.worm)/ave.enrich2
+ }
}
-
+
# rescale worm
max.worm1 <- max(worm1)
r.worm1 <- c(0,max.worm1)
- worm1.scale <- rescale(worm1, newrange=c(1.1,2.1), oldrange=r.worm1)
-
+ worm1.scale <- rescale(worm1, newrange=c(1.1+shift*pos.dir,2.1+shift*pos.dir), oldrange=r.worm1)
+
if(TWO) {
min.worm2 <- min(worm2)
r.worm2 <- c(min.worm2,0)
- worm2.scale <- rescale(worm2, newrange=c(-2.1,-1.1), oldrange=r.worm2)
+ worm2.scale <- rescale(worm2, newrange=c(-2.1-shift,-1.1-shift), oldrange=r.worm2)
}
# plot worms
-
if(!TWO) {
+
lines(x = 1:n, y = worm1.scale, col = col.bars[1], lwd = 2)
- abline(h = rescale(1,newrange=c(1.1,2.1),oldrange=r.worm1), lty=2)
- axis(side = 2, at = c(1.1, 2.1), cex.axis = 0.8, labels = c(0, format(max.worm1, digits = 2)))
- axis(side = 2, labels = "Enrichment", at = 1.6, padj = -0.6, tick = FALSE, cex.axis = 0.8)
+ abline(h = rescale(1,newrange=c(1.1+shift*pos.dir,2.1+shift*pos.dir),oldrange=r.worm1), lty=2)
+ axis(side = 2, at = c(1.1+shift*pos.dir, 2.1+shift*pos.dir), cex.axis = 0.8, labels = c(0, format(max.worm1, digits = 2)))
+ axis(side = 2, labels = "Enrichment", at = 1.6+shift*pos.dir, padj = -0.6, tick = FALSE, cex.axis = 0.8)
+
}
-
+
if(TWO) {
+
lines(x = 1:n, y = worm1.scale, col = col.bars[1], lwd = 2)
- abline(h = rescale(1,newrange=c(1.1,2.1),oldrange=r.worm1), lty=2)
+ abline(h = rescale(1,newrange=c(1.1+shift,2.1+shift),oldrange=r.worm1), lty=2)
+
lines(x = 1:n, y = worm2.scale, col = col.bars[2], lwd = 2)
- abline(h = rescale(-1,newrange=c(-2.1,-1.1),oldrange=r.worm2), lty=2)
- axis(side = 2, at = c(1.1, 2.1), cex.axis = 0.7, labels = c(0, format(max.worm1, digits = 2)))
- axis(side = 2, at = c(-1.1, -2.1), cex.axis = 0.7, labels = c(0, format(-min.worm2, digits = 2)))
- axis(side = 2, labels = "Enrichment", at = -1.6, tick = FALSE, padj = -0.6, cex.axis = 0.7)
- axis(side = 2, labels = "Enrichment", at = 1.6, tick = FALSE, padj = -0.6, cex.axis = 0.7)
+ abline(h = rescale(-1,newrange=c(-2.1-shift,-1.1-shift),oldrange=r.worm2), lty=2)
+
+ axis(side = 2, at = c(1.1+shift, 2.1+shift), cex.axis = 0.7, labels = c(0, format(max.worm1, digits = 2)))
+ axis(side = 2, at = c(-1.1-shift, -2.1-shift), cex.axis = 0.7, labels = c(0, format(-min.worm2, digits = 2)))
+
+ axis(side = 2, labels = "Enrichment", at = 1.6+shift, tick = FALSE, padj = -0.6, cex.axis = 0.7)
+ axis(side = 2, labels = "Enrichment", at = -1.6-shift, tick = FALSE, padj = -0.6, cex.axis = 0.7)
}
}
+ # add gene.weights axis
+ if(WTS){
+
+ if(!TWO){
+
+ if(pos.dir){
+
+ axis(side = 2, at = c(0.5+shift, 1+shift), cex.axis = 0.48, padj = 1.6, labels = c(0, format(max(set1$weight[r]), digits = 2)))
+ axis(side = 2, labels = weights.label[1], at = 0.75+shift, padj = 1, tick = FALSE, cex.axis = 0.5)
+
+ }
+
+ if(!pos.dir){
+
+ axis(side = 2, at = c(0-shift, 0.5-shift), cex.axis = 0.48, padj = 1.6, labels = c(format(min(set1$weight[r]), digits = 2), 0))
+ axis(side = 2, labels = weights.label[1], at = 0.25-shift, padj = 1, tick = FALSE, cex.axis = 0.5)
+
+ }
+ }
+
+ if(TWO){
+
+ max.weight <- max(set1$weight[r], abs(set2$weight[r2]))
+ axis(side = 2, at = c(0.5+shift, 1+shift), cex.axis = 0.43, labels = c(0, format(max.weight, digits = 2, scientific = FALSE)), padj = 1.6)
+ axis(side = 2, labels = weights.label[1], at = 0.75+shift, padj = 1, tick = FALSE, cex.axis = 0.46)
+ axis(side = 2, at = c(-0.5-shift, -1-shift), cex.axis = 0.43, labels = c(0, format(-max.weight, digits = 2, scientific = FALSE)), padj = 1.6)
+ axis(side = 2, labels = weights.label[1], at = -0.75-shift, padj = 1, tick = FALSE, cex.axis = 0.46)
+ }
+
+ }
+
invisible()
}
diff --git a/R/diffSplice.R b/R/diffSplice.R
index c2981a5..cc1b1ca 100644
--- a/R/diffSplice.R
+++ b/R/diffSplice.R
@@ -1,8 +1,8 @@
-diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
+diffSplice <- function(fit,geneid,exonid=NULL,robust=FALSE,verbose=TRUE)
# Test for splicing variants between conditions
# using linear model fit of exon data.
-# Charity Law and Gordon Smyth
-# Created 13 Dec 2013. Last modified 18 Feb 2014.
+# Gordon Smyth and Charity Law
+# Created 13 Dec 2013. Last modified 22 Sep 2014.
{
exon.genes <- fit$genes
if(is.null(exon.genes)) exon.genes <- data.frame(ExonID=1:nrow(fit))
@@ -51,7 +51,7 @@ diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
}
# Posterior genewise variances
- squeeze <- squeezeVar(var=gene.s2, df=gene.df.residual)
+ squeeze <- squeezeVar(var=gene.s2, df=gene.df.residual, robust=robust)
# Remove genes with only 1 exon
gene.keep <- gene.nexons>1
@@ -68,6 +68,7 @@ diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
gene.nexons <- gene.nexons[gene.keep]
gene.df.test <- gene.nexons-1
gene.df.residual <- gene.df.residual[gene.keep]
+ if(robust) squeeze$df.prior <- squeeze$df.prior[gene.keep]
gene.df.total <- gene.df.residual+squeeze$df.prior
gene.df.total <- pmin(gene.df.total,sum(gene.df.residual))
gene.s2.post <- squeeze$var.post[gene.keep]
@@ -106,105 +107,135 @@ diffSplice <- function(fit,geneid,exonid=NULL,verbose=TRUE)
out$gene.F.p.value <- gene.F.p.value
# Which columns of exon.genes contain gene level annotation?
- exon.lastexon <- cumsum(gene.nexons)
- exon.firstexon <- exon.lastexon-gene.nexons+1
+ gene.lastexon <- cumsum(gene.nexons)
+ gene.firstexon <- gene.lastexon-gene.nexons+1
no <- logical(nrow(exon.genes))
- isdup <- vapply(exon.genes,duplicated,no)[-exon.firstexon,,drop=FALSE]
+ isdup <- vapply(exon.genes,duplicated,no)[-gene.firstexon,,drop=FALSE]
isgenelevel <- apply(isdup,2,all)
- out$gene.genes <- exon.genes[exon.lastexon,isgenelevel, drop=FALSE]
+ out$gene.genes <- exon.genes[gene.lastexon,isgenelevel, drop=FALSE]
out$gene.genes$NExons <- gene.nexons
+ out$gene.firstexon <- gene.firstexon
+ out$gene.lastexon <- gene.lastexon
+
+# Simes adjustment of exon level p-values
+ penalty <- rep_len(1L,length(g))
+ penalty[gene.lastexon] <- 1L-gene.nexons
+ penalty <- cumsum(penalty)[-gene.lastexon]
+ penalty <- penalty / rep(gene.nexons-1L,gene.nexons-1L)
+ g2 <- g[-gene.lastexon]
+
+ out$gene.simes.p.value <- gene.F.p.value
+ for (j in 1:ncol(fit)) {
+ o <- order(g,exon.p.value[,j])
+ p.adj <- pmin(exon.p.value[o,j][-gene.lastexon] / penalty, 1)
+ o <- order(g2,p.adj)
+ out$gene.simes.p.value[,j] <- p.adj[o][gene.firstexon-0L:(ngenes-1L)]
+ }
out
}
-topSplice <- function(fit, coef=ncol(fit), level="exon", number=10, FDR=1)
-# Collate voomex results in data.frame, ordered from most significant at top
+topSplice <- function(fit, coef=ncol(fit), test="simes", number=10, FDR=1)
+# Collate diffSplice results into data.frame, ordered from most significant at top
# Gordon Smyth
-# Created 18 Dec 2013. Last modified 17 March 2014.
+# Created 18 Dec 2013. Last modified 18 Aug 2014.
{
coef <- coef[1]
- exon <- match.arg(level,c("exon","gene"))
- if(level=="exon") {
+ test <- match.arg(test,c("simes","F","f","t"))
+ if(test=="f") test <- "F"
+ switch(test,
+ "t" = {
number <- min(number,nrow(fit$coefficients))
P <- fit$p.value[,coef]
BH <- p.adjust(P, method="BH")
if(FDR<1) number <- min(number,sum(BH<FDR))
o <- order(P)[1:number]
data.frame(fit$genes[o,,drop=FALSE],logFC=fit$coefficients[o,coef],t=fit$t[o,coef],P.Value=P[o],FDR=BH[o])
- } else {
+ },
+ F = {
number <- min(number,nrow(fit$gene.F))
P <- fit$gene.F.p.value[,coef]
BH <- p.adjust(P, method="BH")
if(FDR<1) number <- min(number,sum(BH<FDR))
o <- order(P)[1:number]
data.frame(fit$gene.genes[o,,drop=FALSE],F=fit$gene.F[o,coef],P.Value=P[o],FDR=BH[o])
+ },
+ simes = {
+ number <- min(number,nrow(fit$gene.F))
+ P <- fit$gene.simes.p.value[,coef]
+ BH <- p.adjust(P, method="BH")
+ if(FDR<1) number <- min(number,sum(BH<FDR))
+ o <- order(P)[1:number]
+ data.frame(fit$gene.genes[o,,drop=FALSE],P.Value=P[o],FDR=BH[o])
}
+ )
}
-plotSplice <- function(fit, coef=ncol(fit), geneid=NULL, rank=1L, FDR = 0.05)
-# Plot exons of most differentially spliced gene
+plotSplice <- function(fit, coef=ncol(fit), geneid=NULL, genecolname=NULL, rank=1L, FDR = 0.05)
+# Plot exons of chosen gene
+# fit is output from diffSplice
# Gordon Smyth and Yifang Hu
-# Created 3 Jan 2014. Last modified 19 March 2014.
+# Created 3 Jan 2014. Last modified 7 October 2014.
{
- # Gene labelling including gene symbol
- genecolname <- fit$genecolname
- genelab <- grep(paste0(genecolname,"|Symbol|symbol"), colnames(fit$gene.genes), value = T)
-
+ if(is.null(genecolname))
+ genecolname <- fit$genecolname
+ else
+ genecolname <- as.character(genecolname)
+
if(is.null(geneid)) {
+# Find gene from specified rank
if(rank==1L)
i <- which.min(fit$gene.F.p.value[,coef])
else
i <- order(fit$gene.F.p.value[,coef])[rank]
-
- geneid <- paste(fit$gene.genes[i,genelab], collapse = ".")
-
+ geneid <- paste(fit$gene.genes[i,genecolname], collapse = ".")
} else {
- i <- which(fit$gene.genes[,fit$genecolname]==geneid)
-
- geneid <- paste(fit$gene.genes[i,genelab], collapse = ".")
-
+# Find gene from specified name
+ geneid <- as.character(geneid)
+ i <- which(fit$gene.genes[,genecolname]==geneid)[1]
if(!length(i)) stop(paste("geneid",geneid,"not found"))
}
- exon.lastexon <- cumsum(fit$gene.genes$NExons[1:i])
- j <- (exon.lastexon[i-1]+1):exon.lastexon[i]
+
+# Row numbers containing exons
+ j <- fit$gene.firstexon[i]:fit$gene.lastexon[i]
exoncolname <- fit$exoncolname
- if(is.null(exoncolname)){
+# Get strand if possible
+ strcol <- grepl("strand", colnames(fit$gene.genes), ignore.case = TRUE)
+ if(any(strcol)) geneid <- paste0(geneid, " (", as.character(fit$gene.genes[i, strcol])[1], ")")
- plot(fit$coefficients[j,coef], xlab = "Exon", ylab = "logFC (this exon vs rest)", main = geneid, type = "b")
+ if(is.null(exoncolname)) {
- }
+ plot(fit$coefficients[j,coef], xlab = "Exon", ylab = "logFC (this exon vs rest)", main = geneid, type = "b")
- # Plot exons and mark exon ids on the axis
- if(!is.null(exoncolname)) {
+ } else {
exon.id <- fit$genes[j, exoncolname]
xlab <- paste("Exon", exoncolname, sep = " ")
plot(fit$coefficients[j, coef], xlab = "", ylab = "logFC (this exon vs rest)", main = geneid,type = "b", xaxt = "n")
- axis(1, at = 1:length(j), labels = exon.id, las = 2, cex.axis = 0.6)
+ axis(1, at = 1:length(j), labels = exon.id, las = 2, cex.axis = 0.5)
mtext(xlab, side = 1, padj = 5.2)
- # Mark the topSpliced exons
- top <- topSplice(fit, coef = coef, number = Inf, level = "exon", FDR = FDR)
+# Mark the topSpliced exons
+ top <- topSplice(fit, coef = coef, number = Inf, test = "t", FDR = FDR)
m <- which(top[,genecolname] %in% fit$gene.genes[i,genecolname])
-
if(length(m) > 0){
-
- if(length(m) == 1) cex <- 1.5 else{
-
+ if(length(m) == 1)
+ cex <- 1.5
+ else {
abs.fdr <- abs(log10(top$FDR[m]))
from <- range(abs.fdr)
to <- c(1,2)
cex <- (abs.fdr - from[1])/diff(from) * diff(to) + to[1]
-
}
mark <- match(top[m, exoncolname], exon.id)
points((1:length(j))[mark], fit$coefficients[j[mark], coef], col = "red", pch = 16, cex = cex)
}
+
}
abline(h=0,lty=2)
-
-}
\ No newline at end of file
+ invisible()
+}
diff --git a/R/geneset-roast.R b/R/geneset-roast.R
index 5f1e78c..64383c8 100644
--- a/R/geneset-roast.R
+++ b/R/geneset-roast.R
@@ -16,7 +16,7 @@ roast <- function(y,...) UseMethod("roast")
roast.default <- function(y,index=NULL,design=NULL,contrast=ncol(design),set.statistic="mean",gene.weights=NULL,array.weights=NULL,weights=NULL,block=NULL,correlation,var.prior=NULL,df.prior=NULL,trend.var=FALSE,nrot=999,approx.zscore=TRUE,...)
# Rotation gene set testing for linear models
# Gordon Smyth and Di Wu
-# Created 24 Apr 2008. Last modified 18 June 2014.
+# Created 24 Apr 2008. Last modified 19 Sep 2014.
{
# Check index
if(is.list(index)) return(mroast(y=y,index=index,design=design,contrast=contrast,set.statistic=set.statistic,gene.weights=gene.weights,array.weights=array.weights,weights=weights,block=block,correlation=correlation,var.prior=var.prior,df.prior=df.prior,trend.var=trend.var,nrot=nrot))
@@ -97,10 +97,18 @@ roast.default <- function(y,index=NULL,design=NULL,contrast=ncol(design),set.sta
}
# Check contrast
+ if(is.character(contrast)) {
+ if(length(contrast)>1L) {
+ warning("using only first entry for contrast")
+ contrast <- contrast[1]
+ }
+ contrast <- which(contrast==colnames(design))
+ if(length(contrast)==0L) stop("coef ",contrast," not found")
+ }
if(all(contrast == 0)) stop("contrast all zero")
# Reform design matrix so that contrast is last coefficient
- if(length(contrast) == 1) {
+ if(length(contrast) == 1L) {
contrast <- as.integer(contrast)
if(contrast < p)
X <- cbind(design[,-contrast,drop=FALSE],design[,contrast,drop=FALSE])
diff --git a/R/geneset-romer.R b/R/geneset-romer.R
index 19f178d..a258740 100644
--- a/R/geneset-romer.R
+++ b/R/geneset-romer.R
@@ -1,12 +1,12 @@
## ROMER.R
-symbols2indices <- function(gene.sets, symbols, remove.empty=TRUE)
-# Make a list of gene sets symbols into a list of gene sets indices used to create input for romer
+ids2indices <- function(gene.sets, identifiers, remove.empty=TRUE)
+# Make a list of gene identifiers into a list of indices for gene sets
# Gordon Smyth and Yifang Hu
-# 25 March 2009. Last modified 21 March 2011.
+# 25 March 2009. Last modified 19 June 2014.
{
gene.sets <- as.list(gene.sets)
- index <- lapply(gene.sets, function(x) which(symbols %in% x))
+ index <- lapply(gene.sets, function(x) which(identifiers %in% x))
if(remove.empty)
for (i in length(index):1) if(!length(index[[i]])) index[[i]] <- NULL
index
diff --git a/R/goana.R b/R/goana.R
new file mode 100644
index 0000000..851d275
--- /dev/null
+++ b/R/goana.R
@@ -0,0 +1,236 @@
+goana <- function(de,...) UseMethod("goana")
+
+goana.MArrayLM <- function(de, coef = ncol(de), geneid = rownames(de), FDR = 0.05, species = "Hs", trend = FALSE, ...)
+# Gene ontology analysis of DE genes from linear model fit
+# Gordon Smyth and Yifang Hu
+# Created 20 June 2014. Last modified 23 July 2014.
+{
+ # Check fit
+ if(is.null(de$p.value)) stop("p value not found in fit object (from eBayes).")
+ if(is.null(de$coefficients)) stop("coefficient not found in fit object.")
+ if(length(coef) != 1) stop("coef length needs to be 1.")
+ ngenes <- nrow(de)
+
+ # Check geneid
+ # Can be either a vector of IDs or a column name
+ geneid <- as.character(geneid)
+ if(length(geneid) == ngenes) {
+ universe <- geneid
+ } else
+ if(length(geneid) == 1L) {
+ universe <- de$genes[[geneid]]
+ if(is.null(universe)) stop(paste("Column",geneid,"not found in de$genes"))
+ } else
+ stop("geneid has incorrect length")
+
+ # Check trend
+ # Can be logical, or a numeric vector of covariate values, or the name of the column containing the covariate values
+ if(is.logical(trend)) {
+ if(trend) {
+ covariate <- de$Amean
+ if(is.null(covariate)) stop("Amean not found in fit")
+ }
+ } else
+ if(is.numeric(trend)) {
+ if(length(trend) != ngenes) stop("If trend is numeric, then length must equal nrow(de)")
+ covariate <- trend
+ trend <- TRUE
+ } else {
+ if(is.character(trend)) {
+ if(length(trend) != 1L) stop("If trend is character, then length must be 1")
+ covariate <- de$genes[[trend]]
+ if(is.null(covariate)) stop(paste("Column",trend,"not found in de$genes"))
+ trend <- TRUE
+ } else
+ stop("trend is neither logical, numeric nor character")
+ }
+
+ # Check FDR
+ if(!is.numeric(FDR) | length(FDR) != 1) stop("FDR must be numeric and of length 1.")
+ if(FDR < 0 | FDR > 1) stop("FDR should be between 0 and 1.")
+
+ # Get up and down DE genes
+ fdr.coef <- p.adjust(de$p.value[,coef], method = "BH")
+ EG.DE.UP <- universe[fdr.coef < FDR & de$coef[,coef] > 0]
+ EG.DE.DN <- universe[fdr.coef < FDR & de$coef[,coef] < 0]
+ de.gene <- list(Up=EG.DE.UP, Down=EG.DE.DN)
+
+ # Fit monotonic cubic spline for DE genes vs. gene.weights
+ if(trend) {
+ isDE <- as.numeric(fdr.coef < FDR)
+ o <- order(covariate)
+ PW <- rep(0,nrow(de))
+ PW[o] <- tricubeMovingAverage(isDE[o],span=0.5,full.length=TRUE)
+ }
+ if(!trend) PW <- NULL
+
+ NextMethod(de = de.gene, universe = universe, species = species, prior.prob = PW, ...)
+}
+
+goana.default <- function(de, universe = NULL, species = "Hs", prior.prob = NULL, ...)
+# Gene ontology analysis of DE genes
+# Gordon Smyth and Yifang Hu
+# Created 20 June 2014. Last modified 28 August 2014.
+{
+ # Ensure de is a list
+ if(!is.list(de)) de <- list(DE = de)
+
+ # Stop if components of de are not vectors
+ if(!all(vapply(de,is.vector,TRUE))) stop("components of de should be vectors")
+
+ # Ensure gene IDs are of character mode
+ de <- lapply(de, as.character)
+
+ # Ensure all gene sets have names
+ nsets <- length(de)
+
+ names(de) <- trimWhiteSpace(names(de))
+ NAME <- names(de)
+ i <- NAME == "" | is.na(NAME)
+ NAME[i] <- paste0("DE", 1:nsets)[i]
+ names(de) <- makeUnique(NAME)
+
+ # Select species
+ species <- match.arg(species, c("Hs", "Mm", "Rn", "Dm"))
+
+ # Load package of GO terms
+ if(!suppressPackageStartupMessages(require("GO.db", character.only = TRUE))) stop("GO.db package not available")
+
+ # Load species annotation package
+ DB <- paste("org", species, "eg", "db", sep = ".")
+ if(!suppressPackageStartupMessages(require(DB, character.only = TRUE))) stop(DB,"package not available")
+
+ # Get gene-GOterm mappings, and remove duplicate entries
+ GO2ALLEGS <- paste("org", species, "egGO2ALLEGS", sep = ".")
+ if(is.null(universe)) {
+ EG.GO <- toTable(get(GO2ALLEGS))
+ d <- duplicated(EG.GO[,c("gene_id", "go_id", "Ontology")])
+ EG.GO <- EG.GO[!d, ]
+ universe <- unique(EG.GO$gene_id)
+ universe <- as.character(universe)
+ } else {
+
+ universe <- as.character(universe)
+
+ dup <- duplicated(universe)
+ if(!is.null(prior.prob)) {
+ if(length(prior.prob)!=length(universe)) stop("length(prior.prob) must equal length(universe)")
+ prior.prob <- prior.prob[!dup]
+ }
+ universe <- universe[!dup]
+
+ GO2ALLEGS <- get(GO2ALLEGS)
+ m <- match(Lkeys(GO2ALLEGS),universe,0L)
+ universe <- universe[m]
+ if(!is.null(prior.prob)) prior.prob <- prior.prob[m]
+
+ Lkeys(GO2ALLEGS) <- universe
+ EG.GO <- toTable(GO2ALLEGS)
+ d <- duplicated(EG.GO[,c("gene_id", "go_id", "Ontology")])
+ EG.GO <- EG.GO[!d, ]
+ }
+
+ Total <- length(unique(EG.GO$gene_id))
+ if(Total<1L) stop("No genes found in universe")
+
+ # Check prior probabilities
+ if(!is.null(prior.prob)) {
+ if(length(prior.prob)!=length(universe)) stop("length(prior.prob) must equal length(universe)")
+ }
+
+ # Overlap with DE genes
+ isDE <- lapply(de, function(x) EG.GO$gene_id %in% x)
+ TotalDE <- lapply(isDE, function(x) length(unique(EG.GO$gene_id[x])))
+ nDE <- length(isDE)
+
+ if(length(prior.prob)) {
+ # Probability weight for each gene
+ m <- match(EG.GO$gene_id, universe)
+ PW2 <- list(prior.prob[m])
+ X <- do.call(cbind, c(N=1, isDE, PW=PW2))
+ } else
+ X <- do.call(cbind, c(N=1, isDE))
+
+ group <- paste(EG.GO$go_id, EG.GO$Ontology, sep=".")
+ S <- rowsum(X, group=group, reorder=FALSE)
+
+ P <- matrix(0, nrow = nrow(S), ncol = nsets)
+
+ if(length(prior.prob)) {
+
+ # Calculate weight
+ require("BiasedUrn", character.only = TRUE)
+ PW.ALL <- sum(prior.prob[universe %in% EG.GO$gene_id])
+ AVE.PW <- S[,"PW"]/S[,"N"]
+ W <- AVE.PW*(Total-S[,"N"])/(PW.ALL-S[,"N"]*AVE.PW)
+
+ # Wallenius' noncentral hypergeometric test
+ for(j in 1:nsets) for(i in 1:nrow(S))
+ P[i,j] <- pWNCHypergeo(S[i,1+j], S[i,"N"], Total-S[i,"N"], TotalDE[[j]], W[i],lower.tail=FALSE) + dWNCHypergeo(S[i,1+j], S[i,"N"], Total-S[i,"N"], TotalDE[[j]], W[i])
+
+ S <- S[,-ncol(S)]
+
+ } else {
+
+ # Fisher's exact test
+ for(j in 1:nsets)
+ P[,j] <- phyper(q=S[,1+j]-0.5,m=TotalDE[[j]],n=Total-TotalDE[[j]], k=S[,"N"],lower.tail=FALSE)
+
+ }
+
+ # Assemble output
+ g <- strsplit2(rownames(S),split="\\.")
+ TERM <- select(GO.db,keys=g[,1],columns="TERM")
+ Results <- data.frame(Term = TERM[[2]], Ont = g[,2], S, P, stringsAsFactors=FALSE)
+ rownames(Results) <- g[,1]
+
+ # Name p-value columns
+ colnames(Results)[3+nsets+(1L:nsets)] <- paste0("P.", names(de))
+
+ Results
+}
+
+topGO <- function(results, ontology = c("BP", "CC", "MF"), sort = NULL, number = 20L)
+# Extract sorted results from goana output
+# Gordon Smyth and Yifang Hu
+# Created 20 June 2014. Last modified 29 August 2014.
+{
+ # Check results
+ if(!is.data.frame(results)) stop("results should be a data.frame.")
+
+ # Number of gene sets
+ nsets <- (ncol(results)-3L) %/% 2L
+ setnames <- colnames(results)[4L:(3L+nsets)]
+
+ # Check ontology
+ ontology <- match.arg(unique(ontology), c("BP", "CC", "MF"), several.ok = TRUE)
+
+ # Check sort and find p-value column
+ if(is.null(sort)) {
+ P.col <- 4L+nsets
+ } else {
+ sort <- as.character(sort[1])
+ if(sort=="up") sort="Up"
+ if(sort=="down") sort="Down"
+ sort <- paste0("^",sort,"$")
+ P.col <- grep(sort, setnames)
+ if(!length(P.col)) stop("sort column not found")
+ P.col <- 3L+nsets+P.col
+ }
+
+ # Check number
+ if(!is.numeric(number)) stop("Need to input number.")
+ if(number < 1L) return(data.frame())
+
+ # Limit results to specified ontologies
+ if(length(ontology) < 3L) {
+ sel <- results$Ont %in% ontology
+ results <- results[sel,]
+ }
+
+ # Sort by p-value
+ o <- order(results[,P.col], rownames(results))
+ if(number < length(o)) o <- o[1:number]
+
+ results[o,,drop=FALSE]
+}
diff --git a/R/lmfit.R b/R/lmfit.R
index 4ba39ab..4908841 100644
--- a/R/lmfit.R
+++ b/R/lmfit.R
@@ -390,7 +390,7 @@ getEAWP <- function(object)
# Given any microarray data object, extract basic information needed
# for linear modelling.
# Gordon Smyth
-# 9 March 2008. Last modified 26 June 2013.
+# 9 March 2008. Last modified 19 Sep 2014.
{
y <- list()
@@ -412,6 +412,7 @@ getEAWP <- function(object)
y$exprs <- exprs(object)
if(length(object at featureData@data)) y$probes <- object at featureData@data
y$Amean <- rowMeans(y$exprs,na.rm=TRUE)
+ if("weights" %in% assayDataElementNames(object)) y$weights <- assayDataElement(object,"weights")
} else {
if(is(object,"PLMset")) {
y$exprs <- object at chip.coefs
diff --git a/R/plot.R b/R/plot.R
index 252a43d..904f8ec 100644
--- a/R/plot.R
+++ b/R/plot.R
@@ -1,70 +1,64 @@
## PLOT.R
-## plot() produces MA plots on expression objects
+## plot() produces MA plots (aka mean difference plots) on expression objects
-plot.RGList <- function(x, y, array=1, xlab="A", ylab="M", main=colnames(x)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plot.RGList <- function(x, y, array=1, xlab="A", ylab="M", main=colnames(x)[array], status=x$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth
-# Created 21 March 2014
+# Created 21 March 2014. Last modified 18 Sep 2014.
{
MA <- MA.RG(x[,array])
- plot.MAList(MA,array=1,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend,zero.weights=zero.weights,...)
+ plot.MAList(x=MA,array=1,xlab=xlab,ylab=ylab,main=main,status=status,zero.weights=zero.weights,...)
}
-plot.MAList <- function(x, y, array=1, xlab="A", ylab="M", main=colnames(x)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plot.MAList <- function(x, y, array=1, xlab="A", ylab="M", main=colnames(x)[array], status=x$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth
-# Created 21 March 2014
+# Created 21 March 2014. Last modified 18 Sep 2014.
{
- MA <- x
- x <- as.matrix(MA$A)[,array]
- y <- as.matrix(MA$M)[,array]
- if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,array]
- if(missing(status)) status <- MA$genes$Status
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+ A <- as.matrix(x$A)[,array]
+ M <- as.matrix(x$M)[,array]
+ if(!zero.weights && !is.null(x$weights)) {
+ w <- as.matrix(x$weights)[,array]
+ M[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+ plotWithHighlights(x=A,y=M,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
-plot.MArrayLM <- function(x, y, coef=ncol(x), xlab="AveExpr", ylab="logFC", main=colnames(x)[coef], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plot.MArrayLM <- function(x, y, coef=ncol(x), xlab="Average log-expression", ylab="log-fold-change", main=colnames(x)[coef], status=x$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth
-# Created 21 March 2014
+# Created 21 March 2014. Last modified 18 Sep 2014.
{
- fit <- x
- if(is.null(fit$Amean)) stop("MA-plot not possible because Amean component is absent.")
- x <- fit$Amean
- y <- as.matrix(fit$coef)[,coef]
- if(is.null(fit$weights)) w <- NULL else w <- as.matrix(fit$weights)[,coef]
- if(missing(status)) status <- fit$genes$Status
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+ if(is.null(x$Amean)) stop("Amean component is absent.")
+ logFC <- as.matrix(x$coef)[,coef]
+ if(!zero.weights && !is.null(x$weights)) {
+ w <- as.matrix(x$weights)[,coef]
+ logFC[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+ plotWithHighlights(x=x$Amean,y=logFC,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
-plot.EList <- function(x, y, array=1, xlab="A", ylab="M", main=colnames(x)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plot.EList <- function(x, y, array=1, xlab="Average log-expression", ylab="Expression log-ratio (this sample vs others)", main=colnames(x)[array], status=x$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth
-# Created 21 March 2014
+# Created 21 March 2014. Last modified 18 Sep 2014.
{
- E <- x
- E$E <- as.matrix(E$E)
- narrays <- ncol(E$E)
- if(narrays < 2) stop("Need at least two arrays")
- if(narrays > 5)
- x <- apply(E$E,1,median,na.rm=TRUE)
- else
- x <- rowMeans(E$E,na.rm=TRUE)
- y <- E$E[,array]-x
- if(is.null(E$weights)) w <- NULL else w <- as.matrix(E$weights)[,array]
- if(missing(status)) status <- E$genes$Status
+ E <- as.matrix(E$E)
+ if(ncol(E) < 2) stop("Need at least two arrays")
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+# Convert array to integer if not already
+ j <- 1L:ncol(E)
+ names(j) <- colnames(E)
+ array <- j[array[1]]
+
+ AveOfOthers <- rowMeans(E[,-array,drop=FALSE],na.rm=TRUE)
+ Diff <- E[,array]-AveOfOthers
+ Mean <- (E[,array]+AveOfOthers)/2
+
+ if(!zero.weights && !is.null(x$weights)) {
+ w <- as.matrix(x$weights)[,array]
+ Diff[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+
+ plotWithHighlights(x=Mean,y=Diff,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
diff --git a/R/plotExons.R b/R/plotExons.R
new file mode 100644
index 0000000..c680a75
--- /dev/null
+++ b/R/plotExons.R
@@ -0,0 +1,100 @@
+plotExons <- function(fit, coef = ncol(fit), geneid = NULL, genecolname = "GeneID", exoncolname = NULL, rank = 1L, FDR = 0.05)
+# Plot log2 fold changes of the exons and mark the differentially expressed exons
+# Yifang Hu and Gordon Smyth.
+# Created 1 May 2014. Last modified 7 October 2014.
+{
+
+ # Check input
+ if(!is(fit, "MArrayLM")) stop("fit is not MArrayLM object")
+ if(is.null(fit$p.value)) stop("fit object should contain p value from eBayes")
+ if(is.null(fit$method)) stop("fit object should have fitting method from least square or robust regression")
+
+ genecolname <- as.character(genecolname)[1]
+ if(! (genecolname %in% colnames(fit$genes))) stop(paste("genecolname", genecolname, "not found"))
+
+ if( ! is.null(exoncolname)) {
+
+ exoncolname <- as.character(exoncolname)[1]
+ if( ! (exoncolname %in% colnames(fit$genes))) stop(paste("exoncolname", exoncolname, "not found"))
+
+ }
+
+ # Gene to plot using rank
+ if (is.null(geneid)) {
+
+ if (rank == 1L) igene <- which.min(fit$p.value[, coef])
+
+ else {
+
+ o.p <- order(fit$p.value[, coef])
+ fit.o <- fit[o.p,]
+ fit.uniq <- fit.o[!duplicated(fit.o$genes[, genecolname]),]
+ igene <- match(fit.uniq$genes[rank, genecolname], fit$genes[, genecolname])
+
+ }
+
+ # Gene to plot using geneid
+ } else {
+
+ geneid <- as.character(geneid)
+ igene <- match(geneid[1], as.character(fit$genes[, genecolname]))
+ if (is.na(igene)) stop(paste("geneid", geneid, "not found"))
+
+ }
+
+ # Gene annotation
+ ilab <- grep(paste0(genecolname, "|geneid|symbol"), colnames(fit$genes), ignore.case = TRUE)
+ genecollab <- colnames(fit$genes)[ilab]
+ genelab <- paste(sapply(fit$genes[igene,genecollab], as.character), collapse = ", ")
+
+ # Get strand if possible
+ strcol <- grepl("strand", colnames(fit$genes), ignore.case = TRUE)
+ if(any(strcol)) genelab <- paste0(genelab, " (", as.character(fit$genes[igene, strcol])[1], ")")
+
+ # Exon level DE results
+ fdr <- p.adjust(fit$p.value[, coef], method = "BH")
+ de <- data.frame(fit$genes, logFC = fit$coefficients[, coef], fdr)
+ m <- which(de[, genecolname] %in% fit$genes[igene, genecolname])
+ exon.de <- de[m,]
+
+ # Add exon identifier
+ if(is.null(exoncolname)) {
+
+ exoncolname <- "ID"
+ exon.de$ID <- 1:nrow(exon.de)
+
+ }
+
+ # Order exon level results by exon identifier
+ exon.de <- exon.de[order(exon.de[,exoncolname]),]
+
+ # Plot exons
+ plot(exon.de$logFC, xlab = "", ylab = "Log2 Fold Change", main = genelab, type = "b", xaxt = "n")
+ axis(1, at = 1:nrow(exon.de), labels = exon.de[, exoncolname], las = 2, cex.axis = 0.5)
+ mtext(text = paste("Exon", exoncolname), side = 1, padj = 5.2)
+
+ # Mark DE exons
+ mark <- exon.de$fdr < FDR
+
+ if (sum(mark) > 0) {
+
+ col <- rep(NA, nrow(exon.de))
+ col[mark & (exon.de$logFC > 0)] <- "red"
+ col[mark & (exon.de$logFC < 0)] <- "blue"
+
+ if (sum(mark) == 1) cex <- 1.5 else {
+
+ abs.fdr <- abs(log10(exon.de$fdr[mark]))
+ from <- range(abs.fdr)
+ to <- c(1, 2)
+ cex <- (abs.fdr - from[1])/diff(from) * diff(to) + to[1]
+
+ }
+
+ points((1:nrow(exon.de))[mark], exon.de$logFC[mark], col = col[mark], pch = 16, cex = cex)
+
+ }
+
+ abline(h = mean(exon.de$logFC), lty = 2)
+
+}
diff --git a/R/plotWithHighlights.R b/R/plotWithHighlights.R
new file mode 100644
index 0000000..fa4691c
--- /dev/null
+++ b/R/plotWithHighlights.R
@@ -0,0 +1,91 @@
+plotWithHighlights <- function(x, y, status=NULL, values=NULL, pch=16, col=NULL, cex=1, legend="topleft", pch.bg=16, col.bg="black", cex.bg=0.3, ...)
+# Scatterplot with color coding for special points
+
+# Replaces the earlier function .plotMAxy, which in turn was based the original plotMA
+# created by Gordon Smyth 7 April 2003 and modified by James Wettenhall 27 June 2003.
+
+# Gordon Smyth
+# Last modified 17 April 2014.
+{
+# If no status information, just plot all points normally
+ if(is.null(status) || all(is.na(status))) {
+ plot(x,y,pch=pch.bg,col=col.bg,cex=cex.bg,...)
+ return(invisible())
+ }
+# From here, status is not NULL and not all NA
+
+# Check values
+ if(is.null(values)) {
+
+# Check for values and graphics parameters set as attributes by controlStatus()
+ if(!is.null(attr(status,"values"))) {
+ values <- attr(status,"values")
+ if(!is.null(attr(status,"pch"))) pch <- attr(status,"pch")
+ if(!is.null(attr(status,"col"))) col <- attr(status,"col")
+ if(!is.null(attr(status,"cex"))) cex <- attr(status,"cex")
+ }
+
+# Default is to set the most frequent status value as background, and to highlight other status values in decreasing order of frequency
+ if(is.null(values)) {
+ status.values <- names(sort(table(status),decreasing=TRUE))
+ status <- as.character(status)
+ values <- status.values[-1]
+ }
+ }
+
+# If no values, then just plot all points normally
+ nvalues <- length(values)
+ if(nvalues==0L) {
+ plot(x,y,pch=pch.bg,col=col.bg,cex=cex.bg,...)
+ return(invisible())
+ }
+# From here, values has positive length
+
+# Setup plot axes
+ plot(x,y,type="n",...)
+
+# Plot background (non-highlighted) points
+ bg <- !(status %in% values)
+ bg[is.na(bg)] <- TRUE
+ nonhi <- any(bg)
+ if(nonhi) points(x[bg],y[bg],pch=pch.bg[1],col=col.bg[1],cex=cex.bg[1])
+
+# Check graphical parameters for highlighted points
+ pch <- rep_len(unlist(pch),length.out=nvalues)
+ cex <- rep_len(unlist(cex),length.out=nvalues)
+ if(is.null(col)) col <- nonhi + 1L:nvalues
+ col <- rep_len(unlist(col),length.out=nvalues)
+
+# Plot highlighted points
+ for (i in 1:nvalues) {
+ sel <- status==values[i]
+ points(x[sel],y[sel],pch=pch[i],cex=cex[i],col=col[i])
+ }
+
+# Check legend
+ if(is.logical(legend)) {
+ legend.position <- "topleft"
+ } else {
+ legend.position <- as.character(legend)
+ legend <- TRUE
+ }
+ legend.position <- match.arg(legend.position,c("bottomright","bottom","bottomleft","left","topleft","top","topright","right","center"))
+
+# Plot legend
+ if(legend) {
+ if(nonhi) {
+# Include background value in legend
+ bg.value <- unique(status[bg])
+ if(length(bg.value) > 1) bg.value <- "Other"
+ values <- c(bg.value,values)
+ pch <- c(pch.bg,pch)
+ col <- c(col.bg,col)
+ cex <- c(cex.bg,cex)
+ }
+ h <- cex>0.5
+ cex[h] <- 0.5+0.8*(cex[h]-0.5)
+ legend(legend.position,legend=values,pch=pch,,col=col,cex=0.9,pt.cex=cex)
+ }
+
+ invisible()
+}
diff --git a/R/plotdensities.R b/R/plotdensities.R
index 874ff7e..97bf294 100755
--- a/R/plotdensities.R
+++ b/R/plotdensities.R
@@ -3,12 +3,12 @@
plotDensities <- function(object,...)
UseMethod("plotDensities")
-plotDensities.RGList <- function(object,log=TRUE,group=NULL,col=NULL,main="RG Densities",...)
+plotDensities.RGList <- function(object,log=TRUE,group=NULL,col=NULL,main="RG Densities",bc.method="subtract",...)
# Plot empirical single-channel densities
# Original version by Natalie Thorne, 9 September 2003
-# Modified by Gordon Smyth. Last modified 18 Nov 2013.
+# Modified by Gordon Smyth. Last modified 10 Sep 2014.
{
- object <- backgroundCorrect(object,method="subtract")
+ object <- backgroundCorrect(object,method=bc.method)
narray <- ncol(object)
E <- cbind(object$R,object$G)
@@ -23,13 +23,13 @@ plotDensities.RGList <- function(object,log=TRUE,group=NULL,col=NULL,main="RG De
group2 <- c(group,group)
}
- plotDensities(E,group=group2,col=col2,main=main)
+ NextMethod(object=E,group=group2,col=col2,main=main,...)
}
plotDensities.MAList <- function(object,log=TRUE,group=NULL,col=NULL,main="RG Densities",...)
# Plot empirical single-channel densities
# Original version by Natalie Thorne, 9 September 2003
-# Modified by Gordon Smyth. Last modified 18 Nov 2013.
+# Modified by Gordon Smyth. Last modified 10 Sep 2014.
{
narray <- ncol(object)
E <- cbind(object$A+object$M/2, object$A-object$M/2)
@@ -44,28 +44,32 @@ plotDensities.MAList <- function(object,log=TRUE,group=NULL,col=NULL,main="RG De
group2 <- c(group,group)
}
- plotDensities(E,group=group2,col=col2,main=main)
+ NextMethod(object=E,group=group2,col=col2,main=main,...)
}
-plotDensities.EListRaw <- function(object,log=TRUE,group=NULL,col=NULL,main=NULL,...)
+plotDensities.EListRaw <- function(object,log=TRUE,group=NULL,col=NULL,main=NULL,bc.method="subtract",...)
+# Gordon Smyth.
+# Created 23 March 2009. Last modified 10 Sep 2014.
{
- object <- backgroundCorrect(object,method="subtract")
+ object <- backgroundCorrect(object,method=bc.method)
E <- object$E
if(log) E <- log2(E+1)
- plotDensities(E,group=group,col=col,main=main)
+ NextMethod(object=E,group=group,col=col,main=main,...)
}
plotDensities.EList <- function(object,log=TRUE,group=NULL,col=NULL,main=NULL,...)
+# Gordon Smyth.
+# Created 23 March 2009. Last modified 10 Sep 2014.
{
E <- object$E
if(!log) E <- 2^E
- plotDensities(E,group=group,col=col,main=main)
+ NextMethod(object=E,group=group,col=col,main=main,...)
}
plotDensities.default <- function(object,group=NULL,col=NULL,main=NULL,...)
# Plot empirical single-channel densities
# Gordon Smyth
-# 18 Nov 2013. Last modified 18 Nov 2013.
+# 18 Nov 2013. Last modified 10 Sep 2014.
{
# Coerce object to matrix
E <- as.matrix(object)
@@ -89,7 +93,7 @@ plotDensities.default <- function(object,group=NULL,col=NULL,main=NULL,...)
npoint <- 512
X <- Y <- matrix(0,npoint,narray)
for (a in 1:ncol(E)) {
- d <- density(E[,a],n=npoint)
+ d <- density(E[,a],n=npoint,na.rm=TRUE,...)
X[,a] <- d$x
Y[,a] <- d$y
}
diff --git a/R/plots-ma.R b/R/plots-ma.R
index 954456c..1ce2b44 100755
--- a/R/plots-ma.R
+++ b/R/plots-ma.R
@@ -1,206 +1,111 @@
## PLOTS-MA.R
## M-A PLOTS
-plotMA <- function(MA,...) UseMethod("plotMA")
+plotMA <- function(object,...) UseMethod("plotMA")
-plotMA.RGList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plotMA.RGList <- function(object, array=1, xlab="A", ylab="M", main=colnames(object)[array], status=object$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 23 April 2013.
+# Last modified 18 Sep 2014.
{
- MA <- MA.RG(MA[,array])
- plotMA(MA,array=1,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend,zero.weights=zero.weights,...)
+ object <- MA.RG(object[,array])
+ plotMA.MAList(object=object,array=1,xlab=xlab,ylab=ylab,main=main,status=status,zero.weights=zero.weights,...)
}
-plotMA.MAList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plotMA.MAList <- function(object, array=1, xlab="A", ylab="M", main=colnames(object)[array], status=object$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 23 April 2013.
+# Last modified 18 Sep 2014.
{
- x <- as.matrix(MA$A)[,array]
- y <- as.matrix(MA$M)[,array]
- if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,array]
- if(missing(status)) status <- MA$genes$Status
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+ A <- as.matrix(object$A)[,array]
+ M <- as.matrix(object$M)[,array]
+ if(!zero.weights && !is.null(object$weights)) {
+ w <- as.matrix(object$weights)[,array]
+ M[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+ plotWithHighlights(x=A,y=M,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
-plotMA.MArrayLM <- function(MA, coef=ncol(MA), xlab="AveExpr", ylab="logFC", main=colnames(MA)[coef], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plotMA.MArrayLM <- function(object, coef=ncol(object), xlab="Average log-expression", ylab="log-fold-change", main=colnames(object)[coef], status=object$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 21 March 2014.
+# Last modified 18 Sep 2014.
{
- if(is.null(MA$Amean)) stop("MA-plot not possible because Amean component is absent.")
- x <- MA$Amean
- y <- as.matrix(MA$coef)[,coef]
- if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,coef]
- if(missing(status)) status <- MA$genes$Status
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+ if(is.null(object$Amean)) stop("Amean component is absent.")
+ logFC <- as.matrix(object$coef)[,coef]
+ if(!zero.weights && !is.null(object$weights)) {
+ w <- as.matrix(object$weights)[,array]
+ logFC[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+ plotWithHighlights(x=object$Amean,y=logFC,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
-plotMA.EList <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
+plotMA.EList <- function(object, array=1, xlab="Average log-expression", ylab="Expression log-ratio (this sample vs others)", main=colnames(object)[array], status=object$genes$Status, zero.weights=FALSE, ...)
# MA-plot with color coding for controls
# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 23 April 2013.
+# Last modified 18 Sep 2014.
{
- MA$E <- as.matrix(MA$E)
- narrays <- ncol(MA$E)
- if(narrays < 2) stop("Need at least two arrays")
- if(narrays > 5)
- x <- apply(MA$E,1,median,na.rm=TRUE)
- else
- x <- rowMeans(MA$E,na.rm=TRUE)
- y <- MA$E[,array]-x
- if(is.null(MA$weights)) w <- NULL else w <- as.matrix(MA$weights)[,array]
- if(missing(status)) status <- MA$genes$Status
+ E <- as.matrix(object$E)
+ if(ncol(E) < 2) stop("Need at least two arrays")
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
- }
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
-}
+# Convert array to integer if not already
+ j <- 1L:ncol(E)
+ names(j) <- colnames(E)
+ array <- j[array[1]]
-plotMA.default <- function(MA, array=1, xlab="A", ylab="M", main=colnames(MA)[array], xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, zero.weights=FALSE, ...)
-# MA-plot with color coding for controls
-# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 23 April 2013.
-{
-# Data is assumed to be single-channel
- MA <- as.matrix(MA)
- narrays <- ncol(MA)
- if(narrays < 2) stop("Need at least two arrays")
- if(narrays > 5)
- x <- apply(MA,1,median,na.rm=TRUE)
- else
- x <- rowMeans(MA,na.rm=TRUE)
- y <- MA[,array]-x
- w <- NULL
- if(missing(status)) status <- NULL
+ AveOfOthers <- rowMeans(E[,-array,drop=FALSE],na.rm=TRUE)
+ Diff <- E[,array]-AveOfOthers
+ Mean <- (E[,array]+AveOfOthers)/2
- if(!is.null(w) && !zero.weights) {
- i <- is.na(w) | (w <= 0)
- y[i] <- NA
+ if(!zero.weights && !is.null(object$weights)) {
+ w <- as.matrix(object$weights)[,array]
+ Diff[ is.na(w) | (w <= 0) ] <- NA
}
- .plotMAxy(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,status=status,values=values,pch=pch,col=col,cex=cex,legend=legend, ...)
+
+ plotWithHighlights(x=Mean,y=Diff,xlab=xlab,ylab=ylab,main=main,status=status,...)
}
-.plotMAxy <- function(x, y, xlab="A", ylab="M", main=NULL, xlim=NULL, ylim=NULL, status, values, pch, col, cex, legend=TRUE, ...)
+plotMA.default <- function(object, array=1, xlab="Average log-expression", ylab="Expression log-ratio (this sample vs others)", main=colnames(object)[array], status=NULL, ...)
# MA-plot with color coding for controls
# Gordon Smyth 7 April 2003, James Wettenhall 27 June 2003.
-# Last modified 23 April 2013.
+# Last modified 18 Sep 2014.
{
-# Check legend
- legend.position <- "topleft"
- if(!is.logical(legend)) {
- legend.position <- legend
- legend <- TRUE
- }
- legend.position <- match.arg(legend.position,c("bottomright","bottom","bottomleft","left","topleft","top","topright","right","center"))
-
-# Check xlim and ylim
- if(is.null(xlim)) xlim <- range(x,na.rm=TRUE)
- if(is.null(ylim)) ylim <- range(y,na.rm=TRUE)
-
-# Setup plot axes
- plot(x,y,xlab=xlab,ylab=ylab,main=main,xlim=xlim,ylim=ylim,type="n",...)
-
-# If no status information, just plot points normally
- if(is.null(status) || all(is.na(status))) {
- if(missing(pch)) pch <- 16
- if(missing(cex)) cex <- 0.3
- points(x,y,pch=pch[[1]],cex=cex[1])
- return(invisible())
- }
-
-# From here, status is not NULL and not all missing
-
-# Check values
- if(missing(values)) {
- if(is.null(attr(status,"values")))
- values <- names(sort(table(status),decreasing=TRUE))
- else
- values <- attr(status,"values")
- }
- nvalues <- length(values)
-
-# Plot non-highlighted points
- sel <- !(status %in% values)
- nonhi <- any(sel)
- if(nonhi) points(x[sel],y[sel],pch=16,cex=0.3)
-
- if(missing(pch)) {
- if(is.null(attr(status,"pch")))
- pch <- rep(16,nvalues)
- else
- pch <- attr(status,"pch")
- }
-
- if(missing(cex)) {
- if(is.null(attr(status,"cex"))) {
- cex <- rep(1,nvalues)
- if(!nonhi) cex[1] <- 0.3
- } else
- cex <- attr(status,"cex")
- }
-
- if(missing(col)) {
- if(is.null(attr(status,"col"))) {
- col <- nonhi + 1:nvalues
- } else
- col <- attr(status,"col")
- }
-
- pch <- rep(pch,length=nvalues)
- col <- rep(col,length=nvalues)
- cex <- rep(cex,length=nvalues)
-
-# Plot highlighted classes of points
- for (i in 1:nvalues) {
- sel <- status==values[i]
- points(x[sel],y[sel],pch=pch[[i]],cex=cex[i],col=col[i])
- }
-
- if(legend) {
- if(is.list(pch))
- legend(legend.position,legend=values,fill=col,col=col,cex=0.9)
- else
- legend(legend.position,legend=values,pch=pch,,col=col,cex=0.9)
- }
- invisible()
+# Data is assumed to be single-channel
+ object <- as.matrix(object)
+ narrays <- ncol(object)
+ if(narrays<2) stop("Need at least two columns")
+ array <- as.integer(array[1L])
+ Ave <- rowMeans(object[,-array,drop=FALSE],na.rm=TRUE)
+ y <- object[,array]-Ave
+ x <- (object[,array]+Ave)/2
+
+ plotWithHighlights(x,y,xlab=xlab,ylab=ylab,main=main,status=status, ...)
}
-plotMA3by2 <- function(MA, prefix="MA", path=NULL, main=colnames(MA), zero.weights=FALSE, common.lim=TRUE, device="png", ...)
+plotMA3by2 <- function(object, prefix="MA", path=NULL, main=colnames(object), zero.weights=FALSE, common.lim=TRUE, device="png", ...)
# Make files of MA-plots, six to a page
# Gordon Smyth 27 May 2004. Last modified 12 Feb 2014.
{
- if(is(MA,"RGList")) MA <- MA.RG(MA)
- if(is(MA,"EListRaw")) {
- A <- rowMeans(log2(MA$E))
- MA$A <- array(A,dim(MA))
- MA$M <- log2(MA$E)-MA$A
- MA <- new("MAList",unclass(MA))
- }
- if(is(MA,"EList")) {
- A <- rowMeans(MA$E)
- MA$A <- array(A,dim(MA))
- MA$M <- MA$E-MA$A
- MA <- new("MAList",unclass(MA))
- }
- if(is.matrix(MA)) {
- A <- rowMeans(MA)
- MA <- new("MAList",list(M=MA-A, A=array(A,dim(MA))))
+ if(is(object,"RGList")) object <- MA.RG(object)
+ if(is(object,"EListRaw")) {
+ A <- rowMeans(log2(object$E))
+ object$A <- array(A,dim(object))
+ object$M <- log2(object$E)-object$A
+ object <- new("MAList",unclass(object))
+ }
+ if(is(object,"EList")) {
+ A <- rowMeans(object$E)
+ object$A <- array(A,dim(object))
+ object$M <- object$E-object$A
+ object <- new("MAList",unclass(object))
+ }
+ if(is.matrix(object)) {
+ A <- rowMeans(object)
+ object <- new("MAList",list(M=object-A, A=array(A,dim(object))))
}
if(is.null(path)) path <- "."
prefix <- file.path(path,prefix)
- narrays <- ncol(MA)
+ narrays <- ncol(object)
npages <- ceiling(narrays/6)
device <- match.arg(device, c("png","jpeg","pdf","postscript"))
if(device=="png" & !capabilities(what="png")) stop("png not available. Try another device.")
@@ -214,10 +119,10 @@ plotMA3by2 <- function(MA, prefix="MA", path=NULL, main=colnames(MA), zero.weigh
height <- height * 140
}
- if(!zero.weights && !is.null(MA$weights)) MA$M[MA$weights<=0] <- NA
+ if(!zero.weights && !is.null(object$weights)) object$M[object$weights<=0] <- NA
if(common.lim) {
- xlim <- range(MA$A,na.rm=TRUE)
- ylim <- range(MA$M,na.rm=TRUE)
+ xlim <- range(object$A,na.rm=TRUE)
+ ylim <- range(object$M,na.rm=TRUE)
} else {
xlim <- ylim <- NULL
}
@@ -227,7 +132,7 @@ plotMA3by2 <- function(MA, prefix="MA", path=NULL, main=colnames(MA), zero.weigh
fdevice(file=paste(prefix,"-",i1,"-",i2,".",ext,sep=""),width=width,height=height)
par(mfrow=c(3,2))
for (i in i1:i2) {
- plotMA(MA,array=i,xlim=xlim,ylim=ylim,legend=(i%%6==1),zero.weights=TRUE,main=main[i],...)
+ plotMA(object,array=i,xlim=xlim,ylim=ylim,legend=(i%%6==1),zero.weights=TRUE,main=main[i],...)
}
dev.off()
}
@@ -251,12 +156,12 @@ plotPrintTipLoess <- function(object,layout,array=1,span=0.4,...) {
coplot(y~x|gr*gc,data=na.omit(df),xlab=c("A","Tip Column"),ylab=c("M","Tip Row"),pch=".",span=span,show.given=FALSE,panel=panel.smooth)
}
-mdplot <- function(x,...)
+mdplot <- function(x,xlab="Mean",ylab="Difference",...)
# Mean-difference plot
# Gordon Smyth
-# 16 March 2005
+# 16 March 2005. Last modified 30 Sep 2014.
{
d <- x[,1]-x[,2]
m <- (x[,1]+x[,2])/2
- plot(m,d,xlab="Mean",ylab="Difference",...)
+ plotWithHighlights(x=m,y=d,xlab=xlab,ylab=ylab,...)
}
diff --git a/R/read-ilmn.R b/R/read-ilmn.R
index 8530aba..568594b 100644
--- a/R/read-ilmn.R
+++ b/R/read-ilmn.R
@@ -3,7 +3,7 @@
read.ilmn <- function(files=NULL, ctrlfiles=NULL, path=NULL, ctrlpath=NULL, probeid="Probe", annotation=c("TargetID", "SYMBOL"), expr="AVG_Signal", other.columns="Detection",sep="\t", quote="\"", verbose=TRUE, ...)
# Read one or more files of Illumina BeadStudio output
# Wei Shi and Gordon Smyth.
-# Created 15 July 2009. Last modified 27 November 2013.
+# Created 15 July 2009. Last modified 21 July 2014.
{
if(!is.null(files)){
f <- unique(files)
@@ -34,12 +34,22 @@ read.ilmn <- function(files=NULL, ctrlfiles=NULL, path=NULL, ctrlpath=NULL, prob
else
elist.ctrl <- cbind(elist.ctrl, elist.ctrl1)
}
- elist.ctrl$genes$Status <- elist.ctrl$genes[,ncol(elist.ctrl$genes)]
+
+ if(is.null(elist.ctrl$genes)) elist.ctrl$genes <- data.frame(Status = rep("negative", nrow(elist.ctrl)))
+ else {
+ ctrl.status.negative <- apply(elist.ctrl$genes, 2, function(x) sum(tolower(x) %in% "negative"))
+ STATUS.col <- which.max(ctrl.status.negative)
+ if(ctrl.status.negative[STATUS.col] > 0) elist.ctrl$genes$Status <- elist.ctrl$genes[[STATUS.col]]
+ else elist.ctrl$genes$Status <- "negative"
+ }
}
if(!is.null(files))
if(!is.null(ctrlfiles)){
- colnames(elist.ctrl$genes) <- colnames(elist$genes)
+ REG.col <- setdiff(colnames(elist$genes), colnames(elist.ctrl$genes))
+ if(length(REG.col)) for(i in REG.col) elist.ctrl$genes[[i]] <- NA
+ CTRL.col <- setdiff(colnames(elist.ctrl$genes), colnames(elist$genes))
+ if(length(CTRL.col)) for(i in CTRL.col) elist$genes[[i]] <- NA
return(rbind(elist, elist.ctrl))
}
else
@@ -64,7 +74,7 @@ read.ilmn.targets <- function(targets, ...)
.read.oneilmnfile <- function(fname, probeid, annotation, expr, other.columns, sep, quote, verbose, ...)
# Read a single file of Illumina BeadStudio output
# Wei Shi and Gordon Smyth
-# Created 15 July 2009. Last modified 16 June 2014.
+# Created 15 July 2009. Last modified 21 July 2014.
{
h <- readGenericHeader(fname,columns=expr,sep=sep)
skip <- h$NHeaderRecords
@@ -101,7 +111,7 @@ read.ilmn.targets <- function(targets, ...)
# Add probe annotation
if(length(anncol)) {
elist$genes <- x[,anncol,drop=FALSE]
- if(!any(duplicated(pids))) row.names(elist$genes) <- pids
+ if(length(pids) & !any(duplicated(pids))) row.names(elist$genes) <- pids
}
# elist$targets <- data.frame(SampleNames=snames, stringsAsFactors=FALSE)
diff --git a/R/read-maimages.R b/R/read-maimages.R
index f5fd19e..5fb725e 100644
--- a/R/read-maimages.R
+++ b/R/read-maimages.R
@@ -1,13 +1,21 @@
# READ IMAGE ANALYSIS FILES INTO RGList or EListRaw
read.maimages <- function(files=NULL,source="generic",path=NULL,ext=NULL,names=NULL,columns=NULL,other.columns=NULL,annotation=NULL,green.only=FALSE,wt.fun=NULL,verbose=TRUE,sep="\t",quote=NULL,...)
-# Extracts an RG list from a set of two-color image analysis output files
-# or an EListRaw from a set of one-color files
+# Extracts either an RGList object from a set of two-color image analysis output files
+# or an EListRaw object from a set of one-color files
# Gordon Smyth.
-# 1 Nov 2002. Last revised 27 August 2012.
+# 1 Nov 2002. Last revised 26 August 2014.
{
-# Begin checking input arguments
+# Determine type of input file
+ source <- match.arg(source,c("generic","agilent","agilent.mean","agilent.median","arrayvision","arrayvision.ARM","arrayvision.MTM","bluefuse","genepix","genepix.mean","genepix.median","genepix.custom","imagene","imagene9","quantarray","scanarrayexpress","smd.old","smd","spot","spot.close.open"))
+# source2 is the source type with qualifications removed
+ source2 <- strsplit(source,split=".",fixed=TRUE)[[1]][1]
+
+# ImaGene is special case
+ if(source2=="imagene") return(read.imagene(files=files,path=path,ext=ext,names=names,columns=columns,other.columns=other.columns,wt.fun=wt.fun,verbose=verbose,sep=sep,quote=quote,...))
+
+# Get list of files and associated targets frame
if(is.null(files)) {
if(is.null(ext))
stop("Must specify input files")
@@ -16,29 +24,25 @@ read.maimages <- function(files=NULL,source="generic",path=NULL,ext=NULL,names=N
files <- dir(path=ifelse(is.null(path),".",path),pattern=extregex)
files <- sub(extregex,"",files)
}
- }
-
- source <- match.arg(source,c("generic","agilent","agilent.mean","agilent.median","arrayvision","arrayvision.ARM","arrayvision.MTM","bluefuse","genepix","genepix.mean","genepix.median","genepix.custom","imagene","imagene9","quantarray","scanarrayexpress","smd.old","smd","spot","spot.close.open"))
-# source2 is the source type with qualifications removed
- source2 <- strsplit(source,split=".",fixed=TRUE)[[1]][1]
- if(is.null(quote)) if(source2=="agilent") quote <- "" else quote <- "\""
- if(source2=="imagene") return(read.imagene(files=files,path=path,ext=ext,names=names,columns=columns,other.columns=other.columns,wt.fun=wt.fun,verbose=verbose,sep=sep,quote=quote,...))
-
- if(is.data.frame(files)) {
- targets <- files
- files <- files$FileName
- if(is.null(files)) stop("targets frame doesn't contain FileName column")
- if(is.null(names)) names <- targets$Label
- } else {
- targets <- NULL
- }
-
+ } else
+ if(is.data.frame(files)) {
+ targets <- files
+ files <- files$FileName
+ if(is.null(files)) stop("targets frame doesn't contain FileName column")
+ if(is.null(names)) names <- targets$Label
+ } else {
+ targets <- NULL
+ }
slides <- as.vector(as.character(files))
if(!is.null(ext)) slides <- paste(slides,ext,sep=".")
nslides <- length(slides)
+# Default sample names
if(is.null(names)) names <- removeExt(files)
+# Default for quote
+ if(is.null(quote)) if(source2=="agilent") quote <- "" else quote <- "\""
+
if(is.null(columns)) {
if(source2=="generic") stop("must specify columns for generic input")
columns <- switch(source,
diff --git a/R/voom.R b/R/voom.R
index f2a6d68..9cf34cc 100644
--- a/R/voom.R
+++ b/R/voom.R
@@ -1,5 +1,5 @@
voom <- function(counts,design=NULL,lib.size=NULL,normalize.method="none",plot=FALSE,span=0.5,...)
-# Linear modelling of count data mean-variance modelling at the observational level.
+# Linear modelling of count data with mean-variance modelling at the observational level.
# Creates an EList object for entry to lmFit() etc in the limma pipeline.
# Gordon Smyth and Charity Law
# Created 22 June 2011. Last modified 5 June 2013.
diff --git a/R/vooma.R b/R/vooma.R
index e7373d0..0b051be 100644
--- a/R/vooma.R
+++ b/R/vooma.R
@@ -1,5 +1,5 @@
vooma <- function(y,design=NULL,correlation,block=NULL,plot=FALSE,span=NULL)
-# Linear modelling of microarray data mean-variance modelling at the observational level.
+# Linear modelling of microarray data with mean-variance modelling at the observational level.
# Creates an EList object for entry to lmFit() etc in the limma pipeline.
# Gordon Smyth and Charity Law
# Created 31 July 2012. Last modified 5 Aug 2012.
@@ -65,50 +65,45 @@ vooma <- function(y,design=NULL,correlation,block=NULL,plot=FALSE,span=NULL)
y
}
-
-voomaByGroup <- function(y,group,design=NULL,correlation,block=NULL,plot=FALSE,span=NULL,col=NULL)
-# Voom by group
-# Linear modelling of microarray data mean-variance modelling at the observational level by fitting group-specific trends.
-# Creates an EList object for entry to lmFit() etc in the limma pipeline.
-# Charity Law and Gordon Smyth
-# Created 13 February2013.
+voomaByGroup <- function(y,group,design=NULL,correlation,block=NULL,plot=FALSE,span=NULL,col=NULL,lwd=1,alpha=0.5,pch=16,cex=0.3,legend="topright")
+# Vooma by group
+# Linear modelling of microarray data with mean-variance modelling at the observational level by fitting group-specific trends.
+# Creates an EList object for entry to lmFit() etc in the limma pipeline.
+# Charity Law and Gordon Smyth
+# Created 13 Feb 2013. Modified 8 Sept 2014.
{
# Check y
if(!is(y,"EList")) y <- new("EList",list(E=as.matrix(y)))
+ ngenes <- nrow(y)
+ narrays <- ncol(y)
# Check group
group <- as.factor(group)
+ intgroup <- as.integer(group)
+ levgroup <- levels(group)
+ ngroups <- length(levgroup)
# Check design
if(is.null(design)) design <- y$design
- if(is.null(design)) {
- design <- model.matrix(~group)
- colnames(design) <- c("Int", levels(group)[-1])
- }
+ if(is.null(design)) design <- model.matrix(~group)
# Check color
- if(is.null(col)) col <- rainbow(nlevels(group))
- color <- rep(col,length=nlevels(group))
+ if(is.null(col))
+ if(ngroups==2L)
+ col <- c("red","blue")
+ else
+ col <- 1L+1L:ngroups
+ col <- rep_len(col,ngroups)
- w <- matrix(nrow=nrow(y), ncol=ncol(y))
- colnames(w) <- colnames(y)
- rownames(w) <- rownames(y)
- sx <- matrix(nrow=nrow(y), ncol=nlevels(group))
- colnames(sx) <- levels(group)
+ w <- y$E
+ sx <- sy <- matrix(0, nrow=nrow(y), ncol=nlevels(group))
+ colnames(sx) <- levgroup
rownames(sx) <- rownames(y)
- sy <- matrix(nrow=nrow(y), ncol=nlevels(group))
- colnames(sy) <- levels(group)
- rownames(sy) <- rownames(y)
- for (lev in 1:nlevels(group)) {
- i <- group==levels(group)[lev]
+ for (lev in 1L:ngroups) {
+ i <- intgroup==lev
yi <- y[,i]
designi <- design[i,,drop=FALSE]
- if(!is.null(block)) {
- blocki <- block[i]
- voomi <- vooma(y=yi,design=designi,correlation=correlation, block=blocki, plot=FALSE, span=span)
- } else {
- voomi <- vooma(y=yi,design=designi,plot=FALSE,span=span)
- }
+ voomi <- vooma(y=yi,design=designi,correlation=correlation, block=block[i], plot=FALSE, span=span)
w[,i] <- voomi$weights
sx[,lev] <- voomi$meanvar.trend$x
sy[,lev] <- voomi$meanvar.trend$y
@@ -116,22 +111,16 @@ voomaByGroup <- function(y,group,design=NULL,correlation,block=NULL,plot=FALSE,s
span <- voomi$span
# Voom plot
- if(plot) {
- lev <- 1
- colori <- color[lev]
- colori.transparent <- paste(rgb(col2rgb(colori)[1], col2rgb(colori)[2], col2rgb(colori)[3], maxColorValue=255), "10", sep="")
- plot(sx[,lev],sy[,lev],xlab="Average log2 expression",ylab="Sqrt( standard deviation )", xlim=c(min(as.vector(sx))*0.99, max(as.vector(sx))*1.01),ylim=c(min(as.vector(sy))*0.99, max(as.vector(sy))*1.01),pch=16, cex=0.25, col=colori.transparent)
- title("voom: Mean-variance trend")
- l <- lowess(sx[,lev],sy[,lev],f=span)
- lines(l,col=colori)
- for (lev in 2:nlevels(group)) {
- colori <- color[lev]
- colori.transparent <- paste(rgb(col2rgb(colori)[1], col2rgb(colori)[2], col2rgb(colori)[3], maxColorValue=255), "10", sep="")
- points(sx[,lev], sy[,lev], pch=15, cex=0.25, col=colori.transparent)
+ if(plot) {
+ RGB <- col2rgb(col)/255
+ plot(sx,sy,xlab="Average log2 expression",ylab="Sqrt( standard deviation )",main="voom: Mean-variance trend",type="n")
+ for (lev in 1:nlevels(group)) {
+ coli.transparent <- rgb(RGB[1,lev],RGB[2,lev],RGB[3,lev],alpha=alpha)
+ points(sx[,lev],sy[,lev],pch=pch,cex=cex,col=coli.transparent)
l <- lowess(sx[,lev],sy[,lev],f=span)
- lines(l,col=colori)
+ lines(l,col=col[lev],lwd=lwd)
}
- legend("topright", levels(group), col=color, lty=1)
+ if(is.character(legend)) legend(legend, levels(group), col=col, lty=1, lwd=lwd)
}
# Output
diff --git a/inst/NEWS.Rd b/inst/NEWS.Rd
index db768f7..d352e7c 100644
--- a/inst/NEWS.Rd
+++ b/inst/NEWS.Rd
@@ -2,6 +2,115 @@
\title{limma News}
\encoding{UTF-8}
+\section{Version 3.22.0}{\itemize{
+\item
+New functions goana() and topGO() provide gene ontology analyses of differentially genes from a linear model fit.
+The tests include the ability to adjust for gene length or abundance biases in differential expression detection, similar to the goseq package.
+
+\item
+Improvements to diffSplice.
+diffSplice() now calculates Simes adjusted p-values for gene level inferences, in addition to the exon level t-tests and gene level F-tests.
+topSplice() now has three ranking methods ("simes", "F" or "t"), with "simes" now becoming the default.
+diffSplice() also has a new argument 'robust' giving access to robust empirical Bayes variance moderation.
+
+\item
+New function plotExons() to plot log-fold-changes by exon for a given gene.
+
+\item
+New function voomWithQualityWeights() allows users to estimate sample quality weights or allow for heteroscedasticity between treatment groups when doing an RNA-seq analysis.
+
+\item
+Improvement to arrayQualightyWeights().
+It now has a new argument 'var.design' which allows users to model variability by treatment group or other covariates.
+
+\item
+Improved plotting for voomaByGroup().
+
+\item
+barcodeplot() can now plot different weights for different genes in the set.
+
+\item
+Improvements to roast() and mroast().
+The directional (up and down) tests done by roast() now use both the original rotations and their opposite signs, effectively doubling the number of effective rotations for no additional computational cost.
+The two-sided tests are now done explicitly by rotation instead of doubling the smallest one-sided p-value.
+The two-sided p-value is now called "UpOrDown" in the roast() output.
+Both functions now use a fast approximation to convert t-statistics into z-scores,
+making the functions much faster when the number of rotations or the number of genes is large.
+The contrast argument can now optionally be a character string giving a column name of the design matrix.
+
+\item
+zscoreT() can optionally use a fast approximation instead of the slower exact calculation.
+
+\item
+symbols2indices() renamed to ids2indices().
+
+\item
+Improvements to removeBatchEffect().
+It can now take into account weights and other arguments that will affect the linear model fit.
+It can now accept any arguments that would be acceptable for lmFit().
+The behavior of removeBatchEffect() with design supplied has also changed so that it is now consistent with that of lmFit() when modelling batches as additive effects.
+Previously batch adjustments were made only within the treatment levels defined by the design matrix.
+
+\item
+New function plotWithHighlights(), which is now used as the low-level function for plotMA() and plot() methods for limma data objects.
+
+\item
+The definition of the M and A axes for an MA-plot of single channel
+ data is changed slightly. Previously the A-axis was the average of
+ all arrays in the dataset -- this has been definition since MA-plots
+ were introduced for single channel data in April 2003. Now an
+ artificial array is formed by averaging all arrays other than the
+ one to be plotted. Then a mean-difference plot is formed from the
+ specified array and the artificial array. This change ensures the
+ specified and artificial arrays are computed from independent data,
+ and ensures the MA-plot will reduce to a correct mean-difference
+ plot when there are just two arrays in the dataset.
+
+\item
+plotMDS() can now optionally plot samples using symbols instead of
+ text labels.
+It no longer has a 'col' argument, which instead is handled by ....
+
+\item
+vennDiagram() now supports circles of different colors for any
+ number of circles. Previously this was supported only up to three
+ sets.
+
+\item
+getEAWP() will now find a weights matrix in an ExpressionSet object
+ if it exists.
+
+\item
+update to helpMethods().
+
+\item
+Substantial updates to the two RNA-seq case studies in the User's Guide.
+In both cases, the short read data has been realigned and resummarized.
+
+\item
+Improvements to many Rd files.
+Many keyword entries have been revised.
+Many usage and example lines been reformated to avoid over long lines.
+
+\item
+biocViews keywords updated.
+
+\item
+Subsetting columns of a MArrayLM object no longer subsets the design matrix.
+
+\item
+Bug fix for read.maimages: default value for 'quote' was not being
+ set correctly for source="agilent.mean" or source="agilent.median".
+
+\item
+Bug fix to topTableF() and topTable().
+The ordering of Amean values was sometimes incorrect when sorting by F-statistic and a lfc or p.value filter had been set.
+
+\item
+Bug fix to read.ilmn() when sep=",".
+}}
+
+
\section{Version 3.20.0}{\itemize{
\item
diff --git a/inst/doc/changelog.txt b/inst/doc/changelog.txt
index 42fda6b..03a2d2d 100755
--- a/inst/doc/changelog.txt
+++ b/inst/doc/changelog.txt
@@ -1,9 +1,100 @@
-27 August 2014: limma 3.20.8
+10 Oct 2014: limma 3.21.21
+
+- arrayWeights() with plot=TRUE now resets graphics parameters to
+ original settings on exit.
+
+10 Oct 2014: limma 3.21.20
+
+- barcodeplot() is now able to plot different weights for different
+ genes, as specified by the new argument 'gene.weights'.
+
+- new function plotExons() to plot log-fold-changes by exon for a
+ given gene.
+
+- plotSplice() now plots strand information if this can be found.
+
+- first argument of plotMA() and plotMA3by2() is now 'object' instead
+ of 'MA'.
+
+- mdplot() now accepts arguments 'xlab' and 'ylab'.
+
+- all default methods for S3 generic functions are now registered in
+ NAMESPACE.
+
+22 Sep 2014: limma 3.21.19
+
+- Section on "How to get help" in User's Guide now refers to
+ Bioconductor support site.
+
+19 Sep 2014: limma 3.21.18
+
+- new argument 'robust' for diffSplice(), giving access to robust
+ empirical Bayes variance moderation.
+
+19 Sep 2014: limma 3.21.17
+
+- getEAWP() will now find a weights matrix in an ExpressionSet object
+ if it exists.
+
+- The contrast argument to roast() and mroast() can now optionally be
+ a character string giving a column name of the design matrix.
+
+18 Sep 2014: limma 3.21.16
+
+- new function plotWithHighlights(), which is now used as the
+ low-level function for plotMA() and for the plot methods for
+ limma data objects.
+
+- Updates to the two RNA-seq case studies in the User's Guide.
+
+ 8 Sep 2014: limma 3.21.15
+
+- speed improvement for goana().
+
+- improved plotting for voomaByGroup().
+
+26 Aug 2014: limma 3.21.14
+
+- Improvements to goana() and topGO() help files. Argument 'weights'
+ renamed to 'prior.prob'.
- Bug fix for read.maimages: default value for 'quote' was not being
set correctly for source="agilent.mean" or source="agilent.median".
-27 June 2014: limma 3.20.8
+18 Aug 2014: limma 3.21.13
+
+- Speed improvement for the calculation of Simes adjusted p-values by
+ diffSplice().
+
+- argument 'level' for diffSplice() renamed to 'test', with possible
+ values "simes", "F" or "t".
+
+ 8 Aug 2014: limma 3.21.12
+
+- The definition of the M and A axes for an MA-plot of single channel
+ data is changed slightly. Previously the A-axis was the average of
+ all arrays in the dataset -- this has been definition since MA-plots
+ were introduced for single channel data in April 2003. Now an
+ artificial array is formed by averaging all arrays other than the
+ one to be plotted. Then a mean-difference plot is formed from the
+ specified array and the artificial array. This change ensures the
+ specified and artificial arrays are computed from independent data,
+ and ensures the MA-plot will reduce to a correct mean-difference
+ plot when there are just two arrays in the dataset.
+
+ 7 Aug 2014: limma 3.21.11
+
+- diffSplice() now includes Simes adjusted p-values
+
+- topSplice() has a new level="hybrid" argument for ranking genes by
+ Simes adjusted p-values.
+
+- goana() is now an S3 generic function.
+
+- goana() can now optionally adjust for gene length or abundance
+ bias.
+
+27 June 2014: limma 3.21.10
- remove col argument from plotMDS(), as it handled by ... as are
other graphics arguments.
@@ -12,14 +103,21 @@
number of circles. Previously this was supported only up to three
sets.
-24 June 2014: limma 3.20.7
+23 June 2014: limma 3.21.9
+
+- new function goana() for gene ontology analysis of DE results from
+ a linear model fit.
+
+- symbols2indices() renamed to ids2indices().
- update to helpMethods().
+- keyword entries revised for many Rd files.
+
- bug fix to topTableF() and topTable() ordering of Amean values when
ranking is by F-statistic and a lfc or p.value filter has been set.
-18 June 2014: limma 3.20.6
+18 June 2014: limma 3.21.8
- improvements to roast() and mroast(). The directional (up and
down) tests done by roast() now use both the original rotations
@@ -29,12 +127,14 @@
The two-sided p-value is now called "UpOrDown" in the roast()
output.
-16 June 2014: limma 3.20.5
+16 June 2014: limma 3.21.7
- bug fix to read.ilmn() when sep=",".
- biocViews keywords updated.
+ 3 June 2014: limma 3.21.6
+
- roast() and mroast() have a new argument 'approx.zscore'. By
default they now use approx=TRUE when calling zscoreT().
@@ -47,6 +147,8 @@
- many Rd files have been reformated to avoid over long lines in the
pdf manual.
+ 1 June 2014: limma 3.21.5
+
- Further change to removeBatchEffect(), which now calls lmFit() to
estimate the batch effect. This means that removeBatchEffect() can
now take into account weights and other arguments that will affect
@@ -54,7 +156,7 @@
the previous change, when the batch effect was confounded with the
design matrix.
-20 May 2014: limma 3.20.4
+20 May 2014: limma 3.21.4
- Change to removeBatchEffect() when there is a non-trivial design
matrix. Previously batch adjustments were made only within the
@@ -62,20 +164,21 @@
removeBatchEffect() is now more consistent that of linear modelling
with lmFit() with batches as additive effects.
-19 May 2014: limma 3.20.3
+19 May 2014: limma 3.21.3
- Remove unnecessary call to eBayes() from topTable when ranking by
F-statistic for a subset of the coefficients.
-13 May 2014: limma 3.20.2
+13 May 2014: limma 3.21.2
- Subsetting columns of a MArrayLM object should not subset the
design matrix. Now fixed.
-12 April 2014: limma 3.20.1
+12 April 2014: limma 3.21.1
- Update to Pasilla case study in User's Guide.
+12 April 2014: limma 3.21.0 (Bioconductor 2.15 Developmental Branch)
12 April 2014: limma 3.20.0 (Bioconductor 2.14 Release Branch)
7 April 2014: limma 3.19.30
@@ -3441,6 +3544,10 @@ Apr 25 2003: limma 0.9.7
The smawehi package was renamed to limma, with the title "Linear Models
for Microarray Data" and became part of the Bioconductor project.
+Apr 7, 2003
+
+MA plots added for both two-color and single channel data.
+
11 November 2002: smawehi 0.1
smawehi package made publicly available for the first time, through
diff --git a/inst/doc/intro.pdf b/inst/doc/intro.pdf
index 10e4cb8..1c6a399 100644
Binary files a/inst/doc/intro.pdf and b/inst/doc/intro.pdf differ
diff --git a/inst/doc/usersguide.pdf b/inst/doc/usersguide.pdf
deleted file mode 100644
index 7508e02..0000000
Binary files a/inst/doc/usersguide.pdf and /dev/null differ
diff --git a/man/10GeneSetTests.Rd b/man/10GeneSetTests.Rd
index f991624..03557d0 100644
--- a/man/10GeneSetTests.Rd
+++ b/man/10GeneSetTests.Rd
@@ -18,8 +18,8 @@ This page gives an overview of the LIMMA functions for gene set testing and path
\item{ \code{\link{romer}} }{
Gene set enrichment analysis.}
-\item{ \code{\link{symbols2indices}} }{
- Convert gene sets consisting of vectors of symbols into a list of indices suitable for use in the above functions.}
+\item{ \code{\link{ids2indices}} }{
+ Convert gene sets consisting of vectors of gene identifiers into a list of indices suitable for use in the above functions.}
\item{ \code{\link{alias2Symbol}} and \code{\link{alias2SymbolTable}} }{
Convert gene symbols or aliases to current official symbols.}
diff --git a/man/EList.Rd b/man/EList.Rd
index 1f05e52..e1af7d6 100644
--- a/man/EList.Rd
+++ b/man/EList.Rd
@@ -5,30 +5,34 @@
\title{Expression List - class}
\description{
-A list-based S4 classes for storing expression values (E-values) for a set of one-channel microarrays.
+A list-based S4 classes for storing expression values (E-values), for example for a set of one-channel microarrays or a set of RNA-seq samples.
\code{EListRaw} holds expression values on the raw scale.
\code{EList} holds expression values on the log scale, usually after background correction and normalization.
-\code{EListRaw} objects are usually created by \code{\link{read.maimages}}.
-\code{EList} objects are usually created by \code{\link{normalizeBetweenArrays}}.
+
+\code{EListRaw} objects are often created by \code{\link{read.maimages}}, while
+\code{EList} objects are often created by \code{\link{normalizeBetweenArrays}} or by \code{\link{voom}}.
+Alternatively, an \code{EList} object can be created directly by \code{new("EList",x)}, where \code{x} is a list.
}
-\section{Components}{
-\code{EList} objects can be created by \code{new("EList",x)} where \code{x} is a list.
-These classes contains no slots (other than \code{.Data}), but objects should contain a list component \code{E},
-which is a numeric matrix containing expression values.
-Rows correspond to probes and columns to arrays.
-In an \code{EListRaw} object, \code{E} contains unlogged values.
-In an \code{EList} object, \code{E} contains log2 values.
+\section{Required Components}{
+These classes contains no slots (other than \code{.Data}), but objects should contain a list component \code{E}:
+\describe{
+ \item{\code{E}}{numeric matrix containing expression values.
+ In an \code{EListRaw} object, the expression values are unlogged, while in an \code{EList} object, they are log2 values.
+ Rows correspond to probes and columns to samples.}
+}
+}
+\section{Optional Components}{
Optional components include:
\describe{
\item{\code{weights}}{numeric matrix of same dimensions as \code{E} containing relative spot quality weights. Elements should be non-negative.}
\item{\code{other}}{list containing other matrices, all of the same dimensions as \code{E}.}
\item{\code{genes}}{data.frame containing probe information. Should have one row for each probe. May have any number of columns.}
- \item{\code{targets}}{data.frame containing information on the target RNA samples. Rows correspond to arrays. May have any number of columns.}
+ \item{\code{targets}}{data.frame containing information on the target RNA samples. Rows correspond to samples. May have any number of columns.}
}
-Valid \code{EList} or \code{EListRaw} objects may contain other optional components, but all probe or array information should be contained in the above components.
+Valid \code{EList} or \code{EListRaw} objects may contain other optional components, but all probe or sample information should be contained in the above components.
}
\section{Methods}{
@@ -43,8 +47,8 @@ In addition, \code{EList} objects can be \link[limma:subsetting]{subsetted} and
\seealso{
\link{02.Classes} gives an overview of all the classes defined by this package.
- \code{\link[Biobase:class.ExpressionSet]{ExpressionSet}} is a more formal class in the Biobase package.
+ \code{\link[Biobase:class.ExpressionSet]{ExpressionSet}} is a more formal class in the Biobase package used for the same purpose.
}
\keyword{classes}
-\keyword{data}
+
diff --git a/man/TestResults.Rd b/man/TestResults.Rd
index 4a65da4..c1b385c 100755
--- a/man/TestResults.Rd
+++ b/man/TestResults.Rd
@@ -11,7 +11,7 @@ A matrix-based class for storing the results of simultanous tests.
}
\usage{
-\S3method{summary}{TestResults}(object, ...)
+\S3method{summary}{TestResults}(object, \dots)
}
\arguments{
diff --git a/man/alias2Symbol.Rd b/man/alias2Symbol.Rd
index c1e1510..29d93c5 100644
--- a/man/alias2Symbol.Rd
+++ b/man/alias2Symbol.Rd
@@ -39,3 +39,6 @@ This function is often used to assist gene set testing, see
\examples{
if(require("org.Hs.eg.db")) alias2Symbol(c("PUMA","NOXA","BIM"))
}
+
+\keyword{gene annotation}
+
diff --git a/man/anova-method.Rd b/man/anova-method.Rd
index 76ffaad..254b849 100755
--- a/man/anova-method.Rd
+++ b/man/anova-method.Rd
@@ -8,14 +8,14 @@ Produces an ANOVA table useful for quality assessment by decomposing between and
This method produces a single ANOVA Table rather than one for each gene and is not used to identify differentially expressed genes.
}
\section{Usage}{
-\code{anova(object,design=NULL,ndups=2,...)}
+\code{anova(object,design=NULL,ndups=2,\dots)}
}
\section{Arguments}{
\describe{
\item{\code{object}}{object of class \code{MAList}. Missing values in the M-values are not allowed.}
\item{\code{design}}{numeric vector or single-column matrix containing the design matrix for linear model. The length of the vector or the number of rows of the matrix should agree with the number of columns of M.}
\item{\code{ndups}}{number of duplicate spots. Each gene is printed ndups times in adjacent spots on each array.}
- \item{\code{...}}{other arguments are not used}
+ \item{\code{\dots}}{other arguments are not used}
}
}
\section{Details}{
diff --git a/man/arrayWeights.Rd b/man/arrayWeights.Rd
index 18fc209..5f2ce5c 100644
--- a/man/arrayWeights.Rd
+++ b/man/arrayWeights.Rd
@@ -7,8 +7,8 @@ Estimates relative quality weights for each array in a multi-array
experiment.
}
\usage{
-arrayWeights(object, design = NULL, weights = NULL, method = "genebygene", maxiter = 50,
- tol = 1e-10, trace=FALSE)
+arrayWeights(object, design = NULL, weights = NULL, var.design = NULL,
+ method = "genebygene", maxiter = 50, tol = 1e-10, trace=FALSE)
arrayWeightsSimple(object, design = NULL,
maxiter = 100, tol = 1e-6, maxratio = 100, trace=FALSE)
}
@@ -21,6 +21,8 @@ arrayWeightsSimple(object, design = NULL,
estimated. Defaults to the unit vector meaning that the
arrays are treated as replicates.}
\item{weights}{optional numeric matrix containing prior weights for each spot.}
+ \item{var.design}{design matrix for the variance model. Defaults to the sample-specific model
+ (i.e. each sample has a distinct variance) when \code{NULL}.}
\item{method}{character string specifying the estimating algorithm to be used. Choices
are \code{"genebygene"} and \code{"reml"}.}
\item{maxiter}{maximum number of iterations allowed.}
@@ -60,6 +62,8 @@ BMC Bioinformatics 7, 261.
\url{http://www.biomedcentral.com/1471-2105/7/261/abstract}
}
\seealso{
+\code{\link{voomWithQualityWeights}}
+
An overview of linear model functions in limma is given by \link{06.LinearModels}.
}
\examples{
diff --git a/man/asdataframe.Rd b/man/asdataframe.Rd
index 5a2f5a9..423f76e 100644
--- a/man/asdataframe.Rd
+++ b/man/asdataframe.Rd
@@ -8,7 +8,7 @@
Turn a \code{MArrayLM} object into a \code{data.frame}.
}
\usage{
-\method{as.data.frame}{MArrayLM}(x, row.names = NULL, optional = FALSE, ...)
+\method{as.data.frame}{MArrayLM}(x, row.names = NULL, optional = FALSE, \dots)
}
\arguments{
\item{x}{an object of class \code{MArrayLM}}
diff --git a/man/backgroundcorrect.Rd b/man/backgroundcorrect.Rd
index f87d67f..172cd36 100755
--- a/man/backgroundcorrect.Rd
+++ b/man/backgroundcorrect.Rd
@@ -87,4 +87,5 @@ backgroundCorrect(RG, offset=5)
An overview of background correction functions is given in \code{\link{04.Background}}.
}
-\keyword{models}
+
+\keyword{background correction}
diff --git a/man/barcodeplot.Rd b/man/barcodeplot.Rd
index 216f91b..d64126b 100644
--- a/man/barcodeplot.Rd
+++ b/man/barcodeplot.Rd
@@ -1,22 +1,33 @@
\name{barcodeplot}
\alias{barcodeplot}
-\title{Barcode Plot}
+\title{Barcode Enrichment Plot}
\description{
-Plot positions of one or two gene sets in a ranked list of statistics.
+Display the enrichment of one or two gene sets in a ranked gene list.
}
\usage{
-barcodeplot(statistics, index, index2=NULL, labels=c("Up","Down"), quantiles=c(-1,1),
- col.bars=NULL, worm=TRUE, span.worm=0.45, ...)
+barcodeplot(statistics, index = NULL, index2 = NULL, gene.weights = NULL,
+ weights.label = "Weight", labels = c("Up","Down"), quantiles = c(-1,1),
+ col.bars = NULL, worm = TRUE, span.worm=0.45, \dots)
}
\arguments{
- \item{index}{index vector for the gene set. This can be a vector of indices, or a logical vector of the same length as \code{statistics} or, in general, any vector such that \code{statistic[index]} gives the statistic values for the gene set to be tested.}
- \item{index2}{index vector for a second gene set. Usually used to specify down-regulated genes when \code{index} is used for up-regulated genes.vector specifying the elements of \code{statistic} in the test group.}
\item{statistics}{numeric vector giving the values of statistics to rank genes by.}
- \item{labels}{character vector of labels for the two groups of RNA samples that are being compared by the statistics. First label is associated with high statistics and is displayed at the left end of the plot. Second label is associated with low or negative statistics and is displayed at the right end of the plot.}
+ \item{index}{index vector for the gene set.
+ This can be a vector of indices, or a logical vector of the same length as \code{statistics} or, in general, any vector such that \code{statistic[index]} gives a subset of the statistic values.
+ Can be omitted if \code{gene.weights} has same length as \code{statistics}, in which case positive values of \code{gene.weights} indicate to members of the positive set and negative weights correspond to members of the negative set.}
+ \item{index2}{optional index vector for a second (negative) gene set.
+ If specified, then \code{index} and \code{index2} specify positive and negative genes respectively.
+ Usually used to distinguish down-regulated genes from up-regulated genes.}
+ \item{gene.weights}{numeric vector giving directional weights for the genes in the (first) set.
+ Postive and negative weights correspond to positive and negative genes.
+ Ignored if \code{index2} is non-null.}
+ \item{weights.label}{label describing the entries in \code{gene.weights}.}
+ \item{labels}{character vector of labels for high and low statistics. First label is associated with high statistics and is displayed at the left end of the plot. Second label is associated with low or negative statistics and is displayed at the right end of the plot.}
\item{quantiles}{numeric vector of length 2, giving cutoff values for \code{statistics} considered small or large respectively. Used to color the rectangle of the barcodeplot.}
- \item{col.bars}{character vector giving colors for the bars on the barcodeplot. Defaults to \code{"black"} for one set or \code{c("red","blue")} for two sets.}
+ \item{col.bars}{character vector giving colors for the bars on the barcodeplot.
+ Defaults to \code{"black"} for one set or \code{c("red","blue")} for two sets.
+ Defaults to semitransparent color for the bars inside the rectangle when variable gene weights are given which is intended to distinguish the positional bars from the weighted bars and also to show the density of the genes.}
\item{worm}{logical, should enrichment worms be plotted?}
- \item{span.worm}{loess span for enrichment worms.}
+ \item{span.worm}{loess span for enrichment worms. Larger spans give smoother worms.}
\item{\ldots}{other arguments are passed to \code{plot}.}
}
@@ -25,16 +36,31 @@ No value is returned but a plot is produced as a side effect.
}
\details{
-Rank statistics left to right from largest to smallest, and show positions of one or two specified subsets.
-This plot is intended to be used in conjunction with gene set tests, and shows the enrichment of gene sets amongst high or low ranked genes.
-It first appeared in the literature in Lim et al (2009).
-It was inspired by the set location plot of Subramanian et al (2005).
+This function plots the positions of one or two gene sets in a ranked list of statistics.
+If there are two sets, then one is considered to be the positive set and the other the down set.
+For example, the first set and second sets often correspond to genes that are expected to be up- or down-regulated respectively.
+The function can optionally display varying weights for different genes, for example log-fold-changes from a previous experiment.
-The enrichment worm is calculated by tricube-weighted moving average.
+The statistics are ranked left to right from largest to smallest.
+The ranked statistics are represented by a shaded bar or bed, and the positions of the specified subsets are marked by vertical bars, forming a pattern like a barcode.
+An enrichment worm optionally shows the relative enrichment of the vertical bars in each part of the plot.
+
+Barcode plots are often used in conjunction with gene set tests, and show the enrichment of gene sets amongst high or low ranked genes.
+They were inspired by the set location plot of Subramanian et al (2005), with a number of enhancements, especially the ability to plot positive and negative sets simultaneously.
+Barcode plots first appeared in the literature in Lim et al (2009).
+More recent examples can be seen in Liu et al (2014).
+
+The function can be used with any of four different calling sequences:
+ \itemize{
+ \item \code{index} is specified, but not \code{index2} or \code{gene.weights}. Single direction plot.
+ \item \code{index} and \code{index2} are specified. Two directional plot.
+ \item \code{index} and \code{gene.weights} are specified. \code{gene.weights} must have same length as \code{statistics[index]}. Plot will be two-directional if \code{gene.weights} contains positive and negative values.
+ \item \code{gene.weights} is specified by not \code{index} or \code{index2}. \code{gene.weights} must have same length as \code{statistics}. Plot will be two-directional if \code{gene.weights} contains positive and negative values.
+ }
}
\seealso{
-\code{\link{tricubeMovingAverage}}, \code{\link{camera}}, \code{\link{wilcox.test}}
+\code{\link{tricubeMovingAverage}}, \code{\link{roast}}, \code{\link{camera}}, \code{\link{wilcox.test}}
}
\author{Gordon Smyth, Di Wu and Yifang Hu}
@@ -42,7 +68,12 @@ The enrichment worm is calculated by tricube-weighted moving average.
\references{
Lim E, Vaillant F, Wu D, Forrest NC, Pal B, Hart AH, Asselin-Labat ML, Gyorki DE, Ward T, Partanen A, Feleppa F, Huschtscha LI, Thorne HJ; kConFab; Fox SB, Yan M, French JD, Brown MA, Smyth GK, Visvader JE, Lindeman GJ (2009).
Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers.
-\emph{Nat Med}, 15, 907-913..
+\emph{Nat Med}, 15, 907-913.
+
+Liu, GJ, Cimmino, L, Jude, JG, Hu, Y, Witkowski, MT, McKenzie, MD, Kartal-Kaess, M, Best, SA, Tuohey, L, Liao, Y, Shi, W, Mullighan, CG, Farrar, MA, Nutt, SL, Smyth, GK, Zuber, J, and Dickins, RA (2014).
+Pax5 loss imposes a reversible differentiation block in B progenitor acute lymphoblastic leukemia.
+\emph{Genes & Development} 28, 1337-1350.
+\url{http://www.ncbi.nlm.nih.gov/pubmed/24939936}
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005).
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.
@@ -52,9 +83,27 @@ Gene set enrichment analysis: a knowledge-based approach for interpreting genome
\examples{
stat <- rnorm(100)
sel <- 1:10
-stat[sel] <- stat[sel]+1
-barcodeplot(stat,sel)
sel2 <- 11:20
+stat[sel] <- stat[sel]+1
stat[sel2] <- stat[sel2]-1
-barcodeplot(stat,sel,sel2)
+
+# One directional
+barcodeplot(stat, index = sel)
+
+# Two directional
+barcodeplot(stat, index = sel, index2 = sel2)
+
+# Second set can be indicated by negative weights
+barcodeplot(stat, index = c(sel,sel2), gene.weights = c(rep(1,10), rep(-1,10)))
+
+# Two directional with unequal weights
+w <- rep(0,100)
+w[sel] <- runif(10)
+w[sel2] <- -runif(10)
+barcodeplot(stat, gene.weights = w, weights.label = "logFC")
+
+# One directional with unequal weights
+w <- rep(0,100)
+w[sel2] <- -runif(10)
+barcodeplot(stat, gene.weights = w, weights.label = "logFC", col.bars = "dodgerblue")
}
diff --git a/man/camera.Rd b/man/camera.Rd
index 204eaec..5a41a08 100644
--- a/man/camera.Rd
+++ b/man/camera.Rd
@@ -16,7 +16,7 @@ interGeneCorrelation(y, design)
Rows correspond to probes and columns to samples.
Any type of object that can be processed by \code{\link{getEAWP}} is acceptable.
}
- \item{index}{an index vector or a list of index vectors. Can be any vector such that \code{y[index,]} selects the rows corresponding to the test set. The list can be made using \code{\link{symbols2indices}}.}
+ \item{index}{an index vector or a list of index vectors. Can be any vector such that \code{y[index,]} selects the rows corresponding to the test set. The list can be made using \code{\link{ids2indices}}.}
\item{design}{design matrix.}
\item{contrast}{contrast of the linear model coefficients for which the test is required. Can be an integer specifying a column of \code{design}, or else a numeric vector of same length as the number of columns of \code{design}.}
\item{weights}{can be a numeric matrix of individual weights, of same size as \code{y}, or a numeric vector of array weights with length equal to \code{ncol(y)}, or a numeric vector of gene weights with length equal to \code{nrow(y)}.}
@@ -76,7 +76,7 @@ Analyzing gene expression data in terms of gene sets: methodological issues.
\code{\link{geneSetTest}},
\code{\link{roast}},
\code{\link{romer}},
-\code{\link{symbols2indices}}.
+\code{\link{ids2indices}}.
There is a topic page on \link{10.GeneSetTests}.
}
@@ -97,5 +97,7 @@ camera(y, index2, design)
camera(y, list(set1=index1,set2=index2), design)
}
-\keyword{htest}
+\keyword{gene set test}
+\concept{gene set test}
+\concept{gene set enrichment analysis}
diff --git a/man/diffSplice.Rd b/man/diffSplice.Rd
index a94e4fa..bf684c4 100644
--- a/man/diffSplice.Rd
+++ b/man/diffSplice.Rd
@@ -3,27 +3,39 @@
\title{Test for Differential Splicing}
\description{Given a linear model fit at the exon level, test for differences in exon retention between experimental conditions.}
\usage{
-diffSplice(fit, geneid, exonid=NULL, verbose=TRUE)
+diffSplice(fit, geneid, exonid=NULL, robust=FALSE, verbose=TRUE)
}
\arguments{
\item{fit}{an \code{MArrayLM} fitted model object produced by \code{lmFit} or \code{contrasts.fit}. Rows should correspond to exons.}
\item{geneid}{gene identifiers. Either a vector of length \code{nrow(fit)} or the name of the column of \code{fit$genes} containing the gene identifiers. Rows with the same ID are assumed to belong to the same gene.}
\item{exonid}{exon identifiers. Either a vector of length \code{nrow(fit)} or the name of the column of \code{fit$genes} containing the exon identifiers.}
+ \item{robust}{logical, should the estimation of the empirical Bayes prior parameters be robustified against outlier sample variances?}
\item{verbose}{logical, if \code{TRUE} some diagnostic information about the number of genes and exons is output.}
}
\value{
An object of class \code{MArrayLM} containing both exon level and gene level tests.
Results are sorted by geneid and by exonid within gene.
\item{coefficients}{numeric matrix of coefficients of same dimensions as \code{fit}. Each coefficient is the difference between the log-fold-change for that exon versus the average log-fold-change for all other exons for the same gene.}
- \item{t}{numeric matrix of moderated t-statistics}
+ \item{t}{numeric matrix of moderated t-statistics, of same dimensions as \code{fit}.}
\item{p.value}{numeric vector of p-values corresponding to the t-statistics}
\item{genes}{data.frame of exon annotation}
\item{genecolname}{character string giving the name of the column of \code{genes} containing gene IDs}
- \item{gene.F}{numeric vector of moderated F-statistics, one for each gene.}
- \item{gene.F.p.value}{numeric vector of p-values corresponding to \code{gene.F}}
+ \item{gene.F}{numeric matrix of moderated F-statistics, one row for each gene.}
+ \item{gene.F.p.value}{numeric matrix of p-values corresponding to \code{gene.F}}
+ \item{gene.simes.p.value}{numeric matrix of Simes adjusted p-values, one row for each gene.}
+ \item{gene.genes}{data.frame of gene annotation.}
}
\details{
-This function is often used in conjunction with voom.
+This function tests for differential exon usage for each gene and for each column of \code{fit}.
+
+Testing for differential exon usage is equivalent to testing whether the log-fold-changes in the \code{fit} differ between exons for the same gene.
+Two different tests are provided.
+The first is an F-test for differences between the log-fold-changes.
+The other is a series of t-tests in which each exon is compared to the average of all other exons for the same gene.
+The exon-level t-tests are converted into a genewise test by adjusting the p-values for the same gene by Simes method.
+The minimum adjusted p-value is then used for each gene.
+
+This function can be used on data from an exon microarray or can be used in conjunction with voom for exon-level RNA-seq counts.
}
\seealso{
\code{\link{voom}}
@@ -39,3 +51,5 @@ topSplice(ex)
plotSplice(ex)
}
}
+
+\keyword{rna-seq}
diff --git a/man/genas.Rd b/man/genas.Rd
index c1efe81..ea5d08b 100644
--- a/man/genas.Rd
+++ b/man/genas.Rd
@@ -30,10 +30,10 @@ The method is explained briefly in Majewski et al (2010) and in full detail in P
The \code{subset} argument specifies whether and how the fit object should be subsetted.
Ideally, only genes that are truly differentially expressed for one or both of the contrasts should be used estimate the biological correlation.
The default is \code{"all"}, which uses all genes in the fit object to estimate the biological correlation.
-The option \code{"Fpval"} chooses genes based on how many F-test p-values are estimated to be truly significant using the method \code{propNotDE}.
+The option \code{"Fpval"} chooses genes based on how many F-test p-values are estimated to be truly significant using the function \code{propTrueNull}.
This should capture genes that display any evidence of differential expression in either of the two contrasts.
The options \code{"p.union"} and \code{"p.int"} are based on the moderated t p-values from both contrasts.
-From the \code{propNotDE} method an estimate of the number of p-values truly significant in either of the two contrasts can be obtained.
+From the \code{propTrueNull} function an estimate of the number of p-values truly significant in either of the two contrasts can be obtained.
"p.union" takes the union of these genes and \code{"p.int"} takes the intersection of these genes.
The other options, \code{"logFC"} and \code{"predFC"} subsets on genes that attain a logFC or predFC at least as large as the 90th percentile of the log fold changes or predictive log fold changes on the absolute scale.
diff --git a/man/geneSetTest.Rd b/man/geneSetTest.Rd
index 25b5ad5..f708c53 100755
--- a/man/geneSetTest.Rd
+++ b/man/geneSetTest.Rd
@@ -9,7 +9,7 @@ Genes are assumed to be independent.
\usage{
geneSetTest(index, statistics, alternative = "mixed", type= "auto",
ranks.only = TRUE, nsim=9999)
-wilcoxGST(index, statistics, ...)
+wilcoxGST(index, statistics, \dots)
}
\arguments{
@@ -104,3 +104,7 @@ stat <- rnorm(100)
sel <- 1:10; stat[sel] <- stat[sel]+1
wilcoxGST(sel,stat)
}
+
+\keyword{gene set test}
+\concept{gene set test}
+\concept{gene set enrichment analysis}
diff --git a/man/gls.series.Rd b/man/gls.series.Rd
index 63fa209..43ccc0b 100755
--- a/man/gls.series.Rd
+++ b/man/gls.series.Rd
@@ -6,7 +6,7 @@ Fit a linear model genewise to expression data from a series of microarrays.
The fit is by generalized least squares allowing for correlation between duplicate spots or related arrays.
This is a utility function for \code{lmFit}.
}
-\usage{gls.series(M,design=NULL,ndups=2,spacing=1,block=NULL,correlation=NULL,weights=NULL,...)}
+\usage{gls.series(M,design=NULL,ndups=2,spacing=1,block=NULL,correlation=NULL,weights=NULL,\dots)}
\arguments{
\item{M}{numeric matrix containing log-ratio or log-expression values for a series of microarrays, rows correspond to genes and columns to arrays.}
\item{design}{numeric design matrix defining the linear model, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of \code{M}. Defaults to the unit vector meaning that the arrays are treated as replicates.}
@@ -16,7 +16,7 @@ This is a utility function for \code{lmFit}.
Same length as \code{ncol(M)}.}
\item{correlation}{numeric value specifying the inter-duplicate or inter-block correlation.}
\item{weights}{an optional numeric matrix of the same dimension as \code{M} containing weights for each spot. If it is of different dimension to \code{M}, it will be filled out to the same size.}
- \item{...}{other optional arguments to be passed to \code{dupcor.series}.}
+ \item{\dots}{other optional arguments to be passed to \code{dupcor.series}.}
}
\value{
A list with components
diff --git a/man/goana.MArrayLM.Rd b/man/goana.MArrayLM.Rd
new file mode 100644
index 0000000..975b81c
--- /dev/null
+++ b/man/goana.MArrayLM.Rd
@@ -0,0 +1,89 @@
+\name{goana.MarrayLM}
+\alias{goana.MArrayLM}
+\title{Gene Ontology Analysis of Differentially Expressed Genes}
+
+\description{
+Test for over-representation of gene ontology (GO) terms in the up and down differentially expressed genes from a linear model fit.
+}
+
+\usage{
+\method{goana}{MArrayLM}(de, coef = ncol(de), geneid = rownames(de), FDR = 0.05,
+ species = "Hs", trend = FALSE, \dots)
+}
+
+\arguments{
+ \item{de}{an \code{MArrayLM} fit object.}
+ \item{coef}{column number or column name specifying for which coefficient or contrast differential expression should be assessed.}
+ \item{geneid}{Entrez Gene identifiers. Either a vector of length \code{nrow(de)} or the name of the column of \code{de$genes} containing the Entrez Gene IDs.}
+ \item{FDR}{false discovery rate cutoff for differentially expressed genes. Numeric value between 0 and 1.}
+ \item{species}{species identifier. Possible values are \code{"Hs"}, \code{"Mm"}, \code{"Rn"} or \code{"Dm"}.}
+ \item{trend}{adjust analysis for gene length or abundance?
+ Can be logical, or a numeric vector of covariate values, or the name of the column of \code{de$genes} containing the covariate values.
+ If \code{TRUE}, then \code{de$Amean} is used as the covariate.}
+ \item{\dots}{any other arguments are passed to \code{goana.default}.}
+}
+
+\details{
+Performs Gene Ontology enrichment analyses for the up and down differentially expressed genes from a linear model analysis.
+The Entrez Gene ID must be supplied for each gene.
+
+If \code{trend=FALSE}, the function computes one-sided hypergeometric tests equivalent to Fisher's exact test.
+
+If \code{trend=TRUE} or a covariate is supplied, then a trend is fitted to the differential expression results and the method of Young et al (2010) is used to adjust for this trend.
+The adjusted test uses Wallenius' noncentral hypergeometric distribution.
+}
+
+\value{
+A data frame with a row for each GO term and the following columns:
+ \item{Term}{GO term.}
+ \item{Ont}{ontology that the GO term belongs to. Possible values are \code{"BP"}, \code{"CC"} and \code{"MF"}.}
+ \item{N}{number of genes in the GO term.}
+ \item{Up}{number of up-regulated differentially expressed genes.}
+ \item{Down}{number of down-regulated differentially expressed genes.}
+ \item{P.Up}{p-value for over-representation of GO term in up-regulated genes.}
+ \item{P.Down}{p-value for over-representation of GO term in down-regulated genes.}
+The row names of the data frame give the GO term IDs.
+}
+
+\references{
+ Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010).
+ Gene ontology analysis for RNA-seq: accounting for selection bias.
+ \emph{Genome Biology} 11, R14.
+ \url{http://genomebiology.com/2010/11/2/R14}
+}
+
+\seealso{
+\code{\link{goana.default}}, \code{\link{topGO}}
+
+The goseq package implements a similar GO analysis.
+The goseq version will work with a variety of gene identifiers, not only Entrez Gene as here, and includes a database of gene length information for various species.
+
+The gostats package also does GO analyses with some different options.
+}
+\author{Gordon Smyth and Yifang Hu}
+\examples{
+
+\dontrun{
+
+fit <- lmFit(y, design)
+fit <- eBayes(fit)
+
+# Standard GO analysis
+go.fisher <- goana(fit)
+topGO(go.fisher, sort = "up")
+topGO(go.fisher, sort = "down")
+
+# GO analysis adjusting for gene abundance
+go.abund <- goana(fit, geneid = "GeneID", trend = TRUE)
+topGO(go.abund, sort = "up")
+topGO(go.abund, sort = "down")
+
+# GO analysis adjusting for gene length bias
+# (assuming that y$genes$Length contains gene lengths)
+go.len <- goana(fit, geneid = "GeneID", trend = "Length")
+topGO(go.len, sort = "up")
+topGO(go.len, sort = "down")
+}
+}
+
+\keyword{gene set test}
diff --git a/man/goana.Rd b/man/goana.Rd
new file mode 100644
index 0000000..2e14d89
--- /dev/null
+++ b/man/goana.Rd
@@ -0,0 +1,77 @@
+\name{goana}
+\alias{goana}
+\alias{goana.default}
+\title{Gene Ontology Analysis}
+
+\description{
+Test for over-representation of gene ontology (GO) terms in one or more sets of genes.
+}
+
+\usage{
+\method{goana}{default}(de, universe = NULL, species = "Hs", prior.prob = NULL, \dots)
+}
+
+\arguments{
+ \item{de}{a vector of Entrez Gene IDs, or a list of such vectors.}
+ \item{universe}{vector specifying the set of Entrez Gene identifiers to be the background universe.
+ If \code{NULL} then all Entrez Gene IDs associated with any gene ontology term will be used as the universe.}
+ \item{species}{species identifier. Possible values are \code{"Hs"}, \code{"Mm"}, \code{"Rn"} or \code{"Dm"}.}
+ \item{prior.prob}{numeric vector giving the prior probability that each gene in the universe appears in a gene set.}
+ \item{\dots}{other arguments are not currently used.}
+}
+
+\details{
+\code{goana} is an S3 generic function.
+The default method performs a Gene Ontology enrichment analysis for one for more gene lists using the appropriate Bioconductor organism package.
+The gene lists must be supplied as Entrez Gene IDs.
+
+If \code{prior.prob=NULL}, the function computes one-sided hypergeometric tests equivalent to Fisher's exact test.
+
+The \code{prior.prob} vector can be used to specify the prior probability that each gene in the universe appears in a gene set.
+If prior probabilities are specified, then a test based on the Wallenius' noncentral hypergeometric distribution is used to adjust for the relative probability that each gene will appear in a gene set, following the approach of Young et al (2010).
+}
+
+\value{
+A data frame with a row for each GO term and the following columns:
+ \item{Term}{GO term.}
+ \item{Ont}{ontology that the GO term belongs to. Possible values are \code{"BP"}, \code{"CC"} and \code{"MF"}.}
+ \item{N}{number of genes in the GO term.}
+ \item{DE1}{number of genes in the \code{DE1} set.}
+ \item{P.DE1}{p-value for over-representation of the GO term in the set.}
+The last two column names above assume one gene set with the name \code{DE1}.
+In general, there will be a pair of such columns for each gene set and the name of the set will appear in place of \code{"DE1"}.
+
+The row names of the data frame give the GO term IDs.
+}
+
+\references{
+ Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010).
+ Gene ontology analysis for RNA-seq: accounting for selection bias.
+ \emph{Genome Biology} 11, R14.
+ \url{http://genomebiology.com/2010/11/2/R14}
+}
+
+\seealso{
+\code{\link{goana.MArrayLM}}, \code{\link{topGO}}
+
+The goseq package implements a similar GO analysis.
+The goseq version will work with a variety of gene identifiers, not only Entrez Gene as here, and includes a database of gene length information for various species.
+
+The gostats package also does GO analyses with some different options.
+}
+
+\author{Gordon Smyth and Yifang Hu}
+
+\examples{
+\dontrun{
+
+go.de <- goana(list(DE1 = EG.DE1, DE2 = EG.DE2, DE3 = EG.DE3))
+topGO(go.de, sort = "DE1")
+topGO(go.de, sort = "DE2")
+topGO(go.de, ontology = "BP", sort = "DE3")
+topGO(go.de, ontology = "CC", sort = "DE3")
+topGO(go.de, ontology = "MF", sort = "DE3")
+}
+}
+
+\keyword{gene set test}
diff --git a/man/heatdiagram.Rd b/man/heatdiagram.Rd
index 0f39551..91e4640 100755
--- a/man/heatdiagram.Rd
+++ b/man/heatdiagram.Rd
@@ -8,10 +8,10 @@ Creates a heat diagram showing the co-regulation of genes under one condition wi
\usage{
heatDiagram(results, coef, primary=1, names=NULL, treatments=colnames(coef), limit=NULL,
orientation="landscape", low="green", high="red", cex=1, mar=NULL,
- ncolors=123, ...)
+ ncolors=123, \dots)
heatdiagram(stat, coef, primary=1, names=NULL, treatments=colnames(stat),
critical.primary=4, critical.other=3, limit=NULL, orientation="landscape",
- low="green", high="red", cex=1, mar=NULL, ncolors=123, ...)
+ low="green", high="red", cex=1, mar=NULL, ncolors=123, \dots)
}
\arguments{
\item{results}{\code{TestResults} matrix, containing elements -1, 0 or 1, from \code{\link{decideTests}}}
@@ -30,7 +30,7 @@ heatdiagram(stat, coef, primary=1, names=NULL, treatments=colnames(stat),
\item{cex}{factor to increase or decrease size of column and row text}
\item{mar}{numeric vector of length four giving the size of the margin widths.
Default is \code{cex*c(5,6,1,1)} for landscape and \code{cex*c(1,1,4,3)} for portrait.}
- \item{...}{any other arguments will be passed to the \code{image} function}
+ \item{\dots}{any other arguments will be passed to the \code{image} function}
}
\details{
Users are encouraged to use \code{heatDiagram} rather than \code{heatdiagram} as the later function may be removed in future versions of limma.
diff --git a/man/ids2indices.Rd b/man/ids2indices.Rd
new file mode 100644
index 0000000..8ca3074
--- /dev/null
+++ b/man/ids2indices.Rd
@@ -0,0 +1,46 @@
+\name{ids2indices}
+\alias{ids2indices}
+\title{Convert Gene Identifiers to Indices for Gene Sets}
+\description{
+Make a list of gene identifiers into a list of indices for gene sets.
+}
+\usage{
+ids2indices(gene.sets, identifiers, remove.empty=TRUE)
+}
+\arguments{
+ \item{gene.sets}{list of character vectors, each vector containing the gene identifiers for a set of genes.}
+ \item{identifiers}{character vector of gene identifiers.}
+ \item{remove.empty}{logical, should sets of size zero be removed from the output?}
+}
+\value{
+list of integer vectors, each vector containing the indices of a gene set in the vector \code{identifiers}.
+}
+\details{
+This function used to create input for \code{romer}, \code{mroast} and \code{camera} function.
+Typically, \code{identifiers} is the vector of Entrez Gene IDs, and \code{gene.sets} is obtained constructed from a database of gene sets,
+for example a representation of the Molecular Signatures Database (MSigDB) downloaded from \url{http://bioinf.wehi.edu.au/software/MSigDB}.
+}
+
+\seealso{
+\code{\link{romer}}, \code{\link{mroast}}, \code{\link{camera}}
+
+There is a topic page on \link{10.GeneSetTests}.
+}
+
+\examples{
+
+\dontrun{
+
+download.file("http://bioinf.wehi.edu.au/software/MSigDB/human_c2_v4.rdata",
+ "human_c2_v4.rdata", mode = "wb")
+
+load("human_c2_v4.rdata")
+c2.indices <- ids2indices(Hs.c2, y$genes$GeneID)
+camera(y, c2.indices, design)
+
+}
+
+}
+\author{Gordon Smyth and Yifang Hu}
+
+\keyword{gene set test}
diff --git a/man/imageplot.Rd b/man/imageplot.Rd
index 3ae6975..12e4e45 100755
--- a/man/imageplot.Rd
+++ b/man/imageplot.Rd
@@ -8,7 +8,7 @@ This function can be used to explore any spatial effects across the microarray.
}
\usage{
imageplot(z, layout, low = NULL, high = NULL, ncolors = 123, zerocenter = NULL,
-zlim = NULL, mar=c(2,1,1,1), legend=TRUE, ...)
+zlim = NULL, mar=c(2,1,1,1), legend=TRUE, \dots)
}
\arguments{
\item{z}{numeric vector or array. This vector can contain any spot
@@ -32,7 +32,7 @@ the interval \code{zlim} will be truncated to the relevant limit.}
\item{mar}{numeric vector of length 4 specifying the width of the margin around the plot.
This argument is passed to \code{\link[graphics]{par}}.}
\item{legend}{logical, if \code{TRUE} the range of \code{z} and \code{zlim} is shown in the bottom margin}
-\item{...}{any other arguments will be passed to the function image}
+\item{\dots}{any other arguments will be passed to the function image}
}
\details{
This function may be used to plot the values of any spot-specific statistic, such as the log intensity ratio, background intensity or a quality
diff --git a/man/imageplot3by2.Rd b/man/imageplot3by2.Rd
index 9902511..2765255 100755
--- a/man/imageplot3by2.Rd
+++ b/man/imageplot3by2.Rd
@@ -6,7 +6,7 @@ Write imageplots to files in PNG format, six plots to a file in a 3 by 2 grid ar
}
\usage{
imageplot3by2(RG, z="Gb", prefix=paste("image",z,sep="-"), path=NULL,
- zlim=NULL, common.lim=TRUE, ...)
+ zlim=NULL, common.lim=TRUE, \dots)
}
\arguments{
\item{RG}{an \code{RGList} or \code{MAList} object, or any list with component named by \code{z}}
@@ -15,7 +15,7 @@ imageplot3by2(RG, z="Gb", prefix=paste("image",z,sep="-"), path=NULL,
\item{path}{character string specifying directory for output files}
\item{zlim}{numeric vector of length 2, giving limits of response vector to be associated with saturated colors}
\item{common.lim}{logical, should all plots on a page use the same axis limits}
- \item{...}{any other arguments are passed to \code{imageplot}}
+ \item{\dots}{any other arguments are passed to \code{imageplot}}
}
\details{
diff --git a/man/kooperberg.Rd b/man/kooperberg.Rd
index 3b09b20..e5ed572 100755
--- a/man/kooperberg.Rd
+++ b/man/kooperberg.Rd
@@ -65,4 +65,4 @@ MA <- normalizeWithinArrays(RGmodel)
}
}
-\keyword{models}
+\keyword{background correction}
diff --git a/man/lmFit.Rd b/man/lmFit.Rd
index d8fb620..dd26fb9 100755
--- a/man/lmFit.Rd
+++ b/man/lmFit.Rd
@@ -4,10 +4,10 @@
\description{Fit linear model for each gene given a series of arrays}
\usage{
lmFit(object, design=NULL, ndups=1, spacing=1, block=NULL, correlation, weights=NULL,
- method="ls", ...)
+ method="ls", \dots)
}
\arguments{
- \item{object}{object of class \code{numeric}, \code{matrix}, \code{MAList}, \code{EList}, \code{marrayNorm}, \code{ExpressionSet} or \code{PLMset} containing log-ratios or log-values of expression for a series of microarrays}
+ \item{object}{any data object that can be processed by \code{\link{getEAWP}} containing log-ratios or log-values of expression for a series of microarrays.}
\item{design}{the design matrix of the microarray experiment, with rows corresponding to arrays and columns to coefficients to be estimated. Defaults to the unit vector meaning that the arrays are treated as replicates.}
\item{ndups}{positive integer giving the number of times each distinct probe is printed on each array.}
\item{spacing}{positive integer giving the spacing between duplicate occurrences of the same probe, \code{spacing=1} for consecutive rows.}
@@ -15,7 +15,7 @@ lmFit(object, design=NULL, ndups=1, spacing=1, block=NULL, correlation, weights=
\item{correlation}{the inter-duplicate or inter-technical replicate correlation}
\item{weights}{non-negative observation weights. Can be a numeric matrix of individual weights, of same size as the object expression matrix, or a numeric vector of array weights with length equal to \code{ncol} of the expression matrix, or a numeric vector of gene weights with length equal to \code{nrow} of the expression matrix.}
\item{method}{fitting method; \code{"ls"} for least squares or \code{"robust"} for robust regression}
- \item{...}{other optional arguments to be passed to \code{lm.series}, \code{gls.series} or \code{mrlm}}
+ \item{\dots}{other optional arguments to be passed to \code{lm.series}, \code{gls.series} or \code{mrlm}}
}
\value{
diff --git a/man/loessfit.Rd b/man/loessfit.Rd
index 65e0dbb..3f7473d 100755
--- a/man/loessfit.Rd
+++ b/man/loessfit.Rd
@@ -39,7 +39,7 @@ When unequal \code{weights} are provided, this function calls \code{weightedLowe
\code{weightedLowess} implements a similar algorithm to \code{lowess} except that it uses the prior weights both in the local regressions and in determining which other observations to include in the local neighbourhood of each observation.
Two alternative algorithms for weighted lowess curve fitting are provided as options.
-If \code{method="loess"}, then a call is made to \code{loess(y~x,weights=weights,span=span,degree=1,family="symmetric",...)}.
+If \code{method="loess"}, then a call is made to \code{loess(y~x,weights=weights,span=span,degree=1,family="symmetric",\dots)}.
This method differs from \code{weightedLowess} in that the prior weights are ignored when determining the neighbourhood of each observation.
If \code{method="locfit"}, then repeated calls are made to \code{locfit:::locfit.raw} with \code{deg=1}.
diff --git a/man/ma3x3.Rd b/man/ma3x3.Rd
index f898fd7..9f445b5 100755
--- a/man/ma3x3.Rd
+++ b/man/ma3x3.Rd
@@ -6,14 +6,14 @@
Apply a specified function to each to each value of a matrix and its immediate neighbors.
}
\usage{
-ma3x3.matrix(x,FUN=mean,na.rm=TRUE,...)
-ma3x3.spottedarray(x,printer,FUN=mean,na.rm=TRUE,...)
+ma3x3.matrix(x,FUN=mean,na.rm=TRUE,\dots)
+ma3x3.spottedarray(x,printer,FUN=mean,na.rm=TRUE,\dots)
}
\arguments{
\item{x}{numeric matrix}
\item{FUN}{function to apply to each window of values}
\item{na.rm}{logical value, should missing values be removed when applying \code{FUN}}
- \item{...}{other arguments are passed to \code{FUN}}
+ \item{\dots}{other arguments are passed to \code{FUN}}
\item{printer}{list giving the printer layout, see \code{\link{PrintLayout-class}}}
}
\details{
@@ -33,4 +33,5 @@ ma3x3.matrix(x,FUN="mean")
ma3x3.matrix(x,FUN="min")
}
\author{Gordon Smyth}
-\keyword{smooth}
+
+\keyword{background correction}
diff --git a/man/mdplot.Rd b/man/mdplot.Rd
index bb66f77..cedcf46 100644
--- a/man/mdplot.Rd
+++ b/man/mdplot.Rd
@@ -5,23 +5,26 @@
Creates a mean-difference plot.
}
\usage{
-mdplot(x, ...)
+mdplot(x, xlab="Mean", ylab="Difference", \dots)
}
\arguments{
- \item{x}{numeric \code{matrix} with at least two columns}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{x}{numeric \code{matrix} with at least two columns.}
+ \item{xlab}{label for the x-axis.}
+ \item{ylab}{label for the y-axis.}
+ \item{\dots}{any other arguments are passed to \code{\link{plotWithHighlights}}.}
}
\details{
Plots differences vs means for a set of bivariate values.
This is useful to contrast expression values for two microarrays.
-Note that an MA-plot \code{\link{plotMA}} is a type of mean-difference plot.
+An MA-plot \code{\link{plotMA}} is a type of mean-difference plot.
}
\value{A plot is created on the current graphics device.}
-\references{Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. (1983). Graphical Methods of Data Analysis. Wadsworth (pp. 48-57).
+\references{
+Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. (1983). Graphical Methods of Data Analysis. Wadsworth (pp. 48-57).
Cleveland, W. S., (1993). Visualizing Data. Hobart Press.
@@ -31,6 +34,8 @@ See also \url{http://www.statsci.org/micrarra/refs/maplots.html}
}
\author{Gordon Smyth}
\seealso{
+\code{\link{plotMA}}, \code{\link{plotWithHighlights}}
+
An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
}
\keyword{hplot}
diff --git a/man/mrlm.Rd b/man/mrlm.Rd
index 5a4b201..a32a294 100644
--- a/man/mrlm.Rd
+++ b/man/mrlm.Rd
@@ -5,7 +5,7 @@
The fit is by robust M-estimation allowing for a small proportion of outliers.
This is a utility function for \code{lmFit}.}
\usage{
-mrlm(M,design=NULL,ndups=1,spacing=1,weights=NULL,...)
+mrlm(M,design=NULL,ndups=1,spacing=1,weights=NULL,\dots)
}
\arguments{
\item{M}{numeric matrix containing log-ratio or log-expression values for a series of microarrays, rows correspond to genes and columns to arrays.}
@@ -13,7 +13,7 @@ mrlm(M,design=NULL,ndups=1,spacing=1,weights=NULL,...)
\item{ndups}{a positive integer giving the number of times each gene is printed on an array. \code{nrow(M)} must be divisible by \code{ndups}.}
\item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots.}
\item{weights}{numeric matrix of the same dimension as \code{M} containing weights. If it is of different dimension to \code{M}, it will be filled out to the same size. \code{NULL} is equivalent to equal weights.}
- \item{...}{any other arguments are passed to \code{rlm.default}.}
+ \item{\dots}{any other arguments are passed to \code{rlm.default}.}
}
\value{
A list with components
diff --git a/man/nec.Rd b/man/nec.Rd
index d746d71..39ea504 100644
--- a/man/nec.Rd
+++ b/man/nec.Rd
@@ -2,12 +2,13 @@
\alias{nec}
\alias{neqc}
\title{NormExp Background Correction and Normalization Using Control Probes}
-\description{Perform normexp background correction using negative control probes and quantile normalization using negative and positive control probes.}
+\description{Perform normexp background correction using negative control probes and quantile normalization using negative and positive control probes.
+Particularly useful for Illumina BeadChips.}
\usage{
nec(x, status=NULL, negctrl="negative", regular="regular", offset=16,
robust=FALSE, detection.p="Detection")
neqc(x, status=NULL, negctrl="negative", regular="regular", offset=16,
- robust=FALSE, detection.p="Detection", ...)
+ robust=FALSE, detection.p="Detection", \dots)
}
\arguments{
\item{x}{object of class \code{EListRaw} or \code{matrix} containing raw intensities for regular and control probes from a series of microarrays.}
@@ -17,7 +18,7 @@ neqc(x, status=NULL, negctrl="negative", regular="regular", offset=16,
\item{offset}{numeric value added to the intensities after background correction.}
\item{robust}{logical. Should robust estimators be used for the background mean and standard deviation?}
\item{detection.p}{dection p-values. Only used when no negative control probes can be found in the data. Can be a numeric matrix or a character string giving the name of the component of \code{x$other} containing the matrix.}
- \item{...}{any other arguments are passed to \code{normalizeBetweenArrays.}}
+ \item{\dots}{any other arguments are passed to \code{normalizeBetweenArrays.}}
}
\details{
\code{neqc} performs background correction followed by quantile normalization, using negative control probes for background correction and both negative and positive controls for normalization (Shi et al, 2010).
@@ -82,4 +83,5 @@ yr <- neqc(xr)
}
}
-\keyword{models}
+\keyword{background correction}
+\keyword{illumina beadchips}
diff --git a/man/normalizeCyclicLoess.Rd b/man/normalizeCyclicLoess.Rd
index dda429e..6c5e430 100644
--- a/man/normalizeCyclicLoess.Rd
+++ b/man/normalizeCyclicLoess.Rd
@@ -55,4 +55,4 @@ An overview of LIMMA functions for normalization is given in \link{05.Normalizat
\link[affy]{normalize.loess} in the affy package also implements cyclic loess normalization, without weights.
}
-\keyword{models}
+\keyword{normalization}
diff --git a/man/normalizeMedianAbsValues.Rd b/man/normalizeMedianAbsValues.Rd
index b1fe191..8e3c35f 100644
--- a/man/normalizeMedianAbsValues.Rd
+++ b/man/normalizeMedianAbsValues.Rd
@@ -30,4 +30,4 @@ Here the median-absolute value is used for preference to as to not re-center the
M <- cbind(Array1=rnorm(10),Array2=2*rnorm(10))
normalizeMedianAbsValues(M)
}
-\keyword{array}
+\keyword{normalization}
diff --git a/man/normalizeRobustSpline.Rd b/man/normalizeRobustSpline.Rd
index afa96ed..640f4fa 100755
--- a/man/normalizeRobustSpline.Rd
+++ b/man/normalizeRobustSpline.Rd
@@ -50,4 +50,5 @@ normalized.M <- normalizeRobustSpline(M,A)
# Usual usage
\dontrun{MA <- normalizeWithinArrays(RG, method="robustspline")}
}
-\keyword{models}
+
+\keyword{normalization}
diff --git a/man/normalizeVSN.Rd b/man/normalizeVSN.Rd
index 03803c9..8309ae8 100644
--- a/man/normalizeVSN.Rd
+++ b/man/normalizeVSN.Rd
@@ -10,12 +10,12 @@ Apply variance stabilizing normalization (vsn) to limma data objects.
}
\usage{
-normalizeVSN(x, ...)
+normalizeVSN(x, \dots)
}
\arguments{
\item{x}{a numeric \code{matrix}, \code{EListRaw} or \code{\link[limma:rglist]{RGList}} object.}
- \item{...}{other arguments are passed to \code{vsn}}
+ \item{\dots}{other arguments are passed to \code{vsn}}
}
\details{
diff --git a/man/normalizeWithinArrays.Rd b/man/normalizeWithinArrays.Rd
index 7bb879e..dcb8e98 100755
--- a/man/normalizeWithinArrays.Rd
+++ b/man/normalizeWithinArrays.Rd
@@ -84,4 +84,4 @@ The original loess normalization function was the \code{statma} funtion in the s
A different implementation of loess normalization methods, with potentially different behavior, is provided by the \code{\link[marray:maNorm]{maNorm}} in the marray package.
}
-\keyword{models}
+\keyword{normalization}
diff --git a/man/normalizebetweenarrays.Rd b/man/normalizebetweenarrays.Rd
index 1479cf9..46f18e5 100755
--- a/man/normalizebetweenarrays.Rd
+++ b/man/normalizebetweenarrays.Rd
@@ -7,7 +7,7 @@ Normalizes expression intensities so that the intensities or log-ratios have sim
}
\usage{
-normalizeBetweenArrays(object, method=NULL, targets=NULL, cyclic.method="fast", ...)
+normalizeBetweenArrays(object, method=NULL, targets=NULL, cyclic.method="fast", \dots)
}
\arguments{
@@ -20,7 +20,7 @@ normalizeBetweenArrays(object, method=NULL, targets=NULL, cyclic.method="fast",
The default is \code{"Aquantile"} for two-color data objects or \code{"quantile"} for single-channel objects.}
\item{targets}{vector, factor or matrix of length twice the number of arrays, used to indicate target groups if \code{method="Tquantile"}}
\item{cyclic.method}{character string indicating the variant of \code{normalizeCyclicLoess} to be used if \code{method=="cyclicloess"}, see \code{\link{normalizeCyclicLoess}} for possible values.}
- \item{...}{other arguments are passed to \code{normalizeQuantiles} or \code{normalizeCyclicLoess}}
+ \item{\dots}{other arguments are passed to \code{normalizeQuantiles} or \code{normalizeCyclicLoess}}
}
\details{
@@ -100,5 +100,4 @@ x <- matrix(rnorm(ngenes*narrays),100,4)
y <- normalizeBetweenArrays(x)
}
-\keyword{models}
-\keyword{multivariate}
+\keyword{normalization}
diff --git a/man/normalizeprintorder.Rd b/man/normalizeprintorder.Rd
index a8b5bce..058552f 100755
--- a/man/normalizeprintorder.Rd
+++ b/man/normalizeprintorder.Rd
@@ -75,4 +75,4 @@ plotPrintorder(RG,layout,slide=1,separate=TRUE)
RG <- normalizeForPrintorder(mouse.data,mouse.setup)
}
}
-\keyword{models}
+\keyword{normalization}
diff --git a/man/normalizequantiles.Rd b/man/normalizequantiles.Rd
index 8dc6c11..246e2a8 100755
--- a/man/normalizequantiles.Rd
+++ b/man/normalizequantiles.Rd
@@ -30,5 +30,6 @@ Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P. (2003), A comparis
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for normalization is given in \link{05.Normalization}.
-}
-\keyword{models}
+}
+
+\keyword{normalization}
diff --git a/man/plot.Rd b/man/plot.Rd
index 3e7b953..19825fa 100644
--- a/man/plot.Rd
+++ b/man/plot.Rd
@@ -4,17 +4,23 @@
\alias{plot.MAList}
\alias{plot.EList}
\alias{plot.MArrayLM}
+\alias{plotWithHighlights}
\description{
-Plot methods for data or model fit objects.
+Plot data or model fit objects.
These represent the object by creating an MA-plot, with optional color coding for control spots.
}
\usage{
-\method{plot}{MAList}(x, y, array = 1, xlab= "A", ylab= "M", main = colnames(x)[array],
- xlim = NULL, ylim = NULL, status, values, pch, col, cex, legend = TRUE,
- zero.weights = FALSE, \dots)
-\method{plot}{MArrayLM}(x, y, coef = ncol(x), xlab="AveExpr", ylab="logFC",
- main = colnames(x)[coef], xlim=NULL, ylim=NULL, status, values, pch, col, cex,
- legend = TRUE, zero.weights = FALSE, \dots)
+plotWithHighlights(x, y, status = NULL, values = NULL, pch = 16, col = NULL, cex = 1,
+ legend = "topleft", pch.bg = 16, col.bg = "black", cex.bg = 0.3, \dots)
+\method{plot}{EList}(x, y, array = 1, xlab = "Average log-expression",
+ ylab = "Expression log-ratio (this sample vs others)", main = colnames(x)[array],
+ status=x$genes$Status, zero.weights = FALSE, \dots)
+\method{plot}{RGList}(x, y, array = 1, xlab = "A", ylab = "M", main = colnames(x)[array],
+ status=x$genes$Status, zero.weights = FALSE, \dots)
+\method{plot}{MAList}(x, y, array = 1, xlab = "A", ylab = "M", main = colnames(x)[array],
+ status=x$genes$Status, zero.weights = FALSE, \dots)
+\method{plot}{MArrayLM}(x, y, coef = ncol(x), xlab = "Average log-expression", ylab = "log-fold-change",
+ main = colnames(x)[coef], status = x$genes$Status, zero.weights = FALSE, \dots)
}
\arguments{
\item{x}{an \code{RGList}, \code{MAList}, \code{EList} or \code{MArrayLM} object.}
@@ -24,22 +30,28 @@ These represent the object by creating an MA-plot, with optional color coding fo
\item{xlab}{character string giving label for x-axis}
\item{ylab}{character string giving label for y-axis}
\item{main}{character string giving title for plot}
- \item{xlim}{numeric vector of length 2 giving limits for x-axis, defaults to min and max of the data}
- \item{ylim}{numeric vector of length 2 giving limits for y-axis, defaults to min and max of the data}
\item{status}{character vector giving the control status of each spot on the array, of same length as the number of rows of \code{MA$M}.
- If omitted, all points are plotted in the default color, symbol and size.}
- \item{values}{character vector giving values of \code{status} to be highlighted on the plot. Defaults to unique values of \code{status}.
+ If \code{NULL}, then all points are plotted in the default color, symbol and size.}
+ \item{values}{character vector giving values of \code{status} to be highlighted on the plot.
+ Defaults to unique values of \code{status} in decreasing order of frequency, with the most frequent value set as the background value.
Ignored if there is no \code{status} vector.}
- \item{pch}{vector or list of plotting characters. Default is integer code 16 which gives a solid circle.
+ \item{pch}{vector or list of plotting characters.
Ignored is there is no \code{status} vector.}
- \item{col}{numeric or character vector of colors, of the same length as \code{values}. Defaults to \code{1:length(values)}.
+ \item{col}{numeric or character vector of colors, of the same length as \code{values}.
+ Defaults to \code{1+1:length(values)}.
Ignored if there is no \code{status} vector.}
- \item{cex}{numeric vector of plot symbol expansions, of the the same length as \code{values}.
- Defaults to 0.3 for the most common status value and 1 for the others.
+ \item{cex}{numeric vector of plot symbol expansions.
+ If a vector, then of the same length as \code{values}.
+ Ignored if there is no \code{status} vector.}
+ \item{legend}{character string giving position to place legend.
+ See \code{\link{legend}} for possible values.
+ Can also be logical, with \code{FALSE} meaning no legend.
Ignored if there is no \code{status} vector.}
- \item{legend}{logical, should a legend of plotting symbols and colors be included. Can also be a character string giving position to place legend. Ignored if there is no \code{status} vector.}
\item{zero.weights}{logical, should spots with zero or negative weights be plotted?}
- \item{\dots}{any other arguments are passed to \code{plot.default}}
+ \item{pch.bg}{plotting character for background (non-highlighted) points.}
+ \item{col.bg}{color for background (non-highlighted) points.}
+ \item{cex.bg}{plot symbol expansion for background (non-highlighted) points.}
+ \item{\dots}{The \code{plot} methods pass other arguments to \code{plotWithHighlights}, and \code{plotWithHighlights} passes other arguments to \code{plot.default}.}
}
\details{
@@ -48,12 +60,12 @@ If \code{x} is an \code{RGList} or \code{MAList} then this function produces an
If \code{x} is an \code{MArrayLM} object, then the plot is an fitted model MA-plot in which the estimated coefficient is on the y-axis and the average A-value is on the x-axis.
If \code{x} is a \code{EList} object, then this function produces a between-array MA-plot.
-In this case the A-values in the plot are the average log-intensities across the arrays and the M-values are the deviations of the log-intensities for the specified array from the average.
-If there are more than five arays, then the average is computed robustly using medians.
-With five or fewer arrays, it is computed by means.
+An articifial array is produced by averaging all the arrays other than the array specified.
+A mean-difference plot is then producing from the specified array and the artificial array.
+Note that this procedure reduces to an ordinary mean-difference plot when there are just two arrays total.
-The \code{status} vector is intended to specify the control status of each spot, for example "gene", "ratio control", "house keeping gene", "buffer" and so on.
-The vector is usually computed using the function \code{\link{controlStatus}} and a spot-types file.
+The \code{status} vector is intended to specify the control status of each spot, for example \code{"gene"}, \code{"ratio control"}, \code{"house keeping gene"}, \code{"buffer"} and so on.
+The vector is often computed using the function \code{\link{controlStatus}} and a spot-types file.
However the function may be used to highlight any subset of spots.
The \code{status} can be included as the component \code{x$genes$Status} instead of being passed as an argument to \code{plot}.
@@ -65,31 +77,16 @@ See \code{\link[graphics]{points}} for possible values for \code{pch}, \code{col
\value{A plot is created on the current graphics device.}
\references{See \url{http://www.statsci.org/micrarra/refs/maplots.html}}
\author{Gordon Smyth}
-\examples{
-# See also lmFit example
-
-MA <- new("MAList")
-MA$A <- runif(300,4,16)
-MA$M <- rt(300,df=3)
-status <- rep("Gene",300)
-status[1:3] <- "M=0"
-MA$M[1:3] <- 0
-status[4:6] <- "M=3"
-MA$M[4:6] <- 3
-status[7:9] <- "M=-3"
-MA$M[7:9] <- -3
-plot(MA,main="MA-Plot with Simulated Data",status=status,
- values=c("M=0","M=3","M=-3"),col=c("blue","red","green"))
-# Same as above
-attr(status,"values") <- c("M=0","M=3","M=-3")
-attr(status,"col") <- c("blue","red","green")
-plot(MA,main="MA-Plot with Simulated Data",status=status)
-
-# Same as above
-MA$genes$Status <- status
-plot(MA,main="MA-Plot with Simulated Data")
+\examples{
+A <- runif(1000,4,16)
+y <- A + matrix(rnorm(1000*3,sd=0.2),1000,3)
+status <- rep(c(0,-1,1),c(950,40,10))
+y[,1] <- y[,1] + status
+E <- new("EList",list(E=y))
+plot(E,array=1,status=status,values=c(-1,1),col=c("blue","red"))
}
+
\seealso{
\code{\link[limma:plotma]{plotMA}}, \code{\link{plotFB}}, \code{\link{plotMDS}}, \code{\link{plotSA}}
diff --git a/man/plotDensities.Rd b/man/plotDensities.Rd
index f470a32..dbec96b 100755
--- a/man/plotDensities.Rd
+++ b/man/plotDensities.Rd
@@ -10,24 +10,22 @@
Plot the density of expression values for multiple arrays on the same plot.
}
\usage{
-\method{plotDensities}{RGList}(object, log=TRUE, group=NULL, col=NULL, main="RG Densities",...)
-\method{plotDensities}{MAList}(object, log=TRUE, group=NULL, col=NULL, main="RG Densities",...)
-\method{plotDensities}{EListRaw}(object, log=TRUE, group=NULL, col=NULL, main=NULL,...)
-\method{plotDensities}{EList}(object, log=TRUE, group=NULL, col=NULL, main=NULL,...)
-\method{plotDensities}{default}(object, group=NULL, col=NULL, main=NULL,...)
+\method{plotDensities}{RGList}(object, log=TRUE, group=NULL, col=NULL, main="RG Densities",
+ bc.method="subtract", \dots)
+\method{plotDensities}{MAList}(object, log=TRUE, group=NULL, col=NULL, main="RG Densities", \dots)
+\method{plotDensities}{EListRaw}(object, log=TRUE, group=NULL, col=NULL, main=NULL,
+ bc.method="subtract", \dots)
+\method{plotDensities}{EList}(object, log=TRUE, group=NULL, col=NULL, main=NULL, \dots)
+\method{plotDensities}{default}(object, group=NULL, col=NULL, main=NULL, \dots)
}
\arguments{
\item{object}{an \code{RGList}, \code{MAList}, \code{EListRaw} or \code{EList} object containing expression data. Or any data object that can be coerced to a matrix.}
-
\item{log}{logical, should densities be plotted on the log2 scale?}
-
\item{group}{optional vector or factor classifying the arrays into groups. Should be same length as \code{ncol(object)}.}
-
\item{col}{optional vector of colors of the same length as the number of groups.}
-
\item{main}{the main title for the plot.}
-
- \item{\ldots}{Not currently used.}
+ \item{bc.method}{background subtraction method passed to \code{\link{backgroundCorrect}}.}
+ \item{\ldots}{other arguments are passed to \code{\link{density}}.}
}
\details{
diff --git a/man/plotExons.Rd b/man/plotExons.Rd
new file mode 100644
index 0000000..2478dd2
--- /dev/null
+++ b/man/plotExons.Rd
@@ -0,0 +1,48 @@
+\title{Plot exons of differentially expressed gene}
+\name{plotExons}
+\alias{plotExons}
+\description{
+Plot exons of differentially expressed gene and mark the differentially expressed exons.
+}
+\usage{
+plotExons(fit, coef = ncol(fit), geneid = NULL, genecolname = "GeneID",
+ exoncolname = NULL, rank = 1L, FDR = 0.05)
+}
+\arguments{
+ \item{fit}{\code{MArrayLM} fit object produced by \code{eBayes}.}
+ \item{coef}{the coefficient (column) of fit for which differential expression is assessed.}
+ \item{geneid}{character string, ID of the gene to plot.}
+ \item{genecolname}{character string for the column name of \code{fit$genes} containing gene IDs. Defaults to "GeneID" for Entrez Gene ID.}
+ \item{exoncolname}{character string for the column name of \code{fit$genes} containing exon IDs.}
+ \item{rank}{integer, if \code{geneid=NULL} then this ranked gene will be plotted.}
+ \item{FDR}{numeric, mark differentially expressed exons with false discovery rate less than this cutoff.}
+}
+
+\details{
+Plots log2-fold-change by exon for the specified gene and highlight the differentially expressed exons.
+Show annotations such as GeneID, Symbol and Strand if available as title for the gene to plot.
+The significantly differentially expressed individual exons are highlighted as red dots for up-regulation and as blue dots for down-regulation.
+The size of the dots are weighted by its significance.
+}
+
+\value{A plot is created on the current graphics device.}
+\author{Yifang Hu and Gordon Smyth}
+\seealso{
+\code{\link{lmFit}}, \code{\link{eBayes}}, \code{\link{plotSplice}}
+
+An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
+}
+\examples{
+
+\dontrun{
+fit <- lmFit(y,design)
+fit <- eBayes(fit)
+plotExons(fit)
+plotExons(fit, exoncolname = "Start", rank = 1)
+plotExons(fit, geneid = "ps", genecolname = "Symbol", exoncolname = "Start")
+}
+
+}
+
+\keyword{hplot}
+\keyword{rna-seq}
diff --git a/man/plotFB.Rd b/man/plotFB.Rd
index 301e1c6..44265be 100644
--- a/man/plotFB.Rd
+++ b/man/plotFB.Rd
@@ -8,8 +8,8 @@
Creates foreground-background plots.
}
\usage{
-\method{plotFB}{RGList}(x, array=1, lim="separate", pch=16, cex=0.2, ...)
-\method{plotFB}{EListRaw}(x, array=1, pch=16, cex=0.2, ...)
+\method{plotFB}{RGList}(x, array=1, lim="separate", pch=16, cex=0.2, \dots)
+\method{plotFB}{EListRaw}(x, array=1, pch=16, cex=0.2, \dots)
}
\arguments{
\item{x}{an \code{RGList} or \code{EListRaw} object.}
@@ -17,7 +17,7 @@ Creates foreground-background plots.
\item{lim}{character string indicating whether the red and green plots should have \code{"separate"} or \code{"common"} x- and y- co-ordinate limits.}
\item{pch}{vector or list of plotting characters. Defaults to integer code 16.}
\item{cex}{numeric vector of plot symbol expansions.}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{\dots}{any other arguments are passed to \code{plot}}
}
\details{
diff --git a/man/plotRLDF.Rd b/man/plotRLDF.Rd
index ceead36..a280d06 100644
--- a/man/plotRLDF.Rd
+++ b/man/plotRLDF.Rd
@@ -6,7 +6,7 @@ Plot of regularized linear discriminant functions for microarray data.
}
\usage{
plotRLDF(y,design=NULL,z=NULL,labels.y=NULL,labels.z=NULL,col.y=1,col.z=1,
-df.prior=5,show.dimensions=c(1,2),main=NULL,nprobes=500,...)}
+df.prior=5,show.dimensions=c(1,2),main=NULL,nprobes=500,\dots)}
\arguments{
\item{y}{any data object which can be coerced to a matrix, such as \code{ExpressionSet} or \code{EList}. The training dataset.}
\item{z}{any data object which can be coerced to a matrix, such as \code{ExpressionSet} or \code{EList}. The dataset to be classified.}
@@ -19,7 +19,7 @@ df.prior=5,show.dimensions=c(1,2),main=NULL,nprobes=500,...)}
\item{show.dimensions}{which two dimensions should be plotted, numeric vector of length two.}
\item{main}{title of the plot.}
\item{nprobes}{number of probes to be used for the calculations. Selected by moderated F tests.}
- \item{...}{any other arguments are passed to \code{plot}.}
+ \item{\dots}{any other arguments are passed to \code{plot}.}
}
\details{
This function is a variation on the plot of usual linear discriminant fuction, in that the within-group covariance matrix is regularized to ensure that it is invertible, with eigenvalues bounded away from zero.
diff --git a/man/plotSA.Rd b/man/plotSA.Rd
index f75a3be..552cd04 100755
--- a/man/plotSA.Rd
+++ b/man/plotSA.Rd
@@ -6,7 +6,7 @@ Plot log residual standard deviation versus average log expression for a fitted
}
\usage{
plotSA(fit, xlab="Average log-expression", ylab="log2(sigma)",
- zero.weights=FALSE, pch=16, cex=0.2, ...)
+ zero.weights=FALSE, pch=16, cex=0.2, \dots)
}
\arguments{
\item{fit}{an \code{MArrayLM} object.}
@@ -16,7 +16,7 @@ plotSA(fit, xlab="Average log-expression", ylab="log2(sigma)",
\item{cex}{numeric expansion factor for plotting character.
Defaults to 0.2.}
\item{zero.weights}{logical, should spots with zero or negative weights be plotted?}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{\dots}{any other arguments are passed to \code{plot}}
}
\details{
diff --git a/man/plotSplice.Rd b/man/plotSplice.Rd
index 7be592e..3dcb346 100644
--- a/man/plotSplice.Rd
+++ b/man/plotSplice.Rd
@@ -1,22 +1,25 @@
-\title{Plot exons on differentially spliced gene}
+\title{Differential splicing plot}
\name{plotSplice}
\alias{plotSplice}
\description{
-Plot exons of differentially spliced gene.
+Plot relative log-fold changes by exons for the specified gene and highlight the significantly spliced exons.
}
\usage{
-plotSplice(fit, coef=ncol(fit), geneid=NULL, rank=1L, FDR = 0.05)
+plotSplice(fit, coef=ncol(fit), geneid=NULL, genecolname=NULL, rank=1L, FDR = 0.05)
}
\arguments{
\item{fit}{\code{MArrayLM} fit object produced by \code{diffSplice}.}
\item{coef}{the coefficient (column) of fit for which differentially splicing is assessed.}
\item{geneid}{character string, ID of the gene to plot.}
+ \item{genecolname}{column name of \code{fit$genes} containing gene IDs. Defaults to \code{fit$genecolname}.}
\item{rank}{integer, if \code{geneid=NULL} then this ranked gene will be plotted.}
- \item{FDR}{numeric, mark exons with false discovery rate less than this cutoff.}
+ \item{FDR}{numeric, highlight exons as red dots with false discovery rate less than this cutoff. The FDR of the individual exon is calculated based on the exon-level t-statistics test for differences between each exon and all other exons for the same gene.}
}
\details{
-Plots interaction log2-fold-change by exon for the specified gene.
+Plot relative log2-fold-changes by exon for the specified gene.
+The relative logFC is the difference between the exon's logFC and the overall logFC for the gene, as computed by \code{diffSplice}.
+The significantly spliced individual exons are highlighted as red dots. The size of the red dots are weighted by its significance.
}
\value{A plot is created on the current graphics device.}
@@ -27,3 +30,4 @@ Plots interaction log2-fold-change by exon for the specified gene.
An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
}
\examples{# See diffSplice}
+\keyword{rna-seq}
diff --git a/man/plotlines.Rd b/man/plotlines.Rd
index c60b1f8..766fa29 100644
--- a/man/plotlines.Rd
+++ b/man/plotlines.Rd
@@ -5,7 +5,7 @@
Time course style plot of expression data.
}
\usage{
-plotlines(x,first.column.origin=FALSE,xlab="Column",ylab="x",col="black",lwd=1,...)
+plotlines(x,first.column.origin=FALSE,xlab="Column",ylab="x",col="black",lwd=1,\dots)
}
\arguments{
\item{x}{numeric matrix or object containing expression data.}
@@ -14,7 +14,7 @@ plotlines(x,first.column.origin=FALSE,xlab="Column",ylab="x",col="black",lwd=1,.
\item{ylab}{y-axis label}
\item{col}{vector of colors for lines}
\item{lwd}{line width multiplier}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{\dots}{any other arguments are passed to \code{plot}}
}
\details{
diff --git a/man/plotma.Rd b/man/plotma.Rd
index f8924d2..139a609 100755
--- a/man/plotma.Rd
+++ b/man/plotma.Rd
@@ -6,90 +6,106 @@
\alias{plotMA.EList}
\alias{plotMA.MArrayLM}
\alias{plotMA.default}
+
\description{
Creates an MA-plot with color coding for control spots.
}
+
\usage{
-\method{plotMA}{default}(MA, array = 1, xlab = "A", ylab = "M", main = colnames(MA)[array],
- xlim = NULL, ylim = NULL, status, values, pch, col, cex, legend = TRUE,
- zero.weights = FALSE, ...)
-\method{plotMA}{MArrayLM}(MA, coef = ncol(MA), xlab = "AveExpr", ylab = "logFC",
- main = colnames(MA)[coef], xlim = NULL, ylim = NULL, status, values, pch, col,
- cex, legend = TRUE, zero.weights = FALSE, ...)
+\method{plotMA}{default}(object, array = 1, xlab = "Average log-expression",
+ ylab = "Expression log-ratio (this sample vs others)",
+ main = colnames(object)[array], status=NULL, \dots)
+\method{plotMA}{EList}(object, array = 1, xlab = "Average log-expression",
+ ylab = "Expression log-ratio (this sample vs others)",
+ main = colnames(object)[array], status=object$genes$Status,
+ zero.weights = FALSE, \dots)
+\method{plotMA}{RGList}(object, array = 1, xlab = "A", ylab = "M",
+ main = colnames(object)[array], status=object$genes$Status,
+ zero.weights = FALSE, \dots)
+\method{plotMA}{MAList}(object, array = 1, xlab = "A", ylab = "M",
+ main = colnames(object)[array], status=object$genes$Status,
+ zero.weights = FALSE, \dots)
+\method{plotMA}{MArrayLM}(object, coef = ncol(object), xlab = "Average log-expression",
+ ylab = "log-fold-change", main = colnames(object)[coef],
+ status=object$genes$Status, zero.weights = FALSE, \dots)
}
+
\arguments{
- \item{MA}{an \code{RGList}, \code{MAList}, \code{EList} or \code{MArrayLM} object.
- Alternatively a \code{matrix} or \code{ExpressionSet} object.}
+ \item{object}{an \code{RGList}, \code{MAList}, \code{EList}, \code{ExpressionSet} or \code{MArrayLM} object.
+ Alternatively a numeric \code{matrix}.}
\item{array}{integer giving the array to be plotted.}
\item{coef}{integer giving the linear model coefficient to be plotted.}
\item{xlab}{character string giving label for x-axis}
\item{ylab}{character string giving label for y-axis}
\item{main}{character string giving title for plot}
- \item{xlim}{numeric vector of length 2 giving limits for x-axis, defaults to min and max of the data}
- \item{ylim}{numeric vector of length 2 giving limits for y-axis, defaults to min and max of the data}
- \item{status}{character vector giving the control status of each spot on the array, of same length as the number of rows of \code{MA$M}.
- If omitted, all points are plotted in the default color, symbol and size.}
- \item{values}{character vector giving values of \code{status} to be highlighted on the plot. Defaults to unique values of \code{status}.
- Ignored if there is no \code{status} vector.}
- \item{pch}{vector or list of plotting characters. Default is integer code 16 which gives a solid circle.
- Ignored is there is no \code{status} vector.}
- \item{col}{numeric or character vector of colors, of the same length as \code{values}. Defaults to \code{1:length(values)}.
- Ignored if there is no \code{status} vector.}
- \item{cex}{numeric vector of plot symbol expansions, of the the same length as \code{values}.
- Defaults to 0.3 for the most common status value and 1 for the others.
- Ignored if there is no \code{status} vector.}
- \item{legend}{logical, should a legend of plotting symbols and colors be included. Can also be a character string giving position to place legend. Ignored if there is no \code{status} vector.}
+ \item{status}{vector giving the control status of each spot on the array, of same length as the number of rows of \code{object}.
+ If \code{NULL}, then all points are plotted in the default color, symbol and size.}
\item{zero.weights}{logical, should spots with zero or negative weights be plotted?}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{\dots}{other arguments are passed to \code{\link{plotWithHighlights}}.}
}
\details{
An MA-plot is a plot of log-intensity ratios (M-values) versus log-intensity averages (A-values).
-If \code{MA} is an \code{RGList} or \code{MAList} then this function produces an ordinary within-array MA-plot.
-If \code{MA} is an \code{MArrayLM} object, then the plot is an fitted model MA-plot in which the estimated coefficient is on the y-axis and the average A-value is on the x-axis.
+For two color data objects, a within-array MA-plot is produced with the M and A values computed from the two channels for the specified array.
+This is the same as a mean-difference plot (\code{\link{mdplot}}) with the red and green log2-intensities of the array providing the two columns.
-If \code{MA} is a \code{matrix} or \code{ExpressionSet} object, then this function produces a between-array MA-plot.
-In this case the A-values in the plot are the average log-intensities across the arrays and the M-values are the deviations of the log-intensities for the specified array from the average.
-If there are more than five arays, then the average is computed robustly using medians.
-With five or fewer arrays, it is computed by means.
+For single channel data objects, then a between-array MA-plot is produced.
+An articifial array is produced by averaging all the arrays other than the array specified.
+A mean-difference plot is then producing from the specified array and the artificial array.
+Note that this procedure reduces to an ordinary mean-difference plot when there are just two arrays total.
-The \code{status} vector is intended to specify the control status of each spot, for example "gene", "ratio control", "house keeping gene", "buffer" and so on.
-The vector is usually computed using the function \code{\link{controlStatus}} and a spot-types file.
+If \code{object} is an \code{MArrayLM} object, then the plot is an fitted model MA-plot in which the estimated coefficient is on the y-axis and the average A-value is on the x-axis.
+
+The \code{status} vector is intended to specify the control status of each spot, for example \code{"gene"}, \code{"ratio control"}, \code{"house keeping gene"}, \code{"buffer"} and so on.
+The vector is often computed using the function \code{\link{controlStatus}} and a spot-types file.
However the function may be used to highlight any subset of spots.
-The \code{status} can be included as the component \code{MA$genes$Status} instead of being passed as an argument to \code{plotMA}.
+The \code{status} can be included as the component \code{object$genes$Status} instead of being passed as an argument to \code{plotMA}.
The arguments \code{values}, \code{pch}, \code{col} and \code{cex} can be included as attributes to \code{status} instead of being passed as arguments to \code{plotMA}.
-
-See \code{\link[graphics]{points}} for possible values for \code{pch}, \code{col} and \code{cex}.
}
\value{A plot is created on the current graphics device.}
+
\references{See \url{http://www.statsci.org/micrarra/refs/maplots.html}}
+
\author{Gordon Smyth}
+
\examples{
MA <- new("MAList")
MA$A <- runif(300,4,16)
MA$M <- rt(300,df=3)
+
+# Spike-in values
+MA$M[1:3] <- 0
+MA$M[4:6] <- 3
+MA$M[7:9] <- -3
+
status <- rep("Gene",300)
status[1:3] <- "M=0"
-MA$M[1:3] <- 0
status[4:6] <- "M=3"
-MA$M[4:6] <- 3
status[7:9] <- "M=-3"
-MA$M[7:9] <- -3
-plotMA(MA,main="MA-Plot with Simulated Data",
- status=status,values=c("M=0","M=3","M=-3"),col=c("blue","red","green"))
+values <- c("M=0","M=3","M=-3")
+col <- c("blue","red","green")
-# Same as above
-attr(status,"values") <- c("M=0","M=3","M=-3")
-attr(status,"col") <- c("blue","red","green")
-plotMA(MA,main="MA-Plot with Simulated Data",status=status)
+plotMA(MA,main="MA-Plot with 12 spiked-in points",
+ status=status, values=values, col=col)
-# Same as above
+# Same as above but setting graphical parameters as attributes
+attr(status,"values") <- values
+attr(status,"col") <- col
+plotMA(MA, main="MA-Plot with 12 spiked-in points", status=status)
+
+# Same as above but passing status as part of object
+MA$genes$Status <- status
+plotMA(MA, main="MA-Plot with 12 spiked-in points")
+
+# Change settings for background points
MA$genes$Status <- status
-plotMA(MA,main="MA-Plot with Simulated Data")
+plotMA(MA, pch.bg=1, cex.bg=0.5)
}
+
\seealso{
An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
}
+
\keyword{hplot}
diff --git a/man/plotma3by2.Rd b/man/plotma3by2.Rd
index 20d0a2d..2ba7465 100755
--- a/man/plotma3by2.Rd
+++ b/man/plotma3by2.Rd
@@ -5,18 +5,18 @@
Write MA-plots to files in PNG format, six plots to a file in a 3 by 2 grid arrangement.
}
\usage{
-plotMA3by2(MA, prefix="MA", path=NULL, main=colnames(MA),
- zero.weights=FALSE, common.lim=TRUE, device="png", ...)
+plotMA3by2(object, prefix="MA", path=NULL, main=colnames(object),
+ zero.weights=FALSE, common.lim=TRUE, device="png", \dots)
}
\arguments{
- \item{MA}{an \code{MAList}, \code{RGList}, \code{EListRaw} or \code{EList} object, or a matrix containing log-intensities.}
+ \item{object}{an \code{MAList}, \code{RGList}, \code{EListRaw} or \code{EList} object, or a matrix containing log-intensities.}
\item{prefix}{character string giving prefix to attach to file names}
\item{path}{character string specifying directory for output files}
\item{main}{character vector giving titles for plots}
\item{zero.weights}{logical, should points with non-positive weights be plotted}
\item{common.lim}{logical, should all plots on a page use the same axis limits}
\item{device}{device driver for the plot. Choices are \code{"png"}, \code{"jpeg"}, \code{"pdf"}, \code{"postscript"}.}
- \item{...}{any other arguments are passed to \code{plotMA}}
+ \item{\dots}{any other arguments are passed to \code{plotMA}}
}
\details{
@@ -31,7 +31,7 @@ Note that \code{"pdf"} or \code{"postscript"} may produce very large files.
\value{
No value is returned, but one or more files are written to the working directory.
-The number of files is determined by the number of columns of \code{MA}.
+The number of files is determined by the number of columns of \code{object}.
}
\author{Gordon Smyth}
diff --git a/man/plotprinttiploess.Rd b/man/plotprinttiploess.Rd
index 13c249a..e576ebb 100755
--- a/man/plotprinttiploess.Rd
+++ b/man/plotprinttiploess.Rd
@@ -5,7 +5,7 @@
Creates a coplot giving MA-plots with loess curves by print-tip groups.
}
\usage{
-plotPrintTipLoess(object,layout,array=1,span=0.4,...)
+plotPrintTipLoess(object,layout,array=1,span=0.4,\dots)
}
\arguments{
\item{object}{\code{MAList} or \code{RGList} object or list with components \code{M} containing log-ratios and \code{A} containing average intensities}
@@ -13,7 +13,7 @@ plotPrintTipLoess(object,layout,array=1,span=0.4,...)
Defaults to \code{MA$printer} if that is non-null.}
\item{array}{integer giving the array to be plotted. Corresponds to columns of \code{M} and \code{A}.}
\item{span}{span of window for \code{lowess} curve}
- \item{...}{other arguments passed to \code{panel.smooth}}
+ \item{\dots}{other arguments passed to \code{panel.smooth}}
}
\details{
Note that spot quality weights in \code{object} are not used for computing the loess curves for this plot even though such weights would be used for loess normalization using \code{normalizeWithinArrays}.
diff --git a/man/qqt.Rd b/man/qqt.Rd
index cde8fd5..dd620e9 100755
--- a/man/qqt.Rd
+++ b/man/qqt.Rd
@@ -37,4 +37,5 @@ y <- rt(50,df=4)
qqt(y,df=4)
abline(0,1)
}
-\keyword{distribution}
+
+\keyword{distributions}
diff --git a/man/qualwt.Rd b/man/qualwt.Rd
index 7c60fb0..16e4bb9 100755
--- a/man/qualwt.Rd
+++ b/man/qualwt.Rd
@@ -47,4 +47,4 @@ RG <- read.maimages(fnames,source="spot",wt.fun=wtarea(165))
MA <- normalizeWithinArrays(RG,layout)
}
}
-\keyword{regression}
+\keyword{reading data}
diff --git a/man/read.columns.Rd b/man/read.columns.Rd
index 16b4b08..d42e662 100644
--- a/man/read.columns.Rd
+++ b/man/read.columns.Rd
@@ -6,7 +6,7 @@ Reads specified columns from a file in table format and creates a data frame fro
}
\usage{
read.columns(file, required.col=NULL, text.to.search="", sep="\t", quote="\"", skip=0,
- fill=TRUE, blank.lines.skip=TRUE, comment.char="", allowEscapes=FALSE, ...)
+ fill=TRUE, blank.lines.skip=TRUE, comment.char="", allowEscapes=FALSE, \dots)
}
\arguments{
\item{file}{the name of the file which the data are to be read from.}
@@ -47,5 +47,5 @@ A data frame (data.frame) containing a representation of the data in the file.
An overview of LIMMA functions for reading data is given in \link{03.ReadingData}.
}
-\keyword{IO}
-\keyword{file}
+\keyword{reading data}
+
diff --git a/man/read.idat.Rd b/man/read.idat.Rd
index 7a6f716..a43967b 100644
--- a/man/read.idat.Rd
+++ b/man/read.idat.Rd
@@ -65,6 +65,7 @@ propexpr(data)
datanorm = neqc(data)
}
}
-\keyword{file}
+
+\keyword{reading data}
\concept{illumina microarrays}
\concept{microarray data file}
diff --git a/man/read.ilmn.Rd b/man/read.ilmn.Rd
index 56b10a5..aaa4812 100644
--- a/man/read.ilmn.Rd
+++ b/man/read.ilmn.Rd
@@ -64,6 +64,6 @@ x <- read.ilmn(files="sample probe profile.txt",
# See neqc and beadCountWeights for other examples using read.ilmn
}
-\keyword{file}
+\keyword{reading data}
\concept{illumina microarrays}
\concept{microarray data file}
diff --git a/man/read.ilmn.targets.Rd b/man/read.ilmn.targets.Rd
index e547f61..966430d 100644
--- a/man/read.ilmn.targets.Rd
+++ b/man/read.ilmn.targets.Rd
@@ -3,11 +3,11 @@
\title{Read Illumina Data from a Target Dataframe}
\description{Read Illumina data from a target dataframe}
\usage{
-read.ilmn.targets(targets, ...)
+read.ilmn.targets(targets, \dots)
}
\arguments{
\item{targets}{ data frame including names of profile files.}
- \item{...}{ any other parameters are passed on to \code{\link{read.ilmn}}.}
+ \item{\dots}{ any other parameters are passed on to \code{\link{read.ilmn}}.}
}
\details{
\code{targets} is often created by calling the function \code{\link{readTargets}}.
@@ -27,4 +27,5 @@ An \code{\link{EListRaw-class}} object. See return value of the function \code{\
\code{\link{read.ilmn}}
}
+\keyword{reading data}
\concept{illumina microarrays}
diff --git a/man/read.maimages.Rd b/man/read.maimages.Rd
index 90a820e..44f955d 100755
--- a/man/read.maimages.Rd
+++ b/man/read.maimages.Rd
@@ -143,5 +143,5 @@ RG <- read.maimages(files,"genepix",wt.fun=wtflags(0.1))}
RG <- read.maimages(files,"spot",wt.fun=wtarea(150))}
}
-\keyword{file}
+\keyword{reading data}
\concept{microarray data file}
diff --git a/man/readGPRHeader.Rd b/man/readGPRHeader.Rd
index f404fe7..0930cfd 100755
--- a/man/readGPRHeader.Rd
+++ b/man/readGPRHeader.Rd
@@ -29,7 +29,11 @@ A key component is \code{NHeaderRecords} which gives the number of lines in the
All other components are character vectors.
}
\references{
-See \url{http://www.moleculardevices.com/Products/Software/GenePix-Pro.html} for GenePix formats.
+See
+\url{http://www.moleculardevices.com/Products/Software/GenePix-Pro.html}
+or
+\url{http://www.cryer.co.uk/file-types/a/atf/genepix_file_formats.htm}
+for GenePix formats.
See \url{http://http://smd.princeton.edu} for the SMD.
}
@@ -38,6 +42,7 @@ See \url{http://http://smd.princeton.edu} for the SMD.
An overview of LIMMA functions to read data is given in \link{03.ReadingData}.
}
+\keyword{IO}
\keyword{file}
\concept{microarray data file}
diff --git a/man/readSpotTypes.Rd b/man/readSpotTypes.Rd
index ecc3476..b1554fd 100755
--- a/man/readSpotTypes.Rd
+++ b/man/readSpotTypes.Rd
@@ -5,7 +5,7 @@
Read a table giving regular expressions to identify different types of spots in the gene-dataframe.
}
\usage{
-readSpotTypes(file="SpotTypes.txt",path=NULL,sep="\t",check.names=FALSE,...)
+readSpotTypes(file="SpotTypes.txt",path=NULL,sep="\t",check.names=FALSE,\dots)
}
\arguments{
\item{file}{character string giving the name of the file specifying the spot types.}
diff --git a/man/readTargets.Rd b/man/readTargets.Rd
index 0cb347d..1de83ed 100755
--- a/man/readTargets.Rd
+++ b/man/readTargets.Rd
@@ -5,7 +5,7 @@
Read targets file for a microarray experiment into a dataframe.
}
\usage{
-readTargets(file="Targets.txt", path=NULL, sep="\t", row.names=NULL, quote="\"",...)
+readTargets(file="Targets.txt", path=NULL, sep="\t", row.names=NULL, quote="\"",\dots)
}
\arguments{
\item{file}{character string giving the name of the targets file.}
@@ -14,7 +14,7 @@ readTargets(file="Targets.txt", path=NULL, sep="\t", row.names=NULL, quote="\"",
\item{sep}{field separator character}
\item{row.names}{character string giving the name of a column from which to obtain row names}
\item{quote}{the set of quoting characters}
- \item{...}{other arguments are passed to \code{\link{read.table}}}
+ \item{\dots}{other arguments are passed to \code{\link{read.table}}}
}
\details{
The targets file is a text file containing information about the RNA samples used as targets in the microarray experiment.
diff --git a/man/readgal.Rd b/man/readgal.Rd
index 6b3fb12..cd2ccb0 100755
--- a/man/readgal.Rd
+++ b/man/readgal.Rd
@@ -5,7 +5,7 @@
Read a GenePix Array List (GAL) file into a dataframe.
}
\usage{
-readGAL(galfile=NULL,path=NULL,header=TRUE,sep="\t",quote="\"",skip=NULL,as.is=TRUE,...)
+readGAL(galfile=NULL,path=NULL,header=TRUE,sep="\t",quote="\"",skip=NULL,as.is=TRUE,\dots)
}
\arguments{
\item{galfile}{character string giving the name of the GAL file. If \code{NULL} then a file with extension \code{.gal} is found in the directory specified by \code{path}.}
@@ -15,7 +15,7 @@ readGAL(galfile=NULL,path=NULL,header=TRUE,sep="\t",quote="\"",skip=NULL,as.is=T
\item{quote}{the set of quoting characters}
\item{skip}{number of lines of the GAL file to skip before reading data. If \code{NULL} then this number is determined by searching the file for column headings.}
\item{as.is}{logical variable, if \code{TRUE} then read in character columns as vectors rather than factors.}
- \item{...}{any other arguments are passed to \code{read.table}}
+ \item{\dots}{any other arguments are passed to \code{read.table}}
}
\details{
A GAL file is a list of genes IDs and associated information produced by an Axon microarray scanner.
@@ -43,7 +43,7 @@ The data frame will be sorted so that \code{Column} is the fastest moving index,
An overview of LIMMA functions for reading data is given in \link{03.ReadingData}.
}
\references{
-\url{http://www.axon.com/gn_GenePix_File_Formats.html}
+\url{http://www.cryer.co.uk/file-types/a/atf/genepix_file_formats.htm}
}
\examples{
# readGAL()
@@ -51,3 +51,5 @@ An overview of LIMMA functions for reading data is given in \link{03.ReadingData
# found in the current working directory
}
\keyword{IO}
+\keyword{file}
+
diff --git a/man/removeBatchEffect.Rd b/man/removeBatchEffect.Rd
index 1899149..6bc2bc1 100644
--- a/man/removeBatchEffect.Rd
+++ b/man/removeBatchEffect.Rd
@@ -9,7 +9,7 @@ removeBatchEffect(x, batch=NULL, batch2=NULL, covariates=NULL,
design=matrix(1,ncol(x),1), \dots)
}
\arguments{
- \item{x}{numeric matrix, or any object that can be coerced to a matrix by \code{as.matrix(x)}, containing log-expression values for a series of samples.
+ \item{x}{numeric matrix, or any data object that can be processed by \code{\link{getEAWP}} containing log-expression values for a series of samples.
Rows correspond to probes and columns to samples.}
\item{batch}{factor or vector indicating batches.}
\item{batch2}{factor or vector indicating batches.}
@@ -29,7 +29,7 @@ The design matrix is used to describe comparisons between the samples, for examp
The function (in effect) fits a linear model to the data, including both batches and regular treatments, then removes the component due to the batch effects.
-The data object \code{x} can be of any class for which \code{lmFit} and \code{as.matrix} work.
+The data object \code{x} can be of any class for which \code{lmFit} works.
If \code{x} contains weights, then these will be used in estimating the batch effects.
}
diff --git a/man/roast.Rd b/man/roast.Rd
index 1e97d67..05e3982 100644
--- a/man/roast.Rd
+++ b/man/roast.Rd
@@ -25,7 +25,7 @@ Rotation gene set testing for linear models.
Rows correspond to probes and columns to samples.
If either \code{var.prior} or \code{df.prior} are null, then \code{y} should contain values for all genes on the arrays. If both prior parameters are given, then only \code{y} values for the test set are required.}
\item{index}{index vector specifying which rows (probes) of \code{y} are in the test set. This can be a vector of indices, or a logical vector of the same length as \code{statistics}, or any vector such as \code{y[index,]} contains the values for the gene set to be tested.
- For \code{mroast}, \code{index} is a list of index vectors. The list can be made using \link{symbols2indices}.}
+ For \code{mroast}, \code{index} is a list of index vectors. The list can be made using \link{ids2indices}.}
\item{design}{design matrix}
\item{contrast}{contrast for which the test is required. Can be an integer specifying a column of \code{design}, or else a contrast vector of length equal to the number of columns of \code{design}.}
\item{set.statistic}{summary set statistic. Possibilities are \code{"mean"},\code{"floormean"},\code{"mean50"} or \code{"msq"}.}
@@ -110,7 +110,7 @@ The default setting for the set statistic was changed in limma 3.5.9 (3 June 201
}
\seealso{
-\code{\link{camera}}, \code{\link{romer}}, \code{\link{geneSetTest}}, \code{\link{symbols2indices}}.
+\code{\link{camera}}, \code{\link{romer}}, \code{\link{geneSetTest}}, \code{\link{ids2indices}}.
There is a topic page on \link{10.GeneSetTests}.
}
@@ -154,3 +154,7 @@ roast(y,index1,design,contrast=2)
mroast(y,index1,design,contrast=2)
mroast(y,list(set1=index1,set2=index2),design,contrast=2)
}
+
+\keyword{gene set test}
+\concept{gene set test}
+\concept{gene set enrichment analysis}
diff --git a/man/romer.Rd b/man/romer.Rd
index fc03f6a..b15171c 100644
--- a/man/romer.Rd
+++ b/man/romer.Rd
@@ -9,7 +9,7 @@ romer(index, y, design, contrast=ncol(design), array.weights=NULL, block=NULL,
correlation, set.statistic="mean", nrot=9999)
}
\arguments{
- \item{index}{list of indices specifying the rows of \code{y} in the gene sets. The list can be made using \link{symbols2indices}.}
+ \item{index}{list of indices specifying the rows of \code{y} in the gene sets. The list can be made using \link{ids2indices}.}
\item{y}{numeric matrix giving log-expression values.}
\item{design}{design matrix}
\item{contrast}{contrast for which the test is required. Can be an integer specifying a column of \code{design}, or else a contrast vector of length equal to the number of columns of \code{design}.}
@@ -60,7 +60,7 @@ This statistic performs well in practice but is slightly slower to compute.
\seealso{
\code{\link{topRomer}},
-\code{\link{symbols2indices}},
+\code{\link{ids2indices}},
\code{\link{roast}},
\code{\link{wilcoxGST}}
@@ -101,4 +101,7 @@ r
topRomer(r,alt="up")
topRomer(r,alt="down")
}
+
+\keyword{gene set test}
\concept{gene set test}
+\concept{gene set enrichment analysis}
diff --git a/man/selectmodel.Rd b/man/selectmodel.Rd
index 924141a..d915d07 100644
--- a/man/selectmodel.Rd
+++ b/man/selectmodel.Rd
@@ -3,7 +3,7 @@
\title{Select Appropriate Linear Model}
\description{Select the best fitting linear model for each gene by minimizing an information criterion.}
\usage{
-selectModel(y, designlist, criterion="aic", df.prior=0, s2.prior=NULL, s2.true=NULL, ...)
+selectModel(y, designlist, criterion="aic", df.prior=0, s2.prior=NULL, s2.true=NULL, \dots)
}
\arguments{
\item{y}{a matrix-like data object, containing log-ratios or log-values of expression for a series of microarrays.
@@ -13,7 +13,7 @@ selectModel(y, designlist, criterion="aic", df.prior=0, s2.prior=NULL, s2.true=N
\item{df.prior}{prior degrees of freedom for residual variances. See \code{\link{squeezeVar}}}
\item{s2.prior}{prior value for residual variances, to be used if \code{df.prior}>0.}
\item{s2.true}{numeric vector of true variances, to be used if \code{criterion="mallowscp"}.}
- \item{...}{other optional arguments to be passed to \code{lmFit}}
+ \item{\dots}{other optional arguments to be passed to \code{lmFit}}
}
\value{
diff --git a/man/strsplit2.Rd b/man/strsplit2.Rd
index c7ad38c..3046f55 100644
--- a/man/strsplit2.Rd
+++ b/man/strsplit2.Rd
@@ -6,7 +6,7 @@
Split a vector of composite names into a matrix of simple names.}
\usage{
-strsplit2(x, split, ...)
+strsplit2(x, split, \dots)
}
\arguments{
diff --git a/man/summary.Rd b/man/summary.Rd
index 7578b0f..52b5477 100755
--- a/man/summary.Rd
+++ b/man/summary.Rd
@@ -9,11 +9,11 @@
Briefly summarize microarray data objects.
}
\usage{
-\method{summary}{RGList}(object, ...)
+\method{summary}{RGList}(object, \dots)
}
\arguments{
\item{object}{an object of class \code{RGList}, \code{MAList} or \code{MArrayLM}}
- \item{...}{other arguments are not used}
+ \item{\dots}{other arguments are not used}
}
\details{
The data objects are summarized as if they were lists, i.e., brief information about the length and type of the components is given.
diff --git a/man/symbols2indices.Rd b/man/symbols2indices.Rd
deleted file mode 100644
index 6a2d3b4..0000000
--- a/man/symbols2indices.Rd
+++ /dev/null
@@ -1,28 +0,0 @@
-\name{symbols2indices}
-\alias{symbols2indices}
-\title{Convert Gene Set Symbols to Indices}
-\description{
-Make a list of gene set symbols into a list of gene sets indices.
-}
-\usage{
-symbols2indices(gene.sets, symbols, remove.empty=TRUE)
-}
-\arguments{
- \item{gene.sets}{list of character vectors, each vector containing the symbols for a set of genes.}
- \item{symbols}{character vector of gene symbols.}
- \item{remove.empty}{logical, should sets of size zero be removed from the output?}
-}
-\value{
-list of integer vectors, each vector containing the indices of a gene set in the vector \code{symbols}.
-}
-\details{
-This function used to create input for \code{romer} function.
-Typically, \code{symbols} is the vector of symbols of genes on a microarray, and \code{gene.sets} is obtained constructed from a database of gene sets, for example a representation of the Molecular Signatures Database (MSigDB) downloaded from \url{http://bioinf.wehi.edu.au/software/MSigDB}.
-}
-
-\seealso{
-\code{\link{romer}}, \code{\link{mroast}}, \code{\link{camera}}
-
-There is a topic page on \link{10.GeneSetTests}.
-}
-\author{Gordon Smyth and Yifang Hu}
\ No newline at end of file
diff --git a/man/topGO.Rd b/man/topGO.Rd
new file mode 100644
index 0000000..360448c
--- /dev/null
+++ b/man/topGO.Rd
@@ -0,0 +1,38 @@
+\name{topGO}
+\alias{topGO}
+\title{Table of Top GO Terms}
+
+\description{
+Extract top GO terms from goana results.
+}
+
+\usage{
+topGO(results, ontology = c("BP", "CC", "MF"), sort = NULL, number = 20L)
+}
+
+\arguments{
+ \item{results}{data frame produced by \code{\link{goana}}.}
+ \item{ontology}{character vector of ontologies to be included in output. Elements should be one or more of \code{"BP"}, \code{"CC"} or \code{"MF"}.}
+ \item{sort}{name of gene set for which results are required. Should be one of the column names of \code{results}. Defaults to first set.}
+ \item{number}{maximum number of top GO terms to list. For all terms, set \code{number=Inf}.}
+}
+
+\details{
+This function is organize the output from \code{\link{goana}} into top-tables of the most significant GO terms.
+}
+
+\value{
+Same as \code{results} but with rows subsetted by Ontology and sorted by the specified p-value.
+}
+
+\seealso{
+\code{\link{goana}}, \code{\link{goana.MArrayLM}}
+}
+
+\author{Gordon Smyth and Yifang Hu}
+
+\examples{
+# See goana and goana.MArrayLM examples
+}
+
+\keyword{gene set test}
diff --git a/man/topRomer.Rd b/man/topRomer.Rd
index 62f9f51..12f1fee 100644
--- a/man/topRomer.Rd
+++ b/man/topRomer.Rd
@@ -30,4 +30,6 @@ This function takes the results from romer and returns a number of top gene set
There is a topic page on \link{10.GeneSetTests}.
}
-\author{Gordon Smyth and Yifang Hu}
\ No newline at end of file
+\author{Gordon Smyth and Yifang Hu}
+
+\keyword{gene set test}
diff --git a/man/topSplice.Rd b/man/topSplice.Rd
index 3362bc8..92207dc 100644
--- a/man/topSplice.Rd
+++ b/man/topSplice.Rd
@@ -5,18 +5,30 @@
Top table ranking the most differentially spliced genes or exons.
}
\usage{
-topSplice(fit, coef=ncol(fit), level="exon", number=10, FDR=1)
+topSplice(fit, coef=ncol(fit), test="simes", number=10, FDR=1)
}
\arguments{
\item{fit}{\code{MArrayLM} fit object produced by \code{diffSplice}.}
\item{coef}{the coefficient (column) of fit for which differentially splicing is assessed.}
- \item{level}{character string, should the table be by \code{"exon"} or by \code{"gene"}.}
+ \item{test}{character string, possible values are \code{"simes"}, \code{"F"} or \code{"t"}.
+ \code{"F"} gives F-tests for each gene.
+ \code{"t"} gives t-tests for each exon.
+ \code{"simes"} gives genewise p-values derived from the t-tests after Simes adjustment for each gene.}
\item{number}{integer, maximum number of rows to output.}
\item{FDR}{numeric, only show exons or genes with false discovery rate less than this cutoff.}
}
\details{
-Ranks genes by the Plots interaction log2-fold-change by exon for the specified gene.
+Ranks genes or exons by evidence for differential splicing.
+The F-statistic tests for any differences in exon usage between experimental conditions.
+The exon-level t-statistics test for differences between each exon and all other exons for the same gene.
+
+The Simes processes the exon-level p-values to give an overall call of differential splicing for each gene.
+It returns the minimum Simes-adjusted p-values for each gene.
+
+The F-tests are likely to be powerful for genes in which several exons are differentially splices.
+The Simes p-values is likely to be more powerful when only a minority of the exons for a gene are differentially spliced.
+The exon-level t-tests are not recommended for formal error rate control.
}
\value{A data.frame with any annotation columns found in \code{fit} plus the following columns
@@ -32,3 +44,5 @@ Ranks genes by the Plots interaction log2-fold-change by exon for the specified
An overview of diagnostic functions available in LIMMA is given in \link{09.Diagnostics}.
}
\examples{# See diffSplice}
+
+\keyword{rna-seq}
diff --git a/man/tricubeMovingAverage.Rd b/man/tricubeMovingAverage.Rd
index 25b8661..e742412 100644
--- a/man/tricubeMovingAverage.Rd
+++ b/man/tricubeMovingAverage.Rd
@@ -3,7 +3,7 @@
\alias{tricubeMovingAverage}
\title{Moving Average Smoother With Tricube Weights}
\description{
-Apply a moving average smoother to the columns of a matrix.
+Apply a moving average smoother with tricube distance weights to the columns of a matrix.
}
\usage{
tricubeMovingAverage(x,span=0.5,full.length=TRUE)
@@ -14,11 +14,13 @@ tricubeMovingAverage(x,span=0.5,full.length=TRUE)
\item{full.length}{logical value, should output have same number of length as input?}
}
\details{
-This function smooths a vector (considered as a time series) using moving average with tricube weights.
-It is similar to a loess curve of degree zero, with a couple of differences:
+This function smooths a vector (considered as a time series) using a moving average with tricube weights.
+This is similar to a loess curve of degree zero, with a couple of differences:
a continuity correction is applied when computing the neighbouring points and, when \code{full.length=TRUE}, the span halves at the end points.
The \code{filter} function in the stats package is called to do the low-level calculations.
+
+This function is used by \code{\link{barcodeplot}} to compute enrichment worms.
}
\value{
Numeric vector of smoothed values.
@@ -30,7 +32,9 @@ x <- rbinom(100,size=1,prob=0.5)
plot(1:100,tricubeMovingAverage(x))
}
\seealso{
-\code{\link{filter}}
+\code{\link{filter}}, \code{\link{barcodeplot}}
}
\author{Gordon Smyth}
\keyword{smooth}
+\concept{Loess curve}
+
diff --git a/man/volcanoplot.Rd b/man/volcanoplot.Rd
index 046e765..799c688 100755
--- a/man/volcanoplot.Rd
+++ b/man/volcanoplot.Rd
@@ -6,7 +6,7 @@ Creates a volcano plot of log-fold changes versus log-odds of differential expre
}
\usage{
volcanoplot(fit, coef=1, highlight=0, names=fit$genes$ID,
- xlab="Log Fold Change", ylab="Log Odds", pch=16, cex=0.35, ...)
+ xlab="Log Fold Change", ylab="Log Odds", pch=16, cex=0.35, \dots)
}
\arguments{
\item{fit}{an \code{MArrayLM} fitted linear model object}
@@ -17,7 +17,7 @@ volcanoplot(fit, coef=1, highlight=0, names=fit$genes$ID,
\item{ylab}{character string giving label for y-axis}
\item{pch}{vector or list of plotting characters. Default is integer code 16 which gives a solid circle.}
\item{cex}{numeric vector of plot symbol expansions. Default is 0.35.}
- \item{...}{any other arguments are passed to \code{plot}}
+ \item{\dots}{any other arguments are passed to \code{plot}}
}
\details{
diff --git a/man/voom.Rd b/man/voom.Rd
index b8cf5c6..bafa114 100644
--- a/man/voom.Rd
+++ b/man/voom.Rd
@@ -8,7 +8,7 @@ The data are then ready for linear modelling.
\usage{
voom(counts, design = NULL, lib.size = NULL, normalize.method = "none",
- plot = FALSE, span=0.5, ...)
+ plot = FALSE, span=0.5, \dots)
}
\arguments{
\item{counts}{a numeric \code{matrix} containing raw counts, or an \code{ExpressionSet} containing raw counts, or a \code{DGEList} object.}
@@ -20,7 +20,7 @@ voom(counts, design = NULL, lib.size = NULL, normalize.method = "none",
Choices are as for the \code{method} argument of \code{normalizeBetweenArrays} when the data is single-channel.}
\item{plot}{\code{logical}, should a plot of the mean-variance trend be displayed?}
\item{span}{width of the lowess smoothing window as a proportion.}
- \item{...}{other arguments are passed to \code{lmFit}.}
+ \item{\dots}{other arguments are passed to \code{lmFit}.}
}
\details{
@@ -65,3 +65,5 @@ A \code{voom} case study is given in the User's Guide.
\code{\link{vooma}} is a similar function but for microarrays instead of RNA-seq.
}
+
+\keyword{rna-seq}
diff --git a/man/voomWithQualityWeights.Rd b/man/voomWithQualityWeights.Rd
new file mode 100644
index 0000000..b59255f
--- /dev/null
+++ b/man/voomWithQualityWeights.Rd
@@ -0,0 +1,73 @@
+\name{voomWithQualityWeights}
+\alias{voomWithQualityWeights}
+\title{Combining observational-level with sample-specific quality weights for RNA-seq analysis}
+\description{
+Combine voom observational-level weights with sample-specific quality weights in a designed experiment.
+}
+\usage{
+voomWithQualityWeights(counts, design=NULL, lib.size=NULL, normalize.method="none",
+ plot=FALSE, span=0.5, var.design=NULL, method="genebygene", maxiter=50,
+ tol=1e-10, trace=FALSE, replace.weights=TRUE, ...)
+}
+\arguments{
+ \item{counts}{a numeric \code{matrix} containing raw counts, or an \code{ExpressionSet} containing raw counts, or a \code{DGEList} object.}
+ \item{design}{design matrix with rows corresponding to samples and columns to coefficients to be estimated.
+ Defaults to the unit vector meaning that samples are treated as replicates.}
+ \item{lib.size}{numeric vector containing total library sizes for each sample.
+ If \code{NULL} and \code{counts} is a \code{DGEList} then, the normalized library sizes are taken from \code{counts}.
+ Otherwise library sizes are calculated from the columnwise counts totals.}
+ \item{normalize.method}{normalization method to be applied to the logCPM values.
+ Choices are as for the \code{method} argument of \code{normalizeBetweenArrays} when the data is single-channel.}
+ \item{plot}{\code{logical}, should a plot of the mean-variance trend and sample-specific weights be displayed?}
+ \item{span}{width of the lowess smoothing window as a proportion.}
+ \item{var.design}{design matrix for the variance model. Defaults to the sample-specific model (i.e. each sample has a distinct variance) when \code{NULL}.}
+ \item{method}{character string specifying the estimating algorithm to be used. Choices
+ are \code{"genebygene"} and \code{"reml"}.}
+ \item{maxiter}{maximum number of iterations allowed.}
+ \item{tol}{convergence tolerance.}
+ \item{trace}{logical variable. If true then output diagnostic information
+ at each iteration of the '"reml"' algorithm, or at every 1000th iteration of the
+ '"genebygene"' algorithm.}
+ \item{replace.weights}{logical variable. If TRUE then the weights in the voom object will be replaced with
+ the combined voom and sample-specific weights and the \code{\link[limma:EList]{EList}} object from voom is returned.
+ If FALSE, then a matrix of combined weights is returned.}
+ \item{\dots}{other arguments are passed to \code{lmFit}.}
+}
+\details{
+This function is intended to process RNA-Seq data prior to linear modelling in limma.
+
+It combines observational-level weights from \code{voom} with sample-specific weights estimated using the \code{arrayWeights} function.
+}
+\value{
+ Either an \code{\link[limma:EList]{EList}} object with the following components:
+\item{E}{numeric matrix of normalized expression values on the log2 scale}
+\item{weights}{numeric matrix of inverse variance weights}
+\item{design}{design matrix}
+\item{lib.size}{numeric vector of total normalized library sizes}
+\item{genes}{dataframe of gene annotation extracted from \code{counts}}
+ or a matrix of combined voom and sample-specific weights with same dimension as \code{counts}.
+}
+
+\author{Matthew Ritchie and Cynthia Liu}
+
+\references{
+Ritchie, M. E., Diyagama, D., Neilson, van Laar, R., J., Dobrovic, A., Holloway, A., and Smyth, G. K. (2006).
+Empirical array quality weights in the analysis of microarray data.
+BMC Bioinformatics 7, 261.
+\url{http://www.biomedcentral.com/1471-2105/7/261/abstract}
+
+Law, CW, Chen, Y, Shi, W, Smyth, GK (2014).
+Voom: precision weights unlock linear model analysis tools for RNA-seq read counts.
+\emph{Genome Biology} 15, R29.
+\url{http://genomebiology.com/2014/15/2/R29}
+}
+
+\seealso{
+\code{\link{voom}}, \code{\link{arrayWeights}}
+
+An overview of linear model functions in limma is given by \link{06.LinearModels}.
+A \code{voomWithQualityWeights} case study is given in the User's Guide.
+}
+
+\keyword{regression}
+\keyword{rna-seq}
diff --git a/man/vooma.Rd b/man/vooma.Rd
index f09e3e8..6b1b6fc 100644
--- a/man/vooma.Rd
+++ b/man/vooma.Rd
@@ -11,7 +11,8 @@ Estimate the mean-variance relationship and use this to compute appropriate obse
\usage{
vooma(y, design=NULL, correlation, block=NULL, plot=FALSE, span=NULL)
voomaByGroup(y, group, design=NULL, correlation, block=NULL,
- plot=FALSE, span=NULL, col=NULL)
+ plot=FALSE, span=NULL, col=NULL, lwd=1, alpha=0.5,
+ pch=16, cex=0.3, legend="topright")
}
\arguments{
@@ -23,6 +24,11 @@ voomaByGroup(y, group, design=NULL, correlation, block=NULL,
\item{plot}{\code{logical} value indicating whether a plot of mean-variance trend should be displayed.}
\item{group}{categorical vector or factor giving group membership of columns of \code{y}.}
\item{col}{vector of colors for plotting group trends}
+ \item{lwd}{line width for plotting group trends}
+ \item{pch}{plotting character. Default is integer code 16 which gives a solid circle. If a vector, then should be of length \code{nrow(y)}.}
+ \item{cex}{numeric vector of plot symbol expansions. If a vector, then should be of length equal to number of groups.}
+ \item{alpha}{transparancy of points, on scale from \code{0} for fully transparant to \code{1} for fully opaque.}
+ \item{legend}{character string giving position to place legend.}
}
\details{
diff --git a/man/weightedLowess.Rd b/man/weightedLowess.Rd
index 091697c..31922da 100644
--- a/man/weightedLowess.Rd
+++ b/man/weightedLowess.Rd
@@ -29,7 +29,7 @@ This function extends the lowess algorithm to handle non-negative prior weights.
used during span calculations such that the span distance for each point must include the
specified proportion of all weights. They are also applied during weighted linear regression to
compute the fitted value (in addition to the tricube weights determined by \code{span}). For integer
-weights, the prior weights are equivalent to using \code{rep(..., w)} on \code{x} and \code{y} prior to fitting.
+weights, the prior weights are equivalent to using \code{rep(\dots, w)} on \code{x} and \code{y} prior to fitting.
For large vectors, running time is reduced by only performing locally weighted regression for
several points. Fitted values for all points adjacent to the chosen points are computed by
diff --git a/man/writefit.Rd b/man/writefit.Rd
index 5bf6d16..41848b3 100755
--- a/man/writefit.Rd
+++ b/man/writefit.Rd
@@ -9,7 +9,7 @@ Write a microarray linear model fit to a file.
\usage{
write.fit(fit, results=NULL, file, digits=3, adjust="none", method="separate",
- F.adjust="none", sep="\t", ...)
+ F.adjust="none", sep="\t", \dots)
}
\arguments{
diff --git a/man/zscore.Rd b/man/zscore.Rd
index edb8368..77612f3 100755
--- a/man/zscore.Rd
+++ b/man/zscore.Rd
@@ -12,7 +12,7 @@ Compute z-score equivalents of non-normal random deviates.
}
\usage{
-zscore(q, distribution, ...)
+zscore(q, distribution, \dots)
zscoreGamma(q, shape, rate = 1, scale = 1/rate)
zscoreT(x, df, approx=FALSE)
tZscore(x, df)
@@ -80,3 +80,5 @@ zscoreGamma(c(1,2.5), shape=0.5, scale=2)
zscoreT(2, df=3)
tZscore(2, df=3)
}
+
+\keyword{distributions}
diff --git a/tests/limma-Tests.R b/tests/limma-Tests.R
index 7dc83d3..0a715e9 100755
--- a/tests/limma-Tests.R
+++ b/tests/limma-Tests.R
@@ -270,3 +270,8 @@ v <- voom(y,design)
names(v)
summary(v$E)
summary(v$weights)
+
+### goana
+
+go <- goana(fit,FDR=0.6,geneid=1:100)
+topGO(go,n=10)
diff --git a/tests/limma-Tests.Rout.save b/tests/limma-Tests.Rout.save
index 06f2f23..ed546a1 100755
--- a/tests/limma-Tests.Rout.save
+++ b/tests/limma-Tests.Rout.save
@@ -1,5 +1,5 @@
-R version 3.1.0 (2014-04-10) -- "Spring Dance"
+R Under development (unstable) (2014-07-19 r66202) -- "Unsuffered Consequences"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
@@ -1180,87 +1180,7 @@ set1 5 0 1 Up 0.008 0.0150 0.008 0.0150
set2 5 0 0 Up 0.959 0.9585 0.687 0.6865
> mroast(y=y,list(set1=iset1,set2=iset2),design,contrast=2,array.weights=c(0.5,1,0.5,1))
NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
-set1 5 0 1 Up 0.004 0.0070 0.004 0.0070
-set2 5 0 0 Up 0.679 0.6785 0.658 0.6575
-> mroast(y=y,list(set1=iset1,set2=iset2),design,contrast=2,weights=w)
- NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
-set1 5 0.0 1 Up 0.003 0.0050 0.003 0.0050
-set2 5 0.2 0 Down 0.950 0.9495 0.250 0.2495
-> mroast(y=y,list(set1=iset1,set2=iset2),design,contrast=2,weights=w,array.weights=c(0.5,1,0.5,1))
- NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
-set1 5 0 1 Up 0.001 0.0010 0.001 0.0010
-set2 5 0 0 Down 0.791 0.7905 0.146 0.1455
->
-> ### camera
->
-> camera(y=y,iset1,design,contrast=2,weights=c(0.5,1,0.5,1))
- NGenes Correlation Direction PValue
-set1 5 -0.2481655 Up 0.001050253
-> camera(y=y,list(set1=iset1,set2=iset2),design,contrast=2)
- NGenes Correlation Direction PValue FDR
-set1 5 -0.2481655 Up 0.0009047749 0.00180955
-set2 5 0.1719094 Down 0.9068364381 0.90683644
->
-> ### with EList arg
->
-> y <- new("EList",list(E=y))
-> roast(y=y,iset1,design,contrast=2)
- Active.Prop P.Value
-Down 0 0.997498749
-Up 1 0.003001501
-UpOrDown 1 0.006000000
-Mixed 1 0.006000000
-> camera(y=y,iset1,design,contrast=2)
- NGenes Correlation Direction PValue
-set1 5 -0.2481655 Up 0.0009047749
->
-> ### eBayes with trend
->
-> fit <- lmFit(y,design)
-> fit <- eBayes(fit,trend=TRUE)
-Loading required package: splines
-> topTable(fit,coef=2)
- logFC AveExpr t P.Value adj.P.Val B
-3 3.488703 1.03931081 4.860410 0.0002436118 0.01647958 0.6722078
-2 3.729512 1.73488969 4.700998 0.0003295917 0.01647958 0.3777787
-4 2.696676 1.74060725 3.280613 0.0053915597 0.17971866 -2.3313104
-1 2.391846 1.72305203 3.009776 0.0092611288 0.23152822 -2.8478458
-5 2.387967 1.63066783 2.786529 0.0144249169 0.26573834 -3.2671364
-33 -1.492317 -0.07525287 -2.735781 0.0159443006 0.26573834 -3.3613142
-80 -1.839760 -0.32802306 -2.594532 0.0210374835 0.30053548 -3.6207072
-95 -1.907074 1.26297763 -2.462009 0.0272186263 0.33449167 -3.8598265
-39 1.366141 -0.27360750 2.409767 0.0301042507 0.33449167 -3.9527943
-70 -1.789476 0.21771869 -2.184062 0.0462410739 0.46241074 -4.3445901
-> fit$df.prior
-[1] 12.17481
-> fit$s2.prior
- [1] 0.7108745 0.7186517 0.3976222 0.7224388 0.6531157 0.3014062 0.3169880
- [8] 0.3149772 0.3074632 0.2917431 0.3329334 0.3378027 0.2900500 0.3031741
- [15] 0.3221763 0.2981580 0.2897078 0.2925188 0.2924234 0.3042822 0.2923686
- [22] 0.2897022 0.3251669 0.2929813 0.4922090 0.2902725 0.3018205 0.3029119
- [29] 0.3030051 0.3331358 0.3259651 0.2939051 0.3077824 0.3553515 0.3139985
- [36] 0.3181689 0.3197601 0.4687993 0.3316536 0.2897621 0.2910744 0.2907116
- [43] 0.2907966 0.3265722 0.3240487 0.3241126 0.3003970 0.3064187 0.3645035
- [50] 0.2994391 0.3295512 0.2901076 0.2898658 0.3086659 0.2897209 0.2982976
- [57] 0.3043910 0.2900320 0.3006936 0.2935101 0.3646949 0.3596385 0.3064203
- [64] 0.3027439 0.3076483 0.3363356 0.3504336 0.3496698 0.2897618 0.2898810
- [71] 0.3182290 0.3121707 0.2945001 0.2897549 0.3579410 0.3434376 0.3037970
- [78] 0.3201893 0.3048412 0.3394079 0.3516034 0.3034589 0.3120384 0.3007827
- [85] 0.3013925 0.2902524 0.3527793 0.2969359 0.3033756 0.3170187 0.2978833
- [92] 0.2908437 0.3139422 0.3050183 0.4727609 0.2897104 0.2931671 0.2904177
- [99] 0.3231607 0.2941699
-> summary(fit$s2.post)
- Min. 1st Qu. Median Mean 3rd Qu. Max.
- 0.2518 0.2746 0.3080 0.3425 0.3583 0.7344
->
-> y$E[1,1] <- NA
-> y$E[1,3] <- NA
-> fit <- lmFit(y,design)
-> fit <- eBayes(fit,trend=TRUE)
-> topTable(fit,coef=2)
- logFC AveExpr t P.Value adj.P.Val B
-3 3.488703 1.03931081 4.697008 0.0003209946 0.03209946 0.4203254
-2 3.729512 1.73488969 3.999120 0.0012579276 0.06289638 -0.9004934
+set1 5 79276 0.06289638 -0.9004934
4 2.696676 1.74060725 2.813904 0.0135288060 0.39858921 -3.1693877
33 -1.492317 -0.07525287 -2.731110 0.0159435682 0.39858921 -3.3232081
80 -1.839760 -0.32802306 -2.589351 0.0210816889 0.42163378 -3.5833549
@@ -1317,6 +1237,83 @@ Loading required package: splines
3rd Qu.:13.485 3rd Qu.:14.155 3rd Qu.:13.859 3rd Qu.:14.121
Max. :15.083 Max. :15.101 Max. :15.095 Max. :15.063
>
+> ### goana
+>
+> go <- goana(fit,FDR=0.6,geneid=1:100)
+> topGO(go,n=10)
+ Term Ont N
+GO:0001867 complement activation, lectin pathway BP 1
+GO:0001868 regulation of complement activation, lectin pathway BP 1
+GO:0001869 negative regulation of complement activation, lectin pathway BP 1
+GO:0002020 protease binding MF 1
+GO:0002673 regulation of acute inflammatory response BP 1
+GO:0002920 regulation of humoral immune response BP 1
+GO:0002921 negative regulation of humoral immune response BP 1
+GO:0006102 isocitrate metabolic process BP 1
+GO:0006956 complement activation BP 1
+GO:0006959 humoral immune response BP 1
+ Up Down P.Up P.Down
+GO:0001867 1 0 0.0625 1
+GO:0001868 1 0 0.0625 1
+GO:0001869 1 0 0.0625 1
+GO:0002020 1 0 0.0625 1
+GO:0002673 1 0 0.0625 1
+GO:0002920 1 0 0.0625 1
+GO:0002921 1 0 0.0625 1
+GO:0006102 1 0 0.0625 1
+GO:0006956 1 0 0.0625 1
+GO:0006959 1 0 0.0625 1
+>
> proc.time()
user system elapsed
- 1.23 0.12 1.34
+ 5.89 0.21 6.10
+ limma/vignettes/ 0000755 0001263 0001264 00000000000 12417066740 015253 5 ustar 00biocbuild phs_compbio [...]
+%\VignetteDepends{}
+%\VignetteKeywords{microarray linear model}
+%\VignettePackage{limma}
+\documentclass[12pt]{article}
+
+\textwidth=6.2in
+\textheight=8.5in
+\oddsidemargin=0.2in
+\evensidemargin=0.2in
+\headheight=0in
+\headsep=0in
+
+\begin{document}
+\title{Limma Package Introduction}
+\author{Gordon Smyth}
+\date{23 October 2004, Revised 21 October 2013}
+\maketitle
+
+Limma is an R package for the analysis of gene expression microarray data, especially the use of linear models for analysing designed experiments and the assessment of differential expression.
+Limma provides the ability to analyse comparisons between many RNA targets simultaneously in arbitrary complicated designed experiments.
+Empirical Bayesian methods are used to provide stable results even when the number of arrays is small.
+The normalization and data analysis functions are for two-color spotted microarrays.
+The linear model and differential expression functions apply to all microarray technologies including Affymetrix and other single-channel oligonucleotide platforms.
+
+The full Limma User's Guide is available as part of the online documentation.
+To reach the User's Guide you need to install the limma package.
+If you've installed the package and you're using Windows, type \texttt{library(limma)} at the R prompt then click on ``limma'' from the drop-down menu called ``Vignettes''.
+If you're not using Windows, you can type
+\begin{Schunk}
+\begin{Sinput}
+> library(limma)
+> limmaUsersGuide()
+\end{Sinput}
+\end{Schunk}
+or alternatively
+\begin{Schunk}
+\begin{Sinput}
+> help.start()
+\end{Sinput}
+\end{Schunk}
+and follow the links to the limma package help.
+
+\end{document}
+
+
+
+
+
+ [...]
\ No newline at end of file
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-bioc-limma.git
More information about the debian-med-commit
mailing list