[med-svn] [r-bioc-edger] 02/07: Imported Upstream version 3.14.0+dfsg

Andreas Tille tille at debian.org
Fri May 20 10:35:24 UTC 2016


This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository r-bioc-edger.

commit bb9764f6f8405b8c43b2ba143208a3ce0762276f
Author: Andreas Tille <tille at debian.org>
Date:   Fri May 20 11:56:13 2016 +0200

    Imported Upstream version 3.14.0+dfsg
---
 DESCRIPTION                       |  10 +-
 NAMESPACE                         |  30 ++++-
 R/DGEList.R                       |  22 ++-
 R/calcNormFactors.R               |   6 +-
 R/cpm.R                           |   5 +-
 R/dispCoxReidInterpolateTagwise.R |   2 +-
 R/equalizeLibSizes.R              |  62 ++++++---
 R/estimateCommonDisp.R            |  75 +++++++----
 R/estimateDisp.R                  | 174 +++++++++++++++---------
 R/estimateGLMCommonDisp.R         |   5 +-
 R/estimateGLMTagwiseDisp.R        |  18 +--
 R/estimateGLMTrendedDisp.R        |  22 +--
 R/estimateTagwiseDisp.R           |  96 ++++++++-----
 R/estimateTrendedDisp.R           |  85 ++++++++----
 R/geneset-DGEList.R               |   9 ++
 R/gini.R                          |  10 ++
 R/glmQLFTest.R                    |  51 ++++---
 R/glmfit.R                        |   2 +
 R/gof.R                           |  46 +------
 R/plotBCV.R                       |  30 +++--
 R/splitIntoGroups.R               |  34 +++--
 R/subsetting.R                    |  10 +-
 build/vignette.rds                | Bin 229 -> 228 bytes
 inst/NEWS.Rd                      |  43 ++++++
 inst/doc/edgeR.pdf                | Bin 45781 -> 48664 bytes
 man/DGEList.Rd                    |   7 +-
 man/WLEB.Rd                       |  17 +--
 man/camera.DGEList.Rd             |   2 +-
 man/diffSpliceDGE.Rd              |   3 +-
 man/equalizeLibSizes.Rd           |  26 ++--
 man/estimateCommonDisp.Rd         |  22 ++-
 man/estimateDisp.Rd               |  45 ++++---
 man/estimateGLMTagwiseDisp.Rd     |   2 +-
 man/estimateGLMTrendedDisp.Rd     |   2 +-
 man/estimateTagwiseDisp.Rd        |  44 +++---
 man/estimateTrendedDisp.Rd        |  22 +--
 man/gini.Rd                       |  31 +++++
 man/glmQLFTest.Rd                 |  10 +-
 man/gof.Rd                        |  24 ++--
 man/movingAverageByCol.Rd         |   1 -
 man/processAmplicons.Rd           |   6 +-
 man/roast.DGEList.Rd              |   6 +-
 man/romer.DGEList.Rd              |   2 +-
 man/splitIntoGroups.Rd            |  27 ++--
 man/zscoreNBinom.Rd               |   2 +-
 tests/edgeR-Tests.R               |   7 +-
 tests/edgeR-Tests.Rout.save       | 274 +++++++++++++++++++++++++++-----------
 vignettes/edgeR.Rnw               |  48 +++++++
 48 files changed, 986 insertions(+), 491 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 9e4cc4c..245d4e8 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,13 +1,13 @@
 Package: edgeR
-Version: 3.12.0
-Date: 2015/10/05
-Title: Empirical analysis of digital gene expression data in R
+Version: 3.14.0
+Date: 2016-04-19
+Title: Empirical Analysis of Digital Gene Expression Data in R
 Description: Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce counts, including ChIP-seq, SAGE and CAGE.
 Author: Yunshun Chen <yuchen at wehi.edu.au>, Aaron Lun <alun at wehi.edu.au>, Davis McCarthy <dmccarthy at wehi.edu.au>, Xiaobei Zhou <xiaobei.zhou at uzh.ch>, Mark Robinson <mark.robinson at imls.uzh.ch>, Gordon Smyth <smyth at wehi.edu.au>
 Maintainer: Yunshun Chen <yuchen at wehi.edu.au>, Aaron Lun <alun at wehi.edu.au>, Mark Robinson <mark.robinson at imls.uzh.ch>, Davis McCarthy <dmccarthy at wehi.edu.au>, Gordon Smyth <smyth at wehi.edu.au>
 License: GPL (>=2)
 Depends: R (>= 2.15.0), limma
-Imports: methods
+Imports: graphics, stats, utils, methods
 Suggests: MASS, statmod, splines, locfit, KernSmooth
 URL: http://bioinf.wehi.edu.au/edgeR
 biocViews: GeneExpression, Transcription, AlternativeSplicing,
@@ -16,4 +16,4 @@ biocViews: GeneExpression, Transcription, AlternativeSplicing,
         TimeCourse, SAGE, Sequencing, ChIPSeq, RNASeq, BatchEffect,
         MultipleComparison, Normalization, QualityControl
 NeedsCompilation: yes
-Packaged: 2015-10-14 01:06:43 UTC; biocbuild
+Packaged: 2016-05-04 03:09:43 UTC; biocbuild
diff --git a/NAMESPACE b/NAMESPACE
index a9cce9e..32babb5 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -8,7 +8,21 @@ exportClasses("DGEList","DGEExact","DGEGLM","DGELRT","TopTags")
 exportMethods("show")
 
 import(methods)
-importFrom("limma",camera,contrastAsCoef,decideTests,goana,kegga,is.fullrank,lmFit,loessFit,mroast,nonEstimable,plotMD,plotMDS,plotWithHighlights,removeExt,roast,romer,squeezeVar,subsetListOfArrays,zscoreGamma)
+importFrom("graphics", "abline", "axis", "curve", "grid", "legend",
+           "lines", "mtext", "plot", "points", "smoothScatter")
+importFrom("stats", "approxfun", "as.dist", "chisq.test", "cmdscale",
+           "coefficients", "dbinom", "dnbinom", "dnorm", "fitted",
+           "integrate", "lm.fit", "lm.wfit", "median", "model.matrix",
+           "optim", "optimize", "p.adjust", "pbeta", "pbinom",
+           "pchisq", "pf", "pgamma", "pnbinom", "pnorm", "predict",
+           "pt", "qbeta", "qgamma", "qnorm", "qqnorm", "quantile",
+           "rbinom", "rmultinom", "runif", "uniroot")
+importFrom("utils", "read.delim", "read.table", "write.table")
+if( tools:::.OStype() == "windows" ) importFrom("utils", "winMenuAddItem")
+
+importFrom("limma",camera,contrastAsCoef,decideTests,fry,goana,kegga,is.fullrank,
+     lmFit,loessFit,mroast,nonEstimable,plotMD,plotMDS,plotWithHighlights,removeExt,
+     roast,romer,squeezeVar,subsetListOfArrays,zscoreGamma)
 importClassesFrom("limma","LargeDataObject","Roast","MDS")
 
 S3method(as.data.frame,TopTags)
@@ -19,6 +33,7 @@ S3method(aveLogCPM,DGEGLM)
 S3method(calcNormFactors,default)
 S3method(calcNormFactors,DGEList)
 S3method(camera,DGEList)
+S3method(cpm,default)
 S3method(cpm,DGEList)
 S3method(dim,DGEList)
 S3method(dim,DGEExact)
@@ -33,12 +48,23 @@ S3method(dimnames,TopTags)
 S3method("dimnames<-",DGEList)
 S3method("dimnames<-",DGEGLM)
 S3method("dimnames<-",DGELRT)
+S3method(equalizeLibSizes,default)
+S3method(equalizeLibSizes,DGEList)
+S3method(estimateCommonDisp,default)
+S3method(estimateCommonDisp,DGEList)
+S3method(estimateTrendedDisp,default)
+S3method(estimateTrendedDisp,DGEList)
+S3method(estimateTagwiseDisp,default)
+S3method(estimateTagwiseDisp,DGEList)
 S3method(estimateGLMCommonDisp,default)
 S3method(estimateGLMCommonDisp,DGEList)
 S3method(estimateGLMTrendedDisp,default)
 S3method(estimateGLMTrendedDisp,DGEList)
 S3method(estimateGLMTagwiseDisp,default)
 S3method(estimateGLMTagwiseDisp,DGEList)
+S3method(estimateDisp,default)
+S3method(estimateDisp,DGEList)
+S3method(fry,DGEList)
 S3method(glmFit,default)
 S3method(glmFit,DGEList)
 S3method(glmQLFit,default)
@@ -62,5 +88,7 @@ S3method(roast,DGEList)
 S3method(romer,DGEList)
 S3method(rpkm,default)
 S3method(rpkm,DGEList)
+S3method(splitIntoGroups,default)
+S3method(splitIntoGroups,DGEList)
 S3method(sumTechReps,default)
 S3method(sumTechReps,DGEList)
diff --git a/R/DGEList.R b/R/DGEList.R
index e50565b..4f1717a 100644
--- a/R/DGEList.R
+++ b/R/DGEList.R
@@ -1,6 +1,6 @@
-DGEList <- function(counts=matrix(0,0,0), lib.size=colSums(counts), norm.factors=rep(1,ncol(counts)), group=rep(1,ncol(counts)), genes=NULL, remove.zeros=FALSE) 
+DGEList <- function(counts=matrix(0,0,0), lib.size=colSums(counts), norm.factors=rep(1,ncol(counts)), samples=NULL, group=NULL, genes=NULL, remove.zeros=FALSE) 
 #	Construct DGEList object from components, with some checking
-#	Last modified 8 Feb 2015
+#	Last modified 3 Dec 2015
 {
 #	Check counts
 	counts <- as.matrix(counts)
@@ -12,18 +12,32 @@ DGEList <- function(counts=matrix(0,0,0), lib.size=colSums(counts), norm.factors
 #	Check lib.size
 	if(is.null(lib.size)) lib.size <- colSums(counts)
 	if(nlib != length(lib.size)) stop("Length of 'lib.size' must equal number of columns in 'counts'")
+	if(any(lib.size==0L)) warning("Zero library size detected.")
 
 #	Check norm.factors
 	if(is.null(norm.factors)) norm.factors <- rep(1,ncol(counts))
 	if(nlib != length(norm.factors)) stop("Length of 'norm.factors' must equal number of columns in 'counts'")
 
+#	Check samples
+	if(!is.null(samples)) {
+		samples <- as.data.frame(samples)
+		if(nlib != nrow(samples)) stop("Number of rows in 'samples' must equal number of columns in 'counts'")
+	}
+	
 #	Check group
-	if(is.null(group)) group <- rep(1,ncol(counts))
+	if(is.null(group)) {
+		if(!is.null(samples))
+			group <- samples[,1]
+		else
+			group <- rep(1,ncol(counts))
+	}
 	group <- dropEmptyLevels(group)
 	if(nlib != length(group)) stop("Length of 'group' must equal number of columns in 'counts'")
 
 #	Make data frame of sample information
-	samples <- data.frame(group=group,lib.size=lib.size,norm.factors=norm.factors)
+	sam <- data.frame(group=group,lib.size=lib.size,norm.factors=norm.factors)
+	if(!is.null(samples)) sam <- data.frame(sam, samples)
+	samples <- sam
 	if(anyDuplicated(colnames(counts))) {
 		message("Repeated column names found in count matrix")
 		row.names(samples) <- 1L:nlib
diff --git a/R/calcNormFactors.R b/R/calcNormFactors.R
index c1145c6..c4c5be6 100644
--- a/R/calcNormFactors.R
+++ b/R/calcNormFactors.R
@@ -73,9 +73,8 @@ calcNormFactors.default <- function(object, lib.size=NULL, method=c("TMM","RLE",
 }
 
 .calcFactorWeighted <- function(obs, ref, libsize.obs=NULL, libsize.ref=NULL, logratioTrim=.3, sumTrim=0.05, doWeighting=TRUE, Acutoff=-1e10)
+#	TMM between two libraries
 {
-	if( all(obs==ref) ) return(1)
-
 	obs <- as.numeric(obs)
 	ref <- as.numeric(ref)
 
@@ -93,9 +92,10 @@ calcNormFactors.default <- function(object, lib.size=NULL, method=c("TMM","RLE",
 	absE <- absE[fin]
 	v <- v[fin]
 
+	if(max(abs(logR)) < 1e-6) return(1)
+
 #	taken from the original mean() function
 	n <- length(logR)
-
 	loL <- floor(n * logratioTrim) + 1
 	hiL <- n + 1 - loL
 	loS <- floor(n * sumTrim) + 1
diff --git a/R/cpm.R b/R/cpm.R
index 544e2b4..6a8f7d3 100644
--- a/R/cpm.R
+++ b/R/cpm.R
@@ -14,9 +14,12 @@ cpm.DGEList <- function(x, normalized.lib.sizes=TRUE, log=FALSE, prior.count=0.2
 cpm.default <- function(x, lib.size=NULL, log=FALSE, prior.count=0.25, ...)
 #	Counts per million for a matrix
 #	Davis McCarthy and Gordon Smyth.
-#	Created 20 June 2011. Last modified 6 July 2015
+#	Created 20 June 2011. Last modified 11 March 2016
 {
 	x <- as.matrix(x)
+	if (any(dim(x)==0L)) {
+		return(x)
+	}
 	if(is.null(lib.size)) lib.size <- colSums(x)
 	if(log) {
 		prior.count.scaled <- lib.size/mean(lib.size)*prior.count
diff --git a/R/dispCoxReidInterpolateTagwise.R b/R/dispCoxReidInterpolateTagwise.R
index acd222f..462e77f 100644
--- a/R/dispCoxReidInterpolateTagwise.R
+++ b/R/dispCoxReidInterpolateTagwise.R
@@ -38,7 +38,7 @@ dispCoxReidInterpolateTagwise <- function(y, design, offset=NULL, dispersion, tr
 #	Apply min.row.sum and use input dispersion for small count tags
 	i <- (rowSums(y) >= min.row.sum)
 	if(any(!i)) {
-		if(any(i)) dispersion[i] <- Recall(y=y[i,],design=design,offset=offset[i,],dispersion=dispersion[i],AveLogCPM=AveLogCPM[i],grid.npts=grid.npts,min.row.sum=0,prior.df=prior.df,span=span,trend=trend,weights=weights[i,])
+		if(any(i)) dispersion[i] <- Recall(y=y[i,,drop=FALSE],design=design,offset=offset[i,,drop=FALSE],dispersion=dispersion[i],AveLogCPM=AveLogCPM[i],grid.npts=grid.npts,min.row.sum=0,prior.df=prior.df,span=span,trend=trend,weights=weights[i,,drop=FALSE])
 		return(dispersion)
 	}
 
diff --git a/R/equalizeLibSizes.R b/R/equalizeLibSizes.R
index ff123b6..3b9bd39 100644
--- a/R/equalizeLibSizes.R
+++ b/R/equalizeLibSizes.R
@@ -1,28 +1,58 @@
-equalizeLibSizes <- function(object, dispersion=NULL, common.lib.size=NULL)
-#	Normalize counts so that the library sizes can be treated as equal.
+# Calculate pseudo counts and pseudo library sizes.
+
+equalizeLibSizes <- function(y, ...)
+UseMethod("equalizeLibSizes")
+
+equalizeLibSizes.DGEList <- function(y, dispersion=NULL, ...)
+#	Yunshun Chen. Created 17 March 2016.
+{
+#	Check y
+	y <- validDGEList(y)
+
+#	Check dispersion
+	if(is.null(dispersion)) dispersion <- getDispersion(y)
+	
+	lib.size <- y$samples$lib.size * y$samples$norm.factors
+
+	out <- equalizeLibSizes(y=y$counts, group=y$samples$group, dispersion=dispersion, lib.size=lib.size)
+	y$pseudo.counts <- out$pseudo.counts
+	y$pseudo.lib.size <- out$pseudo.lib.size
+	y
+}
+
+equalizeLibSizes.default <- function(y, group=NULL, dispersion=NULL, lib.size=NULL, ...)
 #	Uses a quantile-to-quantile transformation so that new count counts are equivalent deviates on the equalized scale.
 #	Davis McCarthy, Gordon Smyth.
-#	Created July 2009. Last modified 24 September 2014.
+#	Created July 2009. Last modified 17 March 2016.
 {
-	d <- dim(object)
-	ntags <- d[1]
-	nlibs <- d[2]
-	lib.size <- object$samples$lib.size * object$samples$norm.factors
-	if(is.null(common.lib.size)) common.lib.size <- exp(mean(log(lib.size)))
-	levs.group <- unique(object$samples$group)
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	nlibs <- ncol(y)
 
-	if(is.null(dispersion)) dispersion <- getDispersion(object)
-	if(is.null(dispersion)) dispersion <- 0.05
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
 
-	input.mean <- output.mean <- matrix(0,ntags,nlibs)
+#	Check dispersion
+	if(is.null(dispersion)) dispersion <- 0.05
+	
+#	Check lib.size
+	if(is.null(lib.size)) lib.size <- colSums(y)
+	if(length(lib.size)!=nlibs) stop("Incorrect length of lib.size.")
+		
+	common.lib.size <- exp(mean(log(lib.size)))
+	levs.group <- unique(group)
+	input.mean <- output.mean <- matrix(0, ntags, nlibs)
 	for(i in 1:length(levs.group)) {
-		j <- object$samples$group==levs.group[i]
-		beta <- mglmOneGroup(object$counts[,j,drop=FALSE],dispersion=dispersion,offset=log(lib.size[j]))
+		j <- group==levs.group[i]
+		beta <- mglmOneGroup(y[,j,drop=FALSE], dispersion=dispersion, offset=log(lib.size[j]))
 		lambda <- exp(beta)
 		input.mean[,j] <- matrix(lambda,ncol=1) %*% matrix(lib.size[j],nrow=1)
 		output.mean[,j] <- matrix(lambda, ncol=1) %*% matrix(common.lib.size, nrow=1, ncol=sum(j))
 	}
-	pseudo <- q2qnbinom(object$counts, input.mean=input.mean, output.mean=output.mean, dispersion=dispersion)
+	pseudo <- q2qnbinom(y, input.mean=input.mean, output.mean=output.mean, dispersion=dispersion)
 	pseudo[pseudo<0] <- 0
-	list(pseudo.counts=pseudo, common.lib.size=common.lib.size)
+	list(pseudo.counts=pseudo, pseudo.lib.size=common.lib.size)
 }
diff --git a/R/estimateCommonDisp.R b/R/estimateCommonDisp.R
index 2f44a1f..92ff364 100644
--- a/R/estimateCommonDisp.R
+++ b/R/estimateCommonDisp.R
@@ -1,43 +1,62 @@
-estimateCommonDisp <- function(object,tol=1e-06,rowsum.filter=5,verbose=FALSE)
-#	Estimate common dispersion using exact conditional likelihood
-#	Davis McCarthy, Mark Robinson, Gordon Smyth.
-#	Created 2009. Last modified 20 Nov 2013.
-{
-	if(!is(object,"DGEList")) stop("Currently supports DGEList objects")
-	object <- validDGEList(object)
-	group <- object$samples$group <- as.factor(object$samples$group)
+# Estimate common dispersion using exact conditional likelihood
+
+estimateCommonDisp <- function(y, ...)
+UseMethod("estimateCommonDisp")
 
+estimateCommonDisp.DGEList <- function(y, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
+# Yunshun Chen. Created 18 March 2016.
+{
+	y <- validDGEList(y)
+	group <- y$samples$group
+	lib.size <- y$samples$lib.size * y$samples$norm.factors
+	
 	if( all(tabulate(group)<=1) ) {
 		warning("There is no replication, setting dispersion to NA.")
-		object$common.dispersion <- NA
-		return(object)
+		y$common.dispersion <- NA
+		return(y)
 	}
 
-	tags.used <- rowSums(object$counts) > rowsum.filter
-	pseudo.obj <- object[tags.used,]
+	out <- estimateCommonDisp(y$counts, group=group, lib.size=lib.size, tol=tol, rowsum.filter=rowsum.filter, verbose=verbose, ...)	
+	y$common.dispersion <- out
+	y <- equalizeLibSizes(y, dispersion=out)
+	y$AveLogCPM <- aveLogCPM(y, dispersion=out)
+	y
+}
+
+
+estimateCommonDisp.default <- function(y, group=NULL, lib.size=NULL, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
+#	Davis McCarthy, Mark Robinson, Gordon Smyth.
+#	Created 2009. Last modified 18 March 2016.
+{
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	nlibs <- ncol(y)
+
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
+
+#	Check lib.size
+	if(is.null(lib.size)) lib.size <- colSums(y)
+	if(length(lib.size)!=nlibs) stop("Incorrect length of lib.size.")
 
+	sel <- rowSums(y) > rowsum.filter
+	
 #	Start from small dispersion
 	disp <- 0.01
 	for(i in 1:2) {
-		out <- equalizeLibSizes(object,dispersion=disp)
-		pseudo.obj$counts <- out$pseudo.counts[tags.used,,drop=FALSE]
-		y <- splitIntoGroups(pseudo.obj)
-		delta <- optimize(commonCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=y, der=0)
+		out <- equalizeLibSizes(y, group=group, dispersion=disp, lib.size=lib.size)
+		y.pseudo <- out$pseudo.counts[sel, , drop=FALSE]
+		y.split <- splitIntoGroups(y.pseudo, group=group)
+		delta <- optimize(commonCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=y.split, der=0)
 		delta <- delta$maximum
 		disp <- delta/(1-delta)
 	}
 	if(verbose) cat("Disp =",round(disp,5),", BCV =",round(sqrt(disp),4),"\n")
-	object$common.dispersion <- disp
-	object$pseudo.counts <- out$pseudo.counts
-	object$pseudo.lib.size <- out$common.lib.size
-
-#	Average logCPM
-#	Note different behaviour to estimateGLMCommonDisp:
-#	Here the estimated common.dispersion is used to compute AveLogCPM whereas
-#	estimateGLMCommonDisp calculates AveLogCPM prior to estimating the common
-#	dispersion using a pre-set dispersion of 0.05
-	object$AveLogCPM <- aveLogCPM(object)
-
-	object
+	
+	common.dispersion <- disp
+	common.dispersion
 }
 
diff --git a/R/estimateDisp.R b/R/estimateDisp.R
index 97c650c..149018d 100644
--- a/R/estimateDisp.R
+++ b/R/estimateDisp.R
@@ -1,72 +1,101 @@
-##############################################################
-########### Weighted Likelihood Empirical Bayes ##############
-##############################################################
+#  Estimating dispersion using weighted likelihood empirical Bayes.
 
-estimateDisp <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", tagwise=TRUE, span=NULL, min.row.sum=5, grid.length=21, grid.range=c(-10,10), robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06)
-#  Estimating dispersion using weighted conditional likelihood empirical Bayes.
+estimateDisp <- function(y, ...)
+UseMethod("estimateDisp")
+
+estimateDisp.DGEList <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", mixed.df=FALSE, tagwise=TRUE, span=NULL, min.row.sum=5, grid.length=21, grid.range=c(-10,10), robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06, ...)
+#  Yunshun Chen. Created 16 March 2016.
+{
+	y <- validDGEList(y)
+	group <- y$samples$group
+	lib.size <- y$samples$lib.size * y$samples$norm.factors
+
+	d <- estimateDisp(y=y$counts, design=design, group=group, lib.size=lib.size, offset=getOffset(y), prior.df=prior.df, trend.method=trend.method, mixed.df=mixed.df, tagwise=tagwise, span=span, min.row.sum=min.row.sum, grid.length=grid.length, grid.range=grid.range, robust=robust, winsor.tail.p=winsor.tail.p, tol=tol, weights=y$weights, ...)
+
+	y$common.dispersion <- d$common.dispersion
+	y$trended.dispersion <- d$trended.dispersion
+	if(tagwise) y$tagwise.dispersion <- d$tagwise.dispersion
+	y$AveLogCPM <- aveLogCPM(y)
+	y$trend.method <- trend.method
+	y$prior.df <- d$prior.df
+	y$prior.n <- d$prior.n
+	y$span <- d$span
+	y
+}
+
+estimateDisp.default <- function(y, design=NULL, group=NULL, lib.size=NULL, offset=NULL, prior.df=NULL, trend.method="locfit", mixed.df=FALSE, tagwise=TRUE, span=NULL, min.row.sum=5, grid.length=21, grid.range=c(-10,10), robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06, weights=NULL, ...)
 #  Use GLM approach if a design matrix is given, and classic approach otherwise.
 #  It calculates a matrix of likelihoods for each gene at a set of dispersion grid points, and then calls WLEB() to do the shrinkage.
-#  Yunshun Chen, Gordon Smyth. Created July 2012. Last modified 03 Feb 2015.
+#  Yunshun Chen, Gordon Smyth. Created July 2012. Last modified 18 March 2016.
 {
-	if( !is(y,"DGEList") ) stop("y must be a DGEList")
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	if(ntags==0) return(numeric(0))
+	nlibs <- ncol(y)
+	
+#	Check trend.method
 	trend.method <- match.arg(trend.method, c("none", "loess", "locfit", "movingave"))
-	ntags <- nrow(y$counts)
-	nlibs <- ncol(y$counts)
 	
-	offset <- getOffset(y)
-	AveLogCPM <- aveLogCPM(y)
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
+
+#	Check lib.size
+	if(is.null(lib.size)) lib.size <- colSums(y)
+	if(length(lib.size)!=nlibs) stop("Incorrect length of lib.size.")
+	
+#	Check offset
+	if(is.null(offset)) offset <- log(lib.size)
 	offset <- expandAsMatrix(offset, dim(y))
 
-	# Check for genes with small counts
-	sel <- rowSums(y$counts) >= min.row.sum
+#	Check for genes with small counts
+	sel <- rowSums(y) >= min.row.sum
 	
-	# Spline points
-	spline.pts <- seq(from=grid.range[1],to=grid.range[2],length=grid.length)
+#	Spline points
+	spline.pts <- seq(from=grid.range[1], to=grid.range[2], length=grid.length)
 	spline.disp <- 0.1 * 2^spline.pts
 	grid.vals <- spline.disp/(1+spline.disp)
 	l0 <- matrix(0, sum(sel), grid.length)
 
-	# Classic edgeR
+#	Classic edgeR
 	if(is.null(design)){
-		# One group
+		# One way
 		cat("Design matrix not provided. Switch to the classic mode.\n")
-		group <- y$samples$group <- as.factor(y$samples$group)
 		if(length(levels(group))==1)
-			design <- matrix(1,nlibs,1)
+			design <- matrix(1, nlibs, 1)
 		else
 			design <- model.matrix(~group)
 		if( all(tabulate(group)<=1) ) {
 			warning("There is no replication, setting dispersion to NA.")
-			y$common.dispersion <- NA
-			return(y)
+			return(list(common.dispersion=NA, trended.dispersion=NA, tagwise.dispersion=NA))
 		}
-		pseudo.obj <- y[sel, ]
-
-		q2q.out <- equalizeLibSizes(y[sel, ], dispersion=0.01)
-		pseudo.obj$counts <- q2q.out$pseudo
-		ysplit <- splitIntoGroups(pseudo.obj)
-		delta <- optimize(commonCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=ysplit, der=0)
+		
+		eq <- equalizeLibSizes(y, group=group, dispersion=0.01, lib.size=lib.size)
+		y.pseudo <- eq$pseudo.counts[sel, , drop=FALSE]
+		y.split <- splitIntoGroups(y.pseudo, group=group)
+		delta <- optimize(commonCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=y.split, der=0)
 		delta <- delta$maximum
 		disp <- delta/(1-delta)
 
-		q2q.out <- equalizeLibSizes(y[sel, ], dispersion=disp)
-		pseudo.obj$counts <- q2q.out$pseudo
-		ysplit <- splitIntoGroups(pseudo.obj)
+		eq <- equalizeLibSizes(y, group=group, dispersion=disp, lib.size=lib.size)
+		y.pseudo <- eq$pseudo.counts[sel, , drop=FALSE]
+		y.split <- splitIntoGroups(y.pseudo, group=group)
 	
-		for(j in 1:grid.length) for(i in 1:length(ysplit)) 
-			l0[,j] <- condLogLikDerDelta(ysplit[[i]], grid.vals[j], der=0) + l0[,j]
+		for(j in 1:grid.length) for(i in 1:length(y.split)) 
+			l0[,j] <- condLogLikDerDelta(y.split[[i]], grid.vals[j], der=0) + l0[,j]
 	}
 	# GLM edgeR 
 	else {
 		design <- as.matrix(design)
-		if(ncol(design) >= ncol(y$counts)) {
+		if(ncol(design) >= nlibs) {
 			warning("No residual df: setting dispersion to NA")
-			y$common.dispersion <- NA
-			return(y)
+			return(list(common.dispersion=NA, trended.dispersion=NA, tagwise.dispersion=NA))
 		}
-		
+
 		# Protect against zeros.
-		glmfit <- glmFit(y$counts[sel,], design, offset=offset[sel,], dispersion=0.05, prior.count=0)
+		glmfit <- glmFit(y[sel,], design, offset=offset[sel,], dispersion=0.05, prior.count=0)
 		zerofit <- (glmfit$fitted.values < 1e-4) & (glmfit$counts < 1e-4)
 		by.group <- .comboGroups(zerofit)
 
@@ -85,7 +114,7 @@ estimateDisp <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", t
 			# Using the last fit to hot-start the next fit
 			last.beta <- NULL
 			for(i in 1:grid.length) {
-				out <- adjustedProfileLik(spline.disp[i], y=y$counts[sel, ][subg,cur.nzero,drop=FALSE], design=redesign, 
+				out <- adjustedProfileLik(spline.disp[i], y=y[sel, ][subg,cur.nzero,drop=FALSE], design=redesign, 
 					offset=offset[sel,][subg,cur.nzero,drop=FALSE], start=last.beta, get.coef=TRUE)
 				l0[subg,i] <- out$apl
 				last.beta <- out$beta
@@ -93,21 +122,21 @@ estimateDisp <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", t
 		}
 	}
 
-	out.1 <- WLEB(theta=spline.pts, loglik=l0, covariate=AveLogCPM[sel], trend.method=trend.method, span=span, individual=FALSE, m0.out=TRUE)
+	overall <- maximizeInterpolant(spline.pts, matrix(colSums(l0), nrow=1))
+	common.dispersion <- 0.1 * 2^overall
+	AveLogCPM <- aveLogCPM(y, lib.size=lib.size, dispersion=common.dispersion, weights=weights)
 
-	y$common.dispersion <- 0.1 * 2^out.1$overall
+	out.1 <- WLEB(theta=spline.pts, loglik=l0, covariate=AveLogCPM[sel], trend.method=trend.method, 
+		mixed.df=mixed.df, span=span, overall=FALSE, individual=FALSE, m0.out=TRUE)
 	disp.trend <- 0.1 * 2^out.1$trend
-	y$trended.dispersion <- rep( disp.trend[which.min(AveLogCPM[sel])], ntags )
-	y$trended.dispersion[sel] <- disp.trend
-	y$trend.method <- trend.method
-	y$AveLogCPM <- AveLogCPM
-	y$span <- out.1$span
+	trended.dispersion <- rep( disp.trend[which.min(AveLogCPM[sel])], ntags )
+	trended.dispersion[sel] <- disp.trend
 
-	if(!tagwise) return(y)
+	if(!tagwise) return(list(common.dispersion=common.dispersion, trended.dispersion=trended.dispersion))
 
 	# Calculate prior.df
 	if(is.null(prior.df)){
-		glmfit <- glmFit(y$counts[sel,], design, offset=offset[sel,], dispersion=disp.trend, prior.count=0)
+		glmfit <- glmFit(y[sel,], design, offset=offset[sel,], dispersion=disp.trend, prior.count=0)
 
 		# Residual deviances
 		df.residual <- glmfit$df.residual
@@ -121,16 +150,15 @@ estimateDisp <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", t
 		s2[df.residual==0] <- 0
 		s2 <- pmax(s2,0)
 		s2.fit <- squeezeVar(s2, df=df.residual, covariate=AveLogCPM[sel], robust=robust, winsor.tail.p=winsor.tail.p)
-
 		prior.df <- s2.fit$df.prior
 	}
 	ncoefs <- ncol(design)
 	prior.n <- prior.df/(nlibs-ncoefs)
 
 	if (trend.method!='none') { 
-		y$tagwise.dispersion <- y$trended.dispersion
+		tagwise.dispersion <- trended.dispersion
 	} else {
-		y$tagwise.dispersion <- rep(y$common.dispersion, ntags)
+		tagwise.dispersion <- rep(common.dispersion, ntags)
 	}
 
 	# Checking if the shrinkage is near-infinite.
@@ -143,31 +171,28 @@ estimateDisp <- function(y, design=NULL, prior.df=NULL, trend.method="locfit", t
 
 		# Estimating tagwise dispersions	
 		out.2 <- WLEB(theta=spline.pts, loglik=l0, prior.n=temp.n, covariate=AveLogCPM[sel], 
-			trend.method=trend.method, span=span, overall=FALSE, trend=FALSE, m0=out.1$shared.loglik)
+			trend.method=trend.method, mixed.df=mixed.df, span=span, overall=FALSE, trend=FALSE, m0=out.1$shared.loglik)
 
 		if (!robust) { 
-			y$tagwise.dispersion[sel] <- 0.1 * 2^out.2$individual
+			tagwise.dispersion[sel] <- 0.1 * 2^out.2$individual
 		} else {
-			y$tagwise.dispersion[sel][!too.large] <- 0.1 * 2^out.2$individual[!too.large]
+			tagwise.dispersion[sel][!too.large] <- 0.1 * 2^out.2$individual[!too.large]
 		}
 	}
 
-	if(!robust){
-		# scalar prior.n
-		y$prior.df <- prior.df
-		y$prior.n <- prior.n
-	} else {
-		# vector prior.n
-		y$prior.df <- y$prior.n <- rep(Inf, ntags)
-		y$prior.df[sel] <- prior.df
-		y$prior.n[sel] <- prior.n
+	if(robust) {
+		temp.df <- prior.df
+		temp.n <- prior.n
+		prior.df <- prior.n <- rep(Inf, ntags)
+		prior.df[sel] <- temp.df
+		prior.n[sel] <- temp.n
 	}
-	y
-}
 
+	list(common.dispersion=common.dispersion, trended.dispersion=trended.dispersion, tagwise.dispersion=tagwise.dispersion, span=out.1$span, prior.df=prior.df, prior.n=prior.n)
+}
 
 
-WLEB <- function(theta, loglik, prior.n=5, covariate=NULL, trend.method="locfit", span=NULL, 
+WLEB <- function(theta, loglik, prior.n=5, covariate=NULL, trend.method="locfit", mixed.df=FALSE, span=NULL, 
 	overall=TRUE, trend=TRUE, individual=TRUE, m0=NULL, m0.out=FALSE)
 #  Weighted likelihood empirical Bayes for estimating a parameter vector theta
 #  given log-likelihood values on a grid of theta values
@@ -185,6 +210,8 @@ WLEB <- function(theta, loglik, prior.n=5, covariate=NULL, trend.method="locfit"
 	else
 		trend.method <- match.arg(trend.method, c("none", "loess", "locfit", "movingave"))
 
+	if(trend.method=="locfit" & mixed.df) trend.method <- "locfit_mix"
+
 #	Set span
 	if(is.null(span)) if(ntags<=50) span <- 1 else span <- 0.25+0.75*(50/ntags)^0.5
 
@@ -206,9 +233,26 @@ WLEB <- function(theta, loglik, prior.n=5, covariate=NULL, trend.method="locfit"
 		},
 		"loess" = loessByCol(loglik, covariate, span=span)$fitted.values,
 		"locfit" = locfitByCol(loglik, covariate, span=span, degree=0),
+		"locfit_mix" = {
+			deg0 <- locfitByCol(loglik, covariate, span=span, degree=0)
+			deg1 <- locfitByCol(loglik, covariate, span=span, degree=1)
+			r <- range(covariate)
+			w <- pbeta((covariate-r[1])/(r[2]-r[1]),shape1=2,shape2=2)
+			w*deg0 + (1-w)*deg1
+		},
 		"none" = matrix(colMeans(loglik), ntags, length(theta), byrow=TRUE)
 	)
 
+#	make sure each row of m0 is unimodal
+	if(trend.method=="locfit_mix"){
+		for(i in ncol(m0):3){
+			diff1_m0 <- m0[,i] - m0[,i-1]
+			diff2_m0 <- m0[,i-1] - m0[,i-2]
+			k <- which(diff1_m0>0 & diff2_m0<0)
+			m0[k,1:(i-2)] <- m0[k,(i-1)]
+		}
+	}
+
 	if(trend)
 		out$trend <- maximizeInterpolant(theta, m0)
 
diff --git a/R/estimateGLMCommonDisp.R b/R/estimateGLMCommonDisp.R
index 7e60718..6b60a9e 100644
--- a/R/estimateGLMCommonDisp.R
+++ b/R/estimateGLMCommonDisp.R
@@ -5,11 +5,12 @@ estimateGLMCommonDisp.DGEList <- function(y, design=NULL, method="CoxReid", subs
 {
 #	Check y
 	y <- validDGEList(y)
-	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y,dispersion=NULL)
+	AveLogCPM <- aveLogCPM(y, dispersion=0.05)
 
-	disp <- estimateGLMCommonDisp(y=y$counts, design=design, offset=getOffset(y), method=method, subset=subset, AveLogCPM=y$AveLogCPM, verbose=verbose, weights=y$weights, ...)
+	disp <- estimateGLMCommonDisp(y=y$counts, design=design, offset=getOffset(y), method=method, subset=subset, AveLogCPM=AveLogCPM, verbose=verbose, weights=y$weights, ...)
 
 	y$common.dispersion <- disp
+	y$AveLogCPM <- aveLogCPM(y, dispersion=disp)
 	y
 }
 
diff --git a/R/estimateGLMTagwiseDisp.R b/R/estimateGLMTagwiseDisp.R
index 2706305..df7c596 100644
--- a/R/estimateGLMTagwiseDisp.R
+++ b/R/estimateGLMTagwiseDisp.R
@@ -12,13 +12,14 @@ estimateGLMTagwiseDisp.DGEList <- function(y, design=NULL, prior.df=10, trend=!i
 		if(is.null(dispersion)) stop("No common.dispersion found in data object. Run estimateGLMCommonDisp first.")
 	}
 
-	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y,dispersion=dispersion)
-
-	d <- estimateGLMTagwiseDisp(y=y$counts, design=design, offset=getOffset(y), dispersion=dispersion, trend=trend, prior.df=prior.df, AveLogCPM=y$AveLogCPM, weights=y$weights, ...)
-
+	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y)
+	ntags <- nrow(y$counts)
+	if(is.null(span)) if(ntags>10) span <- (10/ntags)^0.23 else span <- 1
+	y$span <- span
+	
+	d <- estimateGLMTagwiseDisp(y=y$counts, design=design, offset=getOffset(y), dispersion=dispersion, trend=trend, span=span, prior.df=prior.df, AveLogCPM=y$AveLogCPM, weights=y$weights, ...)
 	y$prior.df <- prior.df
-	y$span <- d$span
-	y$tagwise.dispersion <- d$tagwise.dispersion
+	y$tagwise.dispersion <- d
 	y
 }
 
@@ -51,10 +52,9 @@ estimateGLMTagwiseDisp.default <- function(y, design=NULL, offset=NULL, dispersi
 	if(is.null(span)) if(ntags>10) span <- (10/ntags)^0.23 else span <- 1
 
 #	Check AveLogCPM
-	if(is.null(AveLogCPM)) AveLogCPM <- aveLogCPM(y,offset=offset,dispersion=dispersion,weights=weights)
+	if(is.null(AveLogCPM)) AveLogCPM <- aveLogCPM(y, offset=offset, weights=weights)
 
 #	Call Cox-Reid grid method
 	tagwise.dispersion <- dispCoxReidInterpolateTagwise(y, design, offset=offset, dispersion, trend=trend, prior.df=prior.df, span=span, AveLogCPM=AveLogCPM, weights=weights,...)
-
-	list(tagwise.dispersion=tagwise.dispersion,span=span)
+	tagwise.dispersion
 }
diff --git a/R/estimateGLMTrendedDisp.R b/R/estimateGLMTrendedDisp.R
index d0830f9..2a6b4fe 100644
--- a/R/estimateGLMTrendedDisp.R
+++ b/R/estimateGLMTrendedDisp.R
@@ -5,12 +5,20 @@ estimateGLMTrendedDisp.DGEList <- function(y, design=NULL, method="auto", ...)
 {
 	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y)
 
-	d <- estimateGLMTrendedDisp(y=y$counts, design=design, offset=getOffset(y), AveLogCPM=y$AveLogCPM, method=method, weights=y$weights, ...)
+#	Check method
+	method <- match.arg(method, c("auto", "bin.spline", "bin.loess", "power", "spline"))
+	ntags <- nrow(y$counts)
+	if(method=="auto"){
+		if(ntags < 200) {
+			method <- "power"
+		} else {
+			method <- "bin.spline"
+		}
+	}
+	y$trend.method <- method
 
-	y$trended.dispersion <- d$dispersion
-	y$trend.method <- d$trend.method
-	y$bin.dispersion <- d$bin.dispersion
-	y$bin.AveLogCPM <- d$bin.AveLogCPM
+	d <- estimateGLMTrendedDisp(y=y$counts, design=design, offset=getOffset(y), AveLogCPM=y$AveLogCPM, method=method, weights=y$weights, ...)
+	y$trended.dispersion <- d
 	y
 }
 
@@ -59,8 +67,6 @@ estimateGLMTrendedDisp.default <- function(y, design=NULL, offset=NULL, AveLogCP
 		spline=dispCoxReidSplineTrend(y, design, offset=offset, AveLogCPM=AveLogCPM, ...)
 	)
 
-	trend$AveLogCPM <- AveLogCPM
-	trend$trend.method <- method
-	trend
+	trend$dispersion
 }
 
diff --git a/R/estimateTagwiseDisp.R b/R/estimateTagwiseDisp.R
index 49d1890..b840160 100644
--- a/R/estimateTagwiseDisp.R
+++ b/R/estimateTagwiseDisp.R
@@ -1,66 +1,100 @@
-estimateTagwiseDisp <- function(object, prior.df=10, trend="movingave", span=NULL, method="grid", grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE)
 #  Tagwise dispersion using weighted conditional likelihood empirical Bayes.
+
+estimateTagwiseDisp <- function(y, ...) 
+UseMethod("estimateTagwiseDisp")
+
+estimateTagwiseDisp.DGEList <- function(y, prior.df=10, trend="movingave", span=NULL, method="grid", grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)
+# Yunshun Chen. Created 18 March 2016.
+{
+	y <- validDGEList(y)
+	group <- y$samples$group
+	lib.size <- y$samples$lib.size * y$samples$norm.factors
+	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y)
+	dispersion <- y$common.dispersion
+	if(is.null(dispersion)) stop("No common.dispersion found in the DGEList object. Run estimateCommonDisp first.")
+
+	y$prior.df <- prior.df
+	nlibs <- ncol(y$counts)
+	ngroups <- length(unique(group))
+	y$prior.n <- prior.df/(nlibs - ngroups)
+	if(is.null(span)) if(trend=="movingave") span <- 0.3 else span <- 0.5
+	
+	out <- estimateTagwiseDisp(y$counts, group=group, lib.size=lib.size, dispersion=dispersion, AveLogCPM=y$AveLogCPM, prior.df=prior.df, trend=trend, span=span, method=method, grid.length=grid.length, grid.range=grid.range, tol=tol, verbose=verbose)
+	y$tagwise.dispersion <- out
+	y$span <- span
+	y
+}
+
+
+estimateTagwiseDisp.default <- function(y, group=NULL, lib.size=NULL, dispersion, AveLogCPM=NULL, prior.df=10, trend="movingave", span=NULL, method="grid", grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)
 #  Davis McCarthy, Mark Robinson, Yunshun Chen, Gordon Smyth.
-#  Created 2009. Last modified 11 March 2013.
+#  Created 2009. Last modified 18 March 2016.
 
 #  Notes 3 July 2012:
 #  - interpolating derivatives would be better than interpolating loglik values.
 #  - share code with estimateGLMTagwiseDisp?
 {
-	if( !is(object,"DGEList") ) stop("object must be a DGEList")
-	if(is.null(object$AveLogCPM)) object$AveLogCPM <- aveLogCPM(object)
-	if( is.null(object$common.dispersion) ) {
-		message("Running estimateCommonDisp() on DGEList object before proceeding with estimateTagwiseDisp().")
-		object <- estimateCommonDisp(object)
-	}
-	trend <- match.arg(trend,c("none","loess","movingave","tricube"))
-	if(trend=="tricube") trend="loess"
-	method <- match.arg(method,c("grid","optimize"))
-	ntags <- nrow(object$counts)
-	group <- object$samples$group <- as.factor(object$samples$group)
-	y <- splitIntoGroups(list(counts=object$pseudo.counts,samples=object$samples))
-	delta <- rep(0,ntags)
-	nlibs <- ncol(object$counts)
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	nlibs <- ncol(y)
+
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
+
+#	Check lib.size
+	if(is.null(lib.size)) lib.size <- colSums(y)
+	if(length(lib.size)!=nlibs) stop("Incorrect length of lib.size.")
+
+#	Check AveLogCPM
+	if(is.null(AveLogCPM)) AveLogCPM <- aveLogCPM(y, lib.size=lib.size, dispersion=dispersion)
+
+	trend <- match.arg(trend, c("none", "loess", "movingave", "tricube"))
+	if(trend=="tricube") trend <- "loess"
+	method <- match.arg(method, c("grid", "optimize"))
+
+	eq <- equalizeLibSizes(y, group=group, dispersion=dispersion, lib.size=lib.size)	
+	u <- splitIntoGroups(eq$pseudo.counts, group=group)
+	delta <- rep(0, ntags)
 	ngroups <- length(unique(group))
-	prior.n <- prior.df/(nlibs-ngroups)
+	prior.n <- prior.df/(nlibs - ngroups)
 
 	if(method=="grid"){  # do spline interpolation
 		if(verbose) message("Using interpolation to estimate tagwise dispersion. ")
-		spline.pts <- seq(from=grid.range[1],to=grid.range[2],length=grid.length)
-		spline.disp <- object$common.dispersion * 2^spline.pts
+		spline.pts <- seq(from=grid.range[1], to=grid.range[2], length=grid.length)
+		spline.disp <- dispersion * 2^spline.pts
 		grid.vals <- spline.disp/(1+spline.disp)
 	
-		l0 <- matrix(0,ntags,grid.length)
-		for(j in 1:grid.length) for(i in 1:length(y)) l0[,j] <- condLogLikDerDelta(y[[i]],grid.vals[j],der=0L)+l0[,j]
+		l0 <- matrix(0, ntags, grid.length)
+		for(j in 1:grid.length) for(i in 1:length(u)) l0[,j] <- condLogLikDerDelta(u[[i]], grid.vals[j], der=0L) + l0[,j]
 
 		if(is.null(span)) if(trend=="movingave") span <- 0.3 else span <- 0.5
 		m0 <- switch(trend,
  			"movingave" = {
- 				o <- order(object$AveLogCPM)
+ 				o <- order(AveLogCPM)
  				oo <- order(o)
  				movingAverageByCol(l0[o,], width=floor(span*ntags))[oo,]
  			},
-			"loess" = loessByCol(l0, object$AveLogCPM, span=span)$fitted.values,
-			"none" = matrix(colMeans(l0),ntags,grid.length,byrow=TRUE)
+			"loess" = loessByCol(l0, AveLogCPM, span=span)$fitted.values,
+			"none" = matrix(colMeans(l0), ntags, grid.length, byrow=TRUE)
 		)
 		l0a <- l0 + prior.n*m0
 		d <- maximizeInterpolant(spline.pts, l0a)
-		tagwise.dispersion <- object$common.dispersion * 2^d
+		tagwise.dispersion <- dispersion * 2^d
 	} else {	
 		if(trend != "none") stop("optimize method doesn't allow for abundance-dispersion trend")
 		if(verbose) message("Tagwise dispersion optimization begun, may be slow, progress reported every 100 tags")
 		for(tag in seq_len(ntags)) {
-			delta.this <- optimize(weightedCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=y, tag=tag, ntags=ntags, prior.n=prior.n, der=0L)
+			delta.this <- optimize(weightedCondLogLikDerDelta, interval=c(1e-4,100/(100+1)), tol=tol, maximum=TRUE, y=u, tag=tag, ntags=ntags, prior.n=prior.n, der=0L)
 			delta[tag] <- delta.this$maximum
 			if(verbose) if(tag%%100==0) message("tag ",tag)
 		}
 		tagwise.dispersion <- delta/(1-delta)
 	}
 	if(verbose) cat("\n")
-	
-#	Output
-	object$prior.n <- prior.n
-	object$tagwise.dispersion <- tagwise.dispersion
-	object
+
+	tagwise.dispersion
 }
 
diff --git a/R/estimateTrendedDisp.R b/R/estimateTrendedDisp.R
index 47c0b36..f64c910 100644
--- a/R/estimateTrendedDisp.R
+++ b/R/estimateTrendedDisp.R
@@ -1,48 +1,77 @@
-estimateTrendedDisp <- function(object, method="bin.spline", df=5, span=2/3)
+# Estimate trended dispersions using exact conditional likelihood
+
+estimateTrendedDisp <- function(y, ...)
+UseMethod("estimateTrendedDisp")
+
+estimateTrendedDisp.DGEList <- function(y, method="bin.spline", df=5, span=2/3, ...)
+# Yunshun Chen. Created 18 March 2016.
+{
+	y <- validDGEList(y)
+	group <- y$samples$group
+	lib.size <- y$samples$lib.size * y$samples$norm.factors
+	if(is.null(y$AveLogCPM)) y$AveLogCPM <- aveLogCPM(y)
+	
+	out <- estimateTrendedDisp(y$counts, group=group, lib.size=lib.size, AveLogCPM=y$AveLogCPM, method=method, df=df, span=span)
+	y$trended.method <- method
+	y$trended.dispersion <- out
+	y
+}
+
+
+estimateTrendedDisp.default <- function(y, group=NULL, lib.size=NULL, AveLogCPM=NULL, method="bin.spline", df=5, span=2/3, ...)
 # Yunshun Chen, Gordon Smyth.
-# Created 2 Feb 2012, last modified on 24 Sep 2015.
+# Created 2 Feb 2012, last modified on 18 March 2016.
 {
-	if( !is(object,"DGEList") ) stop("object must be a DGEList")
-	if( is.null(object$pseudo.counts) ) {
-		message("Running estimateCommonDisp() on DGEList object before proceeding with estimateTrendedDisp().")
-		object <- estimateCommonDisp(object)
-	}
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	nlibs <- ncol(y)
+
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
+
+#	Check lib.size
+	if(is.null(lib.size)) lib.size <- colSums(y)
+	if(length(lib.size)!=nlibs) stop("Incorrect length of lib.size.")
 
-	ntags <- nrow(object$counts)	
-	logCPM <- object$AveLogCPM
-	if(is.null(logCPM)) logCPM <- aveLogCPM(object)
+#	Check AveLogCPM
+	if(is.null(AveLogCPM)) AveLogCPM <- aveLogCPM(y, lib.size=lib.size)
+
+#	Check method
+	method <- match.arg(method, c("bin.spline", "bin.loess"))
 	
 	nbins <- 50
-	if(nbins>ntags) stop("nbins greater than number of rows of data")
-	bins <- cutWithMinN(logCPM,intervals=nbins,min.n=floor(ntags/nbins))
-	disp.bins <- logCPM.bins <- rep(NA,nbins)
+	if(nbins > ntags) stop("nbins greater than number of rows of data")
+	bins <- cutWithMinN(AveLogCPM, intervals=nbins, min.n=floor(ntags/nbins))
+	disp.bins <- AveLogCPM.bins <- rep(NA, nbins)
 	
 	for(i in 1:nbins) {
 		tagsinbin <- bins$group==i
-		disp.bins[i] <- estimateCommonDisp(object[tagsinbin,],rowsum.filter=0)$common.dispersion
-		logCPM.bins[i] <- mean(logCPM[tagsinbin])
+		disp.bins[i] <- estimateCommonDisp(y[tagsinbin,], group=group, lib.size=lib.size, rowsum.filter=0)
+		AveLogCPM.bins[i] <- mean(AveLogCPM[tagsinbin])
 	}
 
 	if( method=="bin.spline" ) {
 		if(!requireNamespace("splines",quietly=TRUE)) stop("splines required but is not available")
 		p1 <- (1:(df-1))/df
-		knots1 <- quantile(logCPM.bins,p=p1)
-		r <- range(logCPM.bins)
-		knots2 <- r[1]+p1*(r[2]-r[1])
-		knots <- 0.3*knots1+0.7*knots2
-		basisbins <- splines::ns(logCPM.bins,df=df,knots=knots,intercept=TRUE)
+		knots1 <- quantile(AveLogCPM.bins, p=p1)
+		r <- range(AveLogCPM.bins)
+		knots2 <- r[1] + p1*(r[2]-r[1])
+		knots <- 0.3*knots1 + 0.7*knots2
+		basisbins <- splines::ns(AveLogCPM.bins, df=df, knots=knots, intercept=TRUE)
 		beta <- coefficients(lm.fit(basisbins, sqrt(disp.bins)))
-		basisall <- predict(basisbins,newx=logCPM)
-		dispersion <- drop(basisall %*% beta)^2
+		basisall <- predict(basisbins, newx=AveLogCPM)
+		trended.dispersion <- drop(basisall %*% beta)^2
 	}
 	
 	if( method=="bin.loess" ) {
-		fit <- loessFit(sqrt(disp.bins), logCPM.bins, span=span, iterations=1)
-		f <- approxfun(logCPM.bins, fit$fitted, rule=2)
-		dispersion <- f(logCPM)^2
+		fit <- loessFit(sqrt(disp.bins), AveLogCPM.bins, span=span, iterations=1)
+		f <- approxfun(AveLogCPM.bins, fit$fitted, rule=2)
+		trended.dispersion <- f(AveLogCPM)^2
 	}
-
-	object$trended.dispersion <- dispersion
-	object
+	
+	trended.dispersion
 }
 
diff --git a/R/geneset-DGEList.R b/R/geneset-DGEList.R
index 5335589..0d74f33 100644
--- a/R/geneset-DGEList.R
+++ b/R/geneset-DGEList.R
@@ -16,6 +16,15 @@ mroast.DGEList <- function(y, index=NULL, design=NULL, contrast=ncol(design), ..
 	mroast(y=y, index=index, design=design, contrast=contrast, var.prior=1, df.prior=Inf, ...)
 }
 
+fry.DGEList <- function(y, index=NULL, design=NULL, contrast=ncol(design), ...)
+#	Rotation gene set testing for RNA-Seq data with multiple sets
+#	Yunshun Chen, Gordon Smyth
+#	Created 1 Dec 2015.  Last revised 15 April 2016.
+{
+	y <- .zscoreDGE(y=y, design=design, contrast=contrast)
+	fry(y=y, index=index, design=design, contrast=contrast, standardize="none", ...)
+}
+
 camera.DGEList <- function(y, index, design=NULL, contrast=ncol(design), ...)
 #	Rotation gene set testing for RNA-Seq data accounting for inter-gene correlation
 #	Yunshun Chen, Gordon Smyth
diff --git a/R/gini.R b/R/gini.R
new file mode 100644
index 0000000..e376a66
--- /dev/null
+++ b/R/gini.R
@@ -0,0 +1,10 @@
+gini <- function(x)
+#	Gini diversity index for columns of a numeric matrix
+#	Gordon Smyth
+#	5 Feb 2016
+{
+	d <- dim(x)
+	if(is.null(d)) d <- dim(x) <- c(length(x),1)
+	for (j in 1:d[2]) x[,j] <- sort.int(x[,j],na.last=TRUE)
+	(2*colSums((1:d[1])*x)/colSums(x)-d[1]-1)/d[1]
+}
diff --git a/R/glmQLFTest.R b/R/glmQLFTest.R
index 47aa94f..cb41005 100644
--- a/R/glmQLFTest.R
+++ b/R/glmQLFTest.R
@@ -4,7 +4,7 @@ glmQLFit <- function(y, ...)
 UseMethod("glmQLFit")
 
 glmQLFit.DGEList <- function(y, design=NULL, dispersion=NULL, offset=NULL, abundance.trend=TRUE, robust=FALSE, winsor.tail.p=c(0.05, 0.1), ...)
-# 	Written by Yunshun Chen and Aaron Lun
+# 	Yunshun Chen and Aaron Lun
 #	Created 05 November 2014.
 {
 	if(is.null(dispersion)) {
@@ -26,14 +26,25 @@ glmQLFit.DGEList <- function(y, design=NULL, dispersion=NULL, offset=NULL, abund
 
 glmQLFit.default <- function(y, design=NULL, dispersion=NULL, offset=NULL, lib.size=NULL, abundance.trend=TRUE, AveLogCPM=NULL, robust=FALSE, winsor.tail.p=c(0.05, 0.1), ...)
 # 	Fits a GLM and computes quasi-likelihood dispersions for each gene.
-# 	Written by Yunshun Chen and Aaron Lun, based on code by Davis McCarthy and Gordon Smyth
-# 	Created 15 September 2014. Last modified 05 November 2014.
+# 	Davis McCarthy, Gordon Smyth, Yunshun Chen, Aaron Lun.
+# 	Originally part of glmQLFTest, as separate function 15 September 2014. Last modified 05 November 2014.
 {
 #	Check y
 	y <- as.matrix(y)
 	ntag <- nrow(y)
 	nlib <- ncol(y)
-
+	
+#	Check design
+	if(is.null(design)) {
+		design <- matrix(1,ncol(y),1)
+		rownames(design) <- colnames(y)
+		colnames(design) <- "Intercept"
+	} else {
+		design <- as.matrix(design)
+		ne <- nonEstimable(design)
+		if(!is.null(ne)) stop(paste("Design matrix not of full rank.  The following coefficients not estimable:\n", paste(ne, collapse = " ")))
+	}
+	
 #	Check dispersion
 	if(is.null(dispersion)) stop("No dispersion values provided.")
 
@@ -73,10 +84,10 @@ glmQLFit.default <- function(y, design=NULL, dispersion=NULL, offset=NULL, lib.s
 }
 
 
-glmQLFTest <- function(glmfit, coef=ncol(glmfit$design), contrast=NULL)
+glmQLFTest <- function(glmfit, coef=ncol(glmfit$design), contrast=NULL, poisson.bound=TRUE)
 #	Quasi-likelihood F-tests for DGE glms.
-#	Davis McCarthy and Gordon Smyth.
-#	Created 18 Feb 2011. Last modified 15 Sep 2014.
+#	Davis McCarthy, Gordon Smyth, Aaron Lun.
+#	Created 18 Feb 2011. Last modified 28 Jan 2016.
 {
     if(!is(glmfit,"DGEGLM")) stop("glmfit must be an DGEGLM object produced by glmQLFit") 
 	if(is.null(glmfit$var.post)) stop("need to run glmQLFit before glmQLFTest") 
@@ -91,11 +102,15 @@ glmQLFTest <- function(glmfit, coef=ncol(glmfit$design), contrast=NULL)
 #	Compute p-values from the QL F-statistic
 	F.pvalue <- pf(F.stat, df1=out$df.test, df2=df.total, lower.tail=FALSE, log.p=FALSE)
 
-#	Ensure is not more significant than chisquare test
-	i <- glmfit$var.post < 1
-	if(any(i)) {
-		chisq.pvalue <- pchisq(out$table$LR[i], df=out$df.test[i], lower.tail=FALSE, log.p=FALSE)
-		F.pvalue[i] <- pmax(F.pvalue[i],chisq.pvalue)
+#	Ensure is not more significant than chisquare test with Poisson variance
+	if(poisson.bound) {
+		i <- rowSums(glmfit$var.post * (1 + glmfit$fitted.values * glmfit$dispersion) < 1) > 0L
+		if(any(i)) {
+			pois.fit <- glmfit[i,]
+			pois.fit <- glmFit(pois.fit$counts, design=pois.fit$design, offset=pois.fit$offset, weights=pois.fit$weights, start=pois.fit$unshrunk.coefficients, dispersion=0)
+			pois.res <- glmLRT(pois.fit, coef=coef, contrast=contrast) 
+			F.pvalue[i] <- pmax(F.pvalue[i], pois.res$table$PValue)
+		}
 	}
 
 	out$table$LR <- out$table$PValue <- NULL
@@ -108,8 +123,8 @@ glmQLFTest <- function(glmfit, coef=ncol(glmfit$design), contrast=NULL)
 
 plotQLDisp <- function(glmfit, xlab="Average Log2 CPM", ylab="Quarter-Root Mean Deviance", pch=16, cex=0.2, col.shrunk="red", col.trend="blue", col.raw="black", ...)
 # 	Plots the result of QL-based shrinkage.
-#	written by Aaron Lun, based on code by Davis McCarthy and Gordon Smyth
-#	15 September 2014
+#	Davis McCarthy, Gordon Smyth, Aaron Lun, Yunshun Chen.
+#	Originally part of glmQLFTest, as separate function 15 September 2014.
 {
 	A <- glmfit$AveLogCPM
 	if(is.null(A)) A <- aveLogCPM(glmfit)
@@ -117,15 +132,15 @@ plotQLDisp <- function(glmfit, xlab="Average Log2 CPM", ylab="Quarter-Root Mean
 	if(is.null(glmfit$var.post)) { stop("need to run glmQLFit before plotQLDisp") }
 
 	plot(A, sqrt(sqrt(s2)),xlab=xlab, ylab=ylab, pch=pch, cex=cex, col=col.raw, ...)
-	points(A,sqrt(sqrt(glmfit$var.post)),pch=16,cex=0.2,col=col.shrunk)
+	points(A, sqrt(sqrt(glmfit$var.post)), pch=pch, cex=cex, col=col.shrunk)
 	if (length(glmfit$var.prior)==1L) { 
 		abline(h=sqrt(sqrt(glmfit$var.prior)), col=col.trend)
 	} else {
 		o <- order(A)
-		lines(A[o],sqrt(sqrt(glmfit$var.prior[o])),col=col.trend)
+		lines(A[o], sqrt(sqrt(glmfit$var.prior[o])), col=col.trend, lwd=2)
 	}
-
-	legend("topright",pch=16,col=c(col.raw,col.shrunk,col.trend),legend=c("Raw","Squeezed", "Trend"))
+	
+	legend("topright", lty=c(-1,-1,1), pch=c(pch,pch,-1), col=c(col.raw,col.shrunk,col.trend), pt.cex=0.7, lwd=2, legend=c("Raw","Squeezed", "Trend"))
 	invisible(NULL)
 }
 
diff --git a/R/glmfit.R b/R/glmfit.R
index 785e74c..48e0928 100644
--- a/R/glmfit.R
+++ b/R/glmfit.R
@@ -48,8 +48,10 @@ glmFit.default <- function(y, design=NULL, dispersion=NULL, offset=NULL, lib.siz
 #	Check offset and lib.size
 	if(is.null(offset)) {
 		if(is.null(lib.size)) lib.size <- colSums(y)
+		if(any(lib.size==0L)) stop("Zero library size detected. Remove the empty libraries before proceeding to the next step.")
 		offset <- log(lib.size)
 	}
+	if(any(is.infinite(offset))) stop("Infinite offset value detected. Check the input offset.")
 	offset <- expandAsMatrix(offset,dim(y))
 
 #	weights are checked in lower-level functions
diff --git a/R/gof.R b/R/gof.R
index a7ebfbf..0a6c3c6 100644
--- a/R/gof.R
+++ b/R/gof.R
@@ -1,8 +1,7 @@
-gof <- function( glmfit, pcutoff=0.1, adjust="holm", plot=FALSE, main="qq-plot of genewise goodness of fit", ...)
-## Use LRT on deviance from a DGEGLM object
-## to identify dispersion outlier genes
-## Davis McCarthy, Gordon Smyth
-## 23 Mar 2011. Last modified 23 May 2012.
+gof <- function( glmfit, pcutoff=0.1, adjust="holm", plot=FALSE, main="qq-plot of residual deviances", ...)
+#	Compare residual deviances to chisquare distribution to identify dispersion outlier genes
+#	Davis McCarthy, Gordon Smyth
+#	23 Mar 2011. Last modified 23 May 2012.
 {
 	stopifnot( is(glmfit, "DGEGLM") )
 	gof.stats <- glmfit$deviance
@@ -19,41 +18,6 @@ gof <- function( glmfit, pcutoff=0.1, adjust="holm", plot=FALSE, main="qq-plot o
 		qqnorm(z,col=col,pch=pch,main=main,...)
 		abline(0,1)
 	}
-	invisible(list(gof.statistics=gof.stats, gof.pvalues=gof.pvals, outlier=outlier, df=glmfit$df.residual[1]))
-}
-
-
-.gof2 <- function( y, design, dispersion, offset=NULL, pcutoff=0.1, fit=NULL, method="LR" )
-    ## Calculate Deviance or Pearson goodness of fit statistics
-    ## for the dispersion parameter and find dispersion outliers
-    ## Davis McCarthy
-    ## 8 Feb 2011. Last modified 13 Nov 2012.
-{
-    y <- as.matrix(y)
-	nlibs <- ncol(y)
-	ntags <- nrow(y)
-    npar <- ncol(design)
-    if( is.null(fit) )
-        fit <- glmFit(y, design, dispersion, offset=offset, prior.count=0)
-    method <- match.arg(method, c("LR","Pearson"))
-
-    if( method=="LR") {
-        gof.stats <- fit$deviance
-    }
-    else {
-        means <- fitted(fit)
-        V <- means*(1+dispersion*means)
-        gof.stats <- rowSums( (y-means)^2/V )
-    }
 
-    right <- gof.stats > ( nlibs - npar )
-    gof.pvals <- rep(NA, ntags)
-    gof.pvals[right] <- pchisq(gof.stats[right], df=(nlibs - npar), lower.tail=FALSE, log.p=FALSE)
-    gof.pvals[!right] <- pchisq(gof.stats[!right], df=(nlibs - npar), lower.tail=TRUE, log.p=FALSE) 
-    outlier <- p.adjust(gof.pvals, method="holm") < pcutoff
-
-    new("list", list(gof.statistics=gof.stats, gof.pvalues=gof.pvals, outlier=outlier, right=right, fit=fit))
+	invisible(list(gof.statistics=gof.stats, gof.pvalues=gof.pvals, outlier=outlier, df=glmfit$df.residual[1]))
 }
-
-
-
diff --git a/R/plotBCV.R b/R/plotBCV.R
index 7a42442..f35790e 100644
--- a/R/plotBCV.R
+++ b/R/plotBCV.R
@@ -8,37 +8,39 @@ plotBCV <- function(y, xlab="Average log CPM", ylab="Biological coefficient of v
 
 #	Compute AveLogCPM if not found in y
 	A <- y$AveLogCPM
-	if(is.null(A)) A <- aveLogCPM(y$counts,offset=getOffset(y))
+	if(is.null(A)) A <- aveLogCPM(y$counts, offset=getOffset(y))
 
 #	Points to determine y axis limits
 	disp <- getDispersion(y)
 	if(is.null(disp)) stop("No dispersions to plot")
-	if(attr(disp,"type")=="common") disp <- rep(disp,length=length(A))
+	if(attr(disp,"type")=="common") disp <- rep(disp, length=length(A))
 
 #	Make plot
 	plot(A, sqrt(disp), xlab=xlab, ylab=ylab, type="n", ...)
-	labels <- cols <- NULL
+	labels <- cols <- lty <- pt <- NULL
 	if(!is.null(y$tagwise.dispersion)) {
 		points(A, sqrt(y$tagwise.dispersion), pch=pch, cex=cex, col=col.tagwise)
-		labels <- c(labels,"Tagwise")
-		cols <- c(cols,col.tagwise)
+		labels <- c(labels, "Tagwise")
+		cols <- c(cols, col.tagwise)
+		lty <- c(lty, -1)
+		pt <- c(pt, pch)
 	}
 	if(!is.null(y$common.dispersion)) {
 		abline(h=sqrt(y$common.dispersion), col=col.common, lwd=2)
-		labels <- c(labels,"Common")
-		cols <- c(cols,col.common)
+		labels <- c(labels, "Common")
+		cols <- c(cols, col.common)
+		lty <- c(lty, 1)
+		pt <- c(pt, -1)		
 	}
 	if(!is.null(y$trended.dispersion)) {
 		o <- order(A)
 		lines(A[o], sqrt(y$trended.dispersion)[o], col=col.trend, lwd=2)
-		labels <- c(labels,"Trend")
-		cols <- c(cols,col.trend)
+		labels <- c(labels, "Trend")
+		cols <- c(cols, col.trend)
+		lty <- c(lty, 1)
+		pt <- c(pt, -1)		
 	}
-	legend("topright",legend=labels,lty=c(-1,1,1),pch=c(pch,-1,-1),pt.cex=cex,lwd=2,col=cols)
-
-#	Add binned dispersions if appropriate
-#	if(!is.null(y$trend.method)) if(y$trend.method %in% c("bin.spline","bin.loess")) if(!is.null(y$bin.dispersion)) if(!is.null(y$bin.AveLogCPM))
-#		points(y$bin.AveLogCPM, sqrt(y$bin.dispersion), pch=16, cex=1, col="lightblue")
+	legend("topright", legend=labels, lty=lty, pch=pt, pt.cex=cex, lwd=2, col=cols)
 
 	invisible()
 }
diff --git a/R/splitIntoGroups.R b/R/splitIntoGroups.R
index 3d6ca02..8e6e7a2 100644
--- a/R/splitIntoGroups.R
+++ b/R/splitIntoGroups.R
@@ -1,11 +1,29 @@
-splitIntoGroups<-function(object) 
+# Split the data according to group
+
+splitIntoGroups <- function(y, ...)
+UseMethod("splitIntoGroups")
+
+splitIntoGroups.DGEList <- function(y, ...)
+# Yunshun Chen. Created 18 March 2016.
+{
+	group <- y$samples$group
+	splitIntoGroups(y$counts, group=group)
+}
+
+splitIntoGroups.default <- function(y, group=NULL, ...) 
 # Written by Davis McCarthy, February 2009, idea suggested by Mark Robinson
-# A function to split the data from a DGEList object according to group
+# Last modified 18 March 2016.
 {
-	nrows<-nrow(object$counts)
-	y<-lapply(split(t(object$counts),object$samples$group), FUN=function(u) matrix(u,nrow=nrows,byrow=TRUE))
-	for(i in 1:length(levels(object$samples$group))) {
-		rownames(y[[i]])<-rownames(object$counts)
-	}
-	y
+#	Check y
+	y <- as.matrix(y)
+	ntags <- nrow(y)
+	nlibs <- ncol(y)
+
+#	Check group
+	if(is.null(group)) group <- rep(1, nlibs)
+	if(length(group)!=nlibs) stop("Incorrect length of group.")
+	group <- dropEmptyLevels(group)
+	
+	out <- lapply(split(t(y), group), FUN=function(u) matrix(u, nrow=ntags, byrow=TRUE))
+	out
 }
\ No newline at end of file
diff --git a/R/subsetting.R b/R/subsetting.R
index 1f0705f..41d79a8 100644
--- a/R/subsetting.R
+++ b/R/subsetting.R
@@ -23,7 +23,7 @@ function(object, i, j, keep.lib.sizes=TRUE)
 assign("[.DGEGLM",
 function(object, i, j)
 #  Subsetting for DGEGLM objects
-#  11 May 2011.  Last modified 09 May 2014.
+#  11 May 2011.  Last modified 20 April 2016.
 {
 	if(nargs() != 3) stop("Two subscripts required",call.=FALSE)
 	if(!missing(j)) stop("Subsetting columns not allowed for DGEGLM object.",call.=FALSE)
@@ -31,8 +31,8 @@ function(object, i, j)
 #	Recognized components
 	IJ <- character(0)
 	IX <- c("counts","offset","weights","genes","coefficients","fitted.values","unshrunk.coefficients")
-	I  <- c("AveLogCPM","trended.dispersion","tagwise.dispersion","prior.n","prior.df","dispersion","df.residual","df.residual.zeros","deviance","iter","failed")
-	JX <- c("samples","design")
+	I  <- c("AveLogCPM","dispersion","prior.n","prior.df","var.post","var.prior","df.prior","df.residual","df.residual.zeros","deviance","iter","failed")
+	JX <- character(0)
 
 	subsetListOfArrays(object,i,j,IJ=IJ,IX=IX,I=I,JX=JX)
 })
@@ -57,7 +57,7 @@ function(object, i, j)
 assign("[.DGELRT",
 function(object, i, j)
 #  Subsetting for DGELRT objects
-#  6 April 2011.  Last modified 09 May 2014.
+#  6 April 2011.  Last modified 20 April 2016.
 {
 	if(nargs() != 3) stop("Two subscripts required",call.=FALSE)
 	if(!missing(j)) stop("Subsetting columns not allowed for DGELRT object.",call.=FALSE)
@@ -65,7 +65,7 @@ function(object, i, j)
 #	Recognized components
 	IJ <- character(0)
 	IX <- c("counts","offset","weights","genes","coefficients","fitted.values","table","unshrunk.coefficients")
-	I  <- c("AveLogCPM","trended.dispersion","tagwise.dispersion","prior.n","prior.df","dispersion","df.residual","df.residual.zeros","deviance","iter","failed","df.test")
+	I  <- c("AveLogCPM","dispersion","prior.n","prior.df","var.post","var.prior","df.prior","df.residual","df.residual.zeros","deviance","iter","failed","df.test","df.total")
 	JX <- character(0)
 
 	subsetListOfArrays(object,i,j,IJ=IJ,IX=IX,I=I,JX=JX)
diff --git a/build/vignette.rds b/build/vignette.rds
index 2aa495e..12bd1b9 100644
Binary files a/build/vignette.rds and b/build/vignette.rds differ
diff --git a/inst/NEWS.Rd b/inst/NEWS.Rd
index c7a76ae..c8d1272 100644
--- a/inst/NEWS.Rd
+++ b/inst/NEWS.Rd
@@ -2,6 +2,49 @@
 \title{edgeR News}
 \encoding{UTF-8}
 
+\section{Version 3.14.0}{\itemize{
+\item
+estimateDisp(), estimateCommonDisp(), estimateTrendedDisp(), estimateTagwiseDisp(), splitIntoGroups() and equalizeLibSizes() are now S3 generic functions.
+
+\item
+The default method of estimateGLMTrendedDisp() and estimateGLMTagwiseDisp() now only return dispersion estimates instead of a list.
+
+\item
+Add fry method for DGEList objects.
+
+\item
+Import R core packages explicitly.
+
+\item
+New function gini() to compute Gini coefficients.
+
+\item
+New argument poisson.bound for glmQLFTest().
+If TRUE (default), the p-value returned by glmQLFTest() will never be less than what would be obtained for a likelihood ratio test with NB dispersion equal to zero.
+
+\item
+New argument samples for DGEList(). It takes a data frame containing information for each sample.
+
+\item
+glmFit() now protects against zero library sizes and infinite offset values.
+
+\item
+glmQLFit.default() now avoids passing a NULL design to .residDF().
+
+\item
+cpm.default() now outputs a matrix of the same dimensions as the input even when the input has 0 row or 0 column.
+
+\item
+DGEList() pops up a warning message when zero lib.size is detected.
+
+\item
+Bug fix to calcNormFactors(method="TMM") when two libraries have identical counts but the lib.sizes have been set unequal.
+
+\item
+Add a CRISPR-Cas9 screen case study to the users' guide and rename Nigerian case study to Yoruba.
+}}
+
+
 \section{Version 3.12.0}{\itemize{
 \item
 New argument tagwise for estimateDisp(), allowing users not to estimate tagwise dispersions. 
diff --git a/inst/doc/edgeR.pdf b/inst/doc/edgeR.pdf
index 68b0de2..8ae117d 100644
Binary files a/inst/doc/edgeR.pdf and b/inst/doc/edgeR.pdf differ
diff --git a/man/DGEList.Rd b/man/DGEList.Rd
index 7d20f42..caa7aa1 100644
--- a/man/DGEList.Rd
+++ b/man/DGEList.Rd
@@ -11,14 +11,15 @@ Creates a \code{DGEList} object from a table of counts (rows=features, columns=s
 
 \usage{
 DGEList(counts = matrix(0, 0, 0), lib.size = colSums(counts),
-        norm.factors = rep(1,ncol(counts)), group = rep(1,ncol(counts)), genes = NULL,
-        remove.zeros = FALSE)
+        norm.factors = rep(1,ncol(counts)), samples = NULL,
+        group = NULL, genes = NULL, remove.zeros = FALSE)
 }
 
 \arguments{
   \item{counts}{numeric matrix of read counts.}
   \item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
   \item{norm.factors}{numeric vector of normalization factors that modify the library sizes.}
+  \item{samples}{data frame containing information for each sample.}
   \item{group}{vector or factor giving the experimental group/condition for each sample/library.}
   \item{genes}{data frame containing annotation information for each gene.}
   \item{remove.zeros}{logical, whether to remove rows that have 0 total count.}
@@ -26,7 +27,7 @@ DGEList(counts = matrix(0, 0, 0), lib.size = colSums(counts),
 
 \details{
 To facilitate programming pipelines,
-\code{NULL} values can be input for \code{lib.size}, \code{norm.factors} or \code{group}, in which case the default value is used as if the argument had been missing.
+\code{NULL} values can be input for \code{lib.size}, \code{norm.factors}, \code{samples} or \code{group}, in which case the default value is used as if the argument had been missing.
 }
 
 \value{a \code{\link[edgeR:DGEList-class]{DGEList}} object}
diff --git a/man/WLEB.Rd b/man/WLEB.Rd
index 7c270d5..b59ce5b 100644
--- a/man/WLEB.Rd
+++ b/man/WLEB.Rd
@@ -8,31 +8,22 @@ Estimates the parameters which maximize the given log-likelihood matrix using em
 }
 
 \usage{
-WLEB(theta, loglik, prior.n, covariate, trend.method="locfit", span=NULL, overall=TRUE,
-     trend=TRUE, individual=TRUE, m0=NULL, m0.out=FALSE)
+WLEB(theta, loglik, prior.n=5, covariate=NULL, trend.method="locfit", mixed.df=FALSE, 
+     span=NULL, overall=TRUE, trend=TRUE, individual=TRUE, m0=NULL, m0.out=FALSE)
 }
 
 \arguments{
 \item{theta}{numeric vector of values of the parameter at which the log-likelihoods are calculated.}
-
 \item{loglik}{numeric matrix of log-likelihood of all the candidates at those values of parameter.}
-
 \item{prior.n}{numeric scaler, estimate of the prior weight, i.e. the smoothing parameter that indicates the weight to put on the common likelihood compared to the individual's likelihood.}
-
 \item{covariate}{numeric vector of values across which a parameter trend is fitted}
-
 \item{trend.method}{method for estimating the parameter trend. Possible values are \code{"none"}, \code{"movingave"} and \code{"loess"}.}
-
+\item{mixed.df}{logical, only used when \code{trend.method="locfit"}. If \code{FALSE}, \code{locfit} uses a polynomial of degree 0. If \code{TRUE}, \code{locfit} uses a polynomial of degree 1 for rows with small covariate values. Care is taken to smooth the curve.}
 \item{span}{width of the smoothing window, as a proportion of the data set.}
-
 \item{overall}{logical, should a single value of the parameter which maximizes the sum of all the log-likelihoods be estimated?}
-
 \item{trend}{logical, should a parameter trend (against the covariate) which maximizes the local shared log-likelihoods be estimated?}
-
 \item{individual}{logical, should individual estimates of all the candidates after applying empirical Bayes method along the trend be estimated?}
-
-\item{m0}{numeric matrix of local shared log-likelihoods. If \code{Null}, they will be calculated using the method selected by \code{trend.method}.}
-
+\item{m0}{numeric matrix of local shared log-likelihoods. If \code{Null}, it will be calculated using the method selected by \code{trend.method}.}
 \item{m0.out}{logical, should local shared log-likelihoods be included in the output?}
 }
 
diff --git a/man/camera.DGEList.Rd b/man/camera.DGEList.Rd
index 6979735..bb8725b 100644
--- a/man/camera.DGEList.Rd
+++ b/man/camera.DGEList.Rd
@@ -32,7 +32,7 @@ A data.frame. See \code{\link{camera}} for details.
 \author{Yunshun Chen, Gordon Smyth}
 
 \references{
-Dunn, K. P., and Smyth, G. K. (1996).
+Dunn, PK, and Smyth, GK (1996).
 Randomized quantile residuals.
 \emph{J. Comput. Graph. Statist.}, 5, 236-244. 
 \url{http://www.statsci.org/smyth/pubs/residual.html}
diff --git a/man/diffSpliceDGE.Rd b/man/diffSpliceDGE.Rd
index f65d9ff..014d854 100644
--- a/man/diffSpliceDGE.Rd
+++ b/man/diffSpliceDGE.Rd
@@ -4,7 +4,8 @@
 \title{Test for Differential Exon Usage}
 \description{Given a negative binomial generalized log-linear model fit at the exon level, test for differential exon usage between experimental conditions.}
 \usage{
-diffSpliceDGE(glmfit, coef=ncol(glmfit$design), contrast=NULL, geneid, exonid=NULL, prior.count=0.125, verbose=TRUE)
+diffSpliceDGE(glmfit, coef=ncol(glmfit$design), contrast=NULL, geneid, exonid=NULL,
+              prior.count=0.125, verbose=TRUE)
 }
 
 \arguments{
diff --git a/man/equalizeLibSizes.Rd b/man/equalizeLibSizes.Rd
index 7b03a95..1258bb1 100644
--- a/man/equalizeLibSizes.Rd
+++ b/man/equalizeLibSizes.Rd
@@ -1,26 +1,32 @@
 \name{equalizeLibSizes}
 \alias{equalizeLibSizes}
+\alias{equalizeLibSizes.DGEList}
+\alias{equalizeLibSizes.default}
 
 \title{Equalize Library Sizes by Quantile-to-Quantile Normalization}
 
 \description{Adjusts counts so that the effective library sizes are equal, preserving fold-changes between groups and preserving biological variability within each group.}
 
 \usage{
-equalizeLibSizes(object, dispersion=NULL, common.lib.size)
+\S3method{equalizeLibSizes}{DGEList}(y, dispersion=NULL, \dots)
+\S3method{equalizeLibSizes}{default}(y, group=NULL, dispersion=NULL, 
+            lib.size=NULL, \dots)
 }
 
 \arguments{
-\item{object}{\code{\link[edgeR:DGEList-class]{DGEList}} object}
-
-\item{dispersion}{numeric vector of dispersion parameters.
-By default, is extracted from \code{object} or, if \code{object} contains no dispersion information, is set to \code{0.05}.}
-
-\item{common.lib.size}{numeric scalar, the library size to normalize to; default is the geometric mean of the original effective library sizes}
+\item{y}{matrix of counts or a \code{DGEList} object.}
+\item{dispersion}{numeric scalar or vector of dispersion parameters.
+By default, is extracted from \code{y} or, if \code{y} contains no dispersion information, is set to \code{0.05}.}
+\item{group}{vector or factor giving the experimental group/condition for each library.}
+\item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
+\item{\dots}{other arguments that are not currently used.}
 }
 
-\value{A list with components
+\value{
+\code{equalizeLibSizes.DGEList} returns a \code{DGEList} object with the following new components:
 	\item{pseudo.counts}{numeric matrix of normalized pseudo-counts}
-	\item{common.lib.size}{normalized library size}
+	\item{pseudo.lib.size}{normalized library size}
+\code{equalizeLibSizes.default} returns a list with components {pseudo.counts} and {pseudo.lib.size}.
 }
 
 \details{
@@ -28,7 +34,7 @@ Thus function implements the quantile-quantile normalization method of Robinson
 It computes normalized counts, or pseudo-counts, used by \code{exactTest} and \code{estimateCommonDisp}.
 
 The output pseudo-counts are the counts that would have theoretically arisen had the effective library sizes been equal for all samples.
-The pseudo-counts are computed in such as way as to preserve fold-change differences beween the groups defined by \code{object$samples$group} as well as biological variability within each group.
+The pseudo-counts are computed in such as way as to preserve fold-change differences beween the groups defined by \code{y$samples$group} as well as biological variability within each group.
 Consequently, the results will depend on how the groups are defined.
 
 Note that the column sums of the \code{pseudo.counts} matrix will not generally be equal, because the effective library sizes are not necessarily the same as actual library sizes and because the normalized pseudo counts are not equal to expected counts.
diff --git a/man/estimateCommonDisp.Rd b/man/estimateCommonDisp.Rd
index 0f12211..896606a 100644
--- a/man/estimateCommonDisp.Rd
+++ b/man/estimateCommonDisp.Rd
@@ -1,5 +1,8 @@
 \name{estimateCommonDisp}
 \alias{estimateCommonDisp}
+\alias{estimateCommonDisp.DGEList}
+\alias{estimateCommonDisp.default}
+
 
 \title{Estimate Common Negative Binomial Dispersion by Conditional Maximum Likelihood}
 
@@ -8,23 +11,28 @@ Maximizes the negative binomial conditional common likelihood to estimate a comm
 }
 
 \usage{
-estimateCommonDisp(object, tol=1e-06, rowsum.filter=5, verbose=FALSE)
+\method{estimateCommonDisp}{DGEList}(y, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
+\method{estimateCommonDisp}{default}(y, group=NULL, lib.size=NULL, tol=1e-06, 
+          rowsum.filter=5, verbose=FALSE, ...)
 }
 
 \arguments{
-\item{object}{a \code{DGEList} object.}
-
+\item{y}{matrix of counts or a \code{DGEList} object.}
 \item{tol}{the desired accuracy, passed to \code{\link{optimize}}.}
-
 \item{rowsum.filter}{genes with total count (across all samples) below this value will be filtered out before estimating the dispersion.}
-
 \item{verbose}{logical, if \code{TRUE} then the estimated dispersion and BCV will be printed to standard output.}
+\item{group}{vector or factor giving the experimental group/condition for each library.}
+\item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
+\item{\dots}{other arguments that are not currently used.}
 }
 
-\value{Returns \code{object} with the following added components:
+\value{
+\code{estimateCommonDisp.DGEList} adds the following components to the input \code{DGEList} object:
 	\item{common.dispersion}{estimate of the common dispersion.}
 	\item{pseudo.counts}{numeric matrix of pseudo-counts.}
-	\item{pseudo.lib.size}{the common library size to which the pseudo-counts have been adjusted}
+	\item{pseudo.lib.size}{the common library size to which the pseudo-counts have been adjusted.}
+	\item{AveLogCPM}{numeric vector giving log2(AveCPM) for each row of \code{y}.}
+\code{estimateCommonDisp.default} returns a numeric scalar of the common dispersion estimate.
 }
 
 \details{
diff --git a/man/estimateDisp.Rd b/man/estimateDisp.Rd
index b6f0ec7..eb35f09 100644
--- a/man/estimateDisp.Rd
+++ b/man/estimateDisp.Rd
@@ -1,5 +1,7 @@
 \name{estimateDisp}
 \alias{estimateDisp}
+\alias{estimateDisp.DGEList}
+\alias{estimateDisp.default}
 
 \title{Estimate Common, Trended and Tagwise Negative Binomial dispersions by weighted likelihood empirical Bayes}
 
@@ -8,45 +10,47 @@ Maximizes the negative binomial likelihood to give the estimate of the common, t
 }
 
 \usage{
-estimateDisp(y, design=NULL, prior.df=NULL, trend.method="locfit", tagwise=TRUE,
-             span=NULL, min.row.sum=5, grid.length=21, grid.range=c(-10,10), 
-             robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06)
+\S3method{estimateDisp}{DGEList}(y, design=NULL, prior.df=NULL, trend.method="locfit", mixed.df=FALSE, 
+            tagwise=TRUE, span=NULL, min.row.sum=5, grid.length=21, grid.range=c(-10,10), robust=FALSE, 
+            winsor.tail.p=c(0.05,0.1), tol=1e-06, ...)
+\S3method{estimateDisp}{default}(y, design=NULL, group=NULL, lib.size=NULL, offset=NULL, prior.df=NULL,
+            trend.method="locfit", mixed.df=FALSE, tagwise=TRUE, span=NULL, min.row.sum=5, grid.length=21, 
+            grid.range=c(-10,10), robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06, weights=NULL, ...)
 }
 
 \arguments{
-\item{y}{\code{DGEList} object}
-
+\item{y}{matrix of counts or a \code{DGEList} object.}
 \item{design}{numeric design matrix}
-
 \item{prior.df}{prior degrees of freedom. It is used in calculating \code{prior.n}.}
-
-\item{trend.method}{method for estimating dispersion trend. Possible values are \code{"none"}, \code{"movingave"}, \code{"loess"} and \code{"locfit"}.}
-
+\item{trend.method}{method for estimating dispersion trend. Possible values are \code{"none"}, \code{"movingave"}, \code{"loess"} and \code{"locfit"} (default).}
+\item{mixed.df}{logical, only used when \code{trend.method="locfit"}. If \code{FALSE}, \code{locfit} uses a polynomial of degree 0. If \code{TRUE}, \code{locfit} uses a polynomial of degree 1 for lowly expressed genes. Care is taken to smooth the curve.}
 \item{tagwise}{logical, should the tagwise dispersions be estimated?}
-
 \item{span}{width of the smoothing window, as a proportion of the data set.}
-
 \item{min.row.sum}{numeric scalar giving a value for the filtering out of low abundance tags. Only tags with total sum of counts above this value are used. Low abundance tags can adversely affect the dispersion estimation, so this argument allows the user to select an appropriate filter threshold for the tag abundance.}
-
 \item{grid.length}{the number of points on which the interpolation is applied for each tag.}
-
 \item{grid.range}{the range of the grid points around the trend on a log2 scale.}
-
 \item{robust}{logical, should the estimation of \code{prior.df} be robustified against outliers?}
-
 \item{winsor.tail.p}{numeric vector of length 1 or 2, giving left and right tail proportions of the deviances to Winsorize when estimating \code{prior.df}.}
-
 \item{tol}{the desired accuracy, passed to \code{\link{optimize}}}
+\item{group}{vector or factor giving the experimental group/condition for each library.}
+\item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
+\item{offset}{offset matrix for the log-linear model, as for \code{\link{glmFit}}.  Defaults to the log-effective library sizes.}
+\item{weights}{optional numeric matrix giving observation weights}
+\item{\dots}{other arguments that are not currently used.}
 }
 
-\value{Returns \code{object} with the following added components:
+\value{
+\code{estimateDisp.DGEList} adds the following components to the input \code{DGEList} object:
 	\item{common.dispersion}{estimate of the common dispersion.}
 	\item{trended.dispersion}{estimates of the trended dispersions.}
 	\item{tagwise.dispersion}{tagwise estimates of the dispersion parameter if \code{tagwise=TRUE}.}
-	\item{logCPM}{the average abundance of each tag, in log average counts per million.}
+	\item{AveLogCPM}{numeric vector giving log2(AveCPM) for each row of \code{y}.}
+	\item{trend.method}{method for estimating dispersion trend as given in the input.}
 	\item{prior.df}{prior degrees of freedom. It is a vector when robust method is used.}
 	\item{prior.n}{estimate of the prior weight, i.e. the smoothing parameter that indicates the weight to put on the common likelihood compared to the individual tag's likelihood.}
 	\item{span}{width of the smoothing window used in estimating dispersions.}
+
+\code{estimateDisp.default} returns a list containing \code{common.dispersion}, \code{trended.dispersion}, \code{tagwise.dispersion} (if \code{tagwise=TRUE}), \code{span}, \code{prior.df} and \code{prior.n}.
 }
 
 \details{
@@ -65,6 +69,11 @@ Differential expression analysis of complex RNA-seq experiments using edgeR.
 In: \emph{Statistical Analysis of Next Generation Sequence Data},
 Somnath Datta and Daniel S Nettleton (eds), Springer, New York.
 \url{http://www.statsci.org/smyth/pubs/edgeRChapterPreprint.pdf}
+
+Phipson, B, Lee, S, Majewski, IJ, Alexander, WS, and Smyth, GK (2016).
+Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression.
+\emph{Annals of Applied Statistics} 10. 
+\url{http://arxiv.org/abs/1602.08678}
 }
 
 \author{Yunshun Chen, Gordon Smyth}
diff --git a/man/estimateGLMTagwiseDisp.Rd b/man/estimateGLMTagwiseDisp.Rd
index d587cc4..149483b 100644
--- a/man/estimateGLMTagwiseDisp.Rd
+++ b/man/estimateGLMTagwiseDisp.Rd
@@ -23,7 +23,7 @@ Compute an empirical Bayes estimate of the negative binomial dispersion paramete
 \item{design}{numeric design matrix, as for \code{\link{glmFit}}.}
 \item{trend}{logical. Should the prior be the trended dispersion (\code{TRUE}) or the common dispersion (\code{FALSE})?}
 \item{offset}{offset matrix for the log-linear model, as for \code{\link{glmFit}}.  Defaults to the log-effective library sizes.}
-\item{dispersion}{common or trended dispersion estimates, used as an initial estimate for the tagwise estimates. By default uses values stored in the \code{DGEList} object.}
+\item{dispersion}{common or trended dispersion estimates, used as an initial estimate for the tagwise estimates.}
 \item{prior.df}{prior degrees of freedom.}
 \item{span}{width of the smoothing window, in terms of proportion of the data set. Default value decreases with the number of tags.}
 \item{AveLogCPM}{numeric vector giving average log2 counts per million for each tag}
diff --git a/man/estimateGLMTrendedDisp.Rd b/man/estimateGLMTrendedDisp.Rd
index 0eca628..4d0e650 100644
--- a/man/estimateGLMTrendedDisp.Rd
+++ b/man/estimateGLMTrendedDisp.Rd
@@ -29,7 +29,7 @@ Possible values are \code{"auto"} (default, switch to \code{"bin.spline"} method
 \value{
 When the input object is a \code{DGEList}, \code{estimateGLMTrendedDisp} produces a \code{DGEList} object, which contains the estimates of the trended dispersion parameter for the negative binomial model according to the method applied.
 
-When the input object is a numeric matrix, the output of one of the lower-level functions \code{dispBinTrend}, \code{dispCoxReidPowerTrend} of \code{dispCoxReidSplineTrend} is returned.
+When the input object is a numeric matrix, it returns a vector of trended dispersion estimates calculated by one of the lower-level functions \code{dispBinTrend}, \code{dispCoxReidPowerTrend} and \code{dispCoxReidSplineTrend}.
 }
 
 \details{
diff --git a/man/estimateTagwiseDisp.Rd b/man/estimateTagwiseDisp.Rd
index fa168dd..c048b42 100644
--- a/man/estimateTagwiseDisp.Rd
+++ b/man/estimateTagwiseDisp.Rd
@@ -1,5 +1,7 @@
 \name{estimateTagwiseDisp}
 \alias{estimateTagwiseDisp}
+\alias{estimateTagwiseDisp.DGEList}
+\alias{estimateTagwiseDisp.default}
 
 \title{Estimate Empirical Bayes Tagwise Dispersion Values}
 
@@ -8,30 +10,29 @@ Estimates tagwise dispersion values by an empirical Bayes method based on weight
 }
 
 \usage{
-estimateTagwiseDisp(object, prior.df=10, trend="movingave", span=NULL, method="grid",
-            grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE)
+\method{estimateTagwiseDisp}{DGEList}(y, prior.df=10, trend="movingave", span=NULL, method="grid", 
+           grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)
+\method{estimateTagwiseDisp}{default}(y, group=NULL, lib.size=NULL, dispersion, AveLogCPM=NULL, 
+           prior.df=10, trend="movingave", span=NULL, method="grid", grid.length=11, 
+           grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)
 }
 
 \arguments{
-\item{object}{object of class \code{DGEList} containing (at least) the elements \code{counts} (table of raw counts), \code{group} (factor indicating group), \code{lib.size} (numeric vector of library sizes) and \code{pseudo.alt} (numeric matrix of quantile-adjusted pseudocounts calculated under the alternative hypothesis of a true difference between groups; recommended to use the \code{DGEList} object provided as the output of \code{estimateCommonDisp}}
-
+\item{y}{matrix of counts or a \code{DGEList} object.}
 \item{prior.df}{prior degrees of freedom.}
-
-\item{trend}{method for estimating dispersion trend. Possible values are \code{"none"}, \code{"movingave"} and \code{"loess"}.}
-
+\item{trend}{method for estimating dispersion trend. Possible values are \code{"movingave"} (default), \code{"loess"} and \code{"none"}.}
 \item{span}{width of the smoothing window, as a proportion of the data set.}
-
 \item{method}{method for maximizing the posterior likelihood.
-Possible values are \code{"grid"} for interpolation on grid points or \code{"optimize"} to call the function of the same name.
-}
-
+Possible values are \code{"grid"} (default) for interpolation on grid points or \code{"optimize"} to call the function of the same name.}
 \item{grid.length}{for \code{method="grid"}, the number of points on which the interpolation is applied for each tag.}
 \item{grid.range}{for \code{method="grid"}, the range of the grid points around the trend on a log2 scale.}
-
 \item{tol}{for \code{method="optimize"}, the tolerance for Newton-Rhapson iterations.}
-
 \item{verbose}{logical, if \code{TRUE} then diagnostic ouput is produced during the estimation process.}
-
+\item{group}{vector or factor giving the experimental group/condition for each library.}
+\item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
+\item{dispersion}{common dispersion estimate, used as an initial estimate for the tagwise estimates.}
+\item{AveLogCPM}{numeric vector giving average log2 counts per million for each tag}
+\item{\dots}{other arguments that are not currently used.}
 }
 
 \details{
@@ -57,9 +58,12 @@ Note that the terms `tag' and `gene' are synonymous here. The function is only n
 }
 
 \value{
-An object of class \code{DGEList} with the same components as for \code{\link{estimateCommonDisp}} plus the following:
-	\item{prior.n}{estimate of the prior weight, i.e. the smoothing parameter that indicates the weight to put on the common likelihood compared to the individual tag's likelihood; prior.n of 10 means that the common likelihood is given 10 times the weight of the individual tag's likelihood in the estimation of the tagwise dispersion.}
-	\item{tagwise.dispersion}{tag-wise estimates of the dispersion parameter.}
+\code{estimateTagwiseDisp.DGEList} adds the following components to the input \code{DGEList} object:
+	\item{prior.df}{prior degrees of freedom.}
+	\item{prior.n}{estimate of the prior weight.}
+	\item{tagwise.dispersion}{numeric vector of the tagwise dispersion estimates.}
+	\item{span}{width of the smoothing window, in terms of proportion of the data set.}
+\code{estimateTagwiseDisp.default} returns a numeric vector of the tagwise dispersion estimates.
 }
 
 \references{
@@ -70,7 +74,11 @@ assessing differences in tag abundance. \emph{Bioinformatics} 23, 2881-2887.
 
 \author{Mark Robinson, Davis McCarthy, Yunshun Chen and Gordon Smyth}
 \examples{
-# See ?exactTest or ?estimateTrendedDisp for examples
+# True dispersion is 1/5=0.2
+y <- matrix(rnbinom(250*4,mu=20,size=5),nrow=250,ncol=4)
+dge <- DGEList(counts=y,group=c(1,1,2,2))
+dge <- estimateCommonDisp(dge)
+dge <- estimateTagwiseDisp(dge)
 }
 
 \seealso{
diff --git a/man/estimateTrendedDisp.Rd b/man/estimateTrendedDisp.Rd
index ffb8271..966967e 100644
--- a/man/estimateTrendedDisp.Rd
+++ b/man/estimateTrendedDisp.Rd
@@ -1,5 +1,7 @@
 \name{estimateTrendedDisp}
 \alias{estimateTrendedDisp}
+\alias{estimateTrendedDisp.DGEList}
+\alias{estimateTrendedDisp.default}
 
 \title{Estimate Empirical Bayes Trended Dispersion Values}
 
@@ -8,18 +10,20 @@ Estimates trended dispersion values by an empirical Bayes method.
 }
 
 \usage{
-estimateTrendedDisp(object, method="bin.spline", df=5, span=2/3)
+\S3method{estimateTrendedDisp}{DGEList}(y, method="bin.spline", df=5, span=2/3, ...)
+\S3method{estimateTrendedDisp}{default}(y, group=NULL, lib.size=NULL, AveLogCPM=NULL, 
+            method="bin.spline", df=5, span=2/3, ...)
 }
 
 \arguments{ 
-
-\item{object}{object of class \code{DGEList} containing (at least) the elements \code{counts} (table of raw counts), \code{group} (factor indicating group), \code{lib.size} (numeric vector of library sizes) and \code{pseudo.alt} (numeric matrix of quantile-adjusted pseudocounts calculated under the alternative hypothesis of a true difference between groups; recommended to use the \code{DGEList} object provided as the output of \code{estimateCommonDisp}}
-
-\item{method}{method used to estimated the trended dispersions. Possible values are \code{"spline"}, and \code{"loess"}.}
-
-\item{df}{integer giving the degrees of freedom of the spline function if \code{"spline"} method is used, see \code{ns} in the splines package. Default is 5.}
-
+\item{y}{matrix of counts or a \code{DGEList} object.}
+\item{method}{method used to estimated the trended dispersions. Possible values are \code{"bin.spline"}, and \code{"bin.loess"}.}
+\item{df}{integer giving the degrees of freedom of the spline function if \code{"bin.spline"} method is used, see \code{ns} in the splines package. Default is 5.}
 \item{span}{scalar, passed to \code{loess} to determine the amount of smoothing for the loess fit when \code{"loess"} method is used. Default is \code{2/3}.}
+\item{group}{vector or factor giving the experimental group/condition for each library.}
+\item{lib.size}{numeric vector giving the total count (sequence depth) for each library.}
+\item{AveLogCPM}{numeric vector giving average log2 counts per million for each tag}
+\item{\dots}{other arguments that are not currently used.}
 }
 
 \details{
@@ -43,8 +47,6 @@ counts <- matrix(counts,ngenes,nlib)
 y <- DGEList(counts,lib.size=rep(lib.size,nlib))
 y <- estimateCommonDisp(y)
 y <- estimateTrendedDisp(y)
-y <- estimateTagwiseDisp(y)
-plotBCV(y)
 }
 
 \seealso{
diff --git a/man/gini.Rd b/man/gini.Rd
new file mode 100644
index 0000000..f2d7506
--- /dev/null
+++ b/man/gini.Rd
@@ -0,0 +1,31 @@
+\name{gini}
+\alias{gini}
+\title{Gini dispersion index}
+\description{
+Gini index for each column of a matrix.
+}
+\usage{
+gini(x)
+}
+\arguments{
+  \item{x}{non-negative numeric matrix}
+}
+\details{
+The Gini coefficient or index is a measure of inequality or diversity.
+It is zero if all the values of \code{x} are equal.
+It reaches a maximum value of \code{1/nrow(x)} when all values are zero except for one.
+
+The Gini index is only interpretable for non-negative quantities.
+It is not meaningful if \code{x} contains negative values.
+}
+\value{
+Numeric vector of length \code{ncol(x)}.
+}
+\references{
+\url{https://en.wikipedia.org/wiki/Gini_coefficient}.
+}
+\examples{
+x <- matrix(rpois(20,lambda=5),10,2)
+gini(x)
+}
+\author{Gordon Smyth}
diff --git a/man/glmQLFTest.Rd b/man/glmQLFTest.Rd
index 5d0f8af..ccc45ff 100644
--- a/man/glmQLFTest.Rd
+++ b/man/glmQLFTest.Rd
@@ -10,12 +10,12 @@
 Conduct genewise statistical tests for a given coefficient or contrast.}
 
 \usage{
-\method{glmQLFit}{DGEList}(y, design=NULL, dispersion=NULL, offset=NULL, abundance.trend=TRUE, 
+\method{glmQLFit}{DGEList}(y, design=NULL, dispersion=NULL, offset=NULL, abundance.trend=TRUE,
         robust=FALSE, winsor.tail.p=c(0.05, 0.1), \dots)
-\method{glmQLFit}{default}(y, design=NULL, dispersion=NULL, offset=NULL, lib.size=NULL, 
-        abundance.trend=TRUE, AveLogCPM=NULL, robust=FALSE, 
+\method{glmQLFit}{default}(y, design=NULL, dispersion=NULL, offset=NULL, lib.size=NULL,
+        abundance.trend=TRUE, AveLogCPM=NULL, robust=FALSE,
         winsor.tail.p=c(0.05, 0.1), \dots)
-glmQLFTest(glmfit, coef=ncol(glmfit$design), contrast=NULL) 
+glmQLFTest(glmfit, coef=ncol(glmfit$design), contrast=NULL, poisson.bound=TRUE)
 }
 
 \arguments{
@@ -44,6 +44,8 @@ glmQLFTest(glmfit, coef=ncol(glmfit$design), contrast=NULL)
 \item{coef}{integer or character index vector indicating which coefficients of the linear model are to be tested equal to zero. Ignored if \code{contrast} is not \code{NULL}.}
 
 \item{contrast}{numeric vector or matrix specifying one or more contrasts of the linear model coefficients to be tested equal to zero.}
+
+\item{poisson.bound}{logical, if \code{TRUE} then the p-value returned will never be less than would be obtained for a likelihood ratio test with NB dispersion equal to zero.}
 }
 
 \details{
diff --git a/man/gof.Rd b/man/gof.Rd
index 4acc05c..66b0705 100644
--- a/man/gof.Rd
+++ b/man/gof.Rd
@@ -6,15 +6,13 @@
 \description{Conducts deviance goodness of fit tests for each fit in a \code{DGEGLM} object}
 
 \usage{
-gof(glmfit, pcutoff=0.1, adjust="holm", plot=FALSE,
-    main="qq-plot of genewise goodness of fit", \dots)
+gof(glmfit, pcutoff = 0.1, adjust = "holm", plot = FALSE,
+    main = "qq-plot of residual deviances", \dots)
 }
-\arguments{ 
-
-\item{glmfit}{\code{DGEGLM} object containing results from fitting NB GLMs to genes in a DGE dataset. Output from \code{glmFit}.}
 
+\arguments{ 
+\item{glmfit}{a \code{DGEGLM} object containing results from fitting NB GLMs to genes in a DGE dataset with a global dispersion model. Usually this is output from \code{glmFit}.}
 \item{pcutoff}{scalar giving the cut-off value for the Holm-adjusted p-value. Genes with Holm-adjusted p-values lower than this cutoff value are flagged as `dispersion outlier' genes.}
-
 \item{adjust}{method used to adjust goodness of fit p-values for multiple testing.}
 \item{plot}{logical, if \code{TRUE} a qq-plot is produced.}
 \item{main}{character, title for the plot.}
@@ -22,15 +20,23 @@ gof(glmfit, pcutoff=0.1, adjust="holm", plot=FALSE,
 }
 
 \details{
-If \code{plot=TRUE}, produces a plot similar to Figure 2 of McCarthy et al (2012).
+This function is useful for evaluating the adequacy of a global dispersion model, such as a constant or trended dispersion.
+If \code{plot=TRUE}, then it produces a qq-plot similar to those in Figure 2 of McCarthy et al (2012).
+}
+
+\note{
+This function should not be used with tagwise estimated dispersions such as those from \code{\link{estimateGLMTagwiseDisp}} or \code{\link{estimateDisp}}.
+\code{glmfit} should contain trended or constant dispersions.
 }
 
 \value{
-This function returns a list with the following components:
+A list with the following components:
 \item{gof.statistics}{numeric vector of deviance statistics, which are the statistics used for the goodness of fit test}
 \item{gof.pvalues}{numeric vector of p-values providing evidence of poor fit; computed from the chi-square distribution on the residual degrees of freedom from the GLM fits.}
 \item{outlier}{logical vector indicating whether or not each gene is a `dispersion outlier' (i.e., the model fit is poor for that gene indicating that the dispersion estimate is not good for that gene).} 
 \item{df}{scalar, the residual degrees of freedom from the GLM fit for which the goodness of fit statistics have been computed. Also the degrees of freedom for the goodness of fit statistics for the LR (chi-quare) test for significance.}
+
+If \code{plot=TRUE}, then a plot is also produced on the current graphics device.
 }
 
 \author{Davis McCarthy and Gordon Smyth}
@@ -73,5 +79,3 @@ gof(fit)
 
 \code{\link{glmFit}} for more information on fitting NB GLMs to DGE data.
 }
-
-\keyword{algebra}
diff --git a/man/movingAverageByCol.Rd b/man/movingAverageByCol.Rd
index da95114..ecf5763 100644
--- a/man/movingAverageByCol.Rd
+++ b/man/movingAverageByCol.Rd
@@ -1,6 +1,5 @@
 \name{movingAverageByCol}
 \alias{movingAverageByCol}
-\alias{movingAverageByCol}
 \title{Moving Average Smoother of Matrix Columns}
 \description{
 Apply a moving average smoother to the columns of a matrix.
diff --git a/man/processAmplicons.Rd b/man/processAmplicons.Rd
index 3235273..7bf7d17 100644
--- a/man/processAmplicons.Rd
+++ b/man/processAmplicons.Rd
@@ -5,6 +5,7 @@
 
 \description{
 Given a list of sample-specific index (barcode) sequences and hairpin/sgRNA-specific sequences from an amplicon sequencing screen, generate a DGEList of counts from the raw fastq file/(s) containing the sequence reads. 
+Assumes fixed structure of amplicon sequences (i.e. both the sample-specific index sequences and hairpin/sgRNA sequences can be found at particular locations within each read).
 }
 
 \usage{
@@ -47,7 +48,8 @@ processAmplicons(readfile, readfile2=NULL, barcodefile, hairpinfile,
 	\item{lib.size}{auto-calculated column sum of the counts matrix}
 }
 
-\details{
+\details{The \code{processAmplicons} function assumes the sequences in your fastq files have a fixed structure (as per Figure 1A of http://f1000research.com/articles/3-95/v2). It cannot be used if your hairpins/sgRNAs/sample index sequences are in random locations within each read. You will need to customise your own sequence processing pipeline if this is the case, but can still complete your downstream analysis using edgeR.
+
 The input barcode file and hairpin/sgRNA files are tab-separated text files with at least two columns (named 'ID' and 'Sequences') containing the sample or hairpin/sgRNA ids and a second column indicating the sample index or hairpin/sgRNA sequences to be matched. If \code{barcode2Start} and \code{barcode2End} are specified, a third column 'Sequences2' is expected in the barcode file. If \code{readfile2}, \code{barcodeStartRev} and \code{barcodeEndRev} are specified, another column 'Seque [...]
 
 To compute the count matrix, matching to the given barcodes and hairpins/sgRNAs is conducted in two rounds. The first round looks for an exact sequence match for the given barcode sequences and hairpin/sgRNA sequences at the locations specified. If \code{allowShifting} is set to \code{TRUE}, the program also checks if a given hairpin/sgRNA sequence can be found at a neighbouring position in the read. If a match isn't found, the program performs a second round of matching which allows for [...]
@@ -55,6 +57,8 @@ To compute the count matrix, matching to the given barcodes and hairpins/sgRNAs
 The program outputs a \code{\link[edgeR:DGEList-class]{DGEList}} object, with a count matrix indicating the number of times each barcode and hairpin/sgRNA combination could be matched in reads from input fastq file/(s).
 
 For further examples and data, refer to the Case studies available from http://bioinf.wehi.edu.au/shRNAseq/.
+
+Note: The \code{processAmplicons} function supercedes the earlier \code{processHairpinReads} function.
 }
 
 \author{Zhiyin Dai and Matthew Ritchie}
diff --git a/man/roast.DGEList.Rd b/man/roast.DGEList.Rd
index 3c64814..1aaf419 100644
--- a/man/roast.DGEList.Rd
+++ b/man/roast.DGEList.Rd
@@ -1,6 +1,7 @@
 \name{roast.DGEList}
 \alias{roast.DGEList}
 \alias{mroast.DGEList}
+\alias{fry.DGEList}
 \title{Rotation Gene Set Tests for Digital Gene Expression Data}
 \description{
 Rotation gene set testing for Negative Binomial generalized linear models.
@@ -9,6 +10,7 @@ Rotation gene set testing for Negative Binomial generalized linear models.
 \usage{
 \method{roast}{DGEList}(y, index=NULL, design=NULL, contrast=ncol(design), \dots)
 \method{mroast}{DGEList}(y, index=NULL, design=NULL, contrast=ncol(design), \dots)
+\method{fry}{DGEList}(y, index=NULL, design=NULL, contrast=ncol(design), \dots)
 }
 
 \arguments{
@@ -22,7 +24,7 @@ Rotation gene set testing for Negative Binomial generalized linear models.
 \value{
 \code{roast} produces an object of class \code{\link[limma:roast]{Roast}}. See \code{\link{roast}} for details.
 
-\code{mroast} produces a data.frame. See \code{\link{mroast}} for details.
+\code{mroast} and \code{fry} produce a data.frame. See \code{\link{mroast}} for details.
 }
 
 \details{
@@ -43,7 +45,7 @@ The design matrix defaults to the \code{model.matrix(~y$samples$group)}.
 \author{Yunshun Chen and Gordon Smyth}
 
 \references{
-Dunn, K. P., and Smyth, G. K. (1996).
+Dunn, PK, and Smyth, GK (1996).
 Randomized quantile residuals.
 \emph{J. Comput. Graph. Statist.}, 5, 236-244. 
 \url{http://www.statsci.org/smyth/pubs/residual.html}
diff --git a/man/romer.DGEList.Rd b/man/romer.DGEList.Rd
index c22af37..e0423da 100644
--- a/man/romer.DGEList.Rd
+++ b/man/romer.DGEList.Rd
@@ -45,7 +45,7 @@ Opposing roles of polycomb repressive complexes in hematopoietic stem and progen
 \emph{Blood}, published online 5 May 2010.
 \url{http://www.ncbi.nlm.nih.gov/pubmed/20445021}
 
-Dunn, K. P., and Smyth, G. K. (1996).
+Dunn, PK, and Smyth, GK (1996).
 Randomized quantile residuals.
 \emph{J. Comput. Graph. Statist.}, 5, 236-244. 
 \url{http://www.statsci.org/smyth/pubs/residual.html}
diff --git a/man/splitIntoGroups.Rd b/man/splitIntoGroups.Rd
index c4d0ab0..8f9f726 100644
--- a/man/splitIntoGroups.Rd
+++ b/man/splitIntoGroups.Rd
@@ -1,25 +1,25 @@
 \name{splitIntoGroups}
 \alias{splitIntoGroups}
+\alias{splitIntoGroups.DGEList}
+\alias{splitIntoGroups.default}
 \alias{splitIntoGroupsPseudo}
 
-
 \title{Split the Counts or Pseudocounts from a DGEList Object According To Group}
 
 \description{Split the counts from a DGEList object according to group, creating a list where each element consists of a numeric matrix of counts for a particular experimental group. Given a pair of groups, split pseudocounts for these groups, creating a list where each element is a matrix of pseudocounts for a particular gourp.}
 
 \usage{
-splitIntoGroups(object)
+\S3method{splitIntoGroups}{DGEList}(y, ...)
+\S3method{splitIntoGroups}{default}(y, group=NULL, ...)
 splitIntoGroupsPseudo(pseudo, group, pair)
 }
 
-\arguments{ 
-\item{object}{\code{DGEList}, object containing (at least) the elements \code{counts} (table of raw counts), \code{group} (factor indicating group) and \code{lib.size} (numeric vector of library sizes)}
-
+\arguments{
+\item{y}{matrix of counts or a \code{DGEList} object.}
+\item{group}{vector or factor giving the experimental group/condition for each library.}
 \item{pseudo}{numeric matrix of quantile-adjusted pseudocounts to be split}
-
-\item{group}{factor indicating group to which libraries/samples (i.e. columns of \code{pseudo} belong; must be same length as ncol(pseudo)}
-
 \item{pair}{vector of length two stating pair of groups to be split for the pseudocounts}
+\item{\dots}{other arguments that are not currently used.}
 }
 
 \value{\code{splitIntoGroups} outputs a list in which each element is a matrix of count counts for an individual group. \code{splitIntoGroupsPseudo} outputs a list with two elements, in which each element is a numeric matrix of (pseudo-)count data for one of the groups specified.}
@@ -27,12 +27,11 @@ splitIntoGroupsPseudo(pseudo, group, pair)
 \author{Davis McCarthy}
 \examples{
 # generate raw counts from NB, create list object
-y<-matrix(rnbinom(80,size=1,mu=10),nrow=20)
-d<-DGEList(counts=y,group=rep(1:2,each=2),lib.size=rep(c(1000:1001),2))
-rownames(d$counts)<-paste("gene",1:nrow(d$counts),sep=".")
-z1<-splitIntoGroups(d)
-
-z2<-splitIntoGroupsPseudo(d$counts,d$group,pair=c(1,2))
+y <- matrix(rnbinom(80, size=1, mu=10), nrow=20)
+d <- DGEList(counts=y, group=rep(1:2, each=2), lib.size=rep(c(1000:1001), 2))
+rownames(d$counts) <- paste("gene", 1:nrow(d$counts), sep=".")
+z1 <- splitIntoGroups(d)
+z2 <- splitIntoGroupsPseudo(d$counts, d$group, pair=c(1,2))
 }
 
 \keyword{algebra}
diff --git a/man/zscoreNBinom.Rd b/man/zscoreNBinom.Rd
index 1bacd9b..0672d0f 100644
--- a/man/zscoreNBinom.Rd
+++ b/man/zscoreNBinom.Rd
@@ -32,6 +32,6 @@ Care is taken to do the computations accurately in both tails of the distributio
 \code{\link{pnbinom}}, \code{\link{qnorm}} in the stats package.
 }
 \examples{
-zscoreNBinom(c(0,10,100), mu=10, size=1/10)
+zscoreNBinom(c(0,10,100), mu=10, size=10)
 }
 \keyword{distribution}
diff --git a/tests/edgeR-Tests.R b/tests/edgeR-Tests.R
index bb189f4..08eec7e 100644
--- a/tests/edgeR-Tests.R
+++ b/tests/edgeR-Tests.R
@@ -101,7 +101,7 @@ d <- calcNormFactors(d)
 fit <- glmFit(d, design, dispersion=dispersion.true, prior.count=0.5/3)
 results <- glmLRT(fit, coef=2)
 topTags(results)
-d <- estimateGLMCommonDisp(d, design, verbose=TRUE)
+d1 <- estimateGLMCommonDisp(d, design, verbose=TRUE)
 glmFit(d,design,dispersion=dispersion.true, prior.count=0.5/3)
 
 d2 <- estimateDisp(d, design)
@@ -135,11 +135,6 @@ fit <- glmFit(y,design,dispersion=2/3,prior.count=0.5/7)
 lrt <- glmLRT(fit,contrast=cbind(c(-1,1,0),c(0,-1,1),c(-1,0,1)))
 topTags(lrt)
 
-# spliceVariants
-z = matrix(c(2,0,4,6,4,3,7,1,1,0,1,1,0,3,1,2,0,1,2,1,0,3,1,0), 8, 3)
-gz = c(1,2,2,2,2,2,2,2)
-spliceVariants(DGEList(counts = z, group = c(1,2,2)), gz)
-
 # simple Good-Turing algorithm runs.
 test1<-1:9
 freq1<-c(2018046, 449721, 188933, 105668, 68379, 48190, 35709, 37710, 22280)
diff --git a/tests/edgeR-Tests.Rout.save b/tests/edgeR-Tests.Rout.save
index f91f8dd..9c951fb 100644
--- a/tests/edgeR-Tests.Rout.save
+++ b/tests/edgeR-Tests.Rout.save
@@ -1,7 +1,7 @@
 
-R version 3.1.1 (2014-07-10) -- "Sock it to Me"
-Copyright (C) 2014 The R Foundation for Statistical Computing
-Platform: i386-w64-mingw32/i386 (32-bit)
+R version 3.2.4 Revised (2016-03-16 r70336) -- "Very Secure Dishes"
+Copyright (C) 2016 The R Foundation for Statistical Computing
+Platform: x86_64-w64-mingw32/x64 (64-bit)
 
 R is free software and comes with ABSOLUTELY NO WARRANTY.
 You are welcome to redistribute it under certain conditions.
@@ -105,7 +105,7 @@ Tag.16  0.9324887 13.57074 0.24308201 0.9711975
 Tag.20  0.8543044 13.76364 0.35534649 0.9711975
 Tag.19 -0.7976535 13.31405 0.38873717 0.9711975
 Tag.3  -0.7300525 13.54155 0.40001438 0.9711975
-Tag.12  0.7080985 14.31389 0.43530228 0.9711975
+Tag.12  0.7080985 14.31389 0.43530227 0.9711975
 Tag.8  -0.7918376 12.86353 0.49782701 0.9711975
 > 
 > summary(exactTest(d2,rejection="smallp")$table$PValue)
@@ -264,8 +264,8 @@ Tag.8  -0.7917906 12.86353 0.7830377 0.37621371 0.8276702
 > topTags(lrt)
 Coefficient:  group2 
             logFC   logCPM        LR      PValue       FDR
-Tag.17  2.0450988 13.73726 6.8001118 0.009115216 0.2005348
-Tag.2   4.0861092 11.54121 4.8594088 0.027495756 0.2872068
+Tag.17  2.0450988 13.73727 6.8001118 0.009115216 0.2005348
+Tag.2   4.0861092 11.54122 4.8594088 0.027495756 0.2872068
 Tag.21 -1.7265904 13.38327 4.2537154 0.039164558 0.2872068
 Tag.6  -1.6329904 12.81479 3.1763761 0.074710253 0.4109064
 Tag.16  0.9324970 13.57074 1.4126709 0.234613512 0.8499599
@@ -283,7 +283,6 @@ Tag.8  -0.7918166 12.86353 0.7356185 0.391068049 0.8603497
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.1522  0.1676  0.1740  0.1887  0.2000  0.3469 
 > dglm <- estimateGLMTrendedDisp(dglm,design,method="spline")
-Loading required package: splines
 > summary(dglm$trended.dispersion)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.09353 0.11080 0.15460 0.19010 0.23050 0.52010 
@@ -302,33 +301,197 @@ Loading required package: splines
  0.1766  0.1789  0.1814  0.1846  0.1870  0.2119 
 > dglm2 <- estimateDisp(dglm, design, prior.df=20)
 > summary(dglm2$tagwise.dispersion)
-   Min. 1st Qu.  Median    Mean 3rd Qu.    .6684268  0.7357761 17.33529  1.7309951 0.420842115 0.90967967
-2  1.2080938 -0.1660740 18.24544  1.0496688 0.591653347 0.90967967
-4  0.5416704 -0.6923084 17.57744  0.3958596 0.820427427 0.90967967
-3  0.2876249 -0.4884392 18.06216  0.1893255 0.909679672 0.90967967
+   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+ 0.1527  0.1669  0.1814  0.1858  0.1951  0.2497 
+> dglm2 <- estimateDisp(dglm, design, robust=TRUE)
+> summary(dglm2$tagwise.dispersion)
+   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+ 0.1766  0.1789  0.1814  0.1846  0.1870  0.2119 
 > 
-> # spliceVariants
-> z = matrix(c(2,0,4,6,4,3,7,1,1,0,1,1,0,3,1,2,0,1,2,1,0,3,1,0), 8, 3)
-> gz = c(1,2,2,2,2,2,2,2)
-> spliceVariants(DGEList(counts = z, group = c(1,2,2)), gz)
-An object of class "DGEExact"
-$table
-  logFC   logCPM       LR     PValue
-1    NA 19.19460  0.00000 1.00000000
-2    NA 19.34082 11.47712 0.07470331
-
-$comparison
-NULL
-
-$genes
-  GeneID
-1      1
-2      2
+> # Continuous trend
+> nlibs <- 3
+> ntags <- 1000
+> dispersion.true <- 0.1
+> # Make first transcript respond to covariate x
+> x <- 0:2
+> design <- model.matrix(~x)
+> beta.true <- cbind(Beta1=2,Beta2=c(2,rep(0,ntags-1)))
+> mu.true <- 2^(beta.true %*% t(design))
+> # Generate count data
+> y <- rnbinom(ntags*nlibs,mu=mu.true,size=1/dispersion.true)
+> y <- matrix(y,ntags,nlibs)
+> colnames(y) <- c("x0","x1","x2")
+> rownames(y) <- paste("Gene",1:ntags,sep="")
+> d <- DGEList(y)
+> d <- calcNormFactors(d)
+> fit <- glmFit(d, design, dispersion=dispersion.true, prior.count=0.5/3)
+> results <- glmLRT(fit, coef=2)
+> topTags(results)
+Coefficient:  x 
+            logFC   logCPM        LR       PValue          FDR
+Gene1    2.907024 13.56183 38.738512 4.845536e-10 4.845536e-07
+Gene61   2.855317 10.27136 10.738307 1.049403e-03 5.247015e-01
+Gene62  -2.123902 10.53174  8.818703 2.981585e-03 8.334760e-01
+Gene134 -1.949073 10.53355  8.125889 4.363759e-03 8.334760e-01
+Gene740 -1.610046 10.94907  8.013408 4.643227e-03 8.334760e-01
+Gene354  2.022698 10.45066  7.826308 5.149118e-03 8.334760e-01
+Gene5    1.856816 10.45249  7.214238 7.232750e-03 8.334760e-01
+Gene746 -1.798331 10.53094  6.846262 8.882693e-03 8.334760e-01
+Gene110  1.623148 10.68607  6.737984 9.438120e-03 8.334760e-01
+Gene383  1.637140 10.75412  6.687530 9.708965e-03 8.334760e-01
+> d1 <- estimateGLMCommonDisp(d, design, verbose=TRUE)
+Disp = 0.10253 , BCV = 0.3202 
+> glmFit(d,design,dispersion=dispersion.true, prior.count=0.5/3)
+An object of class "DGEGLM"
+$coefficients
+      (Intercept)          x
+Gene1   -7.391745  2.0149958
+Gene2   -7.318483 -0.7611895
+Gene3   -6.831702 -0.1399478
+Gene4   -7.480255  0.5172002
+Gene5   -8.747793  1.2870467
+995 more rows ...
+
+$fitted.values
+             x0        x1          x2
+Gene1 2.3570471 18.954454 138.2791328
+Gene2 2.5138172  1.089292   0.4282107
+Gene3 4.1580452  3.750528   3.0690081
+Gene4 2.1012460  3.769592   6.1349937
+Gene5 0.5080377  2.136398   8.1502486
+995 more rows ...
+
+$deviance
+[1] 6.38037545 1.46644913 1.38532340 0.01593969 1.03894513
+995 more elements ...
+
+$iter
+[1] 8 4 4 4 6
+995 more elements ...
+
+$failed
+[1] FALSE FALSE FALSE FALSE FALSE
+995 more elements ...
+
+$method
+[1] "levenberg"
+
+$counts
+      x0 x1  x2
+Gene1  0 30 110
+Gene2  2  2   0
+Gene3  3  6   2
+Gene4  2  4   6
+Gene5  1  1   9
+995 more rows ...
+
+$unshrunk.coefficients
+      (Intercept)          x
+Gene1   -7.437763  2.0412762
+Gene2   -7.373370 -0.8796273
+Gene3   -6.870127 -0.1465014
+Gene4   -7.552642  0.5410832
+Gene5   -8.972372  1.3929679
+995 more rows ...
+
+$df.residual
+[1] 1 1 1 1 1
+995 more elements ...
+
+$design
+  (Intercept) x
+1           1 0
+2           1 1
+3           1 2
+attr(,"assign")
+[1] 0 1
+
+$offset
+         [,1]     [,2]     [,3]
+[1,] 8.295172 8.338525 8.284484
+[2,] 8.295172 8.338525 8.284484
+[3,] 8.295172 8.338525 8.284484
+[4,] 8.295172 8.338525 8.284484
+[5,] 8.295172 8.338525 8.284484
+995 more rows ...
 
 $dispersion
-           1            2 
-6.434416e-03 1.570902e-06 
+[1] 0.1
+
+$prior.count
+[1] 0.1666667
 
+$samples
+   group lib.size norm.factors
+x0     1     4001    1.0008730
+x1     1     4176    1.0014172
+x2     1     3971    0.9977138
+
+$AveLogCPM
+[1] 13.561832  9.682757 10.447014 10.532113 10.452489
+995 more elements ...
+
+> 
+> d2 <- estimateDisp(d, design)
+> summary(d2$tagwise.dispersion)
+   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+0.05545 0.09511 0.11620 0.11010 0.13330 0.16860 
+> d2 <- estimateDisp(d, design, prior.df=20)
+> summary(d2$tagwise.dispersion)
+   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+0.04203 0.08586 0.11280 0.11010 0.12370 0.37410 
+> d2 <- estimateDisp(d, design, robust=TRUE)
+> summary(d2$tagwise.dispersion)
+   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+0.05545 0.09511 0.11620 0.11010 0.13330 0.16860 
+> 
+> # Exact tests
+> y <- matrix(rnbinom(20,mu=10,size=3/2),5,4)
+> group <- factor(c(1,1,2,2))
+> ys <- splitIntoGroupsPseudo(y,group,pair=c(1,2))
+> exactTestDoubleTail(ys$y1,ys$y2,dispersion=2/3)
+[1] 0.1334396 0.6343568 0.7280015 0.7124912 0.3919258
+> 
+> y <- matrix(rnbinom(5*7,mu=10,size=3/2),5,7)
+> group <- factor(c(1,1,2,2,3,3,3))
+> ys <- splitIntoGroupsPseudo(y,group,pair=c(1,3))
+> exactTestDoubleTail(ys$y1,ys$y2,dispersion=2/3)
+[1] 1.0000000 0.4486382 1.0000000 0.9390317 0.4591241
+> exactTestBetaApprox(ys$y1,ys$y2,dispersion=2/3)
+[1] 1.0000000 0.4492969 1.0000000 0.9421695 0.4589194
+> 
+> y[1,3:4] <- 0
+> design <- model.matrix(~group)
+> fit <- glmFit(y,design,dispersion=2/3,prior.count=0.5/7)
+> summary(fit$coef)
+  (Intercept)         group2            group3        
+ Min.   :-1.817   Min.   :-5.0171   Min.   :-0.64646  
+ 1st Qu.:-1.812   1st Qu.:-1.1565   1st Qu.:-0.13919  
+ Median :-1.712   Median : 0.1994   Median :-0.10441  
+ Mean   :-1.625   Mean   :-0.9523   Mean   :-0.04217  
+ 3rd Qu.:-1.429   3rd Qu.: 0.3755   3rd Qu.:-0.04305  
+ Max.   :-1.356   Max.   : 0.8374   Max.   : 0.72227  
+> 
+> lrt <- glmLRT(fit,contrast=cbind(c(0,1,0),c(0,0,1)))
+> topTags(lrt)
+Coefficient:  LR test of 2 contrasts 
+     logFC.1    logFC.2   logCPM         LR      PValue        FDR
+1 -7.2381060 -0.0621100 17.19071 10.7712171 0.004582051 0.02291026
+5 -1.6684268 -0.9326507 17.33529  1.7309951 0.420842115 0.90967967
+2  1.2080938  1.0420198 18.24544  1.0496688 0.591653347 0.90967967
+4  0.5416704 -0.1506381 17.57744  0.3958596 0.820427427 0.90967967
+3  0.2876249 -0.2008143 18.06216  0.1893255 0.909679672 0.90967967
+> design <- model.matrix(~0+group)
+> fit <- glmFit(y,design,dispersion=2/3,prior.count=0.5/7)
+> lrt <- glmLRT(fit,contrast=cbind(c(-1,1,0),c(0,-1,1),c(-1,0,1)))
+> topTags(lrt)
+Coefficient:  LR test of 2 contrasts 
+     logFC.1    logFC.2   logCPM         LR      PValue        FDR
+1 -7.2381060  7.1759960 17.19071 10.7712171 0.004582051 0.02291026
+5 -1.6684268  0.7357761 17.33529  1.7309951 0.420842115 0.90967967
+2  1.2080938 -0.1660740 18.24544  1.0496688 0.591653347 0.90967967
+4  0.5416704 -0.6923084 17.57744  0.3958596 0.820427427 0.90967967
+3  0.2876249 -0.4884392 18.06216  0.1893255 0.909679672 0.90967967
 > 
 > # simple Good-Turing algorithm runs.
 > test1<-1:9
@@ -373,53 +536,4 @@ $n0
 > 
 > proc.time()
    user  system elapsed 
-   3.58    0.04    5.32 
-                                                                                                                                                                                                                                                                                                                                                                   edgeR/vignettes/                                                                                    0000755 0001751 0001751 00000000000 1 [...]
-%\VignetteKeyword{RNA-Seq}
-%\VignetteKeyword{differential expression}
-%\VignettePackage{edgeR}
-\documentclass[12pt]{article}
-
-\textwidth=6.2in
-\textheight=8.5in
-\oddsidemargin=0.2in
-\evensidemargin=0.2in
-\headheight=0in
-\headsep=0in
-
-\begin{document}
-
-\title{edgeR Package Introduction}
-\author{Yunshun Chen, Davis McCarthy, Aaron Lun,\\
-Xiaobei Zhou, Mark Robinson, Gordon K.\ Smyth}
-\date{10 October 2012\\
-Revised 8 October 2014}
-\maketitle
-
-
-edgeR is a package for the differential expression analysis of digital gene expression data,
-that is, of count data arising from DNA sequencing technologies.
-It is especially designed for differential expression analyses of RNA-Seq or SAGE data,
-or differential marking analyses of ChIP-Seq data.
-
-edgeR implements novel statistical methods based on the negative binomial distribution
-as a model for count variability, including empirical Bayes methods, exact tests, and generalized linear models.
-The package is especially suitable for analysing designed experiments with multiple
-experimental factors but possibly small numbers of replicates.
-It has unique abilities to model transcript specific variation even in small samples,
-a capability essential for prioritizing genes or transcripts that have consistent effects across replicates.
-
-The full edgeR User's Guide is available as part of the online documentation.
-To reach the User's Guide, install the edgeR package and load it into an R session by \texttt{library(edgeR)}.
-In R for Windows, the User's Guide will then be available from the drop-down menu called ``Vignettes''.
-In other operating systems, type
-\begin{Schunk}
-\begin{Sinput}
-> library(edgeR)
-> edgeRUsersGuide()
-\end{Sinput}
-\end{Schunk}
-at the R prompt to open the User's Guide in a pdf viewer.
-
-\end{document}
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             [...]
\ No newline at end of file
+   3.58    0.03    3.61 
diff --git a/vignettes/edgeR.Rnw b/vignettes/edgeR.Rnw
new file mode 100755
index 0000000..af5781e
--- /dev/null
+++ b/vignettes/edgeR.Rnw
@@ -0,0 +1,48 @@
+%\VignetteIndexEntry{edgeR Vignette}
+%\VignetteKeyword{RNA-Seq}
+%\VignetteKeyword{differential expression}
+%\VignettePackage{edgeR}
+\documentclass[12pt]{article}
+
+\textwidth=6.2in
+\textheight=8.5in
+\oddsidemargin=0.2in
+\evensidemargin=0.2in
+\headheight=0in
+\headsep=0in
+
+\begin{document}
+
+\title{edgeR Package Introduction}
+\author{Yunshun Chen, Davis McCarthy, Aaron Lun,\\
+Xiaobei Zhou, Mark Robinson, Gordon K.\ Smyth}
+\date{10 October 2012\\
+Revised 8 October 2014}
+\maketitle
+
+
+edgeR is a package for the differential expression analysis of digital gene expression data,
+that is, of count data arising from DNA sequencing technologies.
+It is especially designed for differential expression analyses of RNA-Seq or SAGE data,
+or differential marking analyses of ChIP-Seq data.
+
+edgeR implements novel statistical methods based on the negative binomial distribution
+as a model for count variability, including empirical Bayes methods, exact tests, and generalized linear models.
+The package is especially suitable for analysing designed experiments with multiple
+experimental factors but possibly small numbers of replicates.
+It has unique abilities to model transcript specific variation even in small samples,
+a capability essential for prioritizing genes or transcripts that have consistent effects across replicates.
+
+The full edgeR User's Guide is available as part of the online documentation.
+To reach the User's Guide, install the edgeR package and load it into an R session by \texttt{library(edgeR)}.
+In R for Windows, the User's Guide will then be available from the drop-down menu called ``Vignettes''.
+In other operating systems, type
+\begin{Schunk}
+\begin{Sinput}
+> library(edgeR)
+> edgeRUsersGuide()
+\end{Sinput}
+\end{Schunk}
+at the R prompt to open the User's Guide in a pdf viewer.
+
+\end{document}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-bioc-edger.git



More information about the debian-med-commit mailing list