[med-svn] r17503 - in trunk/packages/R/r-bioc-genomicfeatures/trunk/debian: . patches

Andreas Tille tille at moszumanska.debian.org
Thu Jul 24 06:55:38 UTC 2014


Author: tille
Date: 2014-07-24 06:55:38 +0000 (Thu, 24 Jul 2014)
New Revision: 17503

Added:
   trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/
   trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/drop_tests_requiring_large_data_sets.patch
   trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/series
Modified:
   trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/README.test
   trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/changelog
Log:
Make test independent from TxDb.Hsapiens.UCSC.hg19.knownGene


Modified: trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/README.test
===================================================================
--- trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/README.test	2014-07-24 06:29:03 UTC (rev 17502)
+++ trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/README.test	2014-07-24 06:55:38 UTC (rev 17503)
@@ -10,10 +10,12 @@
 
 in order to confirm its integrity.
 
-As it is reported in bug #735548 to successfully run this test you some
-BioCOnductor databases as preconditions.  If you want to install these
-as Debian packages you can use:
+As it is reported in bug #735548 to successfully run this test you need some
+BioConductor databases as preconditions.  Since these are not packaged for
+Debian the according tests are removed from the test suite of this package.
 
+If you want to install the as Debian packages you can use:
+
    svn://anonscm.debian.org/debian-med/trunk/packages/R/r-bioc-txdb.hsapiens.ucsc.hg19.knowngene/trunk/
 
 A further database
@@ -21,3 +23,9 @@
    http://bioconductor.org/packages/release/data/annotation/html/BSgenome.Hsapiens.UCSC.hg19.html
 
 will be needed as well but it is not yet packaged.
+
+Finally you would need to re-activate the according tests by moving the
+original files from inst/unitTests to the installation directory or you
+rebuild this package by deactivating the patch in the series file.
+
+ -- Andreas Tille <tille at debian.org>  Thu, 24 Jul 2014 08:35:54 +0200

Modified: trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/changelog
===================================================================
--- trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/changelog	2014-07-24 06:29:03 UTC (rev 17502)
+++ trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/changelog	2014-07-24 06:55:38 UTC (rev 17503)
@@ -1,4 +1,4 @@
-r-bioc-genomicfeatures (1.16.2-1) UNRELEASED; urgency=medium
+r-bioc-genomicfeatures (1.16.2-1) unstable; urgency=medium
 
   [ Martin Pitt ]
   * debian/tests/control: Add missing r-cran-runit test dependency, and allow
@@ -8,11 +8,10 @@
   * New upstream version
   * (Build-)Depends: r-bioc-genomeinfodb
   * Add citation
-  TODO: Make test independent from TxDb.Hsapiens.UCSC.hg19.knownGene
-        see README.test
-   this would close #735548
+  * Make test independent from TxDb.Hsapiens.UCSC.hg19.knownGene
+    Closes: #735548
 
- -- Andreas Tille <tille at debian.org>  Tue, 10 Jun 2014 14:07:11 +0200
+ -- Andreas Tille <tille at debian.org>  Thu, 24 Jul 2014 08:35:54 +0200
 
 r-bioc-genomicfeatures (1.14.2-1) unstable; urgency=low
 

Added: trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/drop_tests_requiring_large_data_sets.patch
===================================================================
--- trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/drop_tests_requiring_large_data_sets.patch	                        (rev 0)
+++ trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/drop_tests_requiring_large_data_sets.patch	2014-07-24 06:55:38 UTC (rev 17503)
@@ -0,0 +1,542 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Thu, 24 Jul 2014 08:35:54 +0200
+Bug-Debian: http://bugs.debian.org/735548
+Description: Make test independent from TxDb.Hsapiens.UCSC.hg19.knownGene
+ (see debian/README.test)
+
+--- a/inst/unitTests/test_getPromoterSeq-methods.R
++++ /dev/null
+@@ -1,116 +0,0 @@
+-library(BSgenome.Hsapiens.UCSC.hg19)
+-library(TxDb.Hsapiens.UCSC.hg19.knownGene)
+-library(BSgenome.Dmelanogaster.UCSC.dm3)
+-library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
+-library(Rsamtools)
+-library(pasillaBamSubset)
+-
+-e2f3 <- "1871"   # human gene on the plus strand, chr6
+-grb2 <- "2885"   # human gene on the minus strand, chr17
+-
+-# a note on method: when the promoter sequence is 20 bases or more in length,
+-# uscs blat will find these sequences, and a quick visual inspection of the
+-# accompanying genome browser view at the right level of zoom, will
+-# confirm that the per-transcript sequences is indeed correct.
+-# there are a few tests of shorter sequences below as well, which
+-# I checked in the genome browser, but this required a little more effort
+-# than the length 20, blat approach.
+-
+-testGRangesListBSgenomeHumanGetPromoterSeq <- function() {
+-    genes <- c(e2f3, grb2)
+-    TxDb.Hsapiens.UCSC.hg19.knownGene <- restoreSeqlevels(TxDb.Hsapiens.UCSC.hg19.knownGene)  ## safety net
+-    transcriptCoordsByGene.GRangesList <-
+-      transcriptsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, by="gene") [genes]
+-    transcript.count <- length(unlist(transcriptCoordsByGene.GRangesList))
+-
+-    checkEquals(names(transcriptCoordsByGene.GRangesList), genes)
+-    promoter.seqs <- getPromoterSeq(transcriptCoordsByGene.GRangesList,
+-                                    Hsapiens, upstream=10, downstream=0)
+-    checkTrue(is(promoter.seqs, "DNAStringSetList"))
+-    checkEquals(length(promoter.seqs), 2)
+-    checkEquals(names(promoter.seqs), genes)
+-    checkEquals(width(unlist(promoter.seqs)), rep(10, transcript.count))
+-}
+-
+-testGRangesListBSgenomeFlyGetPromoterSeq <- function() {
+-     # two neighboring genes near beginning of chr3R, on opposite strands
+-     #  gene_id  flybase_id  symbol
+-     #    40524 FBgn0037215 CG12582
+-     #    40526 FBgn0037217 CG14636
+-     # in 2012, UCSC reported 4 total transcripts for these two genes
+-     # in 2013, 6.  there should be as many promoter.seqs as there
+-     # are transcripts, and they should each be of width
+-     # upstream + downstream.  it is risky to check for specific
+-     # sequence in the promoter.seqs since the annotation and sequence
+-     # may change
+-
+-    genes <- c("FBgn0037215", "FBgn0037217")
+-    transcriptCoordsByGene.GRangesList <-
+-       transcriptsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, by="gene") [genes]
+-  
+-    transcript.count <- length(unlist(transcriptCoordsByGene.GRangesList))
+-
+-    promoter.seqs <- getPromoterSeq(transcriptCoordsByGene.GRangesList,
+-                                    Dmelanogaster, upstream=10, downstream=10)
+-    checkTrue(is(promoter.seqs, "DNAStringSetList"))
+-    checkEquals(length(promoter.seqs), 2)
+-    checkEquals(names(promoter.seqs), genes)
+-  
+-    checkEquals(width(unlist(promoter.seqs)), rep(20, transcript.count))
+-}
+-
+-testGRangesListFastaFlyGetPromoterSeq <- function() {
+-      # two neighboring genes near beginning of chr3R, on opposite strands
+-      #  gene_id  flybase_id  symbol   chr
+-      #    43766 FBgn0025740  plexB    4
+-      #    43769 FBgn0085432  pan      4
+-    genes <- c("FBgn0025740", "FBgn0085432")
+-    transcriptCoordsByGene.GRangesList <-
+-       transcriptsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, by="gene") [genes]
+-
+-    transcript.count <- length(unlist(transcriptCoordsByGene.GRangesList))
+-
+-    fasta.file <- dm3_chr4 ()
+-    sequence.from.fasta <- open(FaFile(fasta.file))
+-    promoter.seqs <- getPromoterSeq(transcriptCoordsByGene.GRangesList,
+-                                    sequence.from.fasta, upstream=10,
+-                                    downstream=10)
+-    checkTrue(is(promoter.seqs, "DNAStringSetList"))
+-    checkEquals(length(promoter.seqs), 2)
+-    checkEquals(names(promoter.seqs), genes)
+-  
+-    checkEquals(width(unlist(promoter.seqs)), rep(20, transcript.count))
+-       # we are unable to check for specific DNA sequence, since
+-       # the UCSC annotation of these genes changes over time.
+-}
+-
+-testGRangesBSgenomeHumanGetPromoterSeq <- function() {
+-    TxDb.Hsapiens.UCSC.hg19.knownGene <- restoreSeqlevels(TxDb.Hsapiens.UCSC.hg19.knownGene)  ## safety net
+-    transcriptCoordsByGene.GRanges <-
+-      transcriptsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, by="gene") [[e2f3]]
+-    checkTrue(is(transcriptCoordsByGene.GRanges, "GRanges"))
+-       # would have names only if its a list:
+-    transcript.count <- length(transcriptCoordsByGene.GRanges)
+-    
+-    checkTrue(is.null(names(transcriptCoordsByGene.GRanges)))
+-    checkEquals(dim(mcols(transcriptCoordsByGene.GRanges)),
+-                c(transcript.count, 2))
+-    checkEquals(colnames(mcols(transcriptCoordsByGene.GRanges)),
+-                c("tx_id", "tx_name"))
+-    promoter.seqs <-
+-      getPromoterSeq(transcriptCoordsByGene.GRanges, Hsapiens,
+-                     upstream=10, downstream=0)
+-    checkTrue(is(promoter.seqs, "DNAStringSet"))
+-    checkEquals(length(promoter.seqs), transcript.count)
+-    checkTrue(is.null(names(promoter.seqs)))
+-    checkEquals(width(promoter.seqs), rep(10, transcript.count))
+-      # should be one more column in the metadata than in the metadata 
+-    checkEquals(dim(mcols(promoter.seqs)), c(transcript.count, 3))
+-    checkEquals(colnames(mcols(promoter.seqs)), c("tx_id", "tx_name", "geneID"))
+-       # the input, a GRanges, had no names -- which are the source
+-       # of geneID when the GRangesList version of this methods is called.
+-       # so ensure that this lack of information was passed along into the
+-       # metadata of the returned promoter.seqs
+-    checkTrue(all(is.na(mcols(promoter.seqs)$geneID)))
+-}
+-
+--- a/inst/unitTests/test_select-methods.R
++++ b/inst/unitTests/test_select-methods.R
+@@ -3,65 +3,14 @@
+ ## Why test the lower level helpers?  Because that way I will get a failure
+ ## point right at the location where the trouble occurs (high resolution for
+ ## trouble detection)._
+-require("TxDb.Hsapiens.UCSC.hg19.knownGene")
+-txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
+ require("RUnit")
+   
+-test_getTableColMapping <- function(){
+-  res <- GenomicFeatures:::.getTableColMapping(txdb)
+-  exp <- list(cds=c("_cds_id","cds_name","cds_chrom","cds_strand","cds_start",
+-                "cds_end"),
+-              exon=c("_exon_id","exon_name","exon_chrom","exon_strand",
+-                "exon_start","exon_end"),
+-              gene=c("gene_id","_tx_id"),
+-              splicing=c("_tx_id","exon_rank","_exon_id","_cds_id"),
+-              transcript=c("_tx_id","tx_name","tx_chrom","tx_strand",
+-                "tx_start","tx_end"))
+-  checkIdentical(res, exp)
+-}
+-
+-test_makeColAbbreviations <- function(){
+-  res <- GenomicFeatures:::.makeColAbbreviations(txdb)
+-  checkTrue(res[["_cds_id"]]=="CDSID")
+-  res2 <- GenomicFeatures:::.getTableColMapping(txdb)
+-  checkTrue(length(res)==20, length(unique(unlist(res2))))
+-}
+-
+-test_reverseColAbbreviations <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)
+-  res <- GenomicFeatures:::.reverseColAbbreviations(txdb, cnames)
+-  checkTrue(names(cnames)[[1]]==res[[1]])
+-  checkTrue(length(res) == length(cnames))
+-}
+-
+-test_getTableNames <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)
+-  res <- GenomicFeatures:::.getTableNames(txdb, cnames)
+-  ## Let's check the ones that fail more easily
+-  checkTrue(length(res[["_tx_id"]])==3)
+-  checkTrue(length(res[["_exon_id"]])==2)
+-  checkTrue(length(res[["_cds_id"]])==2)
+-}
+-
+-test_getSimpleTableNames <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)
+-  res <- GenomicFeatures:::.getSimpleTableNames(txdb, cnames)
+-  exp <- c("cds","splicing","exon","gene","transcript")
+-  checkIdentical(res, exp)
+-}
+-
+ test_encodeSortedTableKey <- function(){
+   sTNames <- c("s", "e", "t")
+   res <- GenomicFeatures:::.encodeSortedTableKey(sTNames)
+   checkIdentical(res,"tse")
+ }
+ 
+-test_makeTableKey <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)
+-  res <- GenomicFeatures:::.makeTableKey(txdb, cnames)
+-  checkIdentical(res,"gtsec")
+-}
+-
+ test_missingTableInterpolator <- function(){
+   tName <- "x"
+   res <- GenomicFeatures:::.missingTableInterpolator(tName)
+@@ -82,231 +31,4 @@ test_tableJoinSelector <- function(){
+   checkException(GenomicFeatures:::.tableJoinSelector(tName))
+ }
+ 
+-test_makeSelectList <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)[c("_cds_id","_tx_id")]
+-  res <- GenomicFeatures:::.makeSelectList(txdb, cnames)
+-  exp <- "c._cds_id, g._tx_id" ## 2nd one will be a "g."
+-  checkIdentical(res, exp)
+-}
+-
+-test_makeAsList <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)[c("_cds_id","_tx_id")]
+-  res <- GenomicFeatures:::.makeAsList(txdb, cnames)
+-  exp <- "cds AS c, splicing AS s, gene AS g, transcript AS t"
+-  checkIdentical(res, exp)
+-}
+-
+-test_makeJoinSQL <- function(){
+-  cnames <- GenomicFeatures:::.makeColAbbreviations(txdb)[c("_cds_id","_tx_id")]
+-  res <- GenomicFeatures:::.makeJoinSQL(txdb, cnames)
+-  exp <- "(SELECT * FROM transcript LEFT JOIN gene  ON (transcript._tx_id = gene._tx_id) INNER JOIN splicing  ON (transcript._tx_id = splicing._tx_id)  LEFT JOIN cds ON (splicing._cds_id = cds._cds_id) )"
+-  checkIdentical(res, exp)
+-}
+-
+-test_makeKeyList <- function(){
+-  ks <- 1:6
+-  kt <- "TXID"
+-  res <- GenomicFeatures:::.makeKeyList(txdb, keys=ks, keytype=kt)
+-  exp <- "g._tx_id IN ( '1','2','3','4','5','6' )"
+-  checkIdentical(res, exp)
+-}
+-
+-
+-test_keys <- function(){
+-  checkException(keys(txdb, keytype="CDSCHROM"))
+-}
+-
+-test_keys_advancedArgs <- function(){
+-    k1 <- keys(txdb, keytype="TXNAME")
+-    checkTrue("uc001aaa.3" %in% k1)
+-    
+-    k2 <- keys(txdb, keytype="TXNAME", pattern=".2$")
+-    checkTrue("uc001aaq.2" %in% k2)
+-    checkTrue(!("uc001aaa.3" %in% k2))
+-    checkTrue(length(k2) < length(k1))
+-
+-    l1 <- length(keys(txdb, keytype="TXID", column="GENEID"))
+-    l2 <- length(keys(txdb, keytype="TXID"))
+-    checkTrue(l1 < l2)
+-    
+-    k3 <- head(keys(txdb, keytype="GENEID", pattern=".2$",
+-                    column="TXNAME", fuzzy=TRUE))
+-    res <- suppressWarnings( select(txdb, k3, columns=c("GENEID","TXNAME"),
+-                                   keytype="GENEID"))
+-    checkTrue(any(grepl(".2$",res$TXNAME)))
+-}
+-
+-
+-
+-test_select <- function(){
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID"), colnames(res))
+-  checkTrue(length(res$GENEID)==length(keys))
+-  checkIdentical(res$GENEID, keys)
+-
+-  keys = head(keys(txdb, "TXID"))
+-  cols = c("TXID")
+-  res <- select(txdb, keys, cols, keytype="TXID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXID"), colnames(res))
+-  checkTrue(length(res$TXID)==length(keys))
+-  checkIdentical(res$TXID, keys)
+- 
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","TXID")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","TXID"), colnames(res))
+-  checkTrue(length(unique(res$GENEID))==length(keys))
+-
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","TXID", "EXONRANK")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","TXID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$GENEID))==length(keys))
+-
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","TXID", "EXONRANK","CDSID")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","CDSID","TXID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$GENEID))==length(keys))
+-
+-  ## It's really cosmetic but: should the order of the final data.frame match
+-  ## the order of the cols?
+-  ## I think so, except that we may add a col for keys (even if not requested)
+-  ## if added, such a col should be in front.
+-  
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","TXID", "EXONRANK", "EXONID")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","EXONID","TXID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$GENEID))==length(keys))
+-
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","TXID", "EXONRANK", "EXONID", "CDSID")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","CDSID","EXONID","TXID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$GENEID))==length(keys))
+-  
+-  keys = head(keys(txdb, "TXID"))
+-  cols = c("TXID", "EXONRANK", "EXONID", "CDSID")
+-  res <- select(txdb, keys, cols, keytype="TXID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXID","CDSID","EXONID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$TXID))==length(keys))
+-
+-  keys = head(keys(txdb, "EXONID"))
+-  cols = c("EXONRANK", "EXONID", "CDSID")
+-  res <- select(txdb, keys, cols, keytype="EXONID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("EXONID","CDSID","EXONRANK"), colnames(res))
+-  checkTrue(length(unique(res$EXONID))==length(keys))
+-  
+-  keys = head(keys(txdb, "TXNAME"))
+-  cols = c("GENEID","TXNAME", "CDSID", "EXONSTART")
+-  res <- select(txdb, keys, cols, keytype="TXNAME")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXNAME","CDSID","EXONSTART","GENEID"), colnames(res))  
+-  checkTrue(length(unique(res$TXNAME))==length(keys))
+-
+-  
+-  keys = head(keys(txdb, "TXNAME"))
+-  cols = c("GENEID", "EXONSTART","TXNAME")
+-  res <- select(txdb, keys, cols, keytype="TXNAME")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXNAME","EXONSTART","GENEID"), colnames(res))
+-  checkTrue(length(unique(res$TXNAME))==length(keys))
+-    
+-  
+-  keys = head(keys(txdb, "TXNAME"))
+-  cols = c("GENEID", "CDSID","TXNAME")
+-  res <- select(txdb, keys, cols, keytype="TXNAME")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXNAME","CDSID","GENEID"), colnames(res))
+-  checkTrue(length(unique(res$TXNAME))==length(keys))
+-    
+-  
+-  keys = head(keys(txdb, "TXID"))
+-  cols = c("GENEID","TXNAME", "TXID")
+-  res <- select(txdb, keys, cols, keytype="TXID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXID","GENEID","TXNAME"), colnames(res))
+-  checkTrue(length(unique(res$TXID))==length(keys))
+-  ## For this particular case, we want to make sure that the TXNAMES are not
+-  ## being copied (there should be one unique one for each ID in this range)
+-  checkTrue(length(unique(res$TXNAME)) == length(res$TXNAME))
+-  
+-  keys = head(keys(txdb, "CDSNAME"))
+-  cols = c("GENEID","TXNAME", "TXID", "CDSNAME")
+-  checkException(select(txdb, keys, cols, keytype="CDSNAME"), silent=TRUE)
+-  
+-  keys = head(keys(txdb, "CDSID"))
+-  cols = c("GENEID","TXNAME", "TXID", "CDSNAME")
+-  res <- select(txdb, keys, cols, keytype="CDSID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols)+1) ## this is one where we ADD an extra!
+-  checkIdentical(c("CDSID","CDSNAME","GENEID","TXID","TXNAME"), colnames(res))
+-  checkTrue(length(unique(res$CDSID))==length(keys))
+-
+-  
+-  ## stress test (this used to take way too long)
+-  keys = head(keys(txdb, "GENEID"))
+-  cols = c("GENEID","CDSSTART")
+-  res <- select(txdb, keys, cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","CDSSTART"), colnames(res))
+-    
+-}
+-
+-test_select_isActiveSeq <- function(){
+-  
+-  ## set isActiveSeq to only watch chr1
+- txdb <- restoreSeqlevels(txdb)  ## This is to reset things (safety measure)
+- isActiveSeq(txdb)[seqlevels(txdb)] <- FALSE
+- isActiveSeq(txdb) <- c("chr1"=TRUE)  
+-  
+-  ## then use select
+-  keys <- head(keys(txdb, "GENEID"))
+-  cols <- c("GENEID","CDSSTART", "CDSCHROM")
+-  res <- select(txdb, keys, columns = cols, keytype="GENEID")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("GENEID","CDSCHROM","CDSSTART"), colnames(res))
+-  uniqChrs <- unique(res$CDSCHROM)[!is.na(unique(res$CDSCHROM))]
+-  checkIdentical(c("chr1"),uniqChrs)
+-
+-  ## keys must contain keys that match to more than one thing
+-  keys <- c(head(keys(txdb,keytype="TXNAME")),
+-            tail(keys(txdb,keytype="TXNAME")))
+-  cols <- c("TXNAME","TXCHROM","TXSTRAND")
+-  res <- select(txdb, keys, columns = cols, keytype="TXNAME")
+-  checkTrue(dim(res)[1]>0)
+-  checkTrue(dim(res)[2]==length(cols))
+-  checkIdentical(c("TXNAME","TXCHROM","TXSTRAND"), colnames(res))
+-  uniqChrs <- unique(res$TXCHROM)[!is.na(unique(res$TXCHROM))]
+-  checkIdentical(c("chr1"),uniqChrs)  
+-}
+-
+-
+ 
+--- a/inst/unitTests/test_TranscriptDb_seqinfo.R
++++ /dev/null
+@@ -1,114 +0,0 @@
+-library(TxDb.Hsapiens.UCSC.hg19.knownGene);
+-txdb=TxDb.Hsapiens.UCSC.hg19.knownGene
+-
+-test_rename_seqlevels <- function(){
+-    seqlevels(txdb) <- as.character(1:length(seqlevels(txdb)))
+-    checkIdentical(as.character(1:length(seqlevels(txdb))),
+-                   seqlevels(txdb))
+-}
+-
+-test_restrict_seqlevels <- function(){
+-    ## This should work
+-    txdb <- restoreSeqlevels(txdb)    
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    checkTrue(length(seqinfo(txdb))==1)
+-    
+-    ## This should work
+-    txdb <- restoreSeqlevels(txdb)
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5", chr6="6", chr4="4")
+-    checkTrue(length(seqinfo(txdb))==3)
+-    checkIdentical(c('5','6','4'), seqlevels(txdb))
+-    checkTrue(seqlengths(txdb)[2] == min(seqlengths(txdb)))
+-    checkTrue(seqlengths(txdb)[3] == max(seqlengths(txdb)))
+-    
+-    ## And this should NOT work
+-    txdb <- restoreSeqlevels(txdb)
+-    checkException(seqlevels(txdb, force=TRUE) <- c(foo = "2"))
+-}
+-
+-
+-test_noChange_circ <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    foo = seqinfo(txdb)
+-    foo at is_circular = rep(TRUE, 93)
+-    ## This should throw an exception
+-    checkException(seqinfo(txdb, new2old=1:93) <- foo)    
+-}
+-
+-
+-test_noChange_genome <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    foo = seqinfo(txdb)
+-    foo at genome = rep("hg18", 93)
+-    ## This should throw an exception
+-    checkException(seqinfo(txdb, new2old=1:93) <- foo)
+-}
+-
+-
+-test_noChange_lengths <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    foo = seqinfo(txdb)
+-    foo at seqlengths = rep(1000L, 93)
+-    ## This should throw an exception
+-    checkException(seqinfo(txdb, new2old=1:93) <- foo)
+-}
+-
+-
+-test_transcripts_accessor <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    txs1 <- transcripts(txdb)
+-    seqlevels(txs1, force=TRUE) <- c(chr5 = "5")
+-    ## Then change seqlevels for txdb
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    txs2 <- transcripts(txdb)
+-    checkIdentical(txs1, txs2)
+-}
+-
+-test_exons_accessor <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    exs1 <- exons(txdb)
+-    seqlevels(exs1, force=TRUE) <- c(chr5 = "5")
+-    ## Then change seqlevels for txdb
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    exs2 <- exons(txdb)
+-    checkIdentical(exs1, exs2)
+-}
+-
+-test_cds_accessor <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    cds1 <- cds(txdb)
+-    seqlevels(cds1, force=TRUE) <- c(chr5 = "5")
+-    ## Then change seqlevels for txdb
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    cds2 <- cds(txdb)
+-    checkIdentical(cds1, cds2)
+-}
+-
+-test_promoters_accessor <- function(){
+-    txdb <- restoreSeqlevels(txdb)
+-    prm1 <- promoters(txdb)
+-    seqlevels(prm1, force=TRUE) <- c(chr5 = "5")
+-    ## Then change seqlevels for txdb
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    prm2 <- promoters(txdb)
+-    checkIdentical(prm1, prm2)
+-}
+-
+-
+-test_transcriptsBy_accessors <- function(){
+-    ## This one is a "fun" one.
+-    ## There are issues because some genes are annotated as being on
+-    ## TWO different chromosomes.  Such genes are filtered for txs3,
+-    ## but NOT for txs4...   Hmmmm.
+-    txdb <- restoreSeqlevels(txdb)
+-    txs3 <- transcriptsBy(txdb, by="gene")
+-    seqlevels(txs3, force=TRUE) <- c(chr5 = "5")
+-    ## Then change seqlevels for txdb
+-    seqlevels(txdb, force=TRUE) <- c(chr5 = "5")
+-    txs4 <- transcriptsBy(txdb, by="gene")
+-##    checkIdentical(txs3, txs4)  ## TROUBLE!!
+-    
+-}
+-
+-
+-## What to do about this?  The reason for the difference is because of order of operations.  txs3 gets all the ranges and then removes any that are not kosher (this is correct), txs4 OTOH gets only ranges from chr5 (efficient!), but then fails to filter out things that have hybrid seqnames (as they were pre-filtered).  I think I have to make the query less efficient to fix this, but I want to discuss it with Herve 1st to get a 2nd opinion.

Added: trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/series
===================================================================
--- trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/series	                        (rev 0)
+++ trunk/packages/R/r-bioc-genomicfeatures/trunk/debian/patches/series	2014-07-24 06:55:38 UTC (rev 17503)
@@ -0,0 +1 @@
+drop_tests_requiring_large_data_sets.patch




More information about the debian-med-commit mailing list