[med-svn] [r-bioc-ensembldb] 01/06: New upstream version 2.2.0

Andreas Tille tille at debian.org
Thu Nov 9 10:04:17 UTC 2017


This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository r-bioc-ensembldb.

commit 616aa1532a157e64aa272a53eba93c10689f2b1e
Author: Andreas Tille <tille at debian.org>
Date:   Thu Nov 9 10:13:16 2017 +0100

    New upstream version 2.2.0
---
 DESCRIPTION                                  |  17 +-
 NAMESPACE                                    |  20 +-
 R/Classes.R                                  |  35 +++-
 R/Generics.R                                 |   6 +
 R/Methods-Filter.R                           |  86 ++++++++
 R/Methods.R                                  | 106 +++++++++-
 R/dbhelpers.R                                |  17 +-
 R/functions-EnsDb.R                          |  78 +++++++
 R/functions-Filter.R                         |  67 ++++--
 R/functions-create-EnsDb.R                   | 108 +++++++---
 build/vignette.rds                           | Bin 367 -> 376 bytes
 inst/NEWS                                    |  42 +++-
 inst/doc/MySQL-backend.R                     |   8 +-
 inst/doc/MySQL-backend.Rmd                   |  14 +-
 inst/doc/MySQL-backend.html                  |  13 +-
 inst/doc/ensembldb.R                         | 107 +++++++---
 inst/doc/ensembldb.Rmd                       | 242 ++++++++++++++-------
 inst/doc/ensembldb.html                      | 300 +++++++++++++++++++--------
 inst/doc/proteins.R                          |  30 ++-
 inst/doc/proteins.Rmd                        |  64 +++---
 inst/doc/proteins.html                       |  21 +-
 inst/perl/get_gene_transcript_exon_tables.pl |  47 ++++-
 inst/scripts/checkEnsDbs.R                   |   2 +
 man/EnsDb-class.Rd                           |  10 +-
 man/EnsDb-exonsBy.Rd                         |  17 +-
 man/EnsDb.Rd                                 |  17 +-
 man/Filter-classes.Rd                        |  34 ++-
 man/ProteinFunctionality.Rd                  |  46 ++--
 man/convertFilter.Rd                         |  64 ++++++
 man/global-filters.Rd                        |  94 +++++++++
 man/hasProteinData-EnsDb-method.Rd           |   4 +-
 man/listEnsDbs.Rd                            |  16 +-
 man/useMySQL-EnsDb-method.Rd                 |  16 +-
 tests/testthat/test_Classes.R                |   6 +
 tests/testthat/test_Methods-Filter.R         |  40 ++++
 tests/testthat/test_Methods.R                |  39 +++-
 tests/testthat/test_functions-EnsDb.R        |  59 ++++++
 tests/testthat/test_functions-Filter.R       |  52 +++++
 tests/testthat/test_functions-utils.R        |  22 ++
 vignettes/MySQL-backend.Rmd                  |  14 +-
 vignettes/MySQL-backend.org                  |   5 +-
 vignettes/ensembldb.Rmd                      | 242 ++++++++++++++-------
 vignettes/ensembldb.org                      |  73 ++++++-
 vignettes/images/dblayout.png                | Bin 204300 -> 389708 bytes
 vignettes/proteins.Rmd                       |  64 +++---
 vignettes/proteins.org                       |   2 +-
 46 files changed, 1855 insertions(+), 511 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 3d9115c..e42b11f 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: ensembldb
 Type: Package
 Title: Utilities to create and use Ensembl-based annotation databases
-Version: 2.0.4
+Version: 2.2.0
 Authors at R: c(person(given = "Johannes", family = "Rainer",
 	   email = "johannes.rainer at eurac.edu",
 	   role = c("aut", "cre")),
@@ -17,13 +17,13 @@ URL: https://github.com/jotsetung/ensembldb
 BugReports: https://github.com/jotsetung/ensembldb/issues
 Imports: methods, RSQLite (>= 1.1), DBI, Biobase, GenomeInfoDb,
         AnnotationDbi (>= 1.31.19), rtracklayer, S4Vectors,
-        AnnotationHub, Rsamtools, IRanges, ProtGenerics, Biostrings,
-        curl
+        AnnotationHub, Rsamtools, IRanges (>= 2.11.16), ProtGenerics,
+        Biostrings, curl
 Depends: BiocGenerics (>= 0.15.10), GenomicRanges (>= 1.23.21),
-        GenomicFeatures (>= 1.23.18), AnnotationFilter (>= 0.99.7)
+        GenomicFeatures (>= 1.29.10), AnnotationFilter (>= 1.1.9)
 Suggests: BiocStyle, knitr, rmarkdown, EnsDb.Hsapiens.v75 (>= 0.99.8),
         shiny, testthat, BSgenome.Hsapiens.UCSC.hg19, ggbio (>=
-        1.24.0), Gviz (>= 1.20.0)
+        1.24.0), Gviz (>= 1.20.0), magrittr
 Enhances: RMySQL
 VignetteBuilder: knitr
 Description: The package provides functions to create and use
@@ -37,10 +37,11 @@ Description: The package provides functions to create and use
     specific entries like genes encoded on a chromosome region or
     transcript models of lincRNA genes.
 Collate: Classes.R Generics.R functions-utils.R dbhelpers.R Methods.R
-        functions-Filter.R Methods-Filter.R functions-create-EnsDb.R
-        select-methods.R seqname-utils.R Deprecated.R zzz.R
+        functions-EnsDb.R functions-Filter.R Methods-Filter.R
+        functions-create-EnsDb.R select-methods.R seqname-utils.R
+        Deprecated.R zzz.R
 biocViews: Genetics, AnnotationData, Sequencing, Coverage
 License: LGPL
 RoxygenNote: 6.0.1
 NeedsCompilation: no
-Packaged: 2017-08-04 23:59:15 UTC; biocbuild
+Packaged: 2017-10-31 00:19:36 UTC; biocbuild
diff --git a/NAMESPACE b/NAMESPACE
index 011bef1..19df0b4 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -107,7 +107,8 @@ importMethodsFrom("AnnotationFilter",
                   "field",
                   "value",
                   "condition",
-                  "supportedFilters")
+                  "supportedFilters",
+                  "convertFilter")
 importFrom("AnnotationFilter",
            "AnnotationFilter",
            "ExonIdFilter",
@@ -144,7 +145,8 @@ export("ensDbFromAH",
        "listEnsDbs",
        "makeEnsemblSQLiteFromTables",
        "makeEnsembldbPackage",
-       "runEnsDbApp"
+       "runEnsDbApp",
+       "filter"
        )
 ## Classes
 exportClasses(
@@ -152,13 +154,15 @@ exportClasses(
               "ProtDomIdFilter",
               "UniprotDbFilter",
               "UniprotMappingTypeFilter",
-              "OnlyCodingTxFilter"
+              "OnlyCodingTxFilter",
+              "TxSupportLevelFilter"
               )
 ## Methods for EnsFilter
 exportMethods(
     "seqnames",
     "seqlevels",
-    "show"
+    "show",
+    "convertFilter"
 )
 ## Methods for class EnsDb:
 exportMethods("cdsBy",
@@ -190,7 +194,10 @@ exportMethods("cdsBy",
               "transcriptsByOverlaps",
               "updateEnsDb",
               "useMySQL",
-              "supportedFilters"
+              "supportedFilters",
+              "addFilter",
+              "activeFilter",
+              "dropFilter"
               )
 ## Protein data related stuff
 exportMethods("hasProteinData",
@@ -227,7 +234,8 @@ export(
     "UniprotDbFilter",
     "UniprotMappingTypeFilter",
     "OnlyCodingTxFilter",
-    "ProtDomIdFilter"
+    "ProtDomIdFilter",
+    "TxSupportLevelFilter"
 )
 
 
diff --git a/R/Classes.R b/R/Classes.R
index 1ca0314..6e6ba04 100644
--- a/R/Classes.R
+++ b/R/Classes.R
@@ -14,7 +14,12 @@ setClass("EnsDb",
 #'
 #' @description \code{ensembldb} supports most of the filters from the
 #'     \code{\link{AnnotationFilter}} package to retrieve specific content from
-#'     \code{\linkS4class{EnsDb}} databases.
+#'     \code{\linkS4class{EnsDb}} databases. These filters can be passed to
+#'     the methods such as \code{\link{genes}} with the \code{filter} parameter
+#'     or can be added as a \emph{global} filter to an \code{EnsDb} object
+#'     (see \code{\link{addFilter}} for more details). Use the
+#'     \code{\link{supportedFilters}} to list all filters supported for
+#'     \code{EnsDb} object.
 #'
 #' @note For users of \code{ensembldb} version < 2.0: in the
 #'     \code{\link[AnnotationFilter]{GRangesFilter}} from the
@@ -153,6 +158,14 @@ setClass("EnsDb",
 #'
 #' In addition, the following filters are defined by \code{ensembldb}:
 #' \describe{
+#'
+#' \item{TxSupportLevel}{
+#'     allows to filter results using the provided transcript support level.
+#'     Support levels for transcripts are defined by Ensembl based on the
+#'     available evidences for a transcript with 1 being the highest evidence
+#'     grade and 5 the lowest level. This filter is only supported on
+#'     \code{EnsDb} databases with a db schema version higher 2.1.
+#' }
 #' 
 #' \item{UniprotDbFilter}{
 #'     allows to filter results based on the specified Uniprot database name(s).
@@ -195,6 +208,7 @@ setClass("EnsDb",
 #'     annotation filters.
 #' 
 #' @name Filter-classes
+#' 
 #' @seealso
 #' \code{\link{supportedFilters}} to list all filters supported for \code{EnsDb}
 #'     objects.
@@ -207,6 +221,9 @@ setClass("EnsDb",
 #'
 #'     \code{\link{genes}}, \code{\link{transcripts}}, \code{\link{exons}},
 #'     \code{\link{listGenebiotypes}}, \code{\link{listTxbiotypes}}.
+#'
+#'     \code{\link{addFilter}} for globally adding filters to an \code{EnsDb}
+#'     object.
 #' 
 #' @author Johannes Rainer
 #' @examples
@@ -366,3 +383,19 @@ UniprotMappingTypeFilter <- function(value, condition = "==") {
         value = as.character(value))
 }
 
+#' @rdname Filter-classes
+setClass("TxSupportLevelFilter", contains = "IntegerFilter",
+         prototype = list(
+             condition = "==",
+             values = 0L,
+             field = "tx_support_level"
+         ))
+#' @return For \code{TxSupportLevel}: A
+#' \code{TxSupportLevel} object.
+#' @rdname Filter-classes
+TxSupportLevelFilter <- function(value, condition = "==") {
+    if (!is.numeric(value))
+        stop("Parameter 'value' has to be numeric")
+    new("TxSupportLevelFilter", condition = condition,
+        value = as.integer(value))
+}
diff --git a/R/Generics.R b/R/Generics.R
index 859e1bc..9e6720d 100644
--- a/R/Generics.R
+++ b/R/Generics.R
@@ -4,6 +4,8 @@
 ##
 ##***********************************************************************
 ## A
+setGeneric("activeFilter", function(x, ...) standardGeneric("activeFilter"))
+setGeneric("addFilter", function(x, ...) standardGeneric("addFilter"))
 
 ## B
 setGeneric("buildQuery", function(x, ...)
@@ -16,6 +18,7 @@ setGeneric("cleanColumns", function(x, columns, ...)
 ## D
 setGeneric("dbSeqlevelsStyle", function(x, ...)
     standardGeneric("dbSeqlevelsStyle"))
+setGeneric("dropFilter", function(x, ...) standardGeneric("dropFilter"))
 
 ## E
 setGeneric("ensemblVersion", function(x)
@@ -26,6 +29,9 @@ setGeneric("ensDbQuery", function(object, ...)
     standardGeneric("ensDbQuery"))
 
 ## F
+## if (!isGeneric("filter"))
+##     setGeneric("filter", function(x, ...)
+##                standardGeneric("filter"))
 setGeneric("formatSeqnamesForQuery", function(x, sn, ...)
     standardGeneric("formatSeqnamesForQuery"))
 setGeneric("formatSeqnamesFromQuery", function(x, sn, ...)
diff --git a/R/Methods-Filter.R b/R/Methods-Filter.R
index edb08b2..a4ebe25 100644
--- a/R/Methods-Filter.R
+++ b/R/Methods-Filter.R
@@ -32,6 +32,70 @@ setMethod("ensDbColumn", "AnnotationFilterList",
                                    with.tables = with.tables)))
           })
 
+#' @title Convert an AnnotationFilter to a SQL WHERE condition for EnsDb
+#'
+#' @aliases convertFilter,AnnotationFilter,EnsDb-method
+#'     convertFilter,AnnotationFilterList,EnsDb-method
+#' 
+#' @description `convertFilter` converts an `AnnotationFilter::AnnotationFilter`
+#'     or `AnnotationFilter::AnnotationFilterList` to an SQL where condition
+#'     for an `EnsDb` database.
+#'
+#' @note This function *might* be used in direct SQL queries on the SQLite
+#'     database underlying an `EnsDb` but is more thought to illustrate the
+#'     use of `AnnotationFilter` objects in combination with SQL databases.
+#'     This method is used internally to create the SQL calls to the database.
+#'
+#' @param object `AnnotationFilter` or `AnnotationFilterList` objects (or
+#'     objects extending these classes).
+#'
+#' @param db `EnsDb` object.
+#'
+#' @param with.tables optional `character` vector specifying the names of the
+#'     database tables that are being queried.
+#'
+#' @return A `character(1)` with the SQL where condition.
+#' 
+#' @md
+#'
+#' @rdname convertFilter
+#' 
+#' @author Johannes Rainer
+#'
+#' @examples
+#'
+#' library(EnsDb.Hsapiens.v75)
+#' edb <- EnsDb.Hsapiens.v75
+#'
+#' ## Define a filter
+#' flt <- AnnotationFilter(~ genename == "BCL2")
+#'
+#' ## Use the method from the AnnotationFilter package:
+#' convertFilter(flt)
+#'
+#' ## Create a combination of filters
+#' flt_list <- AnnotationFilter(~ genename %in% c("BCL2", "BCL2L11") &
+#'     tx_biotype == "protein_coding")
+#' flt_list
+#'
+#' convertFilter(flt_list)
+#'
+#' ## Use the filters in the context of an EnsDb database:
+#' convertFilter(flt, edb)
+#'
+#' convertFilter(flt_list, edb)
+setMethod("convertFilter", signature = c(object = "AnnotationFilter",
+                                         db = "EnsDb"),
+          function(object, db, with.tables = character()) {
+              ensDbQuery(object, db, with.tables)
+          })
+#' @rdname convertFilter
+setMethod("convertFilter", signature = c(object = "AnnotationFilterList",
+                                         db = "EnsDb"),
+          function(object, db, with.tables = character()) {
+              ensDbQuery(object, db, with.tables)
+          })
+
 #' @description Build the \emph{where} query for an \code{AnnotationFilter} or
 #'     \code{AnnotationFilterList}.
 #'
@@ -290,3 +354,25 @@ setMethod("ensDbQuery", "UniprotMappingTypeFilter",
                        " be used.")
               .queryForEnsDbWithTables(object, db, with.tables)
           })
+
+setMethod("ensDbColumn", "TxSupportLevelFilter",
+          function(object, db, with.tables = character(), ...) {
+              if (missing(db))
+                  return(callNextMethod())
+              if (!any(listColumns(db) %in% "tx_support_level"))
+                  stop("The 'EnsDb' database used does not provide",
+                       " transcript support levels! A 'TxSupportLevelFilter' ",
+                       "can not be used.")
+              callNextMethod()
+          })
+
+setMethod("ensDbQuery", "TxSupportLevelFilter",
+          function(object, db, with.tables = character()) {
+              if (missing(db))
+                  return(callNextMethod())
+              if (!any(listColumns(db) %in% "tx_support_level"))
+                  stop("The 'EnsDb' database used does not provide",
+                       " transcript support levels! A 'TxSupportLevelFilter' ",
+                       "can not be used.")
+              .queryForEnsDbWithTables(object, db, with.tables)
+          })
diff --git a/R/Methods.R b/R/Methods.R
index 35b279c..e1a3d58 100644
--- a/R/Methods.R
+++ b/R/Methods.R
@@ -28,6 +28,11 @@ setMethod("show", "EnsDb", function(object) {
                    ".\n"))
         if (hasProteinData(object))
             cat("|Protein data available.\n")
+        flts <- .activeFilter(object)
+        if (is(flts, "AnnotationFilter") | is(flts, "AnnotationFilterList")) {
+            cat("|Active filter(s):\n")
+            show(flts)
+        }
     }
 })
 
@@ -1628,7 +1633,7 @@ setMethod("updateEnsDb", "EnsDb", function(x, ...){
 ##  GenomicFeature package, finetuning and adapting it for EnsDbs
 ####------------------------------------------------------------
 setMethod("transcriptsByOverlaps", "EnsDb",
-          function(x, ranges, maxgap = 0L, minoverlap = 1L,
+          function(x, ranges, maxgap = -1L, minoverlap = 0L,
                    type = c("any", "start", "end"),
                    columns = listColumns(x, "tx"),
                    filter = AnnotationFilterList()) {
@@ -1648,7 +1653,7 @@ setMethod("transcriptsByOverlaps", "EnsDb",
 ##
 ####------------------------------------------------------------
 setMethod("exonsByOverlaps", "EnsDb",
-          function(x, ranges, maxgap = 0L, minoverlap = 1L,
+          function(x, ranges, maxgap = -1L, minoverlap = 0L,
                    type = c("any", "start", "end"),
                    columns = listColumns(x, "exon"),
                    filter = AnnotationFilterList()) {
@@ -1991,17 +1996,106 @@ setMethod("listUniprotMappingTypes", "EnsDb", function(object) {
     return(res$uniprot_mapping_type)
 })
 
-#' @description \code{supportedFilters} returns the names of all supported
-#'     filters for the \code{EnsDb} object.
+#' @description \code{supportedFilters} returns a \code{data.frame} with the
+#'     names of all filters and the corresponding field supported by the
+#'     \code{EnsDb} object.
 #'
 #' @param object For \code{supportedFilters}: an \code{EnsDb} object.
 #'
 #' @param ... For \code{supportedFilters}: currently not used.
 #'
-#' @return For \code{supportedFilters}: the names of the supported filter
-#'     classes.
+#' @return For \code{supportedFilters}: a \code{data.frame} with the names and
+#'     the corresponding field of the supported filter classes.
 #' 
 #' @rdname Filter-classes
 setMethod("supportedFilters", "EnsDb", function(object, ...) {
     .supportedFilters(object)
 })
+
+#' @title Globally add filters to an EnsDb database
+#'
+#' @aliases addFilter addFilter,EnsDb-method
+#'
+#' @description These methods allow to set, delete or show globally defined
+#'     filters on an \code{\linkS4class{EnsDb}} object.
+#'
+#'     \code{addFilter}: adds an annotation filter to the \code{EnsDb} object.
+#'
+#' @details Adding a filter to an \code{EnsDb} object causes this filter to be
+#'     permanently active. The filter will be used for all queries to the
+#'     database and is added to all additional filters passed to the methods
+#'     such as \code{\link{genes}}.
+#'
+#' @param x The \code{\linkS4class{EnsDb}} object to which the filter should be
+#'     added.
+#'
+#' @param filter The filter as an
+#'     \code{\link[AnnotationFilter]{AnnotationFilter}},
+#'     \code{\link[AnnotationFilter]{AnnotationFilterList}} or filter
+#'     expression. See
+#'
+#' @return \code{addFilter} and \code{filter} return an \code{EnsDb} object
+#'     with the specified filter added.
+#' 
+#'     \code{activeFilter} returns an
+#'     \code{\link[AnnotationFilter]{AnnotationFilterList}} object being the
+#'     active global filter or \code{NA} if no filter was added.
+#'
+#'     \code{dropFilter} returns an \code{EnsDb} object with all eventually
+#'     present global filters removed.
+#' 
+#' @author Johannes Rainer
+#' 
+#' @rdname global-filters
+#'
+#' @seealso \code{\link{Filter-classes}} for a list of all supported filters.
+#' 
+#' @examples
+#' library(EnsDb.Hsapiens.v75)
+#' edb <- EnsDb.Hsapiens.v75
+#'
+#' ## Add a global SeqNameFilter to the database such that all subsequent
+#' ## queries will be applied on the filtered database.
+#' edb_y <- addFilter(edb, SeqNameFilter("Y"))
+#'
+#' ## Note: using the filter function is equivalent to a call to addFilter.
+#'
+#' ## Each call returns now only features encoded on chromosome Y
+#' gns <- genes(edb_y)
+#'
+#' seqlevels(gns)
+#'
+#' ## Get all lincRNA gene transcripts on chromosome Y
+#' transcripts(edb_y, filter = ~ gene_biotype == "lincRNA")
+#'
+#' ## Get the currently active global filter:
+#' activeFilter(edb_y)
+#'
+#' ## Delete this filter again.
+#' edb_y <- dropFilter(edb_y)
+#'
+#' activeFilter(edb_y)
+setMethod("addFilter", "EnsDb", function(x, filter = AnnotationFilterList()) {
+    .addFilter(x, filter)
+})
+
+#' @aliases dropFilter dropFilter,EnsDb-method
+#'
+#' @description \code{dropFilter} deletes all globally set filters from the
+#'     \code{EnsDb} object.
+#'
+#' @rdname global-filters
+setMethod("dropFilter", "EnsDb", function(x) {
+    .dropFilter(x)
+})
+
+#' @aliases activeFilter activeFilter,EnsDb-method
+#'
+#' @description \code{activeFilter} returns the globally set filter from an
+#'     \code{EnsDb} object.
+#' 
+#' @rdname global-filters
+setMethod("activeFilter", "EnsDb", function(x) {
+    .activeFilter(x)
+})
+
diff --git a/R/dbhelpers.R b/R/dbhelpers.R
index 4ec5b9c..0c67cf6 100644
--- a/R/dbhelpers.R
+++ b/R/dbhelpers.R
@@ -406,11 +406,14 @@ removePrefix <- function(x, split=".", fixed=TRUE){
                                  fetchColumns[fetchColumns != "tx_name"]))
     if (!is(filter, "AnnotationFilterList"))
         stop("parameter 'filter' has to be an 'AnnotationFilterList'!")
+    ## Add also the global filter if present.
+    global_filter <- .activeFilter(x)
+    if (is(global_filter, "AnnotationFilter") |
+        is(global_filter, "AnnotationFilterList"))
+        filter <- AnnotationFilterList(global_filter, filter)
     ## If any filter is a SymbolFilter, add "symbol" to the return columns.
     if (length(filter) > 0) {
-        if (any(unlist(lapply(filter, function(z) {
-            return(is(z, "SymbolFilter"))
-        }))))
+        if (any(.anyIs(filter, "SymbolFilter")))
             columns <- unique(c(columns, "symbol"))  ## append a filter column.
     }
     ## Catch also a "symbol" in columns
@@ -681,12 +684,14 @@ feedEnsDb2MySQL <- function(x, mysql, verbose = TRUE) {
                 idxL <- paste0("(", min(c(max(nchar(ids)), 20)), ")")
             else
                 idxL <- ""
-            dbGetQuery(con, paste0("create index ", tabname, "_", colname, "_idx ",
-                                   "on ", tabname, " (",colname, idxL,")"))
+            aff_rows <- dbExecute(
+                con, paste0("create index ", tabname, "_", colname, "_idx ",
+                            "on ", tabname, " (",colname, idxL,")"))
         }
     }
     ## Add the one on the numeric index:
-    dbGetQuery(con, "create index tx2exon_exon_idx_idx on tx2exon (exon_idx);")
+    aff_rows <- dbExecute(con, paste0("create index tx2exon_exon_idx_idx on ",
+                                      "tx2exon (exon_idx);"))
 }
 
 ############################################################
diff --git a/R/functions-EnsDb.R b/R/functions-EnsDb.R
new file mode 100644
index 0000000..b32a769
--- /dev/null
+++ b/R/functions-EnsDb.R
@@ -0,0 +1,78 @@
+## Functions related to EnsDb objects.
+
+#' @title Globally filter an EnsDb database
+#'
+#' @description \code{.addFilter} globally filters the provided
+#'     \code{EnsDb} database, i.e. it returns a \code{EnsDb} object with the
+#'     provided filter permanently set and active.
+#'
+#' @details Adding a filter to an \code{EnsDb} object causes this filter to be
+#'     permanently active. The filter will be used for all queries to the
+#'     databases and is added to all additional filters passed to the methods.
+#' 
+#' @param x An \code{EnsDb} object on which the filter(s) should be set.
+#'
+#' @param filter An \code{\link[AnnotationFilter]{AnnotationFilterList}},
+#'     \code{\link[AnnotationFulter]{AnnotationFilter}} object or a filter
+#'     expression providing the filter(s) to be set.
+#'
+#' @return For \code{.addFilter}: the \code{EnsDb} object with the filter
+#'     globally added and enabled.
+#'
+#'     For \code{.dropFilter}: the \code{EnsDb} object with all filters removed.
+#'
+#'     For \code{.activeFilter}: an \code{}
+#'
+#' @author Johannes Rainer
+#'
+#' @noRd
+.addFilter <- function(x, filter = AnnotationFilterList()) {
+    if (length(filter) == 0)
+        stop("No filter provided")
+    filter <- .processFilterParam(filter, x)
+    ## Now, if there was no error, filter is an AnnotationFilterList
+    got_filter <- getProperty(x, "FILTER")
+    if (is(got_filter, "AnnotationFilter") |
+        is(got_filter, "AnnotationFilterList")) {
+        ## Append the new filter.
+        filter <- AnnotationFilterList(got_filter, filter)
+    } else {
+        if (!is.na(got_filter))
+            stop("Globally set filter is not an 'AnnotationFilter' or ",
+                 "'AnnotationFilterList'")
+    }
+    x at .properties$FILTER <- filter
+    x
+}
+
+#' @description \code{.dropFilter} drops all (globally) added filters from the
+#'     submitted \code{EnsDb} object.
+#'
+#' @noRd
+.dropFilter <- function(x) {
+    dropProperty(x, "FILTER")
+}
+
+#' @description \code{.activeFilter} lists the globally set and active filter(s)
+#'     of an \code{EnsDb} object.
+#'
+#' @noRd
+.activeFilter <- function(x) {
+    getProperty(x, "FILTER")
+}
+
+
+#' @aliases filter
+#' 
+#' @description \code{filter} filters an \code{EnsDb} object. \code{filter} is
+#'     an alias for the \code{addFilter} function.
+#' 
+#' @rdname global-filters
+filter <- function(x, filter = AnnotationFilterList()) {
+    if (is(x, "EnsDb"))
+        addFilter(x, filter)
+    else
+        stop("ensembldb::filter requires an 'EnsDb' object as input. To call ",
+             "the filter function from the stats or dplyr package use ",
+             "stats::filter and dplyr::filter instead.")
+}
diff --git a/R/functions-Filter.R b/R/functions-Filter.R
index 592e897..12884a3 100644
--- a/R/functions-Filter.R
+++ b/R/functions-Filter.R
@@ -13,12 +13,14 @@
     seq_strand = "seq_strand",
     gene_start = "gene_seq_start",
     gene_end = "gene_seq_end",
+    description = "description",
     ## tx
     tx_id = "tx_id",
     tx_biotype = "tx_biotype",
     tx_name = "tx_id",
     tx_start = "tx_seq_start",
     tx_end = "tx_seq_end",
+    tx_support_level = "tx_support_level",
     ## exon
     exon_id = "exon_id",
     exon_rank = "exon_idx",
@@ -32,18 +34,46 @@
     prot_dom_id = "protein_domain_id"
 )
 
+## .supportedFilters <- function(x) {
+##     flts <- c(
+##         "EntrezFilter", "GeneBiotypeFilter", "GeneIdFilter", "GenenameFilter",
+##         "SymbolFilter", "SeqNameFilter", "SeqStrandFilter", "GeneStartFilter",
+##         "GeneEndFilter", "TxIdFilter", "TxBiotypeFilter", "TxNameFilter",
+##         "TxStartFilter", "TxEndFilter", "ExonIdFilter", "ExonRankFilter",
+##         "ExonStartFilter", "ExonEndFilter", "GRangesFilter"
+##     )
+##     if (hasProteinData(x))
+##         flts <- c(flts, "ProteinIdFilter", "UniprotFilter", "UniprotDbFilter",
+##                   "UniprotMappingTypeFilter", "ProtDomIdFilter")
+##     if (any(listColumns(x) == "tx_support_level"))
+##         flts <- c(flts, "TxSupportLevelFilter")
+##     return(sort(flts))
+## }
 .supportedFilters <- function(x) {
-    flts <- c(
-        "EntrezFilter", "GeneBiotypeFilter", "GeneIdFilter", "GenenameFilter",
-        "SymbolFilter", "SeqNameFilter", "SeqStrandFilter", "GeneStartFilter",
-        "GeneEndFilter", "TxIdFilter", "TxBiotypeFilter", "TxNameFilter",
-        "TxStartFilter", "TxEndFilter", "ExonIdFilter", "ExonRankFilter",
-        "ExonStartFilter", "ExonEndFilter", "GRangesFilter"
-    )
+    flds <- .filterFields(x)
+    flts <- c(.fieldToClass(flds), "GRangesFilter")
+    flds <- c(flds, NA)
+    idx <- order(flts)
+    data.frame(filter = flts[idx], field = flds[idx], stringsAsFactors = FALSE)
+}
+
+.filterFields <- function(x) {
+    flds <- c("entrez", "gene_biotype", "gene_id", "genename", "symbol",
+              "seq_name", "seq_strand", "gene_start", "gene_end", "tx_id",
+              "tx_biotype", "tx_name", "tx_start", "tx_end", "exon_id",
+              "exon_rank", "exon_start", "exon_end")
     if (hasProteinData(x))
-        flts <- c(flts, "ProteinIdFilter", "UniprotFilter", "UniprotDbFilter",
-                  "UniprotMappingTypeFilter", "ProtDomIdFilter")
-    return(sort(flts))
+        flds <- c(flds, "protein_id", "uniprot", "uniprot_db",
+                  "uniprot_mapping_type", "prot_dom_id")
+    if (any(listColumns(x) == "tx_support_level"))
+        flds <- c(flds, "tx_support_level")
+    sort(flds)
+}
+
+.fieldToClass <- function(field) {
+    class <- gsub("_([[:alpha:]])", "\\U\\1", field, perl=TRUE)
+    class <- sub("^([[:alpha:]])", "\\U\\1", class, perl=TRUE)
+    paste0(class, if (length(class)) "Filter" else character(0))
 }
 
 #' Utility function to map from the default AnnotationFilters fields to the
@@ -79,7 +109,7 @@
     }
     if (cond == "==")
         cond <- "="
-    if (cond %in% c("startsWith", "endsWith"))
+    if (cond %in% c("startsWith", "endsWith", "contains"))
         cond <- "like"
     cond
 }
@@ -100,6 +130,8 @@
         vals <- paste0("'", unique(x at value), "%'")
     if (condition(x) == "endsWith")
         vals <- paste0("'%", unique(x at value), "'")
+    if (condition(x) == "contains")
+        vals <- paste0("'%", unique(x at value), "%'")
     vals
 }
 
@@ -165,7 +197,7 @@
                  "or a valid filter expression!")
         }
     }
-    supp_filters <- supportedFilters(db)
+    supp_filters <- supportedFilters(db)$filter
     have_filters <- unique(.AnnotationFilterClassNames(res))
     if (!all(have_filters %in% supp_filters))
         stop("AnnotationFilter classes: ",
@@ -322,3 +354,14 @@ buildWhereForGRanges <- function(grf, columns, db = NULL){
     })
     unlist(classes, use.names = FALSE)
 }
+
+#' @description Test if any of the filter(s) is an SymbolFilter.
+#'
+#' @noRd
+.anyIs <- function(x, what = "SymbolFilter") {
+    if (is(x, "AnnotationFilter")) {
+        is(x, what)
+    } else {
+        unlist(lapply(x, .anyIs, what = what))
+    }
+}
diff --git a/R/functions-create-EnsDb.R b/R/functions-create-EnsDb.R
index 043f009..c15b553 100644
--- a/R/functions-create-EnsDb.R
+++ b/R/functions-create-EnsDb.R
@@ -56,7 +56,8 @@ fetchTablesFromEnsembl <- function(version, ensemblapi, user="anonymous",
 
     ## we should now have the files:
     in_files <- c("ens_gene.txt", "ens_tx.txt", "ens_exon.txt",
-                  "ens_tx2exon.txt", "ens_chromosome.txt", "ens_metadata.txt")
+                  "ens_tx2exon.txt", "ens_chromosome.txt", "ens_metadata.txt",
+                  "ens_counts.txt")
     ## check if we have all files...
     all_files <- dir(pattern="txt")
     if(sum(in_files %in% all_files)!=length(in_files))
@@ -77,8 +78,15 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
         stop("Something went wrong! I'm missing some of the txt files the",
              " perl script should have generated.")
 
+    haveCounts <- file.exists(paste0(path,
+                                     .Platform$file.sep, "ens_counts.txt"))
+    ## read the counts - use these numbers to validate that we did read
+    ## everything
+    if (haveCounts)
+        counts <- read.table(paste0(path, .Platform$file.sep, "ens_counts.txt"),
+                             sep = "\t", as.is = TRUE, header = TRUE)
     ## read information
-    info <- read.table(paste0(path, .Platform$file.sep ,"ens_metadata.txt"),
+    info <- read.table(paste0(path, .Platform$file.sep, "ens_metadata.txt"),
                        sep="\t", as.is=TRUE, header=TRUE)
     species <- .organismName(info[ info$name=="Organism", "value" ])
     ##substring(species, 1, 1) <- toupper(substring(species, 1, 1))
@@ -107,10 +115,22 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
                       sep="\t", as.is=TRUE, header=TRUE,
                       quote="", comment.char="" )
     OK <- .checkIntegerCols(tmp)
+    ## Check that we have the expected number of rows:
+    if (haveCounts)
+        if (nrow(tmp) != counts[1, "gene"])
+            stop("The data read from the 'ens_gene.txt' file does not match ",
+                 "the expected number of entries.")
     dbWriteTable(con, name="gene", tmp, row.names=FALSE)
+    ## Check that we can read the correct number of entries
+    if (haveCounts) {
+        res <- dbGetQuery(con, "select count(*) from gene;")[1, 1]
+        if (res != counts[1, "gene"])
+            stop("The number of rows in the 'gene' database table does not ",
+                 "match the expected number.")
+    }
     rm(tmp)
     message("OK")
-
+    
     if (as.numeric(info[info$name == "DBSCHEMAVERSION", "value"]) > 1) {
         message("Processing 'entrezgene' table ... ", appendLF = FALSE)
         ## process genes: some gene names might have fancy names...
@@ -126,6 +146,11 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
     ## process transcripts:
     tmp <- read.table(paste0(path, .Platform$file.sep, "ens_tx.txt"),
                       sep="\t", as.is=TRUE, header=TRUE)
+    ## Check that we have the expected number of rows:
+    if (haveCounts)
+        if (nrow(tmp) != counts[1, "tx"])
+            stop("The data read from the 'ens_tx.txt' file does not match the",
+                 " expected number of entries.")
     ## Fix the tx_cds_seq_start and tx_cds_seq_end columns: these should be integer!
     suppressWarnings(
         tmp[, "tx_cds_seq_start"] <- as.integer(tmp[, "tx_cds_seq_start"])
@@ -134,6 +159,20 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
         tmp[, "tx_cds_seq_end"] <- as.integer(tmp[, "tx_cds_seq_end"])
     )
     OK <- .checkIntegerCols(tmp)
+    ## Fix the tx_support_level column to ensure it contains only INTEGER!
+    if (any(colnames(tmp) == "tx_support_level")) {
+        tsl <- strsplit(tmp$tx_support_level, split = " ", fixed = TRUE)
+        tsl <- lapply(tsl, function(z) {
+            if (length(z) > 1)
+                z <- z[1]
+            if (is.na(z))
+                return(NA_integer_)
+            if (z == "NA" | z == "NULL")
+                z <- NA
+            as.integer(z)
+        })
+        tmp$tx_support_level <- unlist(tsl, use.names = FALSE)
+    }
     dbWriteTable(con, name="tx", tmp, row.names=FALSE)
     rm(tmp)
     message("OK")
@@ -142,6 +181,10 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
     message("Processing 'exon' table ... ", appendLF = FALSE)
     tmp <- read.table(paste0(path, .Platform$file.sep, "ens_exon.txt"),
                       sep = "\t", as.is = TRUE, header = TRUE)
+    if (haveCounts)
+        if (nrow(tmp) != counts[1, "exon"])
+            stop("The data read from the 'ens_exon.txt' file does not match ",
+                 "the expected number of entries.")
     OK <- .checkIntegerCols(tmp)
     dbWriteTable(con, name="exon", tmp, row.names=FALSE)
     rm(tmp)
@@ -159,6 +202,10 @@ makeEnsemblSQLiteFromTables <- function(path=".", dbname){
     if (file.exists(prot_file)) {
         message("Processing 'protein' table ... ", appendLF = FALSE)
         tmp <- read.table(prot_file, sep = "\t", as.is = TRUE, header = TRUE)
+        if (haveCounts)
+            if (nrow(tmp) != counts[1, "protein"])
+                stop("The data read from the 'ens_protein.txt' file does not ",
+                     "match the expected number of entries.")
         OK <- .checkIntegerCols(tmp)
         dbWriteTable(con, name = "protein", tmp, row.names = FALSE)
         message("OK")
@@ -273,23 +320,23 @@ makeEnsembldbPackage <- function(ensdb,
 ## a Ensembl GTF file.
 ## Limitation:
 ## + There is no way to get the Entrezgene ID from this file.
-## + Assuming that the element 2 in a row for a transcript represents its biotype, since
-##   there is no explicit key transcript_biotype in element 9.
-## + The CDS features in the GTF are somewhat problematic, while we're used to get just the
-##   coding start and end for a transcript from the Ensembl perl API, here we get the coding
-##   start and end for each exon.
+## + Assuming that the element 2 in a row for a transcript represents its
+##   biotype, since there is no explicit key transcript_biotype in element 9.
+## + The CDS features in the GTF are somewhat problematic, while we're used to
+##   get just the coding start and end for a transcript from the Ensembl perl
+##   API, here we get the coding start and end for each exon.
 ensDbFromGtf <- function(gtf, outfile, path, organism, genomeVersion,
                          version, ...){
-    options(useFancyQuotes=FALSE)
-    message("Importing GTF file ... ", appendLF=FALSE)
+    options(useFancyQuotes = FALSE)
+    message("Importing GTF file ... ", appendLF = FALSE)
     ## wanted.features <- c("gene", "transcript", "exon", "CDS")
     wanted.features <- c("exon")
     ## GTF <- import(con=gtf, format="gtf", feature.type=wanted.features)
-    GTF <- import(con=gtf, format="gtf")
+    GTF <- import(con = gtf, format = "gtf")
     message("OK")
     ## check what we've got...
     ## all wanted features?
-    if(any(!(wanted.features %in% levels(GTF$type)))){
+    if (any(!(wanted.features %in% levels(GTF$type)))) {
         stop(paste0("One or more required types are not in the gtf file. Need ",
                     paste(wanted.features, collapse=","), " but got only ",
                     paste(wanted.features[wanted.features %in% levels(GTF$type)],
@@ -297,17 +344,17 @@ ensDbFromGtf <- function(gtf, outfile, path, organism, genomeVersion,
                     "."))
     }
     ## transcript biotype?
-    if(any(colnames(mcols(GTF))=="transcript_biotype")){
+    if (any(colnames(mcols(GTF)) == "transcript_biotype")) {
         txBiotypeCol <- "transcript_biotype"
-    }else{
+    } else {
         ## that's a little weird, but it seems that certain gtf files from Ensembl
         ## provide the transcript biotype in the element "source"
         txBiotypeCol <- "source"
     }
     ## processing the metadata:
     ## first read the header...
-    tmp <- readLines(gtf, n=10)
-    tmp <- tmp[grep(tmp, pattern="^#")]
+    tmp <- readLines(gtf, n = 10)
+    tmp <- tmp[grep(tmp, pattern = "^#")]
     haveHeader <- FALSE
     if (length(tmp) > 0) {
         ##message("GTF file has a header.")
@@ -330,7 +377,7 @@ ensDbFromGtf <- function(gtf, outfile, path, organism, genomeVersion,
     organism <- Parms["organism"]
     genomeVersion <- Parms["genomeVersion"]
 
-    if(haveHeader){
+    if (haveHeader) {
         if(genomeVersion!=Header[Header[, "name"] == "genome-version", "value"]){
             stop(paste0("The GTF file name is not as expected: <Organism>.",
                         "<genome version>.<Ensembl version>.gtf!",
@@ -353,9 +400,8 @@ ensDbFromGtf <- function(gtf, outfile, path, organism, genomeVersion,
     ## updating the Metadata information...
     lite <- dbDriver("SQLite")
     con <- dbConnect(lite, dbname = dbname )
-    bla <- dbGetQuery(con, paste0("update metadata set value='",
-                                  gtfFilename,
-                                  "' where name='source_file';"))
+    bla <- dbExecute(con, paste0("update metadata set value='",
+                                 gtfFilename, "' where name='source_file';"))
     dbDisconnect(con)
     return(dbname)
 }
@@ -414,9 +460,8 @@ ensDbFromAH <- function(ah, outfile, path, organism, genomeVersion, version){
     ## updating the Metadata information...
     lite <- dbDriver("SQLite")
     con <- dbConnect(lite, dbname = dbname )
-    bla <- dbGetQuery(con, paste0("update metadata set value='",
-                                  gtfFilename,
-                                  "' where name='source_file';"))
+    bla <- dbExecute(con, paste0("update metadata set value='",
+                                 gtfFilename, "' where name='source_file';"))
     dbDisconnect(con)
     return(dbname)
 }
@@ -625,9 +670,8 @@ ensDbFromGff <- function(gff, outfile, path, organism, genomeVersion,
     ## updating the Metadata information...
     lite <- dbDriver("SQLite")
     con <- dbConnect(lite, dbname = dbname )
-    bla <- dbGetQuery(con, paste0("update metadata set value='",
-                                  gtfFilename,
-                                  "' where name='source_file';"))
+    bla <- dbExecute(con, paste0("update metadata set value='",
+                                 gtfFilename, "' where name='source_file';"))
     dbDisconnect(con)
     return(dbname)
 }
@@ -982,9 +1026,9 @@ checkValidEnsDb <- function(x){
                                }))
     if(any(Different)){
         stop(paste0("Provided exon index in transcript does not match with",
-                    " ordering of the exons by chromosomal coordinates for",
-                    sum(Different), "of the", length(Different),
-                    "transcripts encoded on the + strand!"))
+                    " ordering of the exons by chromosomal coordinates for ",
+                    sum(Different), " of the ", length(Different),
+                    " transcripts encoded on the + strand!"))
     }
     extmp <- ex[ex$seq_strand==-1, c("exon_idx", "tx_id", "exon_seq_end")]
     extmp <- extmp[order(extmp$exon_seq_end, decreasing=TRUE), ]
@@ -994,9 +1038,9 @@ checkValidEnsDb <- function(x){
                                }))
     if(any(Different)){
         stop(paste0("Provided exon index in transcript does not match with",
-                    " ordering of the exons by chromosomal coordinates for",
-                    sum(Different), "of the", length(Different),
-                    "transcripts encoded on the - strand!"))
+                    " ordering of the exons by chromosomal coordinates for ",
+                    sum(Different), " of the ", length(Different),
+                    " transcripts encoded on the - strand!"))
     }
     message("OK")
     return(TRUE)
diff --git a/build/vignette.rds b/build/vignette.rds
index 256aa5c..5bc633e 100644
Binary files a/build/vignette.rds and b/build/vignette.rds differ
diff --git a/inst/NEWS b/inst/NEWS
index 23d3646..176057f 100644
--- a/inst/NEWS
+++ b/inst/NEWS
@@ -1,9 +1,47 @@
-CHANGES IN VERSION 2.0.4
---------------------------
+CHANGES IN VERSION 2.1.12
+------------------------
+
+BUG FIXES:
+    o Use new defaults from the IRanges package for arguments maxgap = -1L,
+      minoverlap = 0L in transcriptsByOverlaps and exonsByOverlaps methods.
+
+CHANGES IN VERSION 2.1.12
+------------------------
+
+BUG FIXES:
+    o Remove RSQLite warnings (issue #54).
+
+CHANGES IN VERSION 2.1.11
+------------------------
 
 BUG FIXES:
     o ensDbFromGtf failed to parse header for GTF files with more than one
       white space.
+      
+
+CHANGES IN VERSION 2.1.10
+------------------------
+
+USER VISIBLE CHANGES
+     o supportedFilters returns a data frame with the filter class name and
+       corresponding field (column) name.
+
+
+CHANGES IN VERSION 2.1.9
+------------------------
+
+NEW FEATURES
+    o Support for global filters in an EnsDb object.
+    o Add filter function.
+    
+
+CHANGES IN VERSION 2.1.8
+------------------------
+
+NEW FEATURES
+    o New annotations available in EnsDb objects: gene.description and
+      tx.tx_support_level.
+    o New TxSupportLevelFilter object.
 
 
 CHANGES IN VERSION 1.99.13
diff --git a/inst/doc/MySQL-backend.R b/inst/doc/MySQL-backend.R
index 9003464..2d952c2 100644
--- a/inst/doc/MySQL-backend.R
+++ b/inst/doc/MySQL-backend.R
@@ -6,10 +6,11 @@
 #  ## Call the useMySQL method providing the required credentials to create
 #  ## databases and inserting data on the MySQL server
 #  edb_mysql <- useMySQL(EnsDb.Hsapiens.v75, host = "localhost", user = "userwrite",
-#  		      pass = "userpass")
+#                        pass = "userpass")
 #  
 #  ## Use this EnsDb object
 #  genes(edb_mysql)
+#  
 
 ## ----eval = FALSE----------------------------------------------------------
 #  library(ensembldb)
@@ -17,14 +18,15 @@
 #  
 #  ## Connect to the MySQL database to list the databases.
 #  dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-#  		   pass = "readonly")
+#                     pass = "readonly")
 #  
 #  ## List the available databases
 #  listEnsDbs(dbcon)
 #  
 #  ## Connect to one of the databases and use that one.
 #  dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-#  		   pass = "readonly", dbname = "ensdb_hsapiens_v75")
+#                     pass = "readonly", dbname = "ensdb_hsapiens_v75")
 #  edb <- EnsDb(dbcon)
 #  edb
+#  
 
diff --git a/inst/doc/MySQL-backend.Rmd b/inst/doc/MySQL-backend.Rmd
index de512cb..84db670 100644
--- a/inst/doc/MySQL-backend.Rmd
+++ b/inst/doc/MySQL-backend.Rmd
@@ -3,7 +3,7 @@ title: "Using a MySQL server backend"
 author: "Johannes Rainer"
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Using a MySQL server backend}
@@ -33,7 +33,7 @@ this would require access to a MySQL server.
 Installation of `EnsDb` databases in a MySQL server is straight forward - given
 that the user has write access to the server:
 
-```{r eval = FALSE}
+```{r  eval = FALSE }
 library(ensembldb)
 ## Load the EnsDb package that should be installed on the MySQL server
 library(EnsDb.Hsapiens.v75)
@@ -41,10 +41,11 @@ library(EnsDb.Hsapiens.v75)
 ## Call the useMySQL method providing the required credentials to create
 ## databases and inserting data on the MySQL server
 edb_mysql <- useMySQL(EnsDb.Hsapiens.v75, host = "localhost", user = "userwrite",
-		      pass = "userpass")
+                      pass = "userpass")
 
 ## Use this EnsDb object
 genes(edb_mysql)
+ 
 ```
 
 To use an `EnsDb` in a MySQL server without the need to install the corresponding
@@ -52,21 +53,22 @@ R-package, the connection to the database can be passed to the `EnsDb` construct
 function. With the resulting `EnsDb` object annotations can be retrieved from the
 MySQL database.
 
-```{r eval = FALSE}
+```{r  eval = FALSE }
 library(ensembldb)
 library(RMySQL)
 
 ## Connect to the MySQL database to list the databases.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-		   pass = "readonly")
+                   pass = "readonly")
 
 ## List the available databases
 listEnsDbs(dbcon)
 
 ## Connect to one of the databases and use that one.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-		   pass = "readonly", dbname = "ensdb_hsapiens_v75")
+                   pass = "readonly", dbname = "ensdb_hsapiens_v75")
 edb <- EnsDb(dbcon)
 edb
+ 
 ```
 
diff --git a/inst/doc/MySQL-backend.html b/inst/doc/MySQL-backend.html
index 8850745..cebc91e 100644
--- a/inst/doc/MySQL-backend.html
+++ b/inst/doc/MySQL-backend.html
@@ -11,6 +11,7 @@
 
 <meta name="author" content="Johannes Rainer" />
 
+<meta name="date" content="2017-10-30" />
 
 <title>Using a MySQL server backend</title>
 
@@ -68,7 +69,7 @@ h6 {
 }
 </style>
 
-<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0A%7D%0Abody%2C%20td%20%7B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B [...]
+<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0A%7D%0Adiv%23TOC%20%3E%20 [...]
 
 </head>
 
@@ -209,9 +210,9 @@ div.tocify {
 
 <h1 class="title toc-ignore">Using a MySQL server backend</h1>
 <p class="author-name">Johannes Rainer</p>
-<h4 class="date"><em>4 August 2017</em></h4>
+<h4 class="date"><em>30 October 2017</em></h4>
 <h4 class="package">Package</h4>
-<p>ensembldb 2.0.4</p>
+<p>ensembldb 2.2.0</p>
 
 </div>
 
@@ -231,7 +232,7 @@ library(EnsDb.Hsapiens.v75)
 ## Call the useMySQL method providing the required credentials to create
 ## databases and inserting data on the MySQL server
 edb_mysql <- useMySQL(EnsDb.Hsapiens.v75, host = "localhost", user = "userwrite",
-              pass = "userpass")
+                      pass = "userpass")
 
 ## Use this EnsDb object
 genes(edb_mysql)</code></pre>
@@ -241,14 +242,14 @@ library(RMySQL)
 
 ## Connect to the MySQL database to list the databases.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-           pass = "readonly")
+                   pass = "readonly")
 
 ## List the available databases
 listEnsDbs(dbcon)
 
 ## Connect to one of the databases and use that one.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-           pass = "readonly", dbname = "ensdb_hsapiens_v75")
+                   pass = "readonly", dbname = "ensdb_hsapiens_v75")
 edb <- EnsDb(dbcon)
 edb</code></pre>
 </div>
diff --git a/inst/doc/ensembldb.R b/inst/doc/ensembldb.R
index 76f79c2..c8cb262 100644
--- a/inst/doc/ensembldb.R
+++ b/inst/doc/ensembldb.R
@@ -8,15 +8,18 @@ edb
 
 ## For what organism was the database generated?
 organism(edb)
+ 
 
 ## ----no-network, echo = FALSE, results = "hide"----------------------------
 ## Disable code chunks that require network connection - conditionally
 ## disable this on Windows only. This is to avoid TIMEOUT errors on the
 ## Bioconductor Windows build maching (issue #47).
 use_network <- FALSE
+ 
 
 ## ----filters---------------------------------------------------------------
 supportedFilters(edb)
+ 
 
 ## ----transcripts-----------------------------------------------------------
 Tx <- transcripts(edb, filter = list(GenenameFilter("BCL2L11")))
@@ -28,10 +31,29 @@ head(start(Tx))
 
 ## or extract the biotype with
 head(Tx$tx_biotype)
+ 
 
 ## ----transcripts-filter-expression-----------------------------------------
 ## Use a filter expression to perform the filtering.
 transcripts(edb, filter = ~ genename == "ZBTB16")
+ 
+
+## ----transcripts-filter----------------------------------------------------
+library(magrittr)
+
+filter(edb, ~ symbol == "BCL2" & tx_biotype != "protein_coding") %>% transcripts
+ 
+
+## ----filter-Y--------------------------------------------------------------
+edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+## All subsequent filters on that EnsDb will only work on features encoded on
+## chromosome Y
+genes(edb_y)
+
+## Get all lincRNAs on chromosome Y
+genes(edb_y, filter = ~ gene_biotype == "lincRNA")
+ 
 
 ## ----list-columns----------------------------------------------------------
 ## list all database tables along with their columns
@@ -39,23 +61,26 @@ listTables(edb)
 
 ## list columns from a specific table
 listColumns(edb, "tx")
+ 
 
 ## ----transcripts-example2--------------------------------------------------
 Tx <- transcripts(edb,
-		  columns = c(listColumns(edb , "tx"), "gene_name"),
-		  filter = TxBiotypeFilter("nonsense_mediated_decay"),
-		  return.type = "DataFrame")
+                  columns = c(listColumns(edb , "tx"), "gene_name"),
+                  filter = TxBiotypeFilter("nonsense_mediated_decay"),
+                  return.type = "DataFrame")
 nrow(Tx)
 Tx
+ 
 
 ## ----cdsBy-----------------------------------------------------------------
 yCds <- cdsBy(edb, filter = SeqNameFilter("Y"))
 yCds
+ 
 
 ## ----genes-GRangesFilter---------------------------------------------------
 ## Define the filter
 grf <- GRangesFilter(GRanges("11", ranges = IRanges(114000000, 114000050),
-			     strand = "+"), type = "any")
+                             strand = "+"), type = "any")
 
 ## Query genes:
 gn <- genes(edb, filter = grf)
@@ -63,6 +88,7 @@ gn
 
 ## Next we retrieve all transcripts for that gene so that we can plot them.
 txs <- transcripts(edb, filter = GenenameFilter(gn$gene_name))
+ 
 
 ## ----tx-for-zbtb16, message=FALSE, fig.align='center', fig.width=7.5, fig.height=5----
 plot(3, 3, pch = NA, xlim = c(start(gn), end(gn)), ylim = c(0, length(txs)),
@@ -73,12 +99,14 @@ rect(xleft = start(grf), xright = end(grf), ybottom = 0, ytop = length(txs),
 for(i in 1:length(txs)) {
     current <- txs[i]
     rect(xleft = start(current), xright = end(current), ybottom = i-0.975,
-	 ytop = i-0.125, border = "grey")
+         ytop = i-0.125, border = "grey")
     text(start(current), y = i-0.5, pos = 4, cex = 0.75, labels = current$tx_id)
 }
+ 
 
 ## ----transcripts-GRangesFilter---------------------------------------------
 transcripts(edb, filter = grf)
+ 
 
 ## ----biotypes--------------------------------------------------------------
 ## Get all gene biotypes from the database. The GeneBiotypeFilter
@@ -87,6 +115,7 @@ listGenebiotypes(edb)
 
 ## Get all transcript biotypes from the database.
 listTxbiotypes(edb)
+ 
 
 ## ----genes-BCL2------------------------------------------------------------
 ## We're going to fetch all genes which names start with BCL. To this end
@@ -98,46 +127,53 @@ BCLs <- genes(edb,
 	      return.type = "DataFrame")
 nrow(BCLs)
 BCLs
+ 
 
 ## ----example-AnnotationFilterList------------------------------------------
 ## determine the average length of snRNA, snoRNA and rRNA genes encoded on
 ## chromosomes X and Y.
 mean(lengthOf(edb, of = "tx", filter = AnnotationFilterList(
-				  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
-				  SeqNameFilter(c("X", "Y")))))
+                                  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
+                                  SeqNameFilter(c("X", "Y")))))
 
 ## determine the average length of protein coding genes encoded on the same
 ## chromosomes.
 mean(lengthOf(edb, of = "tx", filter = ~ gene_biotype == "protein_coding" &
-				  seq_name %in% c("X", "Y")))
+                                  seq_name %in% c("X", "Y")))
+ 
 
 ## ----example-first-two-exons-----------------------------------------------
 ## Extract all exons 1 and (if present) 2 for all genes encoded on the
 ## Y chromosome
 exons(edb, columns = c("tx_id", "exon_idx"),
       filter = list(SeqNameFilter("Y"),
-		    ExonRankFilter(3, condition = "<")))
+                    ExonRankFilter(3, condition = "<")))
+ 
 
 ## ----transcriptsBy-X-Y-----------------------------------------------------
 TxByGns <- transcriptsBy(edb, by = "gene", filter = SeqNameFilter(c("X", "Y")))
 TxByGns
+ 
 
 ## ----exonsBy-RNAseq, message = FALSE, eval = FALSE-------------------------
 #  ## will just get exons for all genes on chromosomes 1 to 22, X and Y.
 #  ## Note: want to get rid of the "LRG" genes!!!
 #  EnsGenes <- exonsBy(edb, by = "gene", filter = AnnotationFilterList(
-#  					  SeqNameFilter(c(1:22, "X", "Y")),
-#  					  GeneIdFilter("ENSG", "startsWith")))
+#                                            SeqNameFilter(c(1:22, "X", "Y")),
+#                                            GeneIdFilter("ENSG", "startsWith")))
+#  
 
 ## ----toSAF-RNAseq, message = FALSE, eval=FALSE-----------------------------
 #  ## Transforming the GRangesList into a data.frame in SAF format
 #  EnsGenes.SAF <- toSAF(EnsGenes)
+#  
 
 ## ----disjointExons, message = FALSE, eval=FALSE----------------------------
 #  ## Create a GRanges of non-overlapping exon parts.
 #  DJE <- disjointExons(edb, filter = AnnotationFilterList(
 #  			      SeqNameFilter(c(1:22, "X", "Y")),
 #  			      GeneIdFilter("ENSG%", "startsWith")))
+#  
 
 ## ----transcript-sequence-AnnotationHub, message = FALSE, eval = FALSE------
 #  library(EnsDb.Hsapiens.v75)
@@ -157,6 +193,7 @@ TxByGns
 #  ## Get the gene sequences, i.e. the sequence including the sequence of
 #  ## all of the gene's exons and introns.
 #  geneSeqs <- getSeq(Dna, genes)
+#  
 
 ## ----transcript-sequence-extractTranscriptSeqs, message = FALSE, eval = FALSE----
 #  ## get all exons of all transcripts encoded on chromosome Y
@@ -174,6 +211,7 @@ TxByGns
 #  ## of all transcripts on the Y chromosome.
 #  cdsY <- cdsBy(edb, filter = SeqNameFilter("Y"))
 #  extractTranscriptSeqs(Dna, cdsY)
+#  
 
 ## ----seqlevelsStyle, message = FALSE---------------------------------------
 ## Change the seqlevels style form Ensembl (default) to UCSC:
@@ -183,6 +221,7 @@ seqlevelsStyle(edb) <- "UCSC"
 genesY <- genes(edb, filter = ~ seq_name == "chrY")
 ## The seqlevels of the returned GRanges are also in UCSC style
 seqlevels(genesY)
+ 
 
 ## ----seqlevelsStyle-2, message = FALSE-------------------------------------
 seqlevelsStyle(edb) <- "UCSC"
@@ -200,6 +239,7 @@ seqlevels(edb)[1:30]
 
 ## Resetting the option.
 options(ensembldb.seqnameNotFound = "ORIGINAL")
+ 
 
 ## ----extractTranscriptSeqs-BSGenome, warning = FALSE, message = FALSE------
 library(BSgenome.Hsapiens.UCSC.hg19)
@@ -219,11 +259,13 @@ yTxSeqs
 ## Extract just the CDS
 Test <- cdsBy(edb, "tx", filter = SeqNameFilter("chrY"))
 yTxCds <- extractTranscriptSeqs(bsg, cdsBy(edb, "tx",
-					   filter = SeqNameFilter("chrY")))
+                                           filter = SeqNameFilter("chrY")))
 yTxCds
+ 
 
 ## ----seqlevelsStyle-restore------------------------------------------------
 seqlevelsStyle(edb) <- "Ensembl"
+ 
 
 ## ----gviz-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=2.3----
 ## Loading the Gviz library
@@ -234,7 +276,7 @@ edb <- EnsDb.Hsapiens.v75
 ## Retrieving a Gviz compatible GRanges object with all genes
 ## encoded on chromosome Y.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "Y",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 
@@ -245,42 +287,47 @@ options(ucscChromosomeNames = FALSE)
 plotTracks(list(gat, GeneRegionTrack(gr)))
 
 options(ucscChromosomeNames = TRUE)
+ 
 
 ## ----message=FALSE---------------------------------------------------------
 seqlevelsStyle(edb) <- "UCSC"
 ## Retrieving the GRanges objects with seqnames corresponding to UCSC chromosome names.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 seqnames(gr)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 plotTracks(list(gat, GeneRegionTrack(gr)))
+ 
 
 ## ----gviz-separate-tracks, message=FALSE, warning=FALSE, fig.align='center', fig.width=7.5, fig.height=2.25----
 protCod <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				     start = 20400000, end = 21400000,
-				     filter = GeneBiotypeFilter("protein_coding"))
+                                     start = 20400000, end = 21400000,
+                                     filter = GeneBiotypeFilter("protein_coding"))
 lincs <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				   start = 20400000, end = 21400000,
-				   filter = GeneBiotypeFilter("lincRNA"))
+                                   start = 20400000, end = 21400000,
+                                   filter = GeneBiotypeFilter("lincRNA"))
 
 plotTracks(list(gat, GeneRegionTrack(protCod, name = "protein coding"),
-		GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
+                GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
 
 ## At last we change the seqlevels style again to Ensembl
 seqlevelsStyle <- "Ensembl"
+ 
 
 ## ----pplot-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4----
 library(ggbio)
 
 ## Create a plot for all transcripts of the gene SKA2
 autoplot(edb, ~ genename == "SKA2")
+ 
 
 ## ----pplot-plot-2, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4----
 ## Get the chromosomal region in which the gene is encoded
 ska2 <- genes(edb, filter = ~ genename == "SKA2")
 strand(ska2) <- "*"
 autoplot(edb, GRangesFilter(ska2), names.expr = "gene_name")
+ 
 
 ## ----AnnotationDbi, message = FALSE----------------------------------------
 library(EnsDb.Hsapiens.v75)
@@ -305,6 +352,7 @@ length(gids)
 ## Get all gene names for genes encoded on chromosome Y.
 gnames <- keys(edb, keytype = "GENENAME", filter = SeqNameFilter("Y"))
 head(gnames)
+ 
 
 ## ----select, message = FALSE, warning=FALSE--------------------------------
 ## Use the /standard/ way to fetch data.
@@ -313,8 +361,9 @@ select(edb, keys = c("BCL2", "BCL2L11"), keytype = "GENENAME",
 
 ## Use the filtering system of ensembldb
 select(edb, keys = ~ genename %in% c("BCL2", "BCL2L11") &
-		tx_biotype == "protein_coding",
+                tx_biotype == "protein_coding",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))
+ 
 
 ## ----mapIds, message = FALSE-----------------------------------------------
 ## Use the default method, which just returns the first value for multi mappings.
@@ -326,8 +375,9 @@ mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME",
 
 ## And, just like before, we can use filters to map only to protein coding transcripts.
 mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
-			TxBiotypeFilter("protein_coding")), column = "TXID",
+                        TxBiotypeFilter("protein_coding")), column = "TXID",
        multiVals = "list")
+ 
 
 ## ----AnnotationHub-query, message = FALSE, eval = use_network--------------
 #  library(AnnotationHub)
@@ -336,17 +386,20 @@ mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
 #  
 #  ## Query for all available EnsDb databases
 #  query(ah, "EnsDb")
+#  
 
 ## ----AnnotationHub-query-2, message = FALSE, eval = use_network------------
 #  ahDb <- query(ah, pattern = c("Xiphophorus Maculatus", "EnsDb", 87))
 #  ## What have we got
 #  ahDb
+#  
 
 ## ----AnnotationHub-fetch, message = FALSE, eval = FALSE--------------------
 #  ahEdb <- ahDb[[1]]
 #  
 #  ## retriebe all genes
 #  gns <- genes(ahEdb)
+#  
 
 ## ----edb-from-ensembl, message = FALSE, eval = FALSE-----------------------
 #  library(ensembldb)
@@ -363,8 +416,9 @@ mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
 #  
 #  ## and finally we can generate the package
 #  makeEnsembldbPackage(ensdb = DBFile, version = "0.99.12",
-#  		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-#  		     author = "J Rainer")
+#                       maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+#                       author = "J Rainer")
+#  
 
 ## ----gtf-gff-edb, message = FALSE, eval = FALSE----------------------------
 #  ## Load the AnnotationHub data.
@@ -393,6 +447,7 @@ mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
 #  
 #  ## Alternatively, look up and retrieve the toplevel DNA sequence manually.
 #  Dna <- ah[["AH22042"]]
+#  
 
 ## ----EnsDb-from-Y-GRanges, message = FALSE, eval = use_network-------------
 #  ## Generate a sqlite database from a GRanges object specifying
@@ -407,6 +462,7 @@ mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
 #  ## Load the database
 #  edb <- EnsDb(DB)
 #  edb
+#  
 
 ## ----EnsDb-from-GTF, message = FALSE, eval = FALSE-------------------------
 #  library(ensembldb)
@@ -423,6 +479,7 @@ mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
 #  ## alternatively, build the annotation package
 #  ## and finally we can generate the package
 #  makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
-#  		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-#  		     author = "J Rainer")
+#                       maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+#                       author = "J Rainer")
+#  
 
diff --git a/inst/doc/ensembldb.Rmd b/inst/doc/ensembldb.Rmd
index 7bbf10c..b636e21 100644
--- a/inst/doc/ensembldb.Rmd
+++ b/inst/doc/ensembldb.Rmd
@@ -4,13 +4,13 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Generating an using Ensembl based annotation packages}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
-  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz}
+  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz,magrittr}
 ---
 
 
@@ -25,7 +25,7 @@ database, the `ensembldb` package provides also a filter framework allowing to
 retrieve annotations for specific entries like genes encoded on a chromosome
 region or transcript models of lincRNA genes. From version 1.7 on, `EnsDb`
 databases created by the `ensembldb` package contain also protein annotation data
-(see Section [11](#org35014ed) for the database layout and an overview of
+(see Section [11](#org5bd9a97) for the database layout and an overview of
 available attributes/columns). For more information on the use of the protein
 annotations refer to the *proteins* vignette.
 
@@ -42,7 +42,7 @@ In the example below we load an Ensembl based annotation package for Homo
 sapiens, Ensembl version 75. The `EnsDb` object providing access to the underlying
 SQLite database is bound to the variable name `EnsDb.Hsapiens.v75`.
 
-```{r load-libs, warning=FALSE, message=FALSE}
+```{r  load-libs, warning=FALSE, message=FALSE }
 library(EnsDb.Hsapiens.v75)
 
 ## Making a "short cut"
@@ -52,13 +52,15 @@ edb
 
 ## For what organism was the database generated?
 organism(edb)
+ 
 ```
 
-```{r no-network, echo = FALSE, results = "hide"}
+```{r  no-network, echo = FALSE, results = "hide" }
 ## Disable code chunks that require network connection - conditionally
 ## disable this on Windows only. This is to avoid TIMEOUT errors on the
 ## Bioconductor Windows build maching (issue #47).
 use_network <- FALSE
+ 
 ```
 
 
@@ -68,13 +70,19 @@ One of the strengths of the `ensembldb` package and the related `EnsDb` database
 its implementation of a filter framework that enables to efficiently extract
 data sub-sets from the databases. The `ensembldb` package supports most of the
 filters defined in the `AnnotationFilter` Bioconductor package and defines some
-additional filters specific to the data stored in `EnsDb` databases. The
-`supportedFilters` method can be used to get an overview over all supported filter
-classes, each of them (except the `GRangesFilter`) working on a single
+additional filters specific to the data stored in `EnsDb` databases. Filters can
+be passed directly to all methods extracting data from an `EnsDb` (such as `genes`,
+`transcripts` or `exons`). Alternatively it is possible with the `addFilter` or `filter`
+functions to add a filter directly to an `EnsDb` which will then be used in all
+queries on that object.
+
+The `supportedFilters` method can be used to get an overview over all supported
+filter classes, each of them (except the `GRangesFilter`) working on a single
 column/field in the database.
 
-```{r filters}
+```{r  filters }
 supportedFilters(edb)
+ 
 ```
 
 These filters can be divided into 3 main filter types:
@@ -141,6 +149,11 @@ filters are also available:
 These can however only be used on `EnsDb` databases that provide protein
 annotations, i.e. for which a call to `hasProteinData` returns `TRUE`.
 
+`EnsDb` databases for more recent Ensembl versions (starting from Ensembl 87)
+provide also evidence levels for individual transcripts in the `tx_support_level`
+database column. Such databases support also a `TxSupportLevelFilter` filter to
+use this columns for filtering.
+
 A simple use case for the filter framework would be to get all transcripts for
 the gene *BCL2L11*. To this end we specify a `GenenameFilter` with the value
 *BCL2L11*. As a result we get a `GRanges` object with `start`, `end`, `strand` and `seqname`
@@ -150,7 +163,7 @@ columns. Alternatively, by setting `return.type` to "DataFrame", or "data.frame"
 the method would return a `DataFrame` or `data.frame` object instead of the default
 `GRanges`.
 
-```{r transcripts}
+```{r  transcripts }
 Tx <- transcripts(edb, filter = list(GenenameFilter("BCL2L11")))
 
 Tx
@@ -160,6 +173,7 @@ head(start(Tx))
 
 ## or extract the biotype with
 head(Tx$tx_biotype)
+ 
 ```
 
 The parameter `columns` of the extractor methods (such as `exons`, `genes` or
@@ -175,26 +189,60 @@ was used, the column `gene_name` is also returned). Setting
 specified by the `columns` parameter are retrieved.
 
 Instead of passing a filter *object* to the method it is also possible to provide
-a filter *expression* written as a `formula`.
+a filter *expression* written as a `formula`. The `formula` has to be written in the
+form `~ <field> <condition> <value>` with `<field>` being the field (database
+column) in the database, `<condition>` the condition for the filter object and
+`<value>` its value. Use the `supportedFilter` method to get the field names
+corresponding to each filter class.
 
-```{r transcripts-filter-expression}
+```{r  transcripts-filter-expression }
 ## Use a filter expression to perform the filtering.
 transcripts(edb, filter = ~ genename == "ZBTB16")
+ 
 ```
 
 Filter expression have to be written as a formula (i.e. starting with a `~`) in
 the form *column name* followed by the logical condition.
 
+Alternatively, `EnsDb` objects can be filtered directly using the `filter`
+function. In the example below we use the `filter` function to filter the `EnsDb`
+object and pass that filtered database to the `transcripts` method using the `%>%`
+from the `magrittr` package.
+
+```{r  transcripts-filter }
+library(magrittr)
+
+filter(edb, ~ symbol == "BCL2" & tx_biotype != "protein_coding") %>% transcripts
+ 
+```
+
+Adding a filter to an `EnsDb` enables this filter (globally) on all subsequent
+queries on that object. We could thus filter an `EnsDb` to (virtually) contain
+only features encoded on chromosome Y.
+
+```{r  filter-Y }
+edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+## All subsequent filters on that EnsDb will only work on features encoded on
+## chromosome Y
+genes(edb_y)
+
+## Get all lincRNAs on chromosome Y
+genes(edb_y, filter = ~ gene_biotype == "lincRNA")
+ 
+```
+
 To get an overview of database tables and available columns the function
 `listTables` can be used. The method `listColumns` on the other hand lists columns
 for the specified database table.
 
-```{r list-columns}
+```{r  list-columns }
 ## list all database tables along with their columns
 listTables(edb)
 
 ## list columns from a specific table
 listColumns(edb, "tx")
+ 
 ```
 
 Thus, we could retrieve all transcripts of the biotype *nonsense\_mediated\_decay*
@@ -204,22 +252,24 @@ the name of the gene for each transcript. Note that we are changing here the
 `return.type` to `DataFrame`, so the method will return a `DataFrame` with the
 results instead of the default `GRanges`.
 
-```{r transcripts-example2}
+```{r  transcripts-example2 }
 Tx <- transcripts(edb,
-		  columns = c(listColumns(edb , "tx"), "gene_name"),
-		  filter = TxBiotypeFilter("nonsense_mediated_decay"),
-		  return.type = "DataFrame")
+                  columns = c(listColumns(edb , "tx"), "gene_name"),
+                  filter = TxBiotypeFilter("nonsense_mediated_decay"),
+                  return.type = "DataFrame")
 nrow(Tx)
 Tx
+ 
 ```
 
 For protein coding transcripts, we can also specifically extract their coding
 region. In the example below we extract the CDS for all transcripts encoded on
 chromosome Y.
 
-```{r cdsBy}
+```{r  cdsBy }
 yCds <- cdsBy(edb, filter = SeqNameFilter("Y"))
 yCds
+ 
 ```
 
 Using a `GRangesFilter` we can retrieve all features from the database that are
@@ -228,10 +278,10 @@ below we query all genes that are partially overlapping with a small region on
 chromosome 11. The filter restricts to all genes for which either an exon or an
 intron is partially overlapping with the region.
 
-```{r genes-GRangesFilter}
+```{r  genes-GRangesFilter }
 ## Define the filter
 grf <- GRangesFilter(GRanges("11", ranges = IRanges(114000000, 114000050),
-			     strand = "+"), type = "any")
+                             strand = "+"), type = "any")
 
 ## Query genes:
 gn <- genes(edb, filter = grf)
@@ -239,9 +289,10 @@ gn
 
 ## Next we retrieve all transcripts for that gene so that we can plot them.
 txs <- transcripts(edb, filter = GenenameFilter(gn$gene_name))
+ 
 ```
 
-```{r tx-for-zbtb16, message=FALSE, fig.align='center', fig.width=7.5, fig.height=5}
+```{r  tx-for-zbtb16, message=FALSE, fig.align='center', fig.width=7.5, fig.height=5 }
 plot(3, 3, pch = NA, xlim = c(start(gn), end(gn)), ylim = c(0, length(txs)),
      yaxt = "n", ylab = "")
 ## Highlight the GRangesFilter region
@@ -250,9 +301,10 @@ rect(xleft = start(grf), xright = end(grf), ybottom = 0, ytop = length(txs),
 for(i in 1:length(txs)) {
     current <- txs[i]
     rect(xleft = start(current), xright = end(current), ybottom = i-0.975,
-	 ytop = i-0.125, border = "grey")
+         ytop = i-0.125, border = "grey")
     text(start(current), y = i-0.5, pos = 4, cex = 0.75, labels = current$tx_id)
 }
+ 
 ```
 
 As we can see, 4 transcripts of the gene ZBTB16 are also overlapping the
@@ -260,8 +312,9 @@ region. Below we fetch these 4 transcripts. Note, that a call to `exons` will
 not return any features from the database, as no exon is overlapping with the
 region.
 
-```{r transcripts-GRangesFilter}
+```{r  transcripts-GRangesFilter }
 transcripts(edb, filter = grf)
+ 
 ```
 
 The `GRangesFilter` supports also `GRanges` defining multiple regions and a
@@ -275,20 +328,21 @@ to further fine-tune the query.
 The functions `listGenebiotypes` and `listTxbiotypes` can be used to get an overview
 of allowed/available gene and transcript biotype
 
-```{r biotypes}
+```{r  biotypes }
 ## Get all gene biotypes from the database. The GeneBiotypeFilter
 ## allows to filter on these values.
 listGenebiotypes(edb)
 
 ## Get all transcript biotypes from the database.
 listTxbiotypes(edb)
+ 
 ```
 
 Data can be fetched in an analogous way using the `exons` and `genes`
 methods. In the example below we retrieve `gene_name`, `entrezid` and the
 `gene_biotype` of all genes in the database which names start with "BCL2".
 
-```{r genes-BCL2}
+```{r  genes-BCL2 }
 ## We're going to fetch all genes which names start with BCL. To this end
 ## we define a GenenameFilter with partial matching, i.e. condition "like"
 ## and a % for any character/string.
@@ -298,6 +352,7 @@ BCLs <- genes(edb,
 	      return.type = "DataFrame")
 nrow(BCLs)
 BCLs
+ 
 ```
 
 Sometimes it might be useful to know the length of genes or transcripts
@@ -308,17 +363,18 @@ these chromosomes. For the first query we combine two `AnnotationFilter` objects
 using an `AnnotationFilterList` object, in the second we define the query using a
 filter expression.
 
-```{r example-AnnotationFilterList}
+```{r  example-AnnotationFilterList }
 ## determine the average length of snRNA, snoRNA and rRNA genes encoded on
 ## chromosomes X and Y.
 mean(lengthOf(edb, of = "tx", filter = AnnotationFilterList(
-				  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
-				  SeqNameFilter(c("X", "Y")))))
+                                  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
+                                  SeqNameFilter(c("X", "Y")))))
 
 ## determine the average length of protein coding genes encoded on the same
 ## chromosomes.
 mean(lengthOf(edb, of = "tx", filter = ~ gene_biotype == "protein_coding" &
-				  seq_name %in% c("X", "Y")))
+                                  seq_name %in% c("X", "Y")))
+ 
 ```
 
 Not unexpectedly, transcripts of protein coding genes are longer than those of
@@ -327,12 +383,13 @@ snRNA, snoRNA or rRNA genes.
 At last we extract the first two exons of each transcript model from the
 database.
 
-```{r example-first-two-exons}
+```{r  example-first-two-exons }
 ## Extract all exons 1 and (if present) 2 for all genes encoded on the
 ## Y chromosome
 exons(edb, columns = c("tx_id", "exon_idx"),
       filter = list(SeqNameFilter("Y"),
-		    ExonRankFilter(3, condition = "<")))
+                    ExonRankFilter(3, condition = "<")))
+ 
 ```
 
 
@@ -352,9 +409,10 @@ CDS.
 A simple use case is to retrieve all genes encoded on chromosomes X and Y from
 the database.
 
-```{r transcriptsBy-X-Y}
+```{r  transcriptsBy-X-Y }
 TxByGns <- transcriptsBy(edb, by = "gene", filter = SeqNameFilter(c("X", "Y")))
 TxByGns
+ 
 ```
 
 Since Ensembl contains also definitions of genes that are on chromosome variants
@@ -367,12 +425,13 @@ restrict to Ensembl genes only, as also *LRG* (Locus Reference Genomic)
 genes<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup> are defined in the database, which are partially redundant with
 Ensembl genes.
 
-```{r exonsBy-RNAseq, message = FALSE, eval = FALSE}
+```{r  exonsBy-RNAseq, message = FALSE, eval = FALSE }
 ## will just get exons for all genes on chromosomes 1 to 22, X and Y.
 ## Note: want to get rid of the "LRG" genes!!!
 EnsGenes <- exonsBy(edb, by = "gene", filter = AnnotationFilterList(
-					  SeqNameFilter(c(1:22, "X", "Y")),
-					  GeneIdFilter("ENSG", "startsWith")))
+                                          SeqNameFilter(c(1:22, "X", "Y")),
+                                          GeneIdFilter("ENSG", "startsWith")))
+ 
 ```
 
 The code above returns a `GRangesList` that can be used directly as an input for
@@ -382,9 +441,10 @@ Alternatively, the above `GRangesList` can be transformed to a `data.frame` in
 *SAF* format that can be used as an input to the `featureCounts` function of the
 `Rsubread` package <sup><a id="fnr.4" class="footref" href="#fn.4">4</a></sup>.
 
-```{r toSAF-RNAseq, message = FALSE, eval=FALSE}
+```{r  toSAF-RNAseq, message = FALSE, eval=FALSE }
 ## Transforming the GRangesList into a data.frame in SAF format
 EnsGenes.SAF <- toSAF(EnsGenes)
+ 
 ```
 
 Note that the ID by which the `GRangesList` is split is used in the SAF
@@ -396,11 +456,12 @@ In addition, the `disjointExons` function (similar to the one defined in
 `GenomicFeatures`) can be used to generate a `GRanges` of non-overlapping exon
 parts which can be used in the `DEXSeq` package.
 
-```{r disjointExons, message = FALSE, eval=FALSE}
+```{r  disjointExons, message = FALSE, eval=FALSE }
 ## Create a GRanges of non-overlapping exon parts.
 DJE <- disjointExons(edb, filter = AnnotationFilterList(
 			      SeqNameFilter(c(1:22, "X", "Y")),
 			      GeneIdFilter("ENSG%", "startsWith")))
+ 
 ```
 
 
@@ -425,7 +486,7 @@ the package, subset to genes encoded on sequences available in the `FaFile` and
 extract all of their sequences. Note: these sequences represent the sequence
 between the chromosomal start and end coordinates of the gene.
 
-```{r transcript-sequence-AnnotationHub, message = FALSE, eval = FALSE}
+```{r  transcript-sequence-AnnotationHub, message = FALSE, eval = FALSE }
 library(EnsDb.Hsapiens.v75)
 library(Rsamtools)
 edb <- EnsDb.Hsapiens.v75
@@ -443,13 +504,14 @@ genes <- genes[seqnames(genes) %in% seqnames(seqinfo(Dna))]
 ## Get the gene sequences, i.e. the sequence including the sequence of
 ## all of the gene's exons and introns.
 geneSeqs <- getSeq(Dna, genes)
+ 
 ```
 
 To retrieve the (exonic) sequence of transcripts (i.e. without introns) we can
 use directly the `extractTranscriptSeqs` method defined in the `GenomicFeatures` on
 the `EnsDb` object, eventually using a filter to restrict the query.
 
-```{r transcript-sequence-extractTranscriptSeqs, message = FALSE, eval = FALSE}
+```{r  transcript-sequence-extractTranscriptSeqs, message = FALSE, eval = FALSE }
 ## get all exons of all transcripts encoded on chromosome Y
 yTx <- exonsBy(edb, filter = SeqNameFilter("Y"))
 
@@ -465,6 +527,7 @@ yTx <- extractTranscriptSeqs(Dna, edb, filter = SeqNameFilter("Y"))
 ## of all transcripts on the Y chromosome.
 cdsY <- cdsBy(edb, filter = SeqNameFilter("Y"))
 extractTranscriptSeqs(Dna, cdsY)
+ 
 ```
 
 Note: in the next section we describe how transcript sequences can be retrieved
@@ -485,7 +548,7 @@ UCSC, NCBI and Ensembl chromosome names for the *main* chromosomes).
 
 In the example below we change the seqnames style to UCSC.
 
-```{r seqlevelsStyle, message = FALSE}
+```{r  seqlevelsStyle, message = FALSE }
 ## Change the seqlevels style form Ensembl (default) to UCSC:
 seqlevelsStyle(edb) <- "UCSC"
 
@@ -493,6 +556,7 @@ seqlevelsStyle(edb) <- "UCSC"
 genesY <- genes(edb, filter = ~ seq_name == "chrY")
 ## The seqlevels of the returned GRanges are also in UCSC style
 seqlevels(genesY)
+ 
 ```
 
 Note that in most instances no mapping is available for sequences not
@@ -504,7 +568,7 @@ ones from Ensembl) are returned. With `ensembldb.seqnameNotFound` "MISSING" each
 time a seqname can not be found an error is thrown. For all other cases
 (e.g. `ensembldb.seqnameNotFound = NA`) the value of the option is returned.
 
-```{r seqlevelsStyle-2, message = FALSE}
+```{r  seqlevelsStyle-2, message = FALSE }
 seqlevelsStyle(edb) <- "UCSC"
 
 ## Getting the default option:
@@ -520,6 +584,7 @@ seqlevels(edb)[1:30]
 
 ## Resetting the option.
 options(ensembldb.seqnameNotFound = "ORIGINAL")
+ 
 ```
 
 Next we retrieve transcript sequences from genes encoded on chromosome Y using
@@ -528,7 +593,7 @@ the `BSGenome` package for the human genome from UCSC. The specified version
 while we changed the style of the seqnames to UCSC we did not change the naming
 of the genome release.
 
-```{r extractTranscriptSeqs-BSGenome, warning = FALSE, message = FALSE}
+```{r  extractTranscriptSeqs-BSGenome, warning = FALSE, message = FALSE }
 library(BSgenome.Hsapiens.UCSC.hg19)
 bsg <- BSgenome.Hsapiens.UCSC.hg19
 
@@ -546,14 +611,16 @@ yTxSeqs
 ## Extract just the CDS
 Test <- cdsBy(edb, "tx", filter = SeqNameFilter("chrY"))
 yTxCds <- extractTranscriptSeqs(bsg, cdsBy(edb, "tx",
-					   filter = SeqNameFilter("chrY")))
+                                           filter = SeqNameFilter("chrY")))
 yTxCds
+ 
 ```
 
 At last changing the seqname style to the default value `"Ensembl"`.
 
-```{r seqlevelsStyle-restore}
+```{r  seqlevelsStyle-restore }
 seqlevelsStyle(edb) <- "Ensembl"
+ 
 ```
 
 
@@ -584,7 +651,7 @@ not necessary if we just want to retrieve gene models from an `EnsDb` object, as
 the `ensembldb` package internally checks the `ucscChromosomeNames` option and,
 depending on that, maps Ensembl chromosome names to UCSC chromosome names.
 
-```{r gviz-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=2.3}
+```{r  gviz-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=2.3 }
 ## Loading the Gviz library
 library(Gviz)
 library(EnsDb.Hsapiens.v75)
@@ -593,7 +660,7 @@ edb <- EnsDb.Hsapiens.v75
 ## Retrieving a Gviz compatible GRanges object with all genes
 ## encoded on chromosome Y.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "Y",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 
@@ -604,6 +671,7 @@ options(ucscChromosomeNames = FALSE)
 plotTracks(list(gat, GeneRegionTrack(gr)))
 
 options(ucscChromosomeNames = TRUE)
+ 
 ```
 
 Above we had to change the option `ucscChromosomeNames` to `FALSE` in order to
@@ -612,55 +680,59 @@ change the `seqnamesStyle` of the `EnsDb` object to `UCSC`. Note that we have to
 use now also chromosome names in the *UCSC style* in the `SeqNameFilter`
 (i.e. "chrY" instead of `Y`).
 
-```{r message=FALSE}
+```{r  message=FALSE }
 seqlevelsStyle(edb) <- "UCSC"
 ## Retrieving the GRanges objects with seqnames corresponding to UCSC chromosome names.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 seqnames(gr)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 plotTracks(list(gat, GeneRegionTrack(gr)))
+ 
 ```
 
 We can also use the filters from the `ensembldb` package to further refine what
 transcripts are fetched, like in the example below, in which we create two
 different gene region tracks, one for protein coding genes and one for lincRNAs.
 
-```{r gviz-separate-tracks, message=FALSE, warning=FALSE, fig.align='center', fig.width=7.5, fig.height=2.25}
+```{r  gviz-separate-tracks, message=FALSE, warning=FALSE, fig.align='center', fig.width=7.5, fig.height=2.25 }
 protCod <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				     start = 20400000, end = 21400000,
-				     filter = GeneBiotypeFilter("protein_coding"))
+                                     start = 20400000, end = 21400000,
+                                     filter = GeneBiotypeFilter("protein_coding"))
 lincs <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				   start = 20400000, end = 21400000,
-				   filter = GeneBiotypeFilter("lincRNA"))
+                                   start = 20400000, end = 21400000,
+                                   filter = GeneBiotypeFilter("lincRNA"))
 
 plotTracks(list(gat, GeneRegionTrack(protCod, name = "protein coding"),
-		GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
+                GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
 
 ## At last we change the seqlevels style again to Ensembl
 seqlevelsStyle <- "Ensembl"
+ 
 ```
 
 Alternatively, we can also use `ggbio` for plotting. For `ggplot` we can directly
 pass the `EnsDb` object along with optional filters (or as in the example below a
 filter expression as a `formula`).
 
-```{r pplot-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4}
+```{r  pplot-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4 }
 library(ggbio)
 
 ## Create a plot for all transcripts of the gene SKA2
 autoplot(edb, ~ genename == "SKA2")
+ 
 ```
 
 To plot the genomic region and plot genes from both strands we can use a
 `GRangesFilter`.
 
-```{r pplot-plot-2, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4}
+```{r  pplot-plot-2, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4 }
 ## Get the chromosomal region in which the gene is encoded
 ska2 <- genes(edb, filter = ~ genename == "SKA2")
 strand(ska2) <- "*"
 autoplot(edb, GRangesFilter(ska2), names.expr = "gene_name")
+ 
 ```
 
 
@@ -676,7 +748,7 @@ In the example below we first evaluate all the available columns and keytypes in
 the database and extract then the gene names for all genes encoded on chromosome
 X.
 
-```{r AnnotationDbi, message = FALSE}
+```{r  AnnotationDbi, message = FALSE }
 library(EnsDb.Hsapiens.v75)
 edb <- EnsDb.Hsapiens.v75
 
@@ -699,6 +771,7 @@ length(gids)
 ## Get all gene names for genes encoded on chromosome Y.
 gnames <- keys(edb, keytype = "GENENAME", filter = SeqNameFilter("Y"))
 head(gnames)
+ 
 ```
 
 In the next example we retrieve specific information from the database using the
@@ -707,22 +780,23 @@ In the next example we retrieve specific information from the database using the
 we employ the filtering system to perform a more fine-grained query to fetch
 only the protein coding transcripts for these genes.
 
-```{r select, message = FALSE, warning=FALSE}
+```{r  select, message = FALSE, warning=FALSE }
 ## Use the /standard/ way to fetch data.
 select(edb, keys = c("BCL2", "BCL2L11"), keytype = "GENENAME",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))
 
 ## Use the filtering system of ensembldb
 select(edb, keys = ~ genename %in% c("BCL2", "BCL2L11") &
-		tx_biotype == "protein_coding",
+                tx_biotype == "protein_coding",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))
+ 
 ```
 
 Finally, we use the `mapIds` method to establish a mapping between ids and
 values. In the example below we fetch transcript ids for the two genes from the
 example above.
 
-```{r mapIds, message = FALSE}
+```{r  mapIds, message = FALSE }
 ## Use the default method, which just returns the first value for multi mappings.
 mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME")
 
@@ -732,8 +806,9 @@ mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME",
 
 ## And, just like before, we can use filters to map only to protein coding transcripts.
 mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
-			TxBiotypeFilter("protein_coding")), column = "TXID",
+                        TxBiotypeFilter("protein_coding")), column = "TXID",
        multiVals = "list")
+ 
 ```
 
 Note that, if the filters are used, the ordering of the result does no longer
@@ -790,30 +865,33 @@ annotations from Ensembl version 86.
 Since Bioconductor version 3.5 `EnsDb` databases can also be retrieved directly
 from `AnnotationHub`.
 
-```{r AnnotationHub-query, message = FALSE, eval = use_network}
+```{r  AnnotationHub-query, message = FALSE, eval = use_network }
 library(AnnotationHub)
 ## Load the annotation resource.
 ah <- AnnotationHub()
 
 ## Query for all available EnsDb databases
 query(ah, "EnsDb")
+ 
 ```
 
 We can simply fetch one of the databases.
 
-```{r AnnotationHub-query-2, message = FALSE, eval = use_network}
+```{r  AnnotationHub-query-2, message = FALSE, eval = use_network }
 ahDb <- query(ah, pattern = c("Xiphophorus Maculatus", "EnsDb", 87))
 ## What have we got
 ahDb
+ 
 ```
 
 Fetch the `EnsDb` and use it.
 
-```{r AnnotationHub-fetch, message = FALSE, eval = FALSE}
+```{r  AnnotationHub-fetch, message = FALSE, eval = FALSE }
 ahEdb <- ahDb[[1]]
 
 ## retriebe all genes
 gns <- genes(ahEdb)
+ 
 ```
 
 We could even make an annotation package from this `EnsDb` object using the
@@ -835,7 +913,7 @@ the Ensembl core databases. The `makeEnsembldbPackage` function is then used to
 create an annotation package from this `EnsDb` containing all human genes for
 Ensembl version 75.
 
-```{r edb-from-ensembl, message = FALSE, eval = FALSE}
+```{r  edb-from-ensembl, message = FALSE, eval = FALSE }
 library(ensembldb)
 
 ## get all human gene/transcript/exon annotations from Ensembl (75)
@@ -850,8 +928,9 @@ DBFile <- makeEnsemblSQLiteFromTables()
 
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DBFile, version = "0.99.12",
-		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-		     author = "J Rainer")
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")
+ 
 ```
 
 The generated package can then be build using `R CMD build EnsDb.Hsapiens.v75`
@@ -887,7 +966,7 @@ then use the `getGenomeFaFile` method on the `EnsDb` to directly look up and
 retrieve the correct or best matching `FaFile` with the genomic DNA sequence. At
 last we retrieve the sequences of all exons using the `getSeq` method.
 
-```{r gtf-gff-edb, message = FALSE, eval = FALSE}
+```{r  gtf-gff-edb, message = FALSE, eval = FALSE }
 ## Load the AnnotationHub data.
 library(AnnotationHub)
 ah <- AnnotationHub()
@@ -914,13 +993,14 @@ exonSeq <- getSeq(Dna, exons)
 
 ## Alternatively, look up and retrieve the toplevel DNA sequence manually.
 Dna <- ah[["AH22042"]]
+ 
 ```
 
 In the example below we load a `GRanges` containing gene definitions for genes
 encoded on chromosome Y and generate a `EnsDb` SQLite database from that
 information.
 
-```{r EnsDb-from-Y-GRanges, message = FALSE, eval = use_network}
+```{r  EnsDb-from-Y-GRanges, message = FALSE, eval = use_network }
 ## Generate a sqlite database from a GRanges object specifying
 ## genes encoded on chromosome Y
 load(system.file("YGRanges.RData", package = "ensembldb"))
@@ -933,6 +1013,7 @@ DB <- ensDbFromGRanges(Y, path = tempdir(), version = 75,
 ## Load the database
 edb <- EnsDb(DB)
 edb
+ 
 ```
 
 Alternatively we can build the annotation database using the `ensDbFromGtf`
@@ -947,7 +1028,7 @@ length information automatically from Ensembl.
 
 Below we create the annotation from a gtf file that we fetch directly from Ensembl.
 
-```{r EnsDb-from-GTF, message = FALSE, eval = FALSE}
+```{r  EnsDb-from-GTF, message = FALSE, eval = FALSE }
 library(ensembldb)
 
 ## the GTF file can be downloaded from
@@ -962,27 +1043,22 @@ EDB <- EnsDb(DB)
 ## alternatively, build the annotation package
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
-		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-		     author = "J Rainer")
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")
+ 
 ```
 
 
-# Database layout<a id="org35014ed"></a>
+# Database layout<a id="org5bd9a97"></a>
 
 The database consists of the following tables and attributes (the layout is also
-shown in Figure [159](#org6a42233)). Note that the protein-specific annotations
+shown in Figure [165](#orgfd622d5)). Note that the protein-specific annotations
 might not be available in all `EnsDB` databases (e.g. such ones created with
 `ensembldb` version < 1.7 or created from GTF or GFF files).
 
 -   **gene**: all gene specific annotations.
     -   `gene_id`: the Ensembl ID of the gene.
     -   `gene_name`: the name (symbol) of the gene.
-<<<<<<< variant A
-    -   `entrezid`: the NCBI Entrezgene ID(s) of the gene. Note that this can be a
-        `;` separated list of IDs for genes that are mapped to more than one
-        Entrezgene.
->>>>>>> variant B
-======= end
     -   `gene_biotype`: the biotype of the gene.
     -   `gene_seq_start`: the start coordinate of the gene on the sequence (usually
         a chromosome).
@@ -990,6 +1066,7 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
     -   `seq_name`: the name of the sequence (usually the chromosome name).
     -   `seq_strand`: the strand on which the gene is encoded.
     -   `seq_coord_system`: the coordinate system of the sequence.
+    -   `description`: the description of the gene.
 
 -   **entrezgene**: mapping of Ensembl genes to NCBI Entrezgene identifiers. Note that
     this mapping can be a one-to-many mapping.
@@ -1000,6 +1077,7 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
     is available in this database column, all methods to retrieve data from the
     database support also this column. The returned values are however the ID of
     the transcripts.
+    
     -   `tx_id`: the Ensembl transcript ID.
     -   `tx_biotype`: the biotype of the transcript.
     -   `tx_seq_start`: the start coordinate of the transcript.
@@ -1008,6 +1086,10 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
         transcript (NULL for non-coding transcripts).
     -   `tx_cds_seq_end`: the end coordinate of the coding region of the transcript.
     -   `gene_id`: the gene to which the transcript belongs.
+    
+    `EnsDb` databases for more recent Ensembl releases have also a column
+    `tx_support_level` providing the evidence level for a transcript (1 high
+    evidence, 5 low evidence, NA no evidence calculated).
 
 -   **exon**: all exon related annotation.
     -   `exon_id`: the Ensembl exon ID.
diff --git a/inst/doc/ensembldb.html b/inst/doc/ensembldb.html
index 8952374..ae933ae 100644
--- a/inst/doc/ensembldb.html
+++ b/inst/doc/ensembldb.html
@@ -11,6 +11,7 @@
 
 <meta name="author" content="Johannes Rainer" />
 
+<meta name="date" content="2017-10-30" />
 
 <title>Generating an using Ensembl based annotation packages</title>
 
@@ -68,7 +69,7 @@ h6 {
 }
 </style>
 
-<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0A%7D%0Abody%2C%20td%20%7B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B [...]
+<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0A%7D%0Adiv%23TOC%20%3E%20 [...]
 
 </head>
 
@@ -209,16 +210,16 @@ div.tocify {
 
 <h1 class="title toc-ignore">Generating an using Ensembl based annotation packages</h1>
 <p class="author-name">Johannes Rainer</p>
-<h4 class="date"><em>4 August 2017</em></h4>
+<h4 class="date"><em>30 October 2017</em></h4>
 <h4 class="package">Package</h4>
-<p>ensembldb 2.0.4</p>
+<p>ensembldb 2.2.0</p>
 
 </div>
 
 
 <div id="introduction" class="section level1">
 <h1><span class="header-section-number">1</span> Introduction</h1>
-<p>The <code>ensembldb</code> package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl <sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup> using their Perl API. The functionality and data is similar to that of the <code>TxDb</code> packages from the <code>GenomicFeatures</code> package, but, in addition to retrieve all gene/transcript models and annotations from the database,  [...]
+<p>The <code>ensembldb</code> package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl <sup><a id="fnr.1" class="footref" href="#fn.1">1</a></sup> using their Perl API. The functionality and data is similar to that of the <code>TxDb</code> packages from the <code>GenomicFeatures</code> package, but, in addition to retrieve all gene/transcript models and annotations from the database,  [...]
 <p>Another main goal of this package is to generate <em>versioned</em> annotation packages, i.e. annotation packages that are build for a specific Ensembl release, and are also named according to that (e.g. <code>EnsDb.Hsapiens.v75</code> for human gene definitions of the Ensembl code database version 75). This ensures reproducibility, as it allows to load annotations from a specific Ensembl release also if newer versions of annotation packages/releases are available. It also allows to l [...]
 <p>In the example below we load an Ensembl based annotation package for Homo sapiens, Ensembl version 75. The <code>EnsDb</code> object providing access to the underlying SQLite database is bound to the variable name <code>EnsDb.Hsapiens.v75</code>.</p>
 <pre class="r"><code>library(EnsDb.Hsapiens.v75)
@@ -233,13 +234,14 @@ edb</code></pre>
 ## |Type of Gene ID: Ensembl Gene ID
 ## |Supporting package: ensembldb
 ## |Db created by: ensembldb package from Bioconductor
-## |script_version: 0.2.3
-## |Creation time: Tue Nov 15 23:35:19 2016
+## |script_version: 0.3.0
+## |Creation time: Thu May 18 09:15:45 2017
 ## |ensembl_version: 75
 ## |ensembl_host: localhost
 ## |Organism: homo_sapiens
+## |taxonomy_id: 9606
 ## |genome_build: GRCh37
-## |DBSCHEMAVERSION: 1.0
+## |DBSCHEMAVERSION: 2.0
 ## | No. of genes: 64102.
 ## | No. of transcripts: 215647.
 ## |Protein data available.</code></pre>
@@ -249,20 +251,34 @@ organism(edb)</code></pre>
 </div>
 <div id="using-ensembldb-annotation-packages-to-retrieve-specific-annotations" class="section level1">
 <h1><span class="header-section-number">2</span> Using <code>ensembldb</code> annotation packages to retrieve specific annotations</h1>
-<p>One of the strengths of the <code>ensembldb</code> package and the related <code>EnsDb</code> databases is its implementation of a filter framework that enables to efficiently extract data sub-sets from the databases. The <code>ensembldb</code> package supports most of the filters defined in the <code>AnnotationFilter</code> Bioconductor package and defines some additional filters specific to the data stored in <code>EnsDb</code> databases. The <code>supportedFilters</code> method can [...]
+<p>One of the strengths of the <code>ensembldb</code> package and the related <code>EnsDb</code> databases is its implementation of a filter framework that enables to efficiently extract data sub-sets from the databases. The <code>ensembldb</code> package supports most of the filters defined in the <code>AnnotationFilter</code> Bioconductor package and defines some additional filters specific to the data stored in <code>EnsDb</code> databases. Filters can be passed directly to all method [...]
+<p>The <code>supportedFilters</code> method can be used to get an overview over all supported filter classes, each of them (except the <code>GRangesFilter</code>) working on a single column/field in the database.</p>
 <pre class="r"><code>supportedFilters(edb)</code></pre>
-<pre><code>##  [1] "EntrezFilter"             "ExonEndFilter"           
-##  [3] "ExonIdFilter"             "ExonRankFilter"          
-##  [5] "ExonStartFilter"          "GRangesFilter"           
-##  [7] "GeneBiotypeFilter"        "GeneEndFilter"           
-##  [9] "GeneIdFilter"             "GeneStartFilter"         
-## [11] "GenenameFilter"           "ProtDomIdFilter"         
-## [13] "ProteinIdFilter"          "SeqNameFilter"           
-## [15] "SeqStrandFilter"          "SymbolFilter"            
-## [17] "TxBiotypeFilter"          "TxEndFilter"             
-## [19] "TxIdFilter"               "TxNameFilter"            
-## [21] "TxStartFilter"            "UniprotDbFilter"         
-## [23] "UniprotFilter"            "UniprotMappingTypeFilter"</code></pre>
+<pre><code>##                      filter                field
+## 1              EntrezFilter               entrez
+## 2             ExonEndFilter             exon_end
+## 3              ExonIdFilter              exon_id
+## 4            ExonRankFilter            exon_rank
+## 5           ExonStartFilter           exon_start
+## 6             GRangesFilter                 <NA>
+## 7         GeneBiotypeFilter         gene_biotype
+## 8             GeneEndFilter             gene_end
+## 9              GeneIdFilter              gene_id
+## 10          GeneStartFilter           gene_start
+## 11           GenenameFilter             genename
+## 12          ProtDomIdFilter          prot_dom_id
+## 13          ProteinIdFilter           protein_id
+## 14            SeqNameFilter             seq_name
+## 15          SeqStrandFilter           seq_strand
+## 16             SymbolFilter               symbol
+## 17          TxBiotypeFilter           tx_biotype
+## 18              TxEndFilter               tx_end
+## 19               TxIdFilter                tx_id
+## 20             TxNameFilter              tx_name
+## 21            TxStartFilter             tx_start
+## 22          UniprotDbFilter           uniprot_db
+## 23            UniprotFilter              uniprot
+## 24 UniprotMappingTypeFilter uniprot_mapping_type</code></pre>
 <p>These filters can be divided into 3 main filter types:</p>
 <ul>
 <li><code>IntegerFilter</code>: filter classes extending this basic object can take a single numeric value as input and support the conditions <code>=, !</code>, >, <, >= and <=. All filters that work on chromosomal coordinates, such as the <code>GeneEndFilter</code> extend <code>IntegerFilter</code>.</li>
@@ -300,6 +316,7 @@ organism(edb)</code></pre>
 <li><code>UniprotMappingTypeFilter</code>: filter by the mapping type of Ensembl protein IDs to Uniprot IDs.</li>
 </ul>
 <p>These can however only be used on <code>EnsDb</code> databases that provide protein annotations, i.e. for which a call to <code>hasProteinData</code> returns <code>TRUE</code>.</p>
+<p><code>EnsDb</code> databases for more recent Ensembl versions (starting from Ensembl 87) provide also evidence levels for individual transcripts in the <code>tx_support_level</code> database column. Such databases support also a <code>TxSupportLevelFilter</code> filter to use this columns for filtering.</p>
 <p>A simple use case for the filter framework would be to get all transcripts for the gene <em>BCL2L11</em>. To this end we specify a <code>GenenameFilter</code> with the value <em>BCL2L11</em>. As a result we get a <code>GRanges</code> object with <code>start</code>, <code>end</code>, <code>strand</code> and <code>seqname</code> being the start coordinate, end coordinate, chromosome name and strand for the respective transcripts. All additional annotations are available as metadata colu [...]
 <pre class="r"><code>Tx <- transcripts(edb, filter = list(GenenameFilter("BCL2L11")))
 
@@ -354,7 +371,7 @@ head(Tx$tx_biotype)</code></pre>
 <pre><code>## [1] "protein_coding" "protein_coding" "protein_coding" "protein_coding"
 ## [5] "protein_coding" "protein_coding"</code></pre>
 <p>The parameter <code>columns</code> of the extractor methods (such as <code>exons</code>, <code>genes</code> or <code>transcripts)</code> allows to specify which database attributes (columns) should be retrieved. The <code>exons</code> method returns by default all exon-related columns, the <code>transcripts</code> all columns from the transcript database table and the <code>genes</code> all from the gene table. Note however that in the example above we got also a column <code>gene_nam [...]
-<p>Instead of passing a filter <em>object</em> to the method it is also possible to provide a filter <em>expression</em> written as a <code>formula</code>.</p>
+<p>Instead of passing a filter <em>object</em> to the method it is also possible to provide a filter <em>expression</em> written as a <code>formula</code>. The <code>formula</code> has to be written in the form <code>~ <field> <condition> <value></code> with <code><field></code> being the field (database column) in the database, <code><condition></code> the condition for the filter object and <code><value></code> its value. Use the <code>supportedFilte [...]
 <pre class="r"><code>## Use a filter expression to perform the filtering.
 transcripts(edb, filter = ~ genename == "ZBTB16")</code></pre>
 <pre><code>## GRanges object with 9 ranges and 7 metadata columns:
@@ -394,14 +411,125 @@ transcripts(edb, filter = ~ genename == "ZBTB16")</code></pre>
 ##   -------
 ##   seqinfo: 1 sequence from GRCh37 genome</code></pre>
 <p>Filter expression have to be written as a formula (i.e. starting with a <code>~</code>) in the form <em>column name</em> followed by the logical condition.</p>
+<p>Alternatively, <code>EnsDb</code> objects can be filtered directly using the <code>filter</code> function. In the example below we use the <code>filter</code> function to filter the <code>EnsDb</code> object and pass that filtered database to the <code>transcripts</code> method using the <code>%>%</code> from the <code>magrittr</code> package.</p>
+<pre class="r"><code>library(magrittr)</code></pre>
+<pre><code>## 
+## Attaching package: 'magrittr'</code></pre>
+<pre><code>## The following object is masked from 'package:AnnotationFilter':
+## 
+##     not</code></pre>
+<pre class="r"><code>filter(edb, ~ symbol == "BCL2" & tx_biotype != "protein_coding") %>% transcripts</code></pre>
+<pre><code>## GRanges object with 1 range and 6 metadata columns:
+##                   seqnames               ranges strand |           tx_id
+##                      <Rle>            <IRanges>  <Rle> |     <character>
+##   ENST00000590515       18 [60795445, 60829102]      - | ENST00000590515
+##                             tx_biotype tx_cds_seq_start tx_cds_seq_end
+##                            <character>        <integer>      <integer>
+##   ENST00000590515 processed_transcript             <NA>           <NA>
+##                           gene_id         tx_name
+##                       <character>     <character>
+##   ENST00000590515 ENSG00000171791 ENST00000590515
+##   -------
+##   seqinfo: 1 sequence from GRCh37 genome</code></pre>
+<p>Adding a filter to an <code>EnsDb</code> enables this filter (globally) on all subsequent queries on that object. We could thus filter an <code>EnsDb</code> to (virtually) contain only features encoded on chromosome Y.</p>
+<pre class="r"><code>edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+## All subsequent filters on that EnsDb will only work on features encoded on
+## chromosome Y
+genes(edb_y)</code></pre>
+<pre><code>## GRanges object with 495 ranges and 6 metadata columns:
+##                   seqnames               ranges strand |         gene_id
+##                      <Rle>            <IRanges>  <Rle> |     <character>
+##   ENSG00000251841        Y   [2652790, 2652894]      + | ENSG00000251841
+##   ENSG00000184895        Y   [2654896, 2655740]      - | ENSG00000184895
+##   ENSG00000237659        Y   [2657868, 2658369]      + | ENSG00000237659
+##   ENSG00000232195        Y   [2696023, 2696259]      + | ENSG00000232195
+##   ENSG00000129824        Y   [2709527, 2800041]      + | ENSG00000129824
+##               ...      ...                  ...    ... .             ...
+##   ENSG00000224240        Y [28695572, 28695890]      + | ENSG00000224240
+##   ENSG00000227629        Y [28732789, 28737748]      - | ENSG00000227629
+##   ENSG00000237917        Y [28740998, 28780799]      - | ENSG00000237917
+##   ENSG00000231514        Y [28772667, 28773306]      - | ENSG00000231514
+##   ENSG00000235857        Y [59001391, 59001635]      + | ENSG00000235857
+##                     gene_name   gene_biotype seq_coord_system      symbol
+##                   <character>    <character>      <character> <character>
+##   ENSG00000251841  RNU6-1334P          snRNA       chromosome  RNU6-1334P
+##   ENSG00000184895         SRY protein_coding       chromosome         SRY
+##   ENSG00000237659  RNASEH2CP1     pseudogene       chromosome  RNASEH2CP1
+##   ENSG00000232195    TOMM22P2     pseudogene       chromosome    TOMM22P2
+##   ENSG00000129824      RPS4Y1 protein_coding       chromosome      RPS4Y1
+##               ...         ...            ...              ...         ...
+##   ENSG00000224240     CYCSP49     pseudogene       chromosome     CYCSP49
+##   ENSG00000227629  SLC25A15P1     pseudogene       chromosome  SLC25A15P1
+##   ENSG00000237917     PARP4P1     pseudogene       chromosome     PARP4P1
+##   ENSG00000231514     FAM58CP     pseudogene       chromosome     FAM58CP
+##   ENSG00000235857     CTBP2P1     pseudogene       chromosome     CTBP2P1
+##                   entrezid
+##                     <list>
+##   ENSG00000251841       NA
+##   ENSG00000184895     6736
+##   ENSG00000237659       NA
+##   ENSG00000232195       NA
+##   ENSG00000129824     6192
+##               ...      ...
+##   ENSG00000224240       NA
+##   ENSG00000227629       NA
+##   ENSG00000237917       NA
+##   ENSG00000231514       NA
+##   ENSG00000235857       NA
+##   -------
+##   seqinfo: 1 sequence from GRCh37 genome</code></pre>
+<pre class="r"><code>## Get all lincRNAs on chromosome Y
+genes(edb_y, filter = ~ gene_biotype == "lincRNA")</code></pre>
+<pre><code>## GRanges object with 48 ranges and 6 metadata columns:
+##                   seqnames               ranges strand |         gene_id
+##                      <Rle>            <IRanges>  <Rle> |     <character>
+##   ENSG00000231535        Y   [2870953, 2970313]      + | ENSG00000231535
+##   ENSG00000229308        Y   [3904538, 3968361]      + | ENSG00000229308
+##   ENSG00000237069        Y   [6110487, 6111651]      - | ENSG00000237069
+##   ENSG00000229643        Y   [6225260, 6229454]      - | ENSG00000229643
+##   ENSG00000129816        Y   [6258472, 6279605]      + | ENSG00000129816
+##               ...      ...                  ...    ... .             ...
+##   ENSG00000228296        Y [27209230, 27246039]      - | ENSG00000228296
+##   ENSG00000223641        Y [27329790, 27330920]      - | ENSG00000223641
+##   ENSG00000228786        Y [27524447, 27540866]      - | ENSG00000228786
+##   ENSG00000240450        Y [27629055, 27632852]      + | ENSG00000240450
+##   ENSG00000231141        Y [27874637, 27879535]      + | ENSG00000231141
+##                      gene_name gene_biotype seq_coord_system       symbol
+##                    <character>  <character>      <character>  <character>
+##   ENSG00000231535    LINC00278      lincRNA       chromosome    LINC00278
+##   ENSG00000229308   AC010084.1      lincRNA       chromosome   AC010084.1
+##   ENSG00000237069      TTTY23B      lincRNA       chromosome      TTTY23B
+##   ENSG00000229643    LINC00280      lincRNA       chromosome    LINC00280
+##   ENSG00000129816       TTTY1B      lincRNA       chromosome       TTTY1B
+##               ...          ...          ...              ...          ...
+##   ENSG00000228296       TTTY4C      lincRNA       chromosome       TTTY4C
+##   ENSG00000223641      TTTY17C      lincRNA       chromosome      TTTY17C
+##   ENSG00000228786 LINC00266-4P      lincRNA       chromosome LINC00266-4P
+##   ENSG00000240450     CSPG4P1Y      lincRNA       chromosome     CSPG4P1Y
+##   ENSG00000231141        TTTY3      lincRNA       chromosome        TTTY3
+##                               entrezid
+##                                 <list>
+##   ENSG00000231535            100873962
+##   ENSG00000229308                   NA
+##   ENSG00000237069     100101121,252955
+##   ENSG00000229643                   NA
+##   ENSG00000129816      100101116,50858
+##               ...                  ...
+##   ENSG00000228296 474150,474149,114761
+##   ENSG00000223641 474152,474151,252949
+##   ENSG00000228786                   NA
+##   ENSG00000240450               114758
+##   ENSG00000231141        474148,114760
+##   -------
+##   seqinfo: 1 sequence from GRCh37 genome</code></pre>
 <p>To get an overview of database tables and available columns the function <code>listTables</code> can be used. The method <code>listColumns</code> on the other hand lists columns for the specified database table.</p>
 <pre class="r"><code>## list all database tables along with their columns
 listTables(edb)</code></pre>
 <pre><code>## $gene
-##  [1] "gene_id"          "gene_name"        "entrezid"        
-##  [4] "gene_biotype"     "gene_seq_start"   "gene_seq_end"    
-##  [7] "seq_name"         "seq_strand"       "seq_coord_system"
-## [10] "symbol"          
+## [1] "gene_id"          "gene_name"        "gene_biotype"    
+## [4] "gene_seq_start"   "gene_seq_end"     "seq_name"        
+## [7] "seq_strand"       "seq_coord_system" "symbol"          
 ## 
 ## $tx
 ## [1] "tx_id"            "tx_biotype"       "tx_seq_start"    
@@ -428,6 +556,9 @@ listTables(edb)</code></pre>
 ## [1] "protein_id"            "protein_domain_id"     "protein_domain_source"
 ## [4] "interpro_accession"    "prot_dom_start"        "prot_dom_end"         
 ## 
+## $entrezgene
+## [1] "gene_id"  "entrezid"
+## 
 ## $metadata
 ## [1] "name"  "value"</code></pre>
 <pre class="r"><code>## list columns from a specific table
@@ -437,9 +568,9 @@ listColumns(edb, "tx")</code></pre>
 ## [7] "gene_id"          "tx_name"</code></pre>
 <p>Thus, we could retrieve all transcripts of the biotype <em>nonsense_mediated_decay</em> (which, according to the definitions by Ensembl are transcribed, but most likely not translated in a protein, but rather degraded after transcription) along with the name of the gene for each transcript. Note that we are changing here the <code>return.type</code> to <code>DataFrame</code>, so the method will return a <code>DataFrame</code> with the results instead of the default <code>GRanges</code>.</p>
 <pre class="r"><code>Tx <- transcripts(edb,
-          columns = c(listColumns(edb , "tx"), "gene_name"),
-          filter = TxBiotypeFilter("nonsense_mediated_decay"),
-          return.type = "DataFrame")
+                  columns = c(listColumns(edb , "tx"), "gene_name"),
+                  filter = TxBiotypeFilter("nonsense_mediated_decay"),
+                  return.type = "DataFrame")
 nrow(Tx)</code></pre>
 <pre><code>## [1] 13812</code></pre>
 <pre class="r"><code>Tx</code></pre>
@@ -547,7 +678,7 @@ yCds</code></pre>
 <p>Using a <code>GRangesFilter</code> we can retrieve all features from the database that are either within or overlapping the specified genomic region. In the example below we query all genes that are partially overlapping with a small region on chromosome 11. The filter restricts to all genes for which either an exon or an intron is partially overlapping with the region.</p>
 <pre class="r"><code>## Define the filter
 grf <- GRangesFilter(GRanges("11", ranges = IRanges(114000000, 114000050),
-                 strand = "+"), type = "any")
+                             strand = "+"), type = "any")
 
 ## Query genes:
 gn <- genes(edb, filter = grf)
@@ -556,12 +687,12 @@ gn</code></pre>
 ##                   seqnames                 ranges strand |         gene_id
 ##                      <Rle>              <IRanges>  <Rle> |     <character>
 ##   ENSG00000109906       11 [113930315, 114121398]      + | ENSG00000109906
-##                     gene_name    entrezid   gene_biotype seq_coord_system
-##                   <character> <character>    <character>      <character>
-##   ENSG00000109906      ZBTB16        7704 protein_coding       chromosome
-##                        symbol
-##                   <character>
-##   ENSG00000109906      ZBTB16
+##                     gene_name   gene_biotype seq_coord_system      symbol
+##                   <character>    <character>      <character> <character>
+##   ENSG00000109906      ZBTB16 protein_coding       chromosome      ZBTB16
+##                   entrezid
+##                     <list>
+##   ENSG00000109906     7704
 ##   -------
 ##   seqinfo: 1 sequence from GRCh37 genome</code></pre>
 <pre class="r"><code>## Next we retrieve all transcripts for that gene so that we can plot them.
@@ -574,10 +705,10 @@ rect(xleft = start(grf), xright = end(grf), ybottom = 0, ytop = length(txs),
 for(i in 1:length(txs)) {
     current <- txs[i]
     rect(xleft = start(current), xright = end(current), ybottom = i-0.975,
-     ytop = i-0.125, border = "grey")
+         ytop = i-0.125, border = "grey")
     text(start(current), y = i-0.5, pos = 4, cex = 0.75, labels = current$tx_id)
 }</code></pre>
-<p><img src=" [...]
+<p><img src=" [...]
 <p>As we can see, 4 transcripts of the gene ZBTB16 are also overlapping the region. Below we fetch these 4 transcripts. Note, that a call to <code>exons</code> will not return any features from the database, as no exon is overlapping with the region.</p>
 <pre class="r"><code>transcripts(edb, filter = grf)</code></pre>
 <pre><code>## GRanges object with 4 ranges and 6 metadata columns:
@@ -675,30 +806,30 @@ nrow(BCLs)</code></pre>
 <pre><code>## [1] 25</code></pre>
 <pre class="r"><code>BCLs</code></pre>
 <pre><code>## DataFrame with 25 rows and 4 columns
-##       gene_name    entrezid   gene_biotype         gene_id
-##     <character> <character>    <character>     <character>
-## 1         BCL10        8915 protein_coding ENSG00000142867
-## 2        BCL11A       53335 protein_coding ENSG00000119866
-## 3        BCL11B       64919 protein_coding ENSG00000127152
-## 4          BCL2         596 protein_coding ENSG00000171791
-## 5        BCL2A1         597 protein_coding ENSG00000140379
-## ...         ...         ...            ...             ...
-## 21        BCL7C        9274 protein_coding ENSG00000099385
-## 22         BCL9         607 protein_coding ENSG00000116128
-## 23         BCL9         607 protein_coding ENSG00000266095
-## 24        BCL9L      283149 protein_coding ENSG00000186174
-## 25       BCLAF1        9774 protein_coding ENSG00000029363</code></pre>
+##       gene_name entrezid   gene_biotype         gene_id
+##     <character>   <list>    <character>     <character>
+## 1         BCL10     8915 protein_coding ENSG00000142867
+## 2        BCL11A    53335 protein_coding ENSG00000119866
+## 3        BCL11B    64919 protein_coding ENSG00000127152
+## 4          BCL2      596 protein_coding ENSG00000171791
+## 5        BCL2A1      597 protein_coding ENSG00000140379
+## ...         ...      ...            ...             ...
+## 21        BCL7C     9274 protein_coding ENSG00000099385
+## 22         BCL9      607 protein_coding ENSG00000116128
+## 23         BCL9      607 protein_coding ENSG00000266095
+## 24        BCL9L   283149 protein_coding ENSG00000186174
+## 25       BCLAF1     9774 protein_coding ENSG00000029363</code></pre>
 <p>Sometimes it might be useful to know the length of genes or transcripts (i.e. the total sum of nucleotides covered by their exons). Below we calculate the mean length of transcripts from protein coding genes on chromosomes X and Y as well as the average length of snoRNA, snRNA and rRNA transcripts encoded on these chromosomes. For the first query we combine two <code>AnnotationFilter</code> objects using an <code>AnnotationFilterList</code> object, in the second we define the query us [...]
 <pre class="r"><code>## determine the average length of snRNA, snoRNA and rRNA genes encoded on
 ## chromosomes X and Y.
 mean(lengthOf(edb, of = "tx", filter = AnnotationFilterList(
-                  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
-                  SeqNameFilter(c("X", "Y")))))</code></pre>
+                                  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
+                                  SeqNameFilter(c("X", "Y")))))</code></pre>
 <pre><code>## [1] 116.3046</code></pre>
 <pre class="r"><code>## determine the average length of protein coding genes encoded on the same
 ## chromosomes.
 mean(lengthOf(edb, of = "tx", filter = ~ gene_biotype == "protein_coding" &
-                  seq_name %in% c("X", "Y")))</code></pre>
+                                  seq_name %in% c("X", "Y")))</code></pre>
 <pre><code>## [1] 1920</code></pre>
 <p>Not unexpectedly, transcripts of protein coding genes are longer than those of snRNA, snoRNA or rRNA genes.</p>
 <p>At last we extract the first two exons of each transcript model from the database.</p>
@@ -706,7 +837,7 @@ mean(lengthOf(edb, of = "tx", filter = ~ gene_biotype == "protein
 ## Y chromosome
 exons(edb, columns = c("tx_id", "exon_idx"),
       filter = list(SeqNameFilter("Y"),
-            ExonRankFilter(3, condition = "<")))</code></pre>
+                    ExonRankFilter(3, condition = "<")))</code></pre>
 <pre><code>## GRanges object with 1287 ranges and 3 metadata columns:
 ##                   seqnames               ranges strand |           tx_id
 ##                      <Rle>            <IRanges>  <Rle> |     <character>
@@ -807,8 +938,8 @@ TxByGns</code></pre>
 <pre class="r"><code>## will just get exons for all genes on chromosomes 1 to 22, X and Y.
 ## Note: want to get rid of the "LRG" genes!!!
 EnsGenes <- exonsBy(edb, by = "gene", filter = AnnotationFilterList(
-                      SeqNameFilter(c(1:22, "X", "Y")),
-                      GeneIdFilter("ENSG", "startsWith")))</code></pre>
+                                          SeqNameFilter(c(1:22, "X", "Y")),
+                                          GeneIdFilter("ENSG", "startsWith")))</code></pre>
 <p>The code above returns a <code>GRangesList</code> that can be used directly as an input for the <code>summarizeOverlaps</code> function from the <code>GenomicAlignments</code> package <sup><a id="fnr.3" class="footref" href="#fn.3">3</a></sup>.</p>
 <p>Alternatively, the above <code>GRangesList</code> can be transformed to a <code>data.frame</code> in <em>SAF</em> format that can be used as an input to the <code>featureCounts</code> function of the <code>Rsubread</code> package <sup><a id="fnr.4" class="footref" href="#fn.4">4</a></sup>.</p>
 <pre class="r"><code>## Transforming the GRangesList into a data.frame in SAF format
@@ -933,7 +1064,7 @@ yTxSeqs</code></pre>
 <pre class="r"><code>## Extract just the CDS
 Test <- cdsBy(edb, "tx", filter = SeqNameFilter("chrY"))
 yTxCds <- extractTranscriptSeqs(bsg, cdsBy(edb, "tx",
-                       filter = SeqNameFilter("chrY")))
+                                           filter = SeqNameFilter("chrY")))
 yTxCds</code></pre>
 <pre><code>##   A DNAStringSet instance of length 160
 ##       width seq                                          names               
@@ -968,7 +1099,7 @@ edb <- EnsDb.Hsapiens.v75
 ## Retrieving a Gviz compatible GRanges object with all genes
 ## encoded on chromosome Y.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "Y",
-                start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 
@@ -977,13 +1108,13 @@ gat <- GenomeAxisTrack()
 options(ucscChromosomeNames = FALSE)
 
 plotTracks(list(gat, GeneRegionTrack(gr)))</code></pre>
-<p><img src=" [...]
+<p><img src=" [...]
 <pre class="r"><code>options(ucscChromosomeNames = TRUE)</code></pre>
 <p>Above we had to change the option <code>ucscChromosomeNames</code> to <code>FALSE</code> in order to use it with non-UCSC chromosome names. Alternatively, we could however also change the <code>seqnamesStyle</code> of the <code>EnsDb</code> object to <code>UCSC</code>. Note that we have to use now also chromosome names in the <em>UCSC style</em> in the <code>SeqNameFilter</code> (i.e. “chrY” instead of <code>Y</code>).</p>
 <pre class="r"><code>seqlevelsStyle(edb) <- "UCSC"
 ## Retrieving the GRanges objects with seqnames corresponding to UCSC chromosome names.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-                start = 20400000, end = 21400000)</code></pre>
+                                start = 20400000, end = 21400000)</code></pre>
 <pre><code>## Warning in .formatSeqnameByStyleForQuery(x, sn, ifNotFound): Seqnames:
 ## Y could not be mapped to the seqlevels style of the database (Ensembl)!
 ## Returning the orginal seqnames for these.</code></pre>
@@ -995,18 +1126,18 @@ gr <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
 <pre class="r"><code>## Define a genome axis track
 gat <- GenomeAxisTrack()
 plotTracks(list(gat, GeneRegionTrack(gr)))</code></pre>
-<p><img src=" [...]
+<p><img src=" [...]
 <p>We can also use the filters from the <code>ensembldb</code> package to further refine what transcripts are fetched, like in the example below, in which we create two different gene region tracks, one for protein coding genes and one for lincRNAs.</p>
 <pre class="r"><code>protCod <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-                     start = 20400000, end = 21400000,
-                     filter = GeneBiotypeFilter("protein_coding"))
+                                     start = 20400000, end = 21400000,
+                                     filter = GeneBiotypeFilter("protein_coding"))
 lincs <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-                   start = 20400000, end = 21400000,
-                   filter = GeneBiotypeFilter("lincRNA"))
+                                   start = 20400000, end = 21400000,
+                                   filter = GeneBiotypeFilter("lincRNA"))
 
 plotTracks(list(gat, GeneRegionTrack(protCod, name = "protein coding"),
-        GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")</code></pre>
-<p><img src=" [...]
+                GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")</code></pre>
+<p><img src=" [...]
 <pre class="r"><code>## At last we change the seqlevels style again to Ensembl
 seqlevelsStyle <- "Ensembl"</code></pre>
 <p>Alternatively, we can also use <code>ggbio</code> for plotting. For <code>ggplot</code> we can directly pass the <code>EnsDb</code> object along with optional filters (or as in the example below a filter expression as a <code>formula</code>).</p>
@@ -1014,13 +1145,13 @@ seqlevelsStyle <- "Ensembl"</code></pre>
 
 ## Create a plot for all transcripts of the gene SKA2
 autoplot(edb, ~ genename == "SKA2")</code></pre>
-<p><img src=" [...]
+<p><img src=" [...]
 <p>To plot the genomic region and plot genes from both strands we can use a <code>GRangesFilter</code>.</p>
 <pre class="r"><code>## Get the chromosomal region in which the gene is encoded
 ska2 <- genes(edb, filter = ~ genename == "SKA2")
 strand(ska2) <- "*"
 autoplot(edb, GRangesFilter(ska2), names.expr = "gene_name")</code></pre>
-<p><img src=" [...]
+<p><img src=" [...]
 </div>
 <div id="using-ensdb-objects-in-the-annotationdbi-framework" class="section level1">
 <h1><span class="header-section-number">8</span> Using <code>EnsDb</code> objects in the <code>AnnotationDbi</code> framework</h1>
@@ -1048,8 +1179,8 @@ columns(edb)</code></pre>
 ## method.
 listColumns(edb)</code></pre>
 <pre><code>##  [1] "seq_name"              "seq_length"            "is_circular"          
-##  [4] "exon_id"               "exon_seq_start"        "exon_seq_end"         
-##  [7] "gene_id"               "gene_name"             "entrezid"             
+##  [4] "gene_id"               "entrezid"              "exon_id"              
+##  [7] "exon_seq_start"        "exon_seq_end"          "gene_name"            
 ## [10] "gene_biotype"          "gene_seq_start"        "gene_seq_end"         
 ## [13] "seq_strand"            "seq_coord_system"      "symbol"               
 ## [16] "name"                  "value"                 "tx_id"                
@@ -1102,7 +1233,7 @@ select(edb, keys = c("BCL2", "BCL2L11"), keytype = "GEN
 ## 22 ENSG00000153094  BCL2L11 ENST00000337565          protein_coding</code></pre>
 <pre class="r"><code>## Use the filtering system of ensembldb
 select(edb, keys = ~ genename %in% c("BCL2", "BCL2L11") &
-        tx_biotype == "protein_coding",
+                tx_biotype == "protein_coding",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))</code></pre>
 <pre><code>##             GENEID GENENAME            TXID      TXBIOTYPE
 ## 1  ENSG00000171791     BCL2 ENST00000398117 protein_coding
@@ -1138,7 +1269,7 @@ mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID
 ## [17] "ENST00000337565"</code></pre>
 <pre class="r"><code>## And, just like before, we can use filters to map only to protein coding transcripts.
 mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
-            TxBiotypeFilter("protein_coding")), column = "TXID",
+                        TxBiotypeFilter("protein_coding")), column = "TXID",
        multiVals = "list")</code></pre>
 <pre><code>## Warning in .mapIds(x = x, keys = keys, column = column, keytype = keytype, :
 ## Got 2 filter objects. Will use the keys of the first for the mapping!</code></pre>
@@ -1207,8 +1338,8 @@ DBFile <- makeEnsemblSQLiteFromTables()
 
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DBFile, version = "0.99.12",
-             maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-             author = "J Rainer")</code></pre>
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")</code></pre>
 <p>The generated package can then be build using <code>R CMD build EnsDb.Hsapiens.v75</code> and installed with <code>R CMD INSTALL EnsDb.Hsapiens.v75*</code>. Note that we could directly generate an <code>EnsDb</code> instance by loading the database file, i.e. by calling <code>edb <- EnsDb(DBFile)</code> and work with that annotation object.</p>
 <p>To fetch and build annotation packages for plant genomes (e.g. arabidopsis thaliana), the <em>Ensembl genomes</em> should be specified as a host, i.e. setting <code>host</code> to “mysql-eg-publicsql.ebi.ac.uk”, <code>port</code> to <code>4157</code> and <code>species</code> to e.g. “arabidopsis thaliana”.</p>
 </div>
@@ -1271,33 +1402,33 @@ EDB <- EnsDb(DB)
 ## alternatively, build the annotation package
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
-             maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-             author = "J Rainer")</code></pre>
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")</code></pre>
 </div>
 </div>
 </div>
 <div id="database-layout" class="section level1">
-<h1><span class="header-section-number">11</span> Database layout<a id="org35014ed"></a></h1>
-<p>The database consists of the following tables and attributes (the layout is also shown in Figure <a href="#org6a42233">159</a>). Note that the protein-specific annotations might not be available in all <code>EnsDB</code> databases (e.g. such ones created with <code>ensembldb</code> version < 1.7 or created from GTF or GFF files).</p>
+<h1><span class="header-section-number">11</span> Database layout<a id="org5bd9a97"></a></h1>
+<p>The database consists of the following tables and attributes (the layout is also shown in Figure <a href="#orgfd622d5">165</a>). Note that the protein-specific annotations might not be available in all <code>EnsDB</code> databases (e.g. such ones created with <code>ensembldb</code> version < 1.7 or created from GTF or GFF files).</p>
 <ul>
 <li><strong>gene</strong>: all gene specific annotations.
 <ul>
 <li><code>gene_id</code>: the Ensembl ID of the gene.</li>
-<li><code>gene_name</code>: the name (symbol) of the gene. <<<<<<< variant A</li>
-<li><code>entrezid</code>: the NCBI Entrezgene ID(s) of the gene. Note that this can be a <code>;</code> separated list of IDs for genes that are mapped to more than one Entrezgene. >>>>>>> variant B ======= end</li>
+<li><code>gene_name</code>: the name (symbol) of the gene.</li>
 <li><code>gene_biotype</code>: the biotype of the gene.</li>
 <li><code>gene_seq_start</code>: the start coordinate of the gene on the sequence (usually a chromosome).</li>
 <li><code>gene_seq_end</code>: the end coordinate of the gene on the sequence.</li>
 <li><code>seq_name</code>: the name of the sequence (usually the chromosome name).</li>
 <li><code>seq_strand</code>: the strand on which the gene is encoded.</li>
 <li><code>seq_coord_system</code>: the coordinate system of the sequence.</li>
+<li><code>description</code>: the description of the gene.</li>
 </ul></li>
 <li><strong>entrezgene</strong>: mapping of Ensembl genes to NCBI Entrezgene identifiers. Note that this mapping can be a one-to-many mapping.
 <ul>
 <li><code>gene_id</code>: the Ensembl gene ID.</li>
 <li><code>entrezid</code>: the NCBI Entrezgene ID.</li>
 </ul></li>
-<li><strong>tx</strong>: all transcript related annotations. Note that while no <code>tx_name</code> column is available in this database column, all methods to retrieve data from the database support also this column. The returned values are however the ID of the transcripts.
+<li><p><strong>tx</strong>: all transcript related annotations. Note that while no <code>tx_name</code> column is available in this database column, all methods to retrieve data from the database support also this column. The returned values are however the ID of the transcripts.</p>
 <ul>
 <li><code>tx_id</code>: the Ensembl transcript ID.</li>
 <li><code>tx_biotype</code>: the biotype of the transcript.</li>
@@ -1306,7 +1437,8 @@ makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
 <li><code>tx_cds_seq_start</code>: the start coordinate of the coding region of the transcript (NULL for non-coding transcripts).</li>
 <li><code>tx_cds_seq_end</code>: the end coordinate of the coding region of the transcript.</li>
 <li><code>gene_id</code>: the gene to which the transcript belongs.</li>
-</ul></li>
+</ul>
+<p><code>EnsDb</code> databases for more recent Ensembl releases have also a column <code>tx_support_level</code> providing the evidence level for a transcript (1 high evidence, 5 low evidence, NA no evidence calculated).</p></li>
 <li><strong>exon</strong>: all exon related annotation.
 <ul>
 <li><code>exon_id</code>: the Ensembl exon ID.</li>
@@ -1360,7 +1492,7 @@ makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
 </ul>
 <p>The database layout: as already described above, protein related annotations (green) might not be available in each <code>EnsDb</code> database.</p>
 <div class="figure">
-<img src=" [...]
+<img src=" [...]
 <p class="caption">img</p>
 </div>
 </div>
diff --git a/inst/doc/proteins.R b/inst/doc/proteins.R
index ef19851..9455111 100644
--- a/inst/doc/proteins.R
+++ b/inst/doc/proteins.R
@@ -2,6 +2,7 @@
 ## Globally switch off execution of code chunks
 evalMe <- FALSE
 haveProt <- FALSE
+ 
 
 ## ----loadlib, message = FALSE, eval = evalMe-------------------------------
 #  library(ensembldb)
@@ -9,29 +10,35 @@ haveProt <- FALSE
 #  edb <- EnsDb.Hsapiens.v75
 #  ## Evaluate whether we have protein annotation available
 #  hasProteinData(edb)
+#  
 
 ## ----listCols, message = FALSE, eval = evalMe------------------------------
 #  listTables(edb)
+#  
 
 ## ----haveprot, echo = FALSE, results = "hide", eval = evalMe---------------
 #  ## Use this to conditionally disable eval on following chunks
 #  haveProt <- hasProteinData(edb) & evalMe
+#  
 
 ## ----a_transcripts, eval = haveProt----------------------------------------
 #  ## Get also protein information for ZBTB16 transcripts
 #  txs <- transcripts(edb, filter = GenenameFilter("ZBTB16"),
-#  		   columns = c("protein_id", "uniprot_id", "tx_biotype"))
+#                     columns = c("protein_id", "uniprot_id", "tx_biotype"))
 #  txs
+#  
 
 ## ----a_transcripts_coding_noncoding, eval = haveProt-----------------------
 #  ## Subset to transcripts with tx_biotype other than protein_coding.
 #  txs[txs$tx_biotype != "protein_coding", c("uniprot_id", "tx_biotype",
-#  					  "protein_id")]
+#                                            "protein_id")]
+#  
 
 ## ----a_transcripts_coding, eval = haveProt---------------------------------
 #  ## List the protein IDs and uniprot IDs for the coding transcripts
 #  mcols(txs[txs$tx_biotype == "protein_coding",
-#  	  c("tx_id", "protein_id", "uniprot_id")])
+#            c("tx_id", "protein_id", "uniprot_id")])
+#  
 
 ## ----a_transcripts_coding_up, eval = haveProt------------------------------
 #  ## List all uniprot mapping types in the database.
@@ -42,8 +49,9 @@ haveProt <- FALSE
 #  ## on "DIRECT" mapping methods.
 #  txs <- transcripts(edb, filter = list(GenenameFilter("ZBTB16"),
 #  				      UniprotMappingTypeFilter("DIRECT")),
-#  		   columns = c("protein_id", "uniprot_id", "uniprot_db"))
+#                     columns = c("protein_id", "uniprot_id", "uniprot_db"))
 #  mcols(txs)
+#  
 
 ## ----a_genes_protdomid_filter, eval = haveProt-----------------------------
 #  ## Get all genes that encode a transcript encoding for a protein that contains
@@ -52,6 +60,7 @@ haveProt <- FALSE
 #  length(gns)
 #  
 #  sort(gns$gene_name)
+#  
 
 ## ----a_2_annotationdbi, message = FALSE, eval = haveProt-------------------
 #  ## Show all columns that are provided by the database
@@ -59,31 +68,37 @@ haveProt <- FALSE
 #  
 #  ## Show all key types/filters that are supported
 #  keytypes(edb)
+#  
 
 ## ----a_2_select, message = FALSE, eval = haveProt--------------------------
 #  select(edb, keys = "ZBTB16", keytype = "GENENAME",
 #         columns = "UNIPROTID")
+#  
 
 ## ----a_2_select_nmd, message = FALSE, eval = haveProt----------------------
 #  ## Call select, this time providing a GenenameFilter.
 #  select(edb, keys = GenenameFilter("ZBTB16"),
 #         columns = c("TXBIOTYPE", "UNIPROTID", "PROTEINID"))
+#  
 
 ## ----b_proteins, message = FALSE, eval = haveProt--------------------------
 #  ## Get all proteins and return them as an AAStringSet
 #  prts <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-#  		 return.type = "AAStringSet")
+#                   return.type = "AAStringSet")
 #  prts
+#  
 
 ## ----b_proteins_mcols, message = FALSE, eval = haveProt--------------------
 #  mcols(prts)
+#  
 
 ## ----b_proteins_prot_doms, message = FALSE, eval = haveProt----------------
 #  ## Get also protein domain annotations in addition to the protein annotations.
 #  pd <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-#  	       columns = c("tx_id", listColumns(edb, "protein_domain")),
-#  	       return.type = "AAStringSet")
+#                 columns = c("tx_id", listColumns(edb, "protein_domain")),
+#                 return.type = "AAStringSet")
 #  pd
+#  
 
 ## ----b_proteins_prot_doms_2, message = FALSE, eval = haveProt--------------
 #  ## The number of protein domains per protein:
@@ -91,4 +106,5 @@ haveProt <- FALSE
 #  
 #  ## The mcols
 #  mcols(pd)
+#  
 
diff --git a/inst/doc/proteins.Rmd b/inst/doc/proteins.Rmd
index 7bf98ab..7d8ee47 100644
--- a/inst/doc/proteins.Rmd
+++ b/inst/doc/proteins.Rmd
@@ -4,7 +4,7 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Querying protein features}
@@ -46,34 +46,38 @@ databases created through the Ensembl Perl API contain protein annotation, while
 databases created using `ensDbFromAH`, `ensDbFromGff`, `ensDbFromGRanges` and
 `ensDbFromGtf` don't.
 
-```{r doeval, echo = FALSE, results = "hide"}
+```{r  doeval, echo = FALSE, results = "hide" }
 ## Globally switch off execution of code chunks
 evalMe <- FALSE
 haveProt <- FALSE
+ 
 ```
 
-```{r loadlib, message = FALSE, eval = evalMe}
+```{r  loadlib, message = FALSE, eval = evalMe }
 library(ensembldb)
 library(EnsDb.Hsapiens.v75)
 edb <- EnsDb.Hsapiens.v75
 ## Evaluate whether we have protein annotation available
 hasProteinData(edb)
+ 
 ```
 
 If protein annotation is available, the additional tables and columns are also
 listed by the `listTables` and `listColumns` methods.
 
-```{r listCols, message = FALSE, eval = evalMe}
+```{r  listCols, message = FALSE, eval = evalMe }
 listTables(edb)
+ 
 ```
 
 In the following sections we show examples how to 1) fetch protein annotations
 as additional columns to gene/transcript annotations, 2) fetch protein
 annotation data and 3) map proteins to the genome.
 
-```{r haveprot, echo = FALSE, results = "hide", eval = evalMe}
+```{r  haveprot, echo = FALSE, results = "hide", eval = evalMe }
 ## Use this to conditionally disable eval on following chunks
 haveProt <- hasProteinData(edb) & evalMe
+ 
 ```
 
 
@@ -83,11 +87,12 @@ Protein annotations for (protein coding) transcripts can be retrieved by simply
 adding the desired annotation columns to the `columns` parameter of the e.g. `genes`
 or `transcripts` methods.
 
-```{r a_transcripts, eval = haveProt}
+```{r  a_transcripts, eval = haveProt }
 ## Get also protein information for ZBTB16 transcripts
 txs <- transcripts(edb, filter = GenenameFilter("ZBTB16"),
-		   columns = c("protein_id", "uniprot_id", "tx_biotype"))
+                   columns = c("protein_id", "uniprot_id", "tx_biotype"))
 txs
+ 
 ```
 
 The gene ZBTB16 has protein coding and non-coding transcripts, thus, we get the
@@ -95,10 +100,11 @@ protein ID for the coding- and `NA` for the non-coding transcripts. Note also th
 we have a transcript targeted for nonsense mediated mRNA-decay with a protein ID
 associated with it, but no Uniprot ID.
 
-```{r a_transcripts_coding_noncoding, eval = haveProt}
+```{r  a_transcripts_coding_noncoding, eval = haveProt }
 ## Subset to transcripts with tx_biotype other than protein_coding.
 txs[txs$tx_biotype != "protein_coding", c("uniprot_id", "tx_biotype",
-					  "protein_id")]
+                                          "protein_id")]
+ 
 ```
 
 While the mapping from a protein coding transcript to a Ensembl protein ID
@@ -109,10 +115,11 @@ each Uniprot ID can be mapped to more than one `protein_id` (and hence
 fetching Uniprot related additional columns or even protein ID features, as in
 such cases a redundant list of transcripts is returned.
 
-```{r a_transcripts_coding, eval = haveProt}
+```{r  a_transcripts_coding, eval = haveProt }
 ## List the protein IDs and uniprot IDs for the coding transcripts
 mcols(txs[txs$tx_biotype == "protein_coding",
-	  c("tx_id", "protein_id", "uniprot_id")])
+          c("tx_id", "protein_id", "uniprot_id")])
+ 
 ```
 
 Some of the n:m mappings for Uniprot IDs can be resolved by restricting either
@@ -122,7 +129,7 @@ certain type of mapping method. The corresponding filters are the
 `uniprot_mapping_type` columns of the `uniprot` database table). In the example
 below we restrict the result to Uniprot IDs with the mapping type *DIRECT*.
 
-```{r a_transcripts_coding_up, eval = haveProt}
+```{r  a_transcripts_coding_up, eval = haveProt }
 ## List all uniprot mapping types in the database.
 listUniprotMappingTypes(edb)
 
@@ -131,8 +138,9 @@ listUniprotMappingTypes(edb)
 ## on "DIRECT" mapping methods.
 txs <- transcripts(edb, filter = list(GenenameFilter("ZBTB16"),
 				      UniprotMappingTypeFilter("DIRECT")),
-		   columns = c("protein_id", "uniprot_id", "uniprot_db"))
+                   columns = c("protein_id", "uniprot_id", "uniprot_db"))
 mcols(txs)
+ 
 ```
 
 For this example the use of the `UniprotMappingTypeFilter` resolved the multiple
@@ -151,13 +159,14 @@ protein data to filter the results. In the example below we fetch for example
 all genes from the database that have a certain protein domain in the protein
 encoded by any of its transcripts.
 
-```{r a_genes_protdomid_filter, eval = haveProt}
+```{r  a_genes_protdomid_filter, eval = haveProt }
 ## Get all genes that encode a transcript encoding for a protein that contains
 ## a certain protein domain.
 gns <- genes(edb, filter = ProtDomIdFilter("PS50097"))
 length(gns)
 
 sort(gns$gene_name)
+ 
 ```
 
 So, in total we got 152 genes with that protein domain. In addition to the
@@ -172,19 +181,21 @@ The `select`, `keys` and `mapIds` methods from the `AnnotationDbi` package can a
 used to query `EnsDb` objects for protein annotations. Supported columns and
 key types are returned by the `columns` and `keytypes` methods.
 
-```{r a_2_annotationdbi, message = FALSE, eval = haveProt}
+```{r  a_2_annotationdbi, message = FALSE, eval = haveProt }
 ## Show all columns that are provided by the database
 columns(edb)
 
 ## Show all key types/filters that are supported
 keytypes(edb)
+ 
 ```
 
 Below we fetch all Uniprot IDs annotated to the gene *ZBTB16*.
 
-```{r a_2_select, message = FALSE, eval = haveProt}
+```{r  a_2_select, message = FALSE, eval = haveProt }
 select(edb, keys = "ZBTB16", keytype = "GENENAME",
        columns = "UNIPROTID")
+ 
 ```
 
 This returns us all Uniprot IDs of all proteins encoded by the gene's
@@ -193,10 +204,11 @@ annotated to a protein, does not have an Uniprot ID assigned (thus `NA` is
 returned by the above call). As we see below, this transcript is targeted for
 non sense mediated mRNA decay.
 
-```{r a_2_select_nmd, message = FALSE, eval = haveProt}
+```{r  a_2_select_nmd, message = FALSE, eval = haveProt }
 ## Call select, this time providing a GenenameFilter.
 select(edb, keys = GenenameFilter("ZBTB16"),
        columns = c("TXBIOTYPE", "UNIPROTID", "PROTEINID"))
+ 
 ```
 
 Note also that we passed this time a `GenenameFilter` with the `keys` parameter.
@@ -213,11 +225,12 @@ protein annotations becomes available.
 
 In the code chunk below we fetch all protein annotations for the gene *ZBTB16*.
 
-```{r b_proteins, message = FALSE, eval = haveProt}
+```{r  b_proteins, message = FALSE, eval = haveProt }
 ## Get all proteins and return them as an AAStringSet
 prts <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-		 return.type = "AAStringSet")
+                 return.type = "AAStringSet")
 prts
+ 
 ```
 
 Besides the amino acid sequence, the `prts` contains also additional annotations
@@ -225,8 +238,9 @@ that can be accessed with the `mcols` method (metadata columns). All additional
 columns provided with the parameter `columns` are also added to the `mcols`
 `DataFrame`.
 
-```{r b_proteins_mcols, message = FALSE, eval = haveProt}
+```{r  b_proteins_mcols, message = FALSE, eval = haveProt }
 mcols(prts)
+ 
 ```
 
 Note that the `proteins` method will retrieve only gene/transcript annotations of
@@ -237,24 +251,26 @@ previous section are not fetched.
 Querying in addition Uniprot identifiers or protein domain data will result at
 present in a redundant list of proteins as shown in the code block below.
 
-```{r b_proteins_prot_doms, message = FALSE, eval = haveProt}
+```{r  b_proteins_prot_doms, message = FALSE, eval = haveProt }
 ## Get also protein domain annotations in addition to the protein annotations.
 pd <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-	       columns = c("tx_id", listColumns(edb, "protein_domain")),
-	       return.type = "AAStringSet")
+               columns = c("tx_id", listColumns(edb, "protein_domain")),
+               return.type = "AAStringSet")
 pd
+ 
 ```
 
 The result contains one row/element for each protein domain in each of the
 proteins. The number of protein domains per protein and the `mcols` are shown
 below.
 
-```{r b_proteins_prot_doms_2, message = FALSE, eval = haveProt}
+```{r  b_proteins_prot_doms_2, message = FALSE, eval = haveProt }
 ## The number of protein domains per protein:
 table(names(pd))
 
 ## The mcols
 mcols(pd)
+ 
 ```
 
 As we can see each protein can have several protein domains with the start and
diff --git a/inst/doc/proteins.html b/inst/doc/proteins.html
index 885cfc2..1240539 100644
--- a/inst/doc/proteins.html
+++ b/inst/doc/proteins.html
@@ -11,6 +11,7 @@
 
 <meta name="author" content="Johannes Rainer" />
 
+<meta name="date" content="2017-10-30" />
 
 <title>Querying protein features</title>
 
@@ -68,7 +69,7 @@ h6 {
 }
 </style>
 
-<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0A%7D%0Abody%2C%20td%20%7B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B [...]
+<link href="data:text/css;charset=utf-8,body%20%7B%0Amargin%3A%200px%20auto%3B%0Amax%2Dwidth%3A%201134px%3B%0Afont%2Dfamily%3A%20sans%2Dserif%3B%0Afont%2Dsize%3A%2010pt%3B%0A%7D%0A%0Adiv%23TOC%20ul%20%7B%0Apadding%3A%200px%200px%200px%2045px%3B%0Alist%2Dstyle%3A%20none%3B%0Abackground%2Dimage%3A%20none%3B%0Abackground%2Drepeat%3A%20none%3B%0Abackground%2Dposition%3A%200%3B%0Afont%2Dsize%3A%2010pt%3B%0Afont%2Dfamily%3A%20Helvetica%2C%20Arial%2C%20sans%2Dserif%3B%0A%7D%0Adiv%23TOC%20%3E%20 [...]
 
 </head>
 
@@ -209,9 +210,9 @@ div.tocify {
 
 <h1 class="title toc-ignore">Querying protein features</h1>
 <p class="author-name">Johannes Rainer</p>
-<h4 class="date"><em>4 August 2017</em></h4>
+<h4 class="date"><em>30 October 2017</em></h4>
 <h4 class="package">Package</h4>
-<p>ensembldb 2.0.4</p>
+<p>ensembldb 2.2.0</p>
 
 </div>
 
@@ -243,16 +244,16 @@ hasProteinData(edb)</code></pre>
 <p>Protein annotations for (protein coding) transcripts can be retrieved by simply adding the desired annotation columns to the <code>columns</code> parameter of the e.g. <code>genes</code> or <code>transcripts</code> methods.</p>
 <pre class="r"><code>## Get also protein information for ZBTB16 transcripts
 txs <- transcripts(edb, filter = GenenameFilter("ZBTB16"),
-           columns = c("protein_id", "uniprot_id", "tx_biotype"))
+                   columns = c("protein_id", "uniprot_id", "tx_biotype"))
 txs</code></pre>
 <p>The gene ZBTB16 has protein coding and non-coding transcripts, thus, we get the protein ID for the coding- and <code>NA</code> for the non-coding transcripts. Note also that we have a transcript targeted for nonsense mediated mRNA-decay with a protein ID associated with it, but no Uniprot ID.</p>
 <pre class="r"><code>## Subset to transcripts with tx_biotype other than protein_coding.
 txs[txs$tx_biotype != "protein_coding", c("uniprot_id", "tx_biotype",
-                      "protein_id")]</code></pre>
+                                          "protein_id")]</code></pre>
 <p>While the mapping from a protein coding transcript to a Ensembl protein ID (column <code>protein_id</code>) is 1:1, the mapping between <code>protein_id</code> and <code>uniprot_id</code> can be n:m, i.e. each Ensembl protein ID can be mapped to 1 or more Uniprot IDs and each Uniprot ID can be mapped to more than one <code>protein_id</code> (and hence <code>tx_id</code>). This should be kept in mind if querying transcripts from the database fetching Uniprot related additional columns  [...]
 <pre class="r"><code>## List the protein IDs and uniprot IDs for the coding transcripts
 mcols(txs[txs$tx_biotype == "protein_coding",
-      c("tx_id", "protein_id", "uniprot_id")])</code></pre>
+          c("tx_id", "protein_id", "uniprot_id")])</code></pre>
 <p>Some of the n:m mappings for Uniprot IDs can be resolved by restricting either to entries from one Uniprot database (<em>SPTREMBL</em> or <em>SWISSPROT</em>) or to mappings of a certain type of mapping method. The corresponding filters are the <code>UniprotDbFilter</code> and the <code>UniprotMappingTypeFilter</code> (using the <code>uniprot_db</code> and <code>uniprot_mapping_type</code> columns of the <code>uniprot</code> database table). In the example below we restrict the result  [...]
 <pre class="r"><code>## List all uniprot mapping types in the database.
 listUniprotMappingTypes(edb)
@@ -262,7 +263,7 @@ listUniprotMappingTypes(edb)
 ## on "DIRECT" mapping methods.
 txs <- transcripts(edb, filter = list(GenenameFilter("ZBTB16"),
                       UniprotMappingTypeFilter("DIRECT")),
-           columns = c("protein_id", "uniprot_id", "uniprot_db"))
+                   columns = c("protein_id", "uniprot_id", "uniprot_db"))
 mcols(txs)</code></pre>
 <p>For this example the use of the <code>UniprotMappingTypeFilter</code> resolved the multiple mapping of Uniprot IDs to Ensembl protein IDs, but the Uniprot ID <em>Q05516</em> is still assigned to the two Ensembl protein IDs <em>ENSP00000338157</em> and <em>ENSP00000376721</em>.</p>
 <p>All protein annotations can also be added as <em>metadata columns</em> to the results of the <code>genes</code>, <code>exons</code>, <code>exonsBy</code>, <code>transcriptsBy</code>, <code>cdsBy</code>, <code>fiveUTRsByTranscript</code> and <code>threeUTRsByTranscript</code> methods by specifying the desired column names with the <code>columns</code> parameter. For non coding transcripts <code>NA</code> will be reported in the protein annotation columns.</p>
@@ -298,7 +299,7 @@ select(edb, keys = GenenameFilter("ZBTB16"),
 <p>In the code chunk below we fetch all protein annotations for the gene <em>ZBTB16</em>.</p>
 <pre class="r"><code>## Get all proteins and return them as an AAStringSet
 prts <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-         return.type = "AAStringSet")
+                 return.type = "AAStringSet")
 prts</code></pre>
 <p>Besides the amino acid sequence, the <code>prts</code> contains also additional annotations that can be accessed with the <code>mcols</code> method (metadata columns). All additional columns provided with the parameter <code>columns</code> are also added to the <code>mcols</code> <code>DataFrame</code>.</p>
 <pre class="r"><code>mcols(prts)</code></pre>
@@ -306,8 +307,8 @@ prts</code></pre>
 <p>Querying in addition Uniprot identifiers or protein domain data will result at present in a redundant list of proteins as shown in the code block below.</p>
 <pre class="r"><code>## Get also protein domain annotations in addition to the protein annotations.
 pd <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-           columns = c("tx_id", listColumns(edb, "protein_domain")),
-           return.type = "AAStringSet")
+               columns = c("tx_id", listColumns(edb, "protein_domain")),
+               return.type = "AAStringSet")
 pd</code></pre>
 <p>The result contains one row/element for each protein domain in each of the proteins. The number of protein domains per protein and the <code>mcols</code> are shown below.</p>
 <pre class="r"><code>## The number of protein domains per protein:
diff --git a/inst/perl/get_gene_transcript_exon_tables.pl b/inst/perl/get_gene_transcript_exon_tables.pl
index bc07669..5cedf2d 100644
--- a/inst/perl/get_gene_transcript_exon_tables.pl
+++ b/inst/perl/get_gene_transcript_exon_tables.pl
@@ -1,5 +1,9 @@
 #!/usr/bin/perl
 #####################################
+## version 0.3.1: * Add ens_counts.txt with the total counts of genes, tx, exons
+##                  and proteins for validation that all entries were added to
+##                  the database.
+##                * Extract gene descriptions and tx support level.
 ## version 0.3.0: * Change database layout by adding a dedicated entrezgene
 ##                  table.
 ## version 0.2.4: * Extract taxonomy ID and add that to  metadata table.
@@ -24,7 +28,8 @@ use Bio::EnsEMBL::ApiVersion;
 use Bio::EnsEMBL::Registry;
 ## unification function for arrays
 use List::MoreUtils qw/ uniq /;
-my $script_version = "0.3.0";
+my $script_version = "0.3.1";
+my $min_tsl_version = 87;   ## The minimal required Ensembl version providing support for the tsl method.
 
 ## connecting to the ENSEMBL data base
 use Bio::EnsEMBL::Registry;
@@ -95,6 +100,7 @@ my $api_version="".software_version()."";
 if($ensembl_version ne $api_version){
     die "The submitted Ensembl version (".$ensembl_version.") does not match the version of the Ensembl API (".$api_version."). Please configure the environment variable ENS to point to the correct API.";
 }
+my $ensembl_version_num = $ensembl_version + 0;
 
 print "Connecting to ".$host." at port ".$port."\n";
 
@@ -120,10 +126,10 @@ print $infostring;
 
 ## preparing output files:
 open(GENE , ">ens_gene.txt");
-print GENE "gene_id\tgene_name\tgene_biotype\tgene_seq_start\tgene_seq_end\tseq_name\tseq_strand\tseq_coord_system\n";
+print GENE "gene_id\tgene_name\tgene_biotype\tgene_seq_start\tgene_seq_end\tseq_name\tseq_strand\tseq_coord_system\tdescription\n";
 
 open(TRANSCRIPT , ">ens_tx.txt");
-print TRANSCRIPT "tx_id\ttx_biotype\ttx_seq_start\ttx_seq_end\ttx_cds_seq_start\ttx_cds_seq_end\tgene_id\n";
+print TRANSCRIPT "tx_id\ttx_biotype\ttx_seq_start\ttx_seq_end\ttx_cds_seq_start\ttx_cds_seq_end\tgene_id\ttx_support_level\n";
 
 open(EXON , ">ens_exon.txt");
 print EXON "exon_id\texon_seq_start\texon_seq_end\n";
@@ -149,14 +155,22 @@ print PROTDOM "protein_id\tprotein_domain_id\tprotein_domain_source\tinterpro_ac
 open(CHR , ">ens_chromosome.txt");
 print CHR "seq_name\tseq_length\tis_circular\n";
 
+open(COUNTS, ">ens_counts.txt");
+print COUNTS "gene\ttx\texon\tprotein\n";
+
 ##OK now running the stuff:
 print "Start fetching data\n";
 my %done_chromosomes=();
 my %done_exons=();  ## to keep track of which exons have already been saved.
 my $counta = 0;
+my $count_gene = 0;
+my $count_tx = 0;
+my $count_exon = 0;
+my $count_protein = 0;
 @gene_ids = @{$gene_adaptor->list_stable_ids()};
 foreach my $gene_id (@gene_ids){
   $counta++;
+  $count_gene++;
   if(($counta % 2000) == 0){
     print "processed $counta genes\n";
   }
@@ -207,6 +221,10 @@ foreach my $gene_id (@gene_ids){
     my $gene_biotype = $gene->biotype;
     my $gene_seq_start = $gene->start;
     my $gene_seq_end = $gene->end;
+    my $description = $gene->description;
+    if(!defined($description)){
+      $description = "NULL";
+    }
     ## get entrezgene ID, if any...
     my $all_entries = $gene->get_all_DBLinks("EntrezGene");
     foreach my $dbe (@{$all_entries}){
@@ -221,12 +239,13 @@ foreach my $gene_id (@gene_ids){
     # if($hash_size > 0){
     #   $entrezid = join(";", keys %entrezgene_hash);
     # }
-    print GENE "$gene_id\t$gene_external_name\t$gene_biotype\t$gene_seq_start\t$gene_seq_end\t$chrom\t$strand\t$coord_system\n";
+    print GENE "$gene_id\t$gene_external_name\t$gene_biotype\t$gene_seq_start\t$gene_seq_end\t$chrom\t$strand\t$coord_system\t$description\n";
 
     ## process transcript(s)
     my @transcripts = @{ $gene->get_all_Transcripts };
     ## ok looping through the transcripts
     foreach my $transcript (@transcripts){
+      $count_tx++;
       if($do_transform==1){
 	## just to be shure that we have the transcript in chromosomal coordinations.
 	## $transcript = $transcript->transform("chromosome");
@@ -248,13 +267,25 @@ foreach my $gene_id (@gene_ids){
       my $tx_biotype = $transcript->biotype;
       my $tx_seq_start = $transcript->start;
       my $tx_seq_end = $transcript->end;
+      my $tx_tsl = "NULL";
+      if ($ensembl_version_num >= $min_tsl_version) {
+	$tx_tsl = $transcript->tsl;
+	if (!defined($tx_tsl)) {
+	  $tx_tsl = "NULL";
+	}
+      }
+      my $tx_description = $transcript->description;
+      # if (!defined($tx_description)) {
+      # 	$tx_description = "NULL";
+      # }
       ## write info.
-      print TRANSCRIPT "$tx_id\t$tx_biotype\t$tx_seq_start\t$tx_seq_end\t$tx_cds_start\t$tx_cds_end\t$gene_id\n";
+      print TRANSCRIPT "$tx_id\t$tx_biotype\t$tx_seq_start\t$tx_seq_end\t$tx_cds_start\t$tx_cds_end\t$gene_id\t$tx_tsl\n";
 ##      print G2T "$gene_id\t$tx_id\n";
 
       ## Process proteins/translations (if possible)
       my $transl = $transcript->translation();
       if (defined($transl)) {
+	$count_protein++;
 	my $transl_id = $transl->stable_id();
 	my $prot_seq = $transl->seq();
 	## Check if we could get UNIPROT ID(s):
@@ -299,6 +330,7 @@ foreach my $gene_id (@gene_ids){
 	  ## don't do anything.
 	}else{
 	  $done_exons{ $exon_id } = 1;
+	  $count_exon++;
 	  print EXON "$exon_id\t$exon_start\t$exon_end\n";
 	}
 	## saving the exon id to this file that provides the n:m mappint; also saving
@@ -326,7 +358,9 @@ print INFO "ensembl_host\t$host\n";
 print INFO "Organism\t$species_ens\n";
 print INFO "taxonomy_id\t$taxonomy_id\n";
 print INFO "genome_build\t$coord_system_version\n";
-print INFO "DBSCHEMAVERSION\t2.0\n";
+print INFO "DBSCHEMAVERSION\t2.1\n";
+
+print COUNTS "$count_gene\t$count_tx\t$count_exon\t$count_protein\n";
 
 close(INFO);
 
@@ -340,3 +374,4 @@ close(CHR);
 close(PROTEIN);
 close(PROTDOM);
 close(UNIPROT);
+close(COUNTS);
diff --git a/inst/scripts/checkEnsDbs.R b/inst/scripts/checkEnsDbs.R
index 4b7fda8..7f32917 100644
--- a/inst/scripts/checkEnsDbs.R
+++ b/inst/scripts/checkEnsDbs.R
@@ -15,8 +15,10 @@ checkEnsDbs <- function(x) {
         message("\nChecking EnsDb: ", basename(edbs[i]))
         edb <- EnsDb(edbs[i])
         ensembldb:::validateEnsDb(edb)
+        ensembldb:::checkValidEnsDb(edb)
         ## Now check also some query calls:
         gns <- genes(edb)
+        message(" version: ", ensembldb:::dbSchemaVersion(edb))
         message(" OK")
     }
 }
diff --git a/man/EnsDb-class.Rd b/man/EnsDb-class.Rd
index 6ccce43..4004641 100644
--- a/man/EnsDb-class.Rd
+++ b/man/EnsDb-class.Rd
@@ -259,9 +259,13 @@
 \seealso{
   \code{\link{EnsDb}},
   \code{\link{makeEnsembldbPackage}},
-      \code{\link{exonsBy}}, \code{\link{genes}},
-      \code{\link{transcripts}},
-      \code{\link{makeEnsemblSQLiteFromTables}}
+  \code{\link{exonsBy}}, \code{\link{genes}},
+  \code{\link{transcripts}},
+  \code{\link{makeEnsemblSQLiteFromTables}}
+  
+  \code{\link{addFilter}} for globally adding filters to an \code{EnsDb}
+  object.
+
 }
 \examples{
 
diff --git a/man/EnsDb-exonsBy.Rd b/man/EnsDb-exonsBy.Rd
index 2e4c888..5dee114 100644
--- a/man/EnsDb-exonsBy.Rd
+++ b/man/EnsDb-exonsBy.Rd
@@ -28,7 +28,9 @@
   Retrieve gene/transcript/exons annotations stored in an Ensembl based
   database package generated with the \code{\link{makeEnsembldbPackage}}
   function. Parameter \code{filter} enables to define filters to
-  retrieve only specific data.
+  retrieve only specific data. Alternatively, a global filter might be
+  added to the \code{EnsDb} object using the \code{\link{addFilter}}
+  method.
 }
 \usage{
 
@@ -40,7 +42,7 @@
         columns = listColumns(x, "exon"), filter =
         AnnotationFilterList(), use.names = FALSE)
 
-\S4method{exonsByOverlaps}{EnsDb}(x, ranges, maxgap = 0L, minoverlap = 1L,
+\S4method{exonsByOverlaps}{EnsDb}(x, ranges, maxgap = -1L, minoverlap = 0L,
         type = c("any", "start", "end"), columns = listColumns(x, "exon"),
         filter = AnnotationFilterList())
 
@@ -51,8 +53,8 @@
 \S4method{transcriptsBy}{EnsDb}(x, by = c("gene", "exon"),
         columns = listColumns(x, "tx"), filter = AnnotationFilterList())
 
-\S4method{transcriptsByOverlaps}{EnsDb}(x, ranges, maxgap = 0L,
-        minoverlap = 1L, type = c("any", "start", "end"),
+\S4method{transcriptsByOverlaps}{EnsDb}(x, ranges, maxgap = -1L,
+        minoverlap = 0L, type = c("any", "start", "end"),
         columns = listColumns(x, "tx"), filter = AnnotationFilterList())
 
 \S4method{promoters}{EnsDb}(x, upstream = 2000, downstream = 200, ...)
@@ -135,7 +137,9 @@
     \code{\link[AnnotationFilter]{AnnotationFilterList}} object
     combining several such objects or a \code{formula} representing a
     filter expression (see examples below or
-    \code{\link[AnnotationFilter]{AnnotationFilter}} for more details).
+    \code{\link[AnnotationFilter]{AnnotationFilter}} for more
+    details). Use the \code{\link{supportedFilters}} method to get an
+    overview of supported filter classes and related fields.
   }
 
   \item{includeTranscripts}{
@@ -442,6 +446,9 @@
   \code{\link{supportedFilters}} to get an overview of supported filters.
   \code{\link{makeEnsembldbPackage}},
   \code{\link{listColumns}}, \code{\link{lengthOf}}
+
+  \code{\link{addFilter}} for globally adding filters to an \code{EnsDb}
+  object.
 }
 \examples{
 
diff --git a/man/EnsDb.Rd b/man/EnsDb.Rd
index 3f0b303..d624bbd 100644
--- a/man/EnsDb.Rd
+++ b/man/EnsDb.Rd
@@ -15,17 +15,18 @@ A \code{\linkS4class{EnsDb}} object.
 }
 \description{
 The \code{EnsDb} constructor function connects to the database
-specified with argument \code{x} and returns a corresponding
-\code{\linkS4class{EnsDb}} object.
+    specified with argument \code{x} and returns a corresponding
+    \code{\linkS4class{EnsDb}} object.
 }
 \details{
 By providing the connection to a MySQL database, it is possible
-to use MySQL as the database backend and queries will be performed on that
-database. Note however that this requires the package \code{RMySQL} to be
-installed. In addition, the user needs to have access to a MySQL server
-providing already an EnsDb database, or must have write privileges on a
-MySQL server, in which case the \code{\link{useMySQL}} method can be used
-to insert the annotations from an EnsDB package into a MySQL database.
+    to use MySQL as the database backend and queries will be performed on
+    that database. Note however that this requires the package \code{RMySQL}
+    to be installed. In addition, the user needs to have access to a MySQL
+    server providing already an EnsDb database, or must have write
+    privileges on a MySQL server, in which case the \code{\link{useMySQL}}
+    method can be used to insert the annotations from an EnsDB package into
+    a MySQL database.
 }
 \examples{
 ## "Standard" way to create an EnsDb object:
diff --git a/man/Filter-classes.Rd b/man/Filter-classes.Rd
index 51f4a80..ecbfb8b 100644
--- a/man/Filter-classes.Rd
+++ b/man/Filter-classes.Rd
@@ -11,6 +11,8 @@
 \alias{UniprotDbFilter}
 \alias{UniprotMappingTypeFilter-class}
 \alias{UniprotMappingTypeFilter}
+\alias{TxSupportLevelFilter-class}
+\alias{TxSupportLevelFilter}
 \alias{supportedFilters,EnsDb-method}
 \alias{seqnames,GRangesFilter-method}
 \alias{seqlevels,GRangesFilter-method}
@@ -24,6 +26,8 @@ UniprotDbFilter(value, condition = "==")
 
 UniprotMappingTypeFilter(value, condition = "==")
 
+TxSupportLevelFilter(value, condition = "==")
+
 \S4method{supportedFilters}{EnsDb}(object, ...)
 
 \S4method{seqnames}{GRangesFilter}(x)
@@ -57,16 +61,25 @@ For \code{UniprotDbFilter}: A \code{UniprotDbFilter} object.
 For \code{UniprotMappingTypeFilter}: A
 \code{UniprotMappingTypeFilter} object.
 
-For \code{supportedFilters}: the names of the supported filter
-    classes.
+For \code{TxSupportLevel}: A
+\code{TxSupportLevel} object.
+
+For \code{supportedFilters}: a \code{data.frame} with the names and
+    the corresponding field of the supported filter classes.
 }
 \description{
 \code{ensembldb} supports most of the filters from the
     \code{\link{AnnotationFilter}} package to retrieve specific content from
-    \code{\linkS4class{EnsDb}} databases.
+    \code{\linkS4class{EnsDb}} databases. These filters can be passed to
+    the methods such as \code{\link{genes}} with the \code{filter} parameter
+    or can be added as a \emph{global} filter to an \code{EnsDb} object
+    (see \code{\link{addFilter}} for more details). Use the
+    \code{\link{supportedFilters}} to list all filters supported for
+    \code{EnsDb} object.
 
-\code{supportedFilters} returns the names of all supported
-    filters for the \code{EnsDb} object.
+\code{supportedFilters} returns a \code{data.frame} with the
+    names of all filters and the corresponding field supported by the
+    \code{EnsDb} object.
 
 \code{seqnames}: accessor for the sequence names of the
 \code{GRanges} object within a \code{GRangesFilter}
@@ -205,6 +218,14 @@ For \code{supportedFilters}: the names of the supported filter
 In addition, the following filters are defined by \code{ensembldb}:
 \describe{
 
+\item{TxSupportLevel}{
+    allows to filter results using the provided transcript support level.
+    Support levels for transcripts are defined by Ensembl based on the
+    available evidences for a transcript with 1 being the highest evidence
+    grade and 5 the lowest level. This filter is only supported on
+    \code{EnsDb} databases with a db schema version higher 2.1.
+}
+
 \item{UniprotDbFilter}{
     allows to filter results based on the specified Uniprot database name(s).
 }
@@ -344,6 +365,9 @@ if (hasProteinData(edb)) {
 
     \code{\link{genes}}, \code{\link{transcripts}}, \code{\link{exons}},
     \code{\link{listGenebiotypes}}, \code{\link{listTxbiotypes}}.
+
+    \code{\link{addFilter}} for globally adding filters to an \code{EnsDb}
+    object.
 }
 \author{
 Johannes Rainer
diff --git a/man/ProteinFunctionality.Rd b/man/ProteinFunctionality.Rd
index 03a7ed8..73aa2d6 100644
--- a/man/ProteinFunctionality.Rd
+++ b/man/ProteinFunctionality.Rd
@@ -30,9 +30,9 @@ be extracted from the database. Can be any column(s) listed by the
 
 \item{filter}{For \code{proteins}: A filter object extending
 \code{AnnotationFilter} or a list of such objects to select
-specific entries from the database. See \code{\link{Filter-classes}} for a
-documentation of available filters and use \code{\link{supportedFilters}} to
-get the full list of supported filters.}
+specific entries from the database. See \code{\link{Filter-classes}} for
+a documentation of available filters and use
+\code{\link{supportedFilters}} to get the full list of supported filters.}
 
 \item{order.by}{For \code{proteins}: a character vector specifying the
 column(s) by which the result should be ordered.}
@@ -47,41 +47,41 @@ the type of the returned object. Can be either \code{"DataFrame"},
 }
 \value{
 The \code{listProteinColumns} function returns a character vector
-with the column names containing protein annotations or throws an error
-if no such annotations are available.
+    with the column names containing protein annotations or throws an error
+    if no such annotations are available.
 
 The \code{proteins} method returns protein related annotations from
-an \code{\linkS4class{EnsDb}} object with its \code{return.type} argument
-allowing to define the type of the returned object. Note that if
-\code{return.type = "AAStringSet"} additional annotation columns are stored
-in a \code{DataFrame} that can be accessed with the \code{mcols} method on
-the returned object.
+    an \code{\linkS4class{EnsDb}} object with its \code{return.type} argument
+    allowing to define the type of the returned object. Note that if
+    \code{return.type = "AAStringSet"} additional annotation columns are
+    stored in a \code{DataFrame} that can be accessed with the \code{mcols}
+    method on the returned object.
 }
 \description{
 The \code{listProteinColumns} function allows to conveniently
-extract all database columns containing protein annotations from
-an \code{\linkS4class{EnsDb}} database.
+    extract all database columns containing protein annotations from
+    an \code{\linkS4class{EnsDb}} database.
 
 This help page provides information about most of the
-functionality related to protein annotations in \code{ensembldb}.
+    functionality related to protein annotations in \code{ensembldb}.
 
-The \code{proteins} method retrieves protein related annotations from
-an \code{\linkS4class{EnsDb}} database.
+    The \code{proteins} method retrieves protein related annotations from
+    an \code{\linkS4class{EnsDb}} database.
 
 The \code{listUniprotDbs} method lists all Uniprot database
-names in the \code{EnsDb}.
+    names in the \code{EnsDb}.
 
 The \code{listUniprotMappingTypes} method lists all methods
-that were used for the mapping of Uniprot IDs to Ensembl protein IDs.
+    that were used for the mapping of Uniprot IDs to Ensembl protein IDs.
 }
 \details{
 The \code{proteins} method performs the query starting from the
-\code{protein} tables and can hence return all annotations from the database
-that are related to proteins and transcripts encoding these proteins from
-the database. Since \code{proteins} does thus only query annotations for
-protein coding transcripts, the \code{\link{genes}} or
-\code{\link{transcripts}} methods have to be used to retrieve annotations
-for non-coding transcripts.
+    \code{protein} tables and can hence return all annotations from the
+    database that are related to proteins and transcripts encoding these
+    proteins from the database. Since \code{proteins} does thus only query
+    annotations for protein coding transcripts, the \code{\link{genes}} or
+    \code{\link{transcripts}} methods have to be used to retrieve annotations
+    for non-coding transcripts.
 }
 \examples{
 
diff --git a/man/convertFilter.Rd b/man/convertFilter.Rd
new file mode 100644
index 0000000..0fd007e
--- /dev/null
+++ b/man/convertFilter.Rd
@@ -0,0 +1,64 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/Methods-Filter.R
+\docType{methods}
+\name{convertFilter,AnnotationFilter,EnsDb-method}
+\alias{convertFilter,AnnotationFilter,EnsDb-method}
+\alias{convertFilter,AnnotationFilterList,EnsDb-method}
+\alias{convertFilter,AnnotationFilterList,EnsDb-method}
+\title{Convert an AnnotationFilter to a SQL WHERE condition for EnsDb}
+\usage{
+\S4method{convertFilter}{AnnotationFilter,EnsDb}(object, db,
+  with.tables = character())
+
+\S4method{convertFilter}{AnnotationFilterList,EnsDb}(object, db,
+  with.tables = character())
+}
+\arguments{
+\item{object}{\code{AnnotationFilter} or \code{AnnotationFilterList} objects (or
+objects extending these classes).}
+
+\item{db}{\code{EnsDb} object.}
+
+\item{with.tables}{optional \code{character} vector specifying the names of the
+database tables that are being queried.}
+}
+\value{
+A \code{character(1)} with the SQL where condition.
+}
+\description{
+\code{convertFilter} converts an \code{AnnotationFilter::AnnotationFilter}
+or \code{AnnotationFilter::AnnotationFilterList} to an SQL where condition
+for an \code{EnsDb} database.
+}
+\note{
+This function \emph{might} be used in direct SQL queries on the SQLite
+database underlying an \code{EnsDb} but is more thought to illustrate the
+use of \code{AnnotationFilter} objects in combination with SQL databases.
+This method is used internally to create the SQL calls to the database.
+}
+\examples{
+
+library(EnsDb.Hsapiens.v75)
+edb <- EnsDb.Hsapiens.v75
+
+## Define a filter
+flt <- AnnotationFilter(~ genename == "BCL2")
+
+## Use the method from the AnnotationFilter package:
+convertFilter(flt)
+
+## Create a combination of filters
+flt_list <- AnnotationFilter(~ genename \%in\% c("BCL2", "BCL2L11") &
+    tx_biotype == "protein_coding")
+flt_list
+
+convertFilter(flt_list)
+
+## Use the filters in the context of an EnsDb database:
+convertFilter(flt, edb)
+
+convertFilter(flt_list, edb)
+}
+\author{
+Johannes Rainer
+}
diff --git a/man/global-filters.Rd b/man/global-filters.Rd
new file mode 100644
index 0000000..cef2f62
--- /dev/null
+++ b/man/global-filters.Rd
@@ -0,0 +1,94 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/Methods.R, R/functions-EnsDb.R
+\docType{methods}
+\name{addFilter,EnsDb-method}
+\alias{addFilter,EnsDb-method}
+\alias{addFilter}
+\alias{dropFilter,EnsDb-method}
+\alias{dropFilter}
+\alias{activeFilter,EnsDb-method}
+\alias{activeFilter}
+\alias{filter}
+\title{Globally add filters to an EnsDb database}
+\usage{
+\S4method{addFilter}{EnsDb}(x, filter = AnnotationFilterList())
+
+\S4method{dropFilter}{EnsDb}(x)
+
+\S4method{activeFilter}{EnsDb}(x)
+
+filter(x, filter = AnnotationFilterList())
+}
+\arguments{
+\item{x}{The \code{\linkS4class{EnsDb}} object to which the filter should be
+added.}
+
+\item{filter}{The filter as an
+\code{\link[AnnotationFilter]{AnnotationFilter}},
+\code{\link[AnnotationFilter]{AnnotationFilterList}} or filter
+expression. See}
+}
+\value{
+\code{addFilter} and \code{filter} return an \code{EnsDb} object
+    with the specified filter added.
+
+    \code{activeFilter} returns an
+    \code{\link[AnnotationFilter]{AnnotationFilterList}} object being the
+    active global filter or \code{NA} if no filter was added.
+
+    \code{dropFilter} returns an \code{EnsDb} object with all eventually
+    present global filters removed.
+}
+\description{
+These methods allow to set, delete or show globally defined
+    filters on an \code{\linkS4class{EnsDb}} object.
+
+    \code{addFilter}: adds an annotation filter to the \code{EnsDb} object.
+
+\code{dropFilter} deletes all globally set filters from the
+    \code{EnsDb} object.
+
+\code{activeFilter} returns the globally set filter from an
+    \code{EnsDb} object.
+
+\code{filter} filters an \code{EnsDb} object. \code{filter} is
+    an alias for the \code{addFilter} function.
+}
+\details{
+Adding a filter to an \code{EnsDb} object causes this filter to be
+    permanently active. The filter will be used for all queries to the
+    database and is added to all additional filters passed to the methods
+    such as \code{\link{genes}}.
+}
+\examples{
+library(EnsDb.Hsapiens.v75)
+edb <- EnsDb.Hsapiens.v75
+
+## Add a global SeqNameFilter to the database such that all subsequent
+## queries will be applied on the filtered database.
+edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+## Note: using the filter function is equivalent to a call to addFilter.
+
+## Each call returns now only features encoded on chromosome Y
+gns <- genes(edb_y)
+
+seqlevels(gns)
+
+## Get all lincRNA gene transcripts on chromosome Y
+transcripts(edb_y, filter = ~ gene_biotype == "lincRNA")
+
+## Get the currently active global filter:
+activeFilter(edb_y)
+
+## Delete this filter again.
+edb_y <- dropFilter(edb_y)
+
+activeFilter(edb_y)
+}
+\seealso{
+\code{\link{Filter-classes}} for a list of all supported filters.
+}
+\author{
+Johannes Rainer
+}
diff --git a/man/hasProteinData-EnsDb-method.Rd b/man/hasProteinData-EnsDb-method.Rd
index b728dc7..2a01c98 100644
--- a/man/hasProteinData-EnsDb-method.Rd
+++ b/man/hasProteinData-EnsDb-method.Rd
@@ -13,11 +13,11 @@
 }
 \value{
 A logical of length one, \code{TRUE} if protein annotations are
-available and \code{FALSE} otherwise.
+    available and \code{FALSE} otherwise.
 }
 \description{
 Determines whether the \code{\linkS4class{EnsDb}}
-provides protein annotation data.
+    provides protein annotation data.
 }
 \examples{
 library(EnsDb.Hsapiens.v75)
diff --git a/man/listEnsDbs.Rd b/man/listEnsDbs.Rd
index f2bfaab..5652dc3 100644
--- a/man/listEnsDbs.Rd
+++ b/man/listEnsDbs.Rd
@@ -22,20 +22,20 @@ running.}
 }
 \value{
 A \code{data.frame} listing the database names, organism name
-and Ensembl version of the EnsDb databases found on the server.
+    and Ensembl version of the EnsDb databases found on the server.
 }
 \description{
 The \code{listEnsDbs} function lists EnsDb databases in a
-MySQL server.
+    MySQL server.
 }
 \details{
 The use of this function requires that the \code{RMySQL} package
-is installed and that the user has either access to a MySQL server with
-already installed EnsDb databases, or write access to a MySQL server in
-which case EnsDb databases could be added with the \code{\link{useMySQL}}
-method. EnsDb databases follow the same naming conventions than the EnsDb
-packages, with the exception that the name is all lower case and that
-\code{"."} is replaced by \code{"_"}.
+    is installed and that the user has either access to a MySQL server with
+    already installed EnsDb databases, or write access to a MySQL server in
+    which case EnsDb databases could be added with the \code{\link{useMySQL}}
+    method. EnsDb databases follow the same naming conventions than the EnsDb
+    packages, with the exception that the name is all lower case and that
+    \code{"."} is replaced by \code{"_"}.
 }
 \examples{
 \dontrun{
diff --git a/man/useMySQL-EnsDb-method.Rd b/man/useMySQL-EnsDb-method.Rd
index 977dbe0..7e30225 100644
--- a/man/useMySQL-EnsDb-method.Rd
+++ b/man/useMySQL-EnsDb-method.Rd
@@ -22,23 +22,23 @@ server runs.}
 }
 \value{
 A \code{\linkS4class{EnsDb}} object providing access to the
-data stored in the MySQL backend.
+     data stored in the MySQL backend.
 }
 \description{
 Change the SQL backend from \emph{SQLite} to \emph{MySQL}.
-When first called on an \code{\linkS4class{EnsDb}} object, the function
-tries to create and save all of the data into a MySQL database. All
-subsequent calls will connect to the already existing MySQL database.
+    When first called on an \code{\linkS4class{EnsDb}} object, the function
+    tries to create and save all of the data into a MySQL database. All
+    subsequent calls will connect to the already existing MySQL database.
 }
 \details{
 This functionality requires that the \code{RMySQL} package is
-installed and that the user has (write) access to a running MySQL server.
-If the corresponding database does already exist users without write access
-can use this functionality.
+    installed and that the user has (write) access to a running MySQL server.
+    If the corresponding database does already exist users without write
+    access can use this functionality.
 }
 \note{
 At present the function does not evaluate whether the versions
-between the SQLite and MySQL database differ.
+    between the SQLite and MySQL database differ.
 }
 \examples{
 ## Load the EnsDb database (SQLite backend).
diff --git a/tests/testthat/test_Classes.R b/tests/testthat/test_Classes.R
index 9c76303..5b22491 100644
--- a/tests/testthat/test_Classes.R
+++ b/tests/testthat/test_Classes.R
@@ -83,3 +83,9 @@ test_that("GRangesFilter works for EnsDb", {
     expect_equal(ensembldb:::ensDbQuery(grf2, edb), exp)
 })
 
+test_that("TxSupportLevelFilter works for EnsDb", {
+    fl <- TxSupportLevelFilter(3)
+    expect_true(is(fl, "TxSupportLevelFilter"))
+    fl <- AnnotationFilter(~ tx_support_level == 3)
+    expect_true(is(fl, "TxSupportLevelFilter"))
+})
diff --git a/tests/testthat/test_Methods-Filter.R b/tests/testthat/test_Methods-Filter.R
index 5ffb12a..2e49258 100644
--- a/tests/testthat/test_Methods-Filter.R
+++ b/tests/testthat/test_Methods-Filter.R
@@ -494,6 +494,7 @@ test_that("UniprotMappingTypeFilter works", {
     expect_equal(value(pf), "ABC")
     expect_equal(field(pf), "uniprot_mapping_type")
     expect_equal(ensembldb:::ensDbQuery(pf), "uniprot_mapping_type = 'ABC'")
+    expect_equal(unname(ensembldb:::ensDbColumn(pf)), "uniprot_mapping_type")
     if (hasProteinData(edb)) {
         expect_equal(ensembldb:::ensDbColumn(pf, edb),
                      "uniprot.uniprot_mapping_type")
@@ -513,3 +514,42 @@ test_that("UniprotMappingTypeFilter works", {
     expect_equal(ensembldb:::ensDbQuery(pf), "uniprot_mapping_type in ('A','B')")
     expect_error(UniprotMappingTypeFilter("B", condition = ">"))
 })
+
+test_that("TxSupportLevelFilter works", {
+    fl <- TxSupportLevelFilter(3)
+    expect_equal(value(fl), 3)
+    expect_equal(ensembldb:::ensDbQuery(fl), "tx_support_level = 3")
+    expect_equal(unname(ensembldb:::ensDbColumn(fl)), "tx_support_level")
+    supportsTsl <- any(supportedFilters(edb)$filter == "TxSupportLevelFilter")
+    if (supportsTsl) {
+        expect_equal(unname(ensembldb:::ensDbColumn(fl, edb)),
+                     "tx.tx_support_level")
+        expect_equal(unname(ensembldb:::ensDbColumn(fl, edb, with.tables = "tx")),
+                     "tx.tx_support_level")
+        expect_equal(unname(ensembldb:::ensDbQuery(fl, edb)),
+                     "tx.tx_support_level = 3")
+        expect_equal(unname(ensembldb:::ensDbQuery(fl, edb, with.tables = "tx")),
+                     "tx.tx_support_level = 3")
+        expect_error(ensembldb:::ensDbQuery(fl, edb, with.tables = "gene"))
+    } else {
+        expect_error(ensembldb:::ensDbColumn(fl, edb))
+        expect_error(ensembldb:::ensDbColumn(fl, edb, with.tables = "tx"))
+        expect_error(ensembldb:::ensDbQuery(fl, edb))
+        expect_error(ensembldb:::ensDbQuery(fl, edb, with.tables = "tx"))
+    }
+})
+
+test_that("convertFilter works", {
+    flt <- AnnotationFilter(~ genename == "BCL2")
+    expect_equal(convertFilter(flt), "genename == 'BCL2'")
+    expect_equal(convertFilter(flt, edb), "gene.gene_name = 'BCL2'")
+    
+    flt_list <- AnnotationFilter(~ genename %in% c("BCL2", "BCL2L11") &
+                                     tx_biotype == "protein_coding")
+    expect_equal(
+        convertFilter(flt_list),
+        "genename %in% c('BCL2', 'BCL2L11') & tx_biotype == 'protein_coding'")
+    expect_equal(
+        convertFilter(flt_list, edb),
+        "(gene.gene_name in ('BCL2','BCL2L11') and tx.tx_biotype = 'protein_coding')")
+})
diff --git a/tests/testthat/test_Methods.R b/tests/testthat/test_Methods.R
index d87f50f..36d62b9 100644
--- a/tests/testthat/test_Methods.R
+++ b/tests/testthat/test_Methods.R
@@ -539,14 +539,14 @@ test_that("filter columns are correctly added in methods", {
 test_that("supportedFilters works", {
     res <- ensembldb:::.supportedFilters(edb)
     if (!hasProteinData(edb))
-        expect_equal(length(res), 19)
+        expect_equal(nrow(res), 19)
     else 
-        expect_equal(length(res), 24)
+        expect_equal(nrow(res), 24)
     res <- supportedFilters(edb)
     if (!hasProteinData(edb))
-        expect_equal(length(res), 19)
+        expect_equal(nrow(res), 19)
     else 
-        expect_equal(length(res), 24)
+        expect_equal(nrow(res), 24)
 })
 
 ## Here we check if we fetch what we expect from the database.
@@ -891,3 +891,34 @@ test_that("exonsByOverlaps works", {
     Test3 <- exonsByOverlaps(edb, gr2, filter=SeqStrandFilter("-"))
     expect_equal(names(Test), names(Test3))
 })
+
+test_that("methods work with global filter", {
+    g_f <- SeqNameFilter(18)
+    edb_2 <- ensembldb:::.addFilter(edb, g_f)
+
+    ## gns
+    gns <- genes(edb_2)
+    expect_equal(gns, genes(edb, filter = g_f))
+    edb_2 <- ensembldb:::.addFilter(edb_2, TxBiotypeFilter("protein_coding"))
+
+    gns_2 <- genes(edb_2)
+    expect_true(all(seqlevels(gns_2) == "18"))
+    expect_true(length(gns) > length(gns_2))
+
+    ## Combine with additional filter:
+    gns_2 <- genes(edb_2, filter = SeqStrandFilter("+"))
+    expect_true(all(as.character(strand(gns_2)) == "+"))
+    gns <- genes(edb_2, filter = GenenameFilter("ZBTB16"))
+    expect_true(length(gns) == 0)
+    gns <- genes(edb_2, filter = GenenameFilter("BCL2"))
+    expect_true(all(gns$symbol == "BCL2"))
+
+    ## transcripts
+    txs <- transcripts(edb_2, filter = ~ genename == "BCL2")
+    expect_true(all(txs$genename == "BCL2"))
+    expect_true(all(txs$tx_biotype == "protein_coding"))
+
+    ## exonsBy
+    exs <- exonsBy(edb_2, "gene")
+})
+
diff --git a/tests/testthat/test_functions-EnsDb.R b/tests/testthat/test_functions-EnsDb.R
new file mode 100644
index 0000000..ffbf20f
--- /dev/null
+++ b/tests/testthat/test_functions-EnsDb.R
@@ -0,0 +1,59 @@
+test_that(".addFilter .dropFilter and .activeFilter work", {
+    gf <- GenenameFilter("BCL2")
+    ## .addFilter and .activeFilter
+    edb_2 <- ensembldb:::.addFilter(edb, filter = gf)
+    expect_equal(AnnotationFilterList(gf),
+                 ensembldb:::getProperty(edb_2, "FILTER"))
+    expect_equal(AnnotationFilterList(gf),
+                 ensembldb:::.activeFilter(edb_2))
+    edb_2 <- ensembldb:::.addFilter(edb_2, filter = gf)
+    expect_equal(AnnotationFilterList(AnnotationFilterList(gf),
+                                      AnnotationFilterList(gf)),
+                 ensembldb:::getProperty(edb_2, "FILTER"))
+    edb_2 <- ensembldb:::.addFilter(edb, filter = ~ tx_id == 3 &
+                                             tx_biotype == "protein_coding")
+    flts <- ensembldb:::.activeFilter(edb_2)
+    expect_equal(flts, ensembldb:::getProperty(edb_2, "FILTER"))
+    expect_equal(flts, AnnotationFilter(~ tx_id == 3 &
+                                            tx_biotype == "protein_coding"))
+    
+    ## Errors
+    expect_error(ensembldb:::.addFilter(edb, "blabla"))
+    expect_error(ensembldb:::filter(edb, "blabla"))
+    expect_error(ensembldb:::.addFilter(edb))
+    
+    ## .dropFilter
+    edb_2 <- ensembldb:::.dropFilter(edb_2)
+    expect_equal(ensembldb:::.activeFilter(edb_2), NA)
+
+    ## Same but with the methods.
+    gf <- GenenameFilter("BCL2")
+    ## .addFilter and .activeFilter
+    edb_2 <- addFilter(edb, filter = gf)
+    expect_equal(AnnotationFilterList(gf),
+                 ensembldb:::getProperty(edb_2, "FILTER"))
+    edb_2 <- filter(edb, filter = gf)
+    expect_equal(AnnotationFilterList(gf),
+                 ensembldb:::getProperty(edb_2, "FILTER"))
+    expect_equal(AnnotationFilterList(gf),
+                 activeFilter(edb_2))
+    edb_2 <- addFilter(edb_2, filter = gf)
+    expect_equal(AnnotationFilterList(AnnotationFilterList(gf),
+                                      AnnotationFilterList(gf)),
+                 ensembldb:::getProperty(edb_2, "FILTER"))
+    edb_2 <- addFilter(edb, filter = ~ tx_id == 3 &
+                                tx_biotype == "protein_coding")
+    flts <- activeFilter(edb_2)
+    expect_equal(flts, ensembldb:::getProperty(edb_2, "FILTER"))
+    expect_equal(flts, AnnotationFilter(~ tx_id == 3 &
+                                            tx_biotype == "protein_coding"))
+    
+    ## Errors
+    expect_error(addFilter(edb, "blabla"))
+    expect_error(addFilter(edb))
+    
+    ## .dropFilter
+    edb_2 <- dropFilter(edb_2)
+    expect_equal(activeFilter(edb_2), NA)
+})
+
diff --git a/tests/testthat/test_functions-Filter.R b/tests/testthat/test_functions-Filter.R
index 0e446f3..495b4e4 100644
--- a/tests/testthat/test_functions-Filter.R
+++ b/tests/testthat/test_functions-Filter.R
@@ -18,6 +18,10 @@ test_that(".fieldInEnsDb works", {
                  "uniprot_mapping_type")
     expect_equal(unname(ensembldb:::.fieldInEnsDb("prot_dom_id")),
                  "protein_domain_id")
+    expect_equal(unname(ensembldb:::.fieldInEnsDb("description")),
+                 "description")
+    expect_equal(unname(ensembldb:::.fieldInEnsDb("tx_support_level")),
+                 "tx_support_level")
     expect_error(ensembldb:::.fieldInEnsDb("aaa"))
 })
 
@@ -46,6 +50,8 @@ test_that(".conditionForEnsDb works", {
     expect_equal(ensembldb:::.conditionForEnsDb(fl), "<")
     fl <- GeneStartFilter(4, condition = "<=")
     expect_equal(ensembldb:::.conditionForEnsDb(fl), "<=")
+    fl <- TxSupportLevelFilter(4, condition = "<=")
+    expect_equal(ensembldb:::.conditionForEnsDb(fl), "<=")
 })
 
 test_that(".valueForEnsDb works", {
@@ -70,6 +76,8 @@ test_that(".queryForEnsDb works", {
     ## Tests for numeric filters
     fl <- GeneStartFilter(5, condition = "<=")
     expect_equal(ensembldb:::.queryForEnsDb(fl), "gene_seq_start <= 5")
+    fl <- TxSupportLevelFilter(5, condition = "<=")
+    expect_equal(ensembldb:::.queryForEnsDb(fl), "tx_support_level <= 5")
 })
 
 test_that(".queryForEnsDbWithTables works", {
@@ -94,6 +102,9 @@ test_that(".queryForEnsDbWithTables works", {
                  "gene.gene_id = 'b'")
     expect_equal(ensembldb:::.queryForEnsDbWithTables(fl, edb, c("tx", "gene")),
                  "tx.gene_id = 'b'")
+    fl <- GeneIdFilter("b", condition = "contains")
+    expect_equal(ensembldb:::.queryForEnsDbWithTables(fl, edb, c("tx", "gene")),
+                 "tx.gene_id like '%b%'")
     ## Entrez
     if (as.numeric(ensembldb:::dbSchemaVersion(edb)) > 1) {
         fl <- EntrezFilter("g")
@@ -115,6 +126,11 @@ test_that(".queryForEnsDbWithTables works", {
     expect_equal(ensembldb:::.queryForEnsDbWithTables(fl, edb),
                  "tx.tx_seq_start = 123")
     expect_error(ensembldb:::.queryForEnsDbWithTables(fl, edb, "gene"))
+    if (any(listColumns(edb) == "tx_support_level")) {
+        fl <- TxSupportLevelFilter(3)
+        expect_equal(ensembldb:::.queryForEnsDbWithTables(fl, edb),
+                     "tx.tx_support_level = 3")
+    }
 })
 
 test_that(".processFilterParam works", {
@@ -224,3 +240,39 @@ test_that(".AnnottionFilterClassNames works", {
                  c("GeneStartFilter", "SeqStrandFilter", "GenenameFilter",
                    "SeqNameFilter", "SymbolFilter"))
 })
+
+test_that(".anyIs works", {
+    sf <- SymbolFilter("d")
+    expect_true(ensembldb:::.anyIs(sf, "SymbolFilter"))
+    expect_false(ensembldb:::.anyIs(GenenameFilter(3), "SymbolFilter"))
+
+    flts <- AnnotationFilterList(sf, TxIdFilter("b"))
+    expect_true(any(ensembldb:::.anyIs(flts, "SymbolFilter")))
+    expect_false(any(ensembldb:::.anyIs(flts, "BLa")))
+
+    ## Additional nesting.
+    flts <- AnnotationFilterList(flts, GenenameFilter("2"))
+    expect_true(any(ensembldb:::.anyIs(flts, "SymbolFilter")))
+    expect_true(any(ensembldb:::.anyIs(flts, "GenenameFilter")))
+    expect_true(any(ensembldb:::.anyIs(flts, "TxIdFilter")))    
+})
+
+test_that(".fieldToClass works", {
+    expect_equal(ensembldb:::.fieldToClass("gene_id"), "GeneIdFilter")
+})
+
+test_that(".filterFields works", {
+    res <- ensembldb:::.filterFields(edb)
+    if (hasProteinData(edb)) {
+        expect_true(any(res == "uniprot"))
+    }
+    expect_true(!any(res == "g_ranges"))
+})
+
+test_that(".supportedFilters works", {
+    res <- ensembldb:::.supportedFilters(edb)
+    expect_true(class(res) == "data.frame")
+    expect_equal(res[res$filter == "GRangesFilter", "field"], as.character(NA))
+    expect_equal(res[res$filter == "ExonIdFilter", "field"], "exon_id")
+    expect_equal(res[res$filter == "GenenameFilter", "field"], "genename")
+})
diff --git a/tests/testthat/test_functions-utils.R b/tests/testthat/test_functions-utils.R
index 848c805..439a808 100644
--- a/tests/testthat/test_functions-utils.R
+++ b/tests/testthat/test_functions-utils.R
@@ -18,6 +18,28 @@ test_that("addFilterColumns works for AnnotationFilterList", {
     expect_equal(res, c("gene_biotype", "seq_name", "gene_name", "symbol"))
 })
 
+test_that("functions work for encapsuled AnnotationFilterLists", {
+    fl <- AnnotationFilterList(GenenameFilter("a"),
+                               AnnotationFilterList(TxIdFilter("b")))
+    res <- ensembldb:::.processFilterParam(fl, edb)
+    expect_equal(res, fl)
+    res <- ensembldb:::setFeatureInGRangesFilter(fl, "gene")
+    expect_equal(res, fl)
+    res <- ensembldb:::addFilterColumns("z", fl, edb)
+    expect_equal(res, c("z", "gene_name", "tx_id"))
+    res <- ensembldb:::getWhat(edb, filter = fl)
+    ## Check if content is the same
+    flts1 <- ~ genename == "BCL2" & tx_biotype == "protein_coding"
+    res1 <- transcripts(edb, filter = flts1)
+    flts2 <- AnnotationFilterList(
+        AnnotationFilterList(GenenameFilter("BCL2"),
+                             AnnotationFilterList(
+                                 TxBiotypeFilter("protein_coding"))
+                             ))
+    res2 <- transcripts(edb, filter = flts2)
+    expect_equal(res1, res2)
+})
+
 ## Here we want to test if we get always also the filter columns back.
 test_that("multiFilterReturnCols works also with symbolic filters", {
     cols <- ensembldb:::addFilterColumns(edb, cols = c("exon_id"),
diff --git a/vignettes/MySQL-backend.Rmd b/vignettes/MySQL-backend.Rmd
index de512cb..84db670 100644
--- a/vignettes/MySQL-backend.Rmd
+++ b/vignettes/MySQL-backend.Rmd
@@ -3,7 +3,7 @@ title: "Using a MySQL server backend"
 author: "Johannes Rainer"
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Using a MySQL server backend}
@@ -33,7 +33,7 @@ this would require access to a MySQL server.
 Installation of `EnsDb` databases in a MySQL server is straight forward - given
 that the user has write access to the server:
 
-```{r eval = FALSE}
+```{r  eval = FALSE }
 library(ensembldb)
 ## Load the EnsDb package that should be installed on the MySQL server
 library(EnsDb.Hsapiens.v75)
@@ -41,10 +41,11 @@ library(EnsDb.Hsapiens.v75)
 ## Call the useMySQL method providing the required credentials to create
 ## databases and inserting data on the MySQL server
 edb_mysql <- useMySQL(EnsDb.Hsapiens.v75, host = "localhost", user = "userwrite",
-		      pass = "userpass")
+                      pass = "userpass")
 
 ## Use this EnsDb object
 genes(edb_mysql)
+ 
 ```
 
 To use an `EnsDb` in a MySQL server without the need to install the corresponding
@@ -52,21 +53,22 @@ R-package, the connection to the database can be passed to the `EnsDb` construct
 function. With the resulting `EnsDb` object annotations can be retrieved from the
 MySQL database.
 
-```{r eval = FALSE}
+```{r  eval = FALSE }
 library(ensembldb)
 library(RMySQL)
 
 ## Connect to the MySQL database to list the databases.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-		   pass = "readonly")
+                   pass = "readonly")
 
 ## List the available databases
 listEnsDbs(dbcon)
 
 ## Connect to one of the databases and use that one.
 dbcon <- dbConnect(MySQL(), host = "localhost", user = "readonly",
-		   pass = "readonly", dbname = "ensdb_hsapiens_v75")
+                   pass = "readonly", dbname = "ensdb_hsapiens_v75")
 edb <- EnsDb(dbcon)
 edb
+ 
 ```
 
diff --git a/vignettes/MySQL-backend.org b/vignettes/MySQL-backend.org
index 9b0d164..d81991d 100644
--- a/vignettes/MySQL-backend.org
+++ b/vignettes/MySQL-backend.org
@@ -11,7 +11,7 @@ title: "Using a MySQL server backend"
 author: "Johannes Rainer"
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Using a MySQL server backend}
@@ -19,6 +19,7 @@ vignette: >
   %\VignetteEncoding{UTF-8}
   %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle}
 ---
+
 #+END_EXPORT
 
 ** Introduction
@@ -52,6 +53,7 @@ that the user has write access to the server:
 
   ## Use this EnsDb object
   genes(edb_mysql)
+
 #+END_SRC
 
 To use an =EnsDb= in a MySQL server without the need to install the corresponding
@@ -75,5 +77,6 @@ MySQL database.
                      pass = "readonly", dbname = "ensdb_hsapiens_v75")
   edb <- EnsDb(dbcon)
   edb
+
 #+END_SRC
 
diff --git a/vignettes/ensembldb.Rmd b/vignettes/ensembldb.Rmd
index 7bbf10c..b636e21 100644
--- a/vignettes/ensembldb.Rmd
+++ b/vignettes/ensembldb.Rmd
@@ -4,13 +4,13 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Generating an using Ensembl based annotation packages}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
-  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz}
+  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz,magrittr}
 ---
 
 
@@ -25,7 +25,7 @@ database, the `ensembldb` package provides also a filter framework allowing to
 retrieve annotations for specific entries like genes encoded on a chromosome
 region or transcript models of lincRNA genes. From version 1.7 on, `EnsDb`
 databases created by the `ensembldb` package contain also protein annotation data
-(see Section [11](#org35014ed) for the database layout and an overview of
+(see Section [11](#org5bd9a97) for the database layout and an overview of
 available attributes/columns). For more information on the use of the protein
 annotations refer to the *proteins* vignette.
 
@@ -42,7 +42,7 @@ In the example below we load an Ensembl based annotation package for Homo
 sapiens, Ensembl version 75. The `EnsDb` object providing access to the underlying
 SQLite database is bound to the variable name `EnsDb.Hsapiens.v75`.
 
-```{r load-libs, warning=FALSE, message=FALSE}
+```{r  load-libs, warning=FALSE, message=FALSE }
 library(EnsDb.Hsapiens.v75)
 
 ## Making a "short cut"
@@ -52,13 +52,15 @@ edb
 
 ## For what organism was the database generated?
 organism(edb)
+ 
 ```
 
-```{r no-network, echo = FALSE, results = "hide"}
+```{r  no-network, echo = FALSE, results = "hide" }
 ## Disable code chunks that require network connection - conditionally
 ## disable this on Windows only. This is to avoid TIMEOUT errors on the
 ## Bioconductor Windows build maching (issue #47).
 use_network <- FALSE
+ 
 ```
 
 
@@ -68,13 +70,19 @@ One of the strengths of the `ensembldb` package and the related `EnsDb` database
 its implementation of a filter framework that enables to efficiently extract
 data sub-sets from the databases. The `ensembldb` package supports most of the
 filters defined in the `AnnotationFilter` Bioconductor package and defines some
-additional filters specific to the data stored in `EnsDb` databases. The
-`supportedFilters` method can be used to get an overview over all supported filter
-classes, each of them (except the `GRangesFilter`) working on a single
+additional filters specific to the data stored in `EnsDb` databases. Filters can
+be passed directly to all methods extracting data from an `EnsDb` (such as `genes`,
+`transcripts` or `exons`). Alternatively it is possible with the `addFilter` or `filter`
+functions to add a filter directly to an `EnsDb` which will then be used in all
+queries on that object.
+
+The `supportedFilters` method can be used to get an overview over all supported
+filter classes, each of them (except the `GRangesFilter`) working on a single
 column/field in the database.
 
-```{r filters}
+```{r  filters }
 supportedFilters(edb)
+ 
 ```
 
 These filters can be divided into 3 main filter types:
@@ -141,6 +149,11 @@ filters are also available:
 These can however only be used on `EnsDb` databases that provide protein
 annotations, i.e. for which a call to `hasProteinData` returns `TRUE`.
 
+`EnsDb` databases for more recent Ensembl versions (starting from Ensembl 87)
+provide also evidence levels for individual transcripts in the `tx_support_level`
+database column. Such databases support also a `TxSupportLevelFilter` filter to
+use this columns for filtering.
+
 A simple use case for the filter framework would be to get all transcripts for
 the gene *BCL2L11*. To this end we specify a `GenenameFilter` with the value
 *BCL2L11*. As a result we get a `GRanges` object with `start`, `end`, `strand` and `seqname`
@@ -150,7 +163,7 @@ columns. Alternatively, by setting `return.type` to "DataFrame", or "data.frame"
 the method would return a `DataFrame` or `data.frame` object instead of the default
 `GRanges`.
 
-```{r transcripts}
+```{r  transcripts }
 Tx <- transcripts(edb, filter = list(GenenameFilter("BCL2L11")))
 
 Tx
@@ -160,6 +173,7 @@ head(start(Tx))
 
 ## or extract the biotype with
 head(Tx$tx_biotype)
+ 
 ```
 
 The parameter `columns` of the extractor methods (such as `exons`, `genes` or
@@ -175,26 +189,60 @@ was used, the column `gene_name` is also returned). Setting
 specified by the `columns` parameter are retrieved.
 
 Instead of passing a filter *object* to the method it is also possible to provide
-a filter *expression* written as a `formula`.
+a filter *expression* written as a `formula`. The `formula` has to be written in the
+form `~ <field> <condition> <value>` with `<field>` being the field (database
+column) in the database, `<condition>` the condition for the filter object and
+`<value>` its value. Use the `supportedFilter` method to get the field names
+corresponding to each filter class.
 
-```{r transcripts-filter-expression}
+```{r  transcripts-filter-expression }
 ## Use a filter expression to perform the filtering.
 transcripts(edb, filter = ~ genename == "ZBTB16")
+ 
 ```
 
 Filter expression have to be written as a formula (i.e. starting with a `~`) in
 the form *column name* followed by the logical condition.
 
+Alternatively, `EnsDb` objects can be filtered directly using the `filter`
+function. In the example below we use the `filter` function to filter the `EnsDb`
+object and pass that filtered database to the `transcripts` method using the `%>%`
+from the `magrittr` package.
+
+```{r  transcripts-filter }
+library(magrittr)
+
+filter(edb, ~ symbol == "BCL2" & tx_biotype != "protein_coding") %>% transcripts
+ 
+```
+
+Adding a filter to an `EnsDb` enables this filter (globally) on all subsequent
+queries on that object. We could thus filter an `EnsDb` to (virtually) contain
+only features encoded on chromosome Y.
+
+```{r  filter-Y }
+edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+## All subsequent filters on that EnsDb will only work on features encoded on
+## chromosome Y
+genes(edb_y)
+
+## Get all lincRNAs on chromosome Y
+genes(edb_y, filter = ~ gene_biotype == "lincRNA")
+ 
+```
+
 To get an overview of database tables and available columns the function
 `listTables` can be used. The method `listColumns` on the other hand lists columns
 for the specified database table.
 
-```{r list-columns}
+```{r  list-columns }
 ## list all database tables along with their columns
 listTables(edb)
 
 ## list columns from a specific table
 listColumns(edb, "tx")
+ 
 ```
 
 Thus, we could retrieve all transcripts of the biotype *nonsense\_mediated\_decay*
@@ -204,22 +252,24 @@ the name of the gene for each transcript. Note that we are changing here the
 `return.type` to `DataFrame`, so the method will return a `DataFrame` with the
 results instead of the default `GRanges`.
 
-```{r transcripts-example2}
+```{r  transcripts-example2 }
 Tx <- transcripts(edb,
-		  columns = c(listColumns(edb , "tx"), "gene_name"),
-		  filter = TxBiotypeFilter("nonsense_mediated_decay"),
-		  return.type = "DataFrame")
+                  columns = c(listColumns(edb , "tx"), "gene_name"),
+                  filter = TxBiotypeFilter("nonsense_mediated_decay"),
+                  return.type = "DataFrame")
 nrow(Tx)
 Tx
+ 
 ```
 
 For protein coding transcripts, we can also specifically extract their coding
 region. In the example below we extract the CDS for all transcripts encoded on
 chromosome Y.
 
-```{r cdsBy}
+```{r  cdsBy }
 yCds <- cdsBy(edb, filter = SeqNameFilter("Y"))
 yCds
+ 
 ```
 
 Using a `GRangesFilter` we can retrieve all features from the database that are
@@ -228,10 +278,10 @@ below we query all genes that are partially overlapping with a small region on
 chromosome 11. The filter restricts to all genes for which either an exon or an
 intron is partially overlapping with the region.
 
-```{r genes-GRangesFilter}
+```{r  genes-GRangesFilter }
 ## Define the filter
 grf <- GRangesFilter(GRanges("11", ranges = IRanges(114000000, 114000050),
-			     strand = "+"), type = "any")
+                             strand = "+"), type = "any")
 
 ## Query genes:
 gn <- genes(edb, filter = grf)
@@ -239,9 +289,10 @@ gn
 
 ## Next we retrieve all transcripts for that gene so that we can plot them.
 txs <- transcripts(edb, filter = GenenameFilter(gn$gene_name))
+ 
 ```
 
-```{r tx-for-zbtb16, message=FALSE, fig.align='center', fig.width=7.5, fig.height=5}
+```{r  tx-for-zbtb16, message=FALSE, fig.align='center', fig.width=7.5, fig.height=5 }
 plot(3, 3, pch = NA, xlim = c(start(gn), end(gn)), ylim = c(0, length(txs)),
      yaxt = "n", ylab = "")
 ## Highlight the GRangesFilter region
@@ -250,9 +301,10 @@ rect(xleft = start(grf), xright = end(grf), ybottom = 0, ytop = length(txs),
 for(i in 1:length(txs)) {
     current <- txs[i]
     rect(xleft = start(current), xright = end(current), ybottom = i-0.975,
-	 ytop = i-0.125, border = "grey")
+         ytop = i-0.125, border = "grey")
     text(start(current), y = i-0.5, pos = 4, cex = 0.75, labels = current$tx_id)
 }
+ 
 ```
 
 As we can see, 4 transcripts of the gene ZBTB16 are also overlapping the
@@ -260,8 +312,9 @@ region. Below we fetch these 4 transcripts. Note, that a call to `exons` will
 not return any features from the database, as no exon is overlapping with the
 region.
 
-```{r transcripts-GRangesFilter}
+```{r  transcripts-GRangesFilter }
 transcripts(edb, filter = grf)
+ 
 ```
 
 The `GRangesFilter` supports also `GRanges` defining multiple regions and a
@@ -275,20 +328,21 @@ to further fine-tune the query.
 The functions `listGenebiotypes` and `listTxbiotypes` can be used to get an overview
 of allowed/available gene and transcript biotype
 
-```{r biotypes}
+```{r  biotypes }
 ## Get all gene biotypes from the database. The GeneBiotypeFilter
 ## allows to filter on these values.
 listGenebiotypes(edb)
 
 ## Get all transcript biotypes from the database.
 listTxbiotypes(edb)
+ 
 ```
 
 Data can be fetched in an analogous way using the `exons` and `genes`
 methods. In the example below we retrieve `gene_name`, `entrezid` and the
 `gene_biotype` of all genes in the database which names start with "BCL2".
 
-```{r genes-BCL2}
+```{r  genes-BCL2 }
 ## We're going to fetch all genes which names start with BCL. To this end
 ## we define a GenenameFilter with partial matching, i.e. condition "like"
 ## and a % for any character/string.
@@ -298,6 +352,7 @@ BCLs <- genes(edb,
 	      return.type = "DataFrame")
 nrow(BCLs)
 BCLs
+ 
 ```
 
 Sometimes it might be useful to know the length of genes or transcripts
@@ -308,17 +363,18 @@ these chromosomes. For the first query we combine two `AnnotationFilter` objects
 using an `AnnotationFilterList` object, in the second we define the query using a
 filter expression.
 
-```{r example-AnnotationFilterList}
+```{r  example-AnnotationFilterList }
 ## determine the average length of snRNA, snoRNA and rRNA genes encoded on
 ## chromosomes X and Y.
 mean(lengthOf(edb, of = "tx", filter = AnnotationFilterList(
-				  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
-				  SeqNameFilter(c("X", "Y")))))
+                                  GeneBiotypeFilter(c("snRNA", "snoRNA", "rRNA")),
+                                  SeqNameFilter(c("X", "Y")))))
 
 ## determine the average length of protein coding genes encoded on the same
 ## chromosomes.
 mean(lengthOf(edb, of = "tx", filter = ~ gene_biotype == "protein_coding" &
-				  seq_name %in% c("X", "Y")))
+                                  seq_name %in% c("X", "Y")))
+ 
 ```
 
 Not unexpectedly, transcripts of protein coding genes are longer than those of
@@ -327,12 +383,13 @@ snRNA, snoRNA or rRNA genes.
 At last we extract the first two exons of each transcript model from the
 database.
 
-```{r example-first-two-exons}
+```{r  example-first-two-exons }
 ## Extract all exons 1 and (if present) 2 for all genes encoded on the
 ## Y chromosome
 exons(edb, columns = c("tx_id", "exon_idx"),
       filter = list(SeqNameFilter("Y"),
-		    ExonRankFilter(3, condition = "<")))
+                    ExonRankFilter(3, condition = "<")))
+ 
 ```
 
 
@@ -352,9 +409,10 @@ CDS.
 A simple use case is to retrieve all genes encoded on chromosomes X and Y from
 the database.
 
-```{r transcriptsBy-X-Y}
+```{r  transcriptsBy-X-Y }
 TxByGns <- transcriptsBy(edb, by = "gene", filter = SeqNameFilter(c("X", "Y")))
 TxByGns
+ 
 ```
 
 Since Ensembl contains also definitions of genes that are on chromosome variants
@@ -367,12 +425,13 @@ restrict to Ensembl genes only, as also *LRG* (Locus Reference Genomic)
 genes<sup><a id="fnr.2" class="footref" href="#fn.2">2</a></sup> are defined in the database, which are partially redundant with
 Ensembl genes.
 
-```{r exonsBy-RNAseq, message = FALSE, eval = FALSE}
+```{r  exonsBy-RNAseq, message = FALSE, eval = FALSE }
 ## will just get exons for all genes on chromosomes 1 to 22, X and Y.
 ## Note: want to get rid of the "LRG" genes!!!
 EnsGenes <- exonsBy(edb, by = "gene", filter = AnnotationFilterList(
-					  SeqNameFilter(c(1:22, "X", "Y")),
-					  GeneIdFilter("ENSG", "startsWith")))
+                                          SeqNameFilter(c(1:22, "X", "Y")),
+                                          GeneIdFilter("ENSG", "startsWith")))
+ 
 ```
 
 The code above returns a `GRangesList` that can be used directly as an input for
@@ -382,9 +441,10 @@ Alternatively, the above `GRangesList` can be transformed to a `data.frame` in
 *SAF* format that can be used as an input to the `featureCounts` function of the
 `Rsubread` package <sup><a id="fnr.4" class="footref" href="#fn.4">4</a></sup>.
 
-```{r toSAF-RNAseq, message = FALSE, eval=FALSE}
+```{r  toSAF-RNAseq, message = FALSE, eval=FALSE }
 ## Transforming the GRangesList into a data.frame in SAF format
 EnsGenes.SAF <- toSAF(EnsGenes)
+ 
 ```
 
 Note that the ID by which the `GRangesList` is split is used in the SAF
@@ -396,11 +456,12 @@ In addition, the `disjointExons` function (similar to the one defined in
 `GenomicFeatures`) can be used to generate a `GRanges` of non-overlapping exon
 parts which can be used in the `DEXSeq` package.
 
-```{r disjointExons, message = FALSE, eval=FALSE}
+```{r  disjointExons, message = FALSE, eval=FALSE }
 ## Create a GRanges of non-overlapping exon parts.
 DJE <- disjointExons(edb, filter = AnnotationFilterList(
 			      SeqNameFilter(c(1:22, "X", "Y")),
 			      GeneIdFilter("ENSG%", "startsWith")))
+ 
 ```
 
 
@@ -425,7 +486,7 @@ the package, subset to genes encoded on sequences available in the `FaFile` and
 extract all of their sequences. Note: these sequences represent the sequence
 between the chromosomal start and end coordinates of the gene.
 
-```{r transcript-sequence-AnnotationHub, message = FALSE, eval = FALSE}
+```{r  transcript-sequence-AnnotationHub, message = FALSE, eval = FALSE }
 library(EnsDb.Hsapiens.v75)
 library(Rsamtools)
 edb <- EnsDb.Hsapiens.v75
@@ -443,13 +504,14 @@ genes <- genes[seqnames(genes) %in% seqnames(seqinfo(Dna))]
 ## Get the gene sequences, i.e. the sequence including the sequence of
 ## all of the gene's exons and introns.
 geneSeqs <- getSeq(Dna, genes)
+ 
 ```
 
 To retrieve the (exonic) sequence of transcripts (i.e. without introns) we can
 use directly the `extractTranscriptSeqs` method defined in the `GenomicFeatures` on
 the `EnsDb` object, eventually using a filter to restrict the query.
 
-```{r transcript-sequence-extractTranscriptSeqs, message = FALSE, eval = FALSE}
+```{r  transcript-sequence-extractTranscriptSeqs, message = FALSE, eval = FALSE }
 ## get all exons of all transcripts encoded on chromosome Y
 yTx <- exonsBy(edb, filter = SeqNameFilter("Y"))
 
@@ -465,6 +527,7 @@ yTx <- extractTranscriptSeqs(Dna, edb, filter = SeqNameFilter("Y"))
 ## of all transcripts on the Y chromosome.
 cdsY <- cdsBy(edb, filter = SeqNameFilter("Y"))
 extractTranscriptSeqs(Dna, cdsY)
+ 
 ```
 
 Note: in the next section we describe how transcript sequences can be retrieved
@@ -485,7 +548,7 @@ UCSC, NCBI and Ensembl chromosome names for the *main* chromosomes).
 
 In the example below we change the seqnames style to UCSC.
 
-```{r seqlevelsStyle, message = FALSE}
+```{r  seqlevelsStyle, message = FALSE }
 ## Change the seqlevels style form Ensembl (default) to UCSC:
 seqlevelsStyle(edb) <- "UCSC"
 
@@ -493,6 +556,7 @@ seqlevelsStyle(edb) <- "UCSC"
 genesY <- genes(edb, filter = ~ seq_name == "chrY")
 ## The seqlevels of the returned GRanges are also in UCSC style
 seqlevels(genesY)
+ 
 ```
 
 Note that in most instances no mapping is available for sequences not
@@ -504,7 +568,7 @@ ones from Ensembl) are returned. With `ensembldb.seqnameNotFound` "MISSING" each
 time a seqname can not be found an error is thrown. For all other cases
 (e.g. `ensembldb.seqnameNotFound = NA`) the value of the option is returned.
 
-```{r seqlevelsStyle-2, message = FALSE}
+```{r  seqlevelsStyle-2, message = FALSE }
 seqlevelsStyle(edb) <- "UCSC"
 
 ## Getting the default option:
@@ -520,6 +584,7 @@ seqlevels(edb)[1:30]
 
 ## Resetting the option.
 options(ensembldb.seqnameNotFound = "ORIGINAL")
+ 
 ```
 
 Next we retrieve transcript sequences from genes encoded on chromosome Y using
@@ -528,7 +593,7 @@ the `BSGenome` package for the human genome from UCSC. The specified version
 while we changed the style of the seqnames to UCSC we did not change the naming
 of the genome release.
 
-```{r extractTranscriptSeqs-BSGenome, warning = FALSE, message = FALSE}
+```{r  extractTranscriptSeqs-BSGenome, warning = FALSE, message = FALSE }
 library(BSgenome.Hsapiens.UCSC.hg19)
 bsg <- BSgenome.Hsapiens.UCSC.hg19
 
@@ -546,14 +611,16 @@ yTxSeqs
 ## Extract just the CDS
 Test <- cdsBy(edb, "tx", filter = SeqNameFilter("chrY"))
 yTxCds <- extractTranscriptSeqs(bsg, cdsBy(edb, "tx",
-					   filter = SeqNameFilter("chrY")))
+                                           filter = SeqNameFilter("chrY")))
 yTxCds
+ 
 ```
 
 At last changing the seqname style to the default value `"Ensembl"`.
 
-```{r seqlevelsStyle-restore}
+```{r  seqlevelsStyle-restore }
 seqlevelsStyle(edb) <- "Ensembl"
+ 
 ```
 
 
@@ -584,7 +651,7 @@ not necessary if we just want to retrieve gene models from an `EnsDb` object, as
 the `ensembldb` package internally checks the `ucscChromosomeNames` option and,
 depending on that, maps Ensembl chromosome names to UCSC chromosome names.
 
-```{r gviz-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=2.3}
+```{r  gviz-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=2.3 }
 ## Loading the Gviz library
 library(Gviz)
 library(EnsDb.Hsapiens.v75)
@@ -593,7 +660,7 @@ edb <- EnsDb.Hsapiens.v75
 ## Retrieving a Gviz compatible GRanges object with all genes
 ## encoded on chromosome Y.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "Y",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 
@@ -604,6 +671,7 @@ options(ucscChromosomeNames = FALSE)
 plotTracks(list(gat, GeneRegionTrack(gr)))
 
 options(ucscChromosomeNames = TRUE)
+ 
 ```
 
 Above we had to change the option `ucscChromosomeNames` to `FALSE` in order to
@@ -612,55 +680,59 @@ change the `seqnamesStyle` of the `EnsDb` object to `UCSC`. Note that we have to
 use now also chromosome names in the *UCSC style* in the `SeqNameFilter`
 (i.e. "chrY" instead of `Y`).
 
-```{r message=FALSE}
+```{r  message=FALSE }
 seqlevelsStyle(edb) <- "UCSC"
 ## Retrieving the GRanges objects with seqnames corresponding to UCSC chromosome names.
 gr <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				start = 20400000, end = 21400000)
+                                start = 20400000, end = 21400000)
 seqnames(gr)
 ## Define a genome axis track
 gat <- GenomeAxisTrack()
 plotTracks(list(gat, GeneRegionTrack(gr)))
+ 
 ```
 
 We can also use the filters from the `ensembldb` package to further refine what
 transcripts are fetched, like in the example below, in which we create two
 different gene region tracks, one for protein coding genes and one for lincRNAs.
 
-```{r gviz-separate-tracks, message=FALSE, warning=FALSE, fig.align='center', fig.width=7.5, fig.height=2.25}
+```{r  gviz-separate-tracks, message=FALSE, warning=FALSE, fig.align='center', fig.width=7.5, fig.height=2.25 }
 protCod <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				     start = 20400000, end = 21400000,
-				     filter = GeneBiotypeFilter("protein_coding"))
+                                     start = 20400000, end = 21400000,
+                                     filter = GeneBiotypeFilter("protein_coding"))
 lincs <- getGeneRegionTrackForGviz(edb, chromosome = "chrY",
-				   start = 20400000, end = 21400000,
-				   filter = GeneBiotypeFilter("lincRNA"))
+                                   start = 20400000, end = 21400000,
+                                   filter = GeneBiotypeFilter("lincRNA"))
 
 plotTracks(list(gat, GeneRegionTrack(protCod, name = "protein coding"),
-		GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
+                GeneRegionTrack(lincs, name = "lincRNAs")), transcriptAnnotation = "symbol")
 
 ## At last we change the seqlevels style again to Ensembl
 seqlevelsStyle <- "Ensembl"
+ 
 ```
 
 Alternatively, we can also use `ggbio` for plotting. For `ggplot` we can directly
 pass the `EnsDb` object along with optional filters (or as in the example below a
 filter expression as a `formula`).
 
-```{r pplot-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4}
+```{r  pplot-plot, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4 }
 library(ggbio)
 
 ## Create a plot for all transcripts of the gene SKA2
 autoplot(edb, ~ genename == "SKA2")
+ 
 ```
 
 To plot the genomic region and plot genes from both strands we can use a
 `GRangesFilter`.
 
-```{r pplot-plot-2, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4}
+```{r  pplot-plot-2, message=FALSE, fig.align='center', fig.width=7.5, fig.height=4 }
 ## Get the chromosomal region in which the gene is encoded
 ska2 <- genes(edb, filter = ~ genename == "SKA2")
 strand(ska2) <- "*"
 autoplot(edb, GRangesFilter(ska2), names.expr = "gene_name")
+ 
 ```
 
 
@@ -676,7 +748,7 @@ In the example below we first evaluate all the available columns and keytypes in
 the database and extract then the gene names for all genes encoded on chromosome
 X.
 
-```{r AnnotationDbi, message = FALSE}
+```{r  AnnotationDbi, message = FALSE }
 library(EnsDb.Hsapiens.v75)
 edb <- EnsDb.Hsapiens.v75
 
@@ -699,6 +771,7 @@ length(gids)
 ## Get all gene names for genes encoded on chromosome Y.
 gnames <- keys(edb, keytype = "GENENAME", filter = SeqNameFilter("Y"))
 head(gnames)
+ 
 ```
 
 In the next example we retrieve specific information from the database using the
@@ -707,22 +780,23 @@ In the next example we retrieve specific information from the database using the
 we employ the filtering system to perform a more fine-grained query to fetch
 only the protein coding transcripts for these genes.
 
-```{r select, message = FALSE, warning=FALSE}
+```{r  select, message = FALSE, warning=FALSE }
 ## Use the /standard/ way to fetch data.
 select(edb, keys = c("BCL2", "BCL2L11"), keytype = "GENENAME",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))
 
 ## Use the filtering system of ensembldb
 select(edb, keys = ~ genename %in% c("BCL2", "BCL2L11") &
-		tx_biotype == "protein_coding",
+                tx_biotype == "protein_coding",
        columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE"))
+ 
 ```
 
 Finally, we use the `mapIds` method to establish a mapping between ids and
 values. In the example below we fetch transcript ids for the two genes from the
 example above.
 
-```{r mapIds, message = FALSE}
+```{r  mapIds, message = FALSE }
 ## Use the default method, which just returns the first value for multi mappings.
 mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME")
 
@@ -732,8 +806,9 @@ mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME",
 
 ## And, just like before, we can use filters to map only to protein coding transcripts.
 mapIds(edb, keys = list(GenenameFilter(c("BCL2", "BCL2L11")),
-			TxBiotypeFilter("protein_coding")), column = "TXID",
+                        TxBiotypeFilter("protein_coding")), column = "TXID",
        multiVals = "list")
+ 
 ```
 
 Note that, if the filters are used, the ordering of the result does no longer
@@ -790,30 +865,33 @@ annotations from Ensembl version 86.
 Since Bioconductor version 3.5 `EnsDb` databases can also be retrieved directly
 from `AnnotationHub`.
 
-```{r AnnotationHub-query, message = FALSE, eval = use_network}
+```{r  AnnotationHub-query, message = FALSE, eval = use_network }
 library(AnnotationHub)
 ## Load the annotation resource.
 ah <- AnnotationHub()
 
 ## Query for all available EnsDb databases
 query(ah, "EnsDb")
+ 
 ```
 
 We can simply fetch one of the databases.
 
-```{r AnnotationHub-query-2, message = FALSE, eval = use_network}
+```{r  AnnotationHub-query-2, message = FALSE, eval = use_network }
 ahDb <- query(ah, pattern = c("Xiphophorus Maculatus", "EnsDb", 87))
 ## What have we got
 ahDb
+ 
 ```
 
 Fetch the `EnsDb` and use it.
 
-```{r AnnotationHub-fetch, message = FALSE, eval = FALSE}
+```{r  AnnotationHub-fetch, message = FALSE, eval = FALSE }
 ahEdb <- ahDb[[1]]
 
 ## retriebe all genes
 gns <- genes(ahEdb)
+ 
 ```
 
 We could even make an annotation package from this `EnsDb` object using the
@@ -835,7 +913,7 @@ the Ensembl core databases. The `makeEnsembldbPackage` function is then used to
 create an annotation package from this `EnsDb` containing all human genes for
 Ensembl version 75.
 
-```{r edb-from-ensembl, message = FALSE, eval = FALSE}
+```{r  edb-from-ensembl, message = FALSE, eval = FALSE }
 library(ensembldb)
 
 ## get all human gene/transcript/exon annotations from Ensembl (75)
@@ -850,8 +928,9 @@ DBFile <- makeEnsemblSQLiteFromTables()
 
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DBFile, version = "0.99.12",
-		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-		     author = "J Rainer")
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")
+ 
 ```
 
 The generated package can then be build using `R CMD build EnsDb.Hsapiens.v75`
@@ -887,7 +966,7 @@ then use the `getGenomeFaFile` method on the `EnsDb` to directly look up and
 retrieve the correct or best matching `FaFile` with the genomic DNA sequence. At
 last we retrieve the sequences of all exons using the `getSeq` method.
 
-```{r gtf-gff-edb, message = FALSE, eval = FALSE}
+```{r  gtf-gff-edb, message = FALSE, eval = FALSE }
 ## Load the AnnotationHub data.
 library(AnnotationHub)
 ah <- AnnotationHub()
@@ -914,13 +993,14 @@ exonSeq <- getSeq(Dna, exons)
 
 ## Alternatively, look up and retrieve the toplevel DNA sequence manually.
 Dna <- ah[["AH22042"]]
+ 
 ```
 
 In the example below we load a `GRanges` containing gene definitions for genes
 encoded on chromosome Y and generate a `EnsDb` SQLite database from that
 information.
 
-```{r EnsDb-from-Y-GRanges, message = FALSE, eval = use_network}
+```{r  EnsDb-from-Y-GRanges, message = FALSE, eval = use_network }
 ## Generate a sqlite database from a GRanges object specifying
 ## genes encoded on chromosome Y
 load(system.file("YGRanges.RData", package = "ensembldb"))
@@ -933,6 +1013,7 @@ DB <- ensDbFromGRanges(Y, path = tempdir(), version = 75,
 ## Load the database
 edb <- EnsDb(DB)
 edb
+ 
 ```
 
 Alternatively we can build the annotation database using the `ensDbFromGtf`
@@ -947,7 +1028,7 @@ length information automatically from Ensembl.
 
 Below we create the annotation from a gtf file that we fetch directly from Ensembl.
 
-```{r EnsDb-from-GTF, message = FALSE, eval = FALSE}
+```{r  EnsDb-from-GTF, message = FALSE, eval = FALSE }
 library(ensembldb)
 
 ## the GTF file can be downloaded from
@@ -962,27 +1043,22 @@ EDB <- EnsDb(DB)
 ## alternatively, build the annotation package
 ## and finally we can generate the package
 makeEnsembldbPackage(ensdb = DB, version = "0.99.12",
-		     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
-		     author = "J Rainer")
+                     maintainer = "Johannes Rainer <johannes.rainer at eurac.edu>",
+                     author = "J Rainer")
+ 
 ```
 
 
-# Database layout<a id="org35014ed"></a>
+# Database layout<a id="org5bd9a97"></a>
 
 The database consists of the following tables and attributes (the layout is also
-shown in Figure [159](#org6a42233)). Note that the protein-specific annotations
+shown in Figure [165](#orgfd622d5)). Note that the protein-specific annotations
 might not be available in all `EnsDB` databases (e.g. such ones created with
 `ensembldb` version < 1.7 or created from GTF or GFF files).
 
 -   **gene**: all gene specific annotations.
     -   `gene_id`: the Ensembl ID of the gene.
     -   `gene_name`: the name (symbol) of the gene.
-<<<<<<< variant A
-    -   `entrezid`: the NCBI Entrezgene ID(s) of the gene. Note that this can be a
-        `;` separated list of IDs for genes that are mapped to more than one
-        Entrezgene.
->>>>>>> variant B
-======= end
     -   `gene_biotype`: the biotype of the gene.
     -   `gene_seq_start`: the start coordinate of the gene on the sequence (usually
         a chromosome).
@@ -990,6 +1066,7 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
     -   `seq_name`: the name of the sequence (usually the chromosome name).
     -   `seq_strand`: the strand on which the gene is encoded.
     -   `seq_coord_system`: the coordinate system of the sequence.
+    -   `description`: the description of the gene.
 
 -   **entrezgene**: mapping of Ensembl genes to NCBI Entrezgene identifiers. Note that
     this mapping can be a one-to-many mapping.
@@ -1000,6 +1077,7 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
     is available in this database column, all methods to retrieve data from the
     database support also this column. The returned values are however the ID of
     the transcripts.
+    
     -   `tx_id`: the Ensembl transcript ID.
     -   `tx_biotype`: the biotype of the transcript.
     -   `tx_seq_start`: the start coordinate of the transcript.
@@ -1008,6 +1086,10 @@ might not be available in all `EnsDB` databases (e.g. such ones created with
         transcript (NULL for non-coding transcripts).
     -   `tx_cds_seq_end`: the end coordinate of the coding region of the transcript.
     -   `gene_id`: the gene to which the transcript belongs.
+    
+    `EnsDb` databases for more recent Ensembl releases have also a column
+    `tx_support_level` providing the evidence level for a transcript (1 high
+    evidence, 5 low evidence, NA no evidence calculated).
 
 -   **exon**: all exon related annotation.
     -   `exon_id`: the Ensembl exon ID.
diff --git a/vignettes/ensembldb.org b/vignettes/ensembldb.org
index b44a95e..8260bba 100644
--- a/vignettes/ensembldb.org
+++ b/vignettes/ensembldb.org
@@ -18,14 +18,15 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Generating an using Ensembl based annotation packages}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
-  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz}
+  %\VignetteDepends{ensembldb,EnsDb.Hsapiens.v75,BiocStyle,AnnotationHub,ggbio,Gviz,magrittr}
 ---
+
 #+END_EXPORT
 
 
@@ -108,9 +109,14 @@ One of the strengths of the =ensembldb= package and the related =EnsDb= database
 its implementation of a filter framework that enables to efficiently extract
 data sub-sets from the databases. The =ensembldb= package supports most of the
 filters defined in the =AnnotationFilter= Bioconductor package and defines some
-additional filters specific to the data stored in =EnsDb= databases. The
-=supportedFilters= method can be used to get an overview over all supported filter
-classes, each of them (except the =GRangesFilter=) working on a single
+additional filters specific to the data stored in =EnsDb= databases. Filters can
+be passed directly to all methods extracting data from an =EnsDb= (such as =genes=,
+=transcripts= or =exons=). Alternatively it is possible with the =addFilter= or =filter=
+functions to add a filter directly to an =EnsDb= which will then be used in all
+queries on that object.
+
+The =supportedFilters= method can be used to get an overview over all supported
+filter classes, each of them (except the =GRangesFilter=) working on a single
 column/field in the database.
 
 #+NAME: filters
@@ -182,6 +188,11 @@ filters are also available:
 These can however only be used on =EnsDb= databases that provide protein
 annotations, i.e. for which a call to =hasProteinData= returns =TRUE=.
 
+=EnsDb= databases for more recent Ensembl versions (starting from Ensembl 87)
+provide also evidence levels for individual transcripts in the =tx_support_level=
+database column. Such databases support also a =TxSupportLevelFilter= filter to
+use this columns for filtering.
+
 A simple use case for the filter framework would be to get all transcripts for
 the gene /BCL2L11/. To this end we specify a =GenenameFilter= with the value
 /BCL2L11/. As a result we get a =GRanges= object with =start=, =end=, =strand= and =seqname=
@@ -218,7 +229,11 @@ was used, the column =gene_name= is also returned). Setting
 specified by the =columns= parameter are retrieved.
 
 Instead of passing a filter /object/ to the method it is also possible to provide
-a filter /expression/ written as a =formula=.
+a filter /expression/ written as a =formula=. The =formula= has to be written in the
+form =~ <field> <condition> <value>= with =<field>= being the field (database
+column) in the database, =<condition>= the condition for the filter object and
+=<value>= its value. Use the =supportedFilter= method to get the field names
+corresponding to each filter class.
 
 #+NAME: transcripts-filter-expression
 #+BEGIN_SRC R
@@ -230,6 +245,37 @@ a filter /expression/ written as a =formula=.
 Filter expression have to be written as a formula (i.e. starting with a =~=) in
 the form /column name/ followed by the logical condition.
 
+Alternatively, =EnsDb= objects can be filtered directly using the =filter=
+function. In the example below we use the =filter= function to filter the =EnsDb=
+object and pass that filtered database to the =transcripts= method using the =%>%=
+from the =magrittr= package.
+
+#+NAME: transcripts-filter
+#+BEGIN_SRC R
+  library(magrittr)
+
+  filter(edb, ~ symbol == "BCL2" & tx_biotype != "protein_coding") %>% transcripts
+
+#+END_SRC
+
+Adding a filter to an =EnsDb= enables this filter (globally) on all subsequent
+queries on that object. We could thus filter an =EnsDb= to (virtually) contain
+only features encoded on chromosome Y.
+
+#+NAME: filter-Y
+#+BEGIN_SRC R
+  edb_y <- addFilter(edb, SeqNameFilter("Y"))
+
+  ## All subsequent filters on that EnsDb will only work on features encoded on
+  ## chromosome Y
+  genes(edb_y)
+
+  ## Get all lincRNAs on chromosome Y
+  genes(edb_y, filter = ~ gene_biotype == "lincRNA")
+
+#+END_SRC
+
+
 To get an overview of database tables and available columns the function
 =listTables= can be used. The method =listColumns= on the other hand lists columns
 for the specified database table.
@@ -1092,6 +1138,7 @@ might not be available in all =EnsDB= databases (e.g. such ones created with
   - =seq_name=: the name of the sequence (usually the chromosome name).
   - =seq_strand=: the strand on which the gene is encoded.
   - =seq_coord_system=: the coordinate system of the sequence.
+  - =description=: the description of the gene.
 
 + *entrezgene*: mapping of Ensembl genes to NCBI Entrezgene identifiers. Note that
   this mapping can be a one-to-many mapping.
@@ -1110,6 +1157,9 @@ might not be available in all =EnsDB= databases (e.g. such ones created with
     transcript (NULL for non-coding transcripts).
   - =tx_cds_seq_end=: the end coordinate of the coding region of the transcript.
   - =gene_id=: the gene to which the transcript belongs.
+  =EnsDb= databases for more recent Ensembl releases have also a column
+  =tx_support_level= providing the evidence level for a transcript (1 high
+  evidence, 5 low evidence, NA no evidence calculated).
 
 + *exon*: all exon related annotation.
   - =exon_id=: the Ensembl exon ID.
@@ -1586,8 +1636,10 @@ Now, their filters are created /dynamically/, the first part of the name being t
 attribute (field) name followed by /Filter/. How could I use these? Problem comes
 since my attributes are not unique, i.e. present in one table only.
 
-** TODO Implement a different type of filtering
+** DONE Implement a different type of filtering
+   CLOSED: [2017-06-16 Fri 09:27]
 
+   - State "DONE"       from "TODO"       [2017-06-16 Fri 09:27]
 Implement a filtering that does allow calls like
 
 #+BEGIN_EXAMPLE
@@ -1605,7 +1657,8 @@ The idea would be to add filter(s) as =AnnotationFilterList= object(s) to the
 even the =properties=, =getProperty=, =dropProperty= and =setProperty= methods (check
 /Methods.R/.
 
-
+Now, how should this function be called? =filter= would be intuitive, but is
+already taken. What about BioGenerics =Filter=?
 
 ** DONE Interpret R logical conditions
    CLOSED: [2017-03-22 Wed 06:58]
@@ -1940,8 +1993,10 @@ Base on =ensembldb=:
 + [X] =ggbio=:
 + [ ] =Pbase=:
 + [ ] =wiggleplotr=:
-** TODO entrezid in separate database table
+** DONE entrezid in separate database table
+   CLOSED: [2017-06-16 Fri 09:27]
 
+   - State "DONE"       from "TODO"       [2017-06-16 Fri 09:27]
 + [X] Perl script to save =entrezid= into a separate table =entrezgene=.
 + [X] Import script to create the additional table and indices (=gene_id= and
   =entrezid=).
diff --git a/vignettes/images/dblayout.png b/vignettes/images/dblayout.png
index 382df90..3ac4bee 100644
Binary files a/vignettes/images/dblayout.png and b/vignettes/images/dblayout.png differ
diff --git a/vignettes/proteins.Rmd b/vignettes/proteins.Rmd
index 7bf98ab..7d8ee47 100644
--- a/vignettes/proteins.Rmd
+++ b/vignettes/proteins.Rmd
@@ -4,7 +4,7 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Querying protein features}
@@ -46,34 +46,38 @@ databases created through the Ensembl Perl API contain protein annotation, while
 databases created using `ensDbFromAH`, `ensDbFromGff`, `ensDbFromGRanges` and
 `ensDbFromGtf` don't.
 
-```{r doeval, echo = FALSE, results = "hide"}
+```{r  doeval, echo = FALSE, results = "hide" }
 ## Globally switch off execution of code chunks
 evalMe <- FALSE
 haveProt <- FALSE
+ 
 ```
 
-```{r loadlib, message = FALSE, eval = evalMe}
+```{r  loadlib, message = FALSE, eval = evalMe }
 library(ensembldb)
 library(EnsDb.Hsapiens.v75)
 edb <- EnsDb.Hsapiens.v75
 ## Evaluate whether we have protein annotation available
 hasProteinData(edb)
+ 
 ```
 
 If protein annotation is available, the additional tables and columns are also
 listed by the `listTables` and `listColumns` methods.
 
-```{r listCols, message = FALSE, eval = evalMe}
+```{r  listCols, message = FALSE, eval = evalMe }
 listTables(edb)
+ 
 ```
 
 In the following sections we show examples how to 1) fetch protein annotations
 as additional columns to gene/transcript annotations, 2) fetch protein
 annotation data and 3) map proteins to the genome.
 
-```{r haveprot, echo = FALSE, results = "hide", eval = evalMe}
+```{r  haveprot, echo = FALSE, results = "hide", eval = evalMe }
 ## Use this to conditionally disable eval on following chunks
 haveProt <- hasProteinData(edb) & evalMe
+ 
 ```
 
 
@@ -83,11 +87,12 @@ Protein annotations for (protein coding) transcripts can be retrieved by simply
 adding the desired annotation columns to the `columns` parameter of the e.g. `genes`
 or `transcripts` methods.
 
-```{r a_transcripts, eval = haveProt}
+```{r  a_transcripts, eval = haveProt }
 ## Get also protein information for ZBTB16 transcripts
 txs <- transcripts(edb, filter = GenenameFilter("ZBTB16"),
-		   columns = c("protein_id", "uniprot_id", "tx_biotype"))
+                   columns = c("protein_id", "uniprot_id", "tx_biotype"))
 txs
+ 
 ```
 
 The gene ZBTB16 has protein coding and non-coding transcripts, thus, we get the
@@ -95,10 +100,11 @@ protein ID for the coding- and `NA` for the non-coding transcripts. Note also th
 we have a transcript targeted for nonsense mediated mRNA-decay with a protein ID
 associated with it, but no Uniprot ID.
 
-```{r a_transcripts_coding_noncoding, eval = haveProt}
+```{r  a_transcripts_coding_noncoding, eval = haveProt }
 ## Subset to transcripts with tx_biotype other than protein_coding.
 txs[txs$tx_biotype != "protein_coding", c("uniprot_id", "tx_biotype",
-					  "protein_id")]
+                                          "protein_id")]
+ 
 ```
 
 While the mapping from a protein coding transcript to a Ensembl protein ID
@@ -109,10 +115,11 @@ each Uniprot ID can be mapped to more than one `protein_id` (and hence
 fetching Uniprot related additional columns or even protein ID features, as in
 such cases a redundant list of transcripts is returned.
 
-```{r a_transcripts_coding, eval = haveProt}
+```{r  a_transcripts_coding, eval = haveProt }
 ## List the protein IDs and uniprot IDs for the coding transcripts
 mcols(txs[txs$tx_biotype == "protein_coding",
-	  c("tx_id", "protein_id", "uniprot_id")])
+          c("tx_id", "protein_id", "uniprot_id")])
+ 
 ```
 
 Some of the n:m mappings for Uniprot IDs can be resolved by restricting either
@@ -122,7 +129,7 @@ certain type of mapping method. The corresponding filters are the
 `uniprot_mapping_type` columns of the `uniprot` database table). In the example
 below we restrict the result to Uniprot IDs with the mapping type *DIRECT*.
 
-```{r a_transcripts_coding_up, eval = haveProt}
+```{r  a_transcripts_coding_up, eval = haveProt }
 ## List all uniprot mapping types in the database.
 listUniprotMappingTypes(edb)
 
@@ -131,8 +138,9 @@ listUniprotMappingTypes(edb)
 ## on "DIRECT" mapping methods.
 txs <- transcripts(edb, filter = list(GenenameFilter("ZBTB16"),
 				      UniprotMappingTypeFilter("DIRECT")),
-		   columns = c("protein_id", "uniprot_id", "uniprot_db"))
+                   columns = c("protein_id", "uniprot_id", "uniprot_db"))
 mcols(txs)
+ 
 ```
 
 For this example the use of the `UniprotMappingTypeFilter` resolved the multiple
@@ -151,13 +159,14 @@ protein data to filter the results. In the example below we fetch for example
 all genes from the database that have a certain protein domain in the protein
 encoded by any of its transcripts.
 
-```{r a_genes_protdomid_filter, eval = haveProt}
+```{r  a_genes_protdomid_filter, eval = haveProt }
 ## Get all genes that encode a transcript encoding for a protein that contains
 ## a certain protein domain.
 gns <- genes(edb, filter = ProtDomIdFilter("PS50097"))
 length(gns)
 
 sort(gns$gene_name)
+ 
 ```
 
 So, in total we got 152 genes with that protein domain. In addition to the
@@ -172,19 +181,21 @@ The `select`, `keys` and `mapIds` methods from the `AnnotationDbi` package can a
 used to query `EnsDb` objects for protein annotations. Supported columns and
 key types are returned by the `columns` and `keytypes` methods.
 
-```{r a_2_annotationdbi, message = FALSE, eval = haveProt}
+```{r  a_2_annotationdbi, message = FALSE, eval = haveProt }
 ## Show all columns that are provided by the database
 columns(edb)
 
 ## Show all key types/filters that are supported
 keytypes(edb)
+ 
 ```
 
 Below we fetch all Uniprot IDs annotated to the gene *ZBTB16*.
 
-```{r a_2_select, message = FALSE, eval = haveProt}
+```{r  a_2_select, message = FALSE, eval = haveProt }
 select(edb, keys = "ZBTB16", keytype = "GENENAME",
        columns = "UNIPROTID")
+ 
 ```
 
 This returns us all Uniprot IDs of all proteins encoded by the gene's
@@ -193,10 +204,11 @@ annotated to a protein, does not have an Uniprot ID assigned (thus `NA` is
 returned by the above call). As we see below, this transcript is targeted for
 non sense mediated mRNA decay.
 
-```{r a_2_select_nmd, message = FALSE, eval = haveProt}
+```{r  a_2_select_nmd, message = FALSE, eval = haveProt }
 ## Call select, this time providing a GenenameFilter.
 select(edb, keys = GenenameFilter("ZBTB16"),
        columns = c("TXBIOTYPE", "UNIPROTID", "PROTEINID"))
+ 
 ```
 
 Note also that we passed this time a `GenenameFilter` with the `keys` parameter.
@@ -213,11 +225,12 @@ protein annotations becomes available.
 
 In the code chunk below we fetch all protein annotations for the gene *ZBTB16*.
 
-```{r b_proteins, message = FALSE, eval = haveProt}
+```{r  b_proteins, message = FALSE, eval = haveProt }
 ## Get all proteins and return them as an AAStringSet
 prts <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-		 return.type = "AAStringSet")
+                 return.type = "AAStringSet")
 prts
+ 
 ```
 
 Besides the amino acid sequence, the `prts` contains also additional annotations
@@ -225,8 +238,9 @@ that can be accessed with the `mcols` method (metadata columns). All additional
 columns provided with the parameter `columns` are also added to the `mcols`
 `DataFrame`.
 
-```{r b_proteins_mcols, message = FALSE, eval = haveProt}
+```{r  b_proteins_mcols, message = FALSE, eval = haveProt }
 mcols(prts)
+ 
 ```
 
 Note that the `proteins` method will retrieve only gene/transcript annotations of
@@ -237,24 +251,26 @@ previous section are not fetched.
 Querying in addition Uniprot identifiers or protein domain data will result at
 present in a redundant list of proteins as shown in the code block below.
 
-```{r b_proteins_prot_doms, message = FALSE, eval = haveProt}
+```{r  b_proteins_prot_doms, message = FALSE, eval = haveProt }
 ## Get also protein domain annotations in addition to the protein annotations.
 pd <- proteins(edb, filter = GenenameFilter("ZBTB16"),
-	       columns = c("tx_id", listColumns(edb, "protein_domain")),
-	       return.type = "AAStringSet")
+               columns = c("tx_id", listColumns(edb, "protein_domain")),
+               return.type = "AAStringSet")
 pd
+ 
 ```
 
 The result contains one row/element for each protein domain in each of the
 proteins. The number of protein domains per protein and the `mcols` are shown
 below.
 
-```{r b_proteins_prot_doms_2, message = FALSE, eval = haveProt}
+```{r  b_proteins_prot_doms_2, message = FALSE, eval = haveProt }
 ## The number of protein domains per protein:
 table(names(pd))
 
 ## The mcols
 mcols(pd)
+ 
 ```
 
 As we can see each protein can have several protein domains with the start and
diff --git a/vignettes/proteins.org b/vignettes/proteins.org
index bfb94b2..e42010d 100644
--- a/vignettes/proteins.org
+++ b/vignettes/proteins.org
@@ -12,7 +12,7 @@ author: "Johannes Rainer"
 graphics: yes
 package: ensembldb
 output:
-  BiocStyle::html_document2:
+  BiocStyle::html_document:
     toc_float: true
 vignette: >
   %\VignetteIndexEntry{Querying protein features}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-bioc-ensembldb.git



More information about the debian-med-commit mailing list