[med-svn] [r-cran-bold] 08/12: New upstream version 0.4.0
Andreas Tille
tille at debian.org
Sun Oct 1 20:38:06 UTC 2017
This is an automated email from the git hooks/post-receive script.
tille pushed a commit to branch master
in repository r-cran-bold.
commit 415c7746316ad6aeb3c4b321feb2faa41af90367
Author: Andreas Tille <tille at debian.org>
Date: Sun Oct 1 22:30:01 2017 +0200
New upstream version 0.4.0
---
DESCRIPTION | 25 ++
LICENSE | 2 +
MD5 | 39 +++
NAMESPACE | 40 +++
NEWS.md | 118 +++++++
R/bold-package.R | 44 +++
R/bold_filter.R | 44 +++
R/bold_identify.R | 93 ++++++
R/bold_identify_parents.R | 95 ++++++
R/bold_seq.R | 71 +++++
R/bold_seqspec.R | 87 +++++
R/bold_specimens.R | 57 ++++
R/bold_tax_id.R | 65 ++++
R/bold_tax_name.R | 53 ++++
R/bold_trace.R | 90 ++++++
R/zzz.R | 81 +++++
README.md | 171 ++++++++++
build/vignette.rds | Bin 0 -> 202 bytes
data/sequences.RData | Bin 0 -> 661 bytes
debian/README.test | 9 -
debian/changelog | 23 --
debian/compat | 1 -
debian/control | 35 --
debian/copyright | 32 --
debian/docs | 3 -
debian/rules | 8 -
debian/source/format | 1 -
debian/tests/control | 3 -
debian/tests/run-unit-test | 12 -
debian/watch | 3 -
inst/doc/bold_vignette.Rmd | 439 +++++++++++++++++++++++++
inst/doc/bold_vignette.html | 599 +++++++++++++++++++++++++++++++++++
man/bold-package.Rd | 38 +++
man/bold_filter.Rd | 41 +++
man/bold_identify.Rd | 73 +++++
man/bold_identify_parents.Rd | 57 ++++
man/bold_seq.Rd | 79 +++++
man/bold_seqspec.Rd | 87 +++++
man/bold_specimens.Rd | 69 ++++
man/bold_tax_id.Rd | 66 ++++
man/bold_tax_name.Rd | 56 ++++
man/bold_trace.Rd | 82 +++++
man/sequences.Rd | 16 +
tests/test-all.R | 2 +
tests/testthat/test-bold_identify.R | 36 +++
tests/testthat/test-bold_seq.R | 30 ++
tests/testthat/test-bold_seqspec.R | 32 ++
tests/testthat/test-bold_specimens.R | 35 ++
tests/testthat/test-bold_tax_id.R | 74 +++++
tests/testthat/test-bold_tax_name.R | 33 ++
vignettes/bold_vignette.Rmd | 439 +++++++++++++++++++++++++
51 files changed, 3558 insertions(+), 130 deletions(-)
diff --git a/DESCRIPTION b/DESCRIPTION
new file mode 100644
index 0000000..2d4c2d7
--- /dev/null
+++ b/DESCRIPTION
@@ -0,0 +1,25 @@
+Package: bold
+Title: Interface to Bold Systems 'API'
+Description: A programmatic interface to the Web Service methods provided by
+ Bold Systems for genetic 'barcode' data. Functions include methods for
+ searching by sequences by taxonomic names, ids, collectors, and
+ institutions; as well as a function for searching for specimens, and
+ downloading trace files.
+Version: 0.4.0
+License: MIT + file LICENSE
+Authors at R: c(person("Scott", "Chamberlain", role = c("aut", "cre"),
+ email = "myrmecocystus at gmail.com"))
+URL: https://github.com/ropensci/bold
+BugReports: https://github.com/ropensci/bold/issues
+VignetteBuilder: knitr
+LazyData: yes
+Imports: methods, utils, stats, xml2, httr, stringr, assertthat,
+ jsonlite, reshape, plyr, data.table, tibble
+Suggests: sangerseqR, knitr, testthat, covr
+RoxygenNote: 5.0.1
+NeedsCompilation: no
+Packaged: 2017-01-06 21:09:43 UTC; sacmac
+Author: Scott Chamberlain [aut, cre]
+Maintainer: Scott Chamberlain <myrmecocystus at gmail.com>
+Repository: CRAN
+Date/Publication: 2017-01-06 23:15:12
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..37ee2c7
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,2 @@
+YEAR: 2017
+COPYRIGHT HOLDER: Scott Chamberlain
diff --git a/MD5 b/MD5
new file mode 100644
index 0000000..9eaf8b4
--- /dev/null
+++ b/MD5
@@ -0,0 +1,39 @@
+4048fee7c0975af6a9c0d7a337260afc *DESCRIPTION
+c5af52351472a750055a760a8924ce71 *LICENSE
+5165860a3b255395be13d642b179a371 *NAMESPACE
+2fa5c8bdf95858dd1837fae28ea17d55 *NEWS.md
+28f38a59fa3ed4f2876866e27009eea5 *R/bold-package.R
+d8c604b3cb5646d3eed200f6d682ac03 *R/bold_filter.R
+02ac86657dc05dbd92006534c96a4d49 *R/bold_identify.R
+760c8ca1347f2518652c26b4cb57a36a *R/bold_identify_parents.R
+6015c06eb7b9ab66487923d0a4710254 *R/bold_seq.R
+b06dba4827970b90e0040b96c709bd83 *R/bold_seqspec.R
+cf752c9a8deacc48dda939aa98ab2ac5 *R/bold_specimens.R
+9f5c8fab046120ccaf628b8637226e85 *R/bold_tax_id.R
+0d27cbf2100323c91fb0ce8c90475ac2 *R/bold_tax_name.R
+327a997d91052526d920f535af9925a7 *R/bold_trace.R
+1ba52b69b466fd914392a4a4944e49fe *R/zzz.R
+0361d41c9dd17420ea54bed2d78cd389 *README.md
+b550cac6a4409fc34211d69d6034e5f5 *build/vignette.rds
+bb64a460c31e2e6821ac53870b09c38e *data/sequences.RData
+419068dc2a58bb51f979137163fd9c80 *inst/doc/bold_vignette.Rmd
+da3109bbcd8a2ab41006e4858c774b38 *inst/doc/bold_vignette.html
+1b8a47f0bd42f789894f4262599a7c7e *man/bold-package.Rd
+2790dc0795eddbc5d639e2b8ec68fc9f *man/bold_filter.Rd
+4b9c6b3164d102f5cff0ffc15aa59a0d *man/bold_identify.Rd
+bfe5d520c74325e0d8e18fbfc9e87f26 *man/bold_identify_parents.Rd
+c7d7ef5d73a4f044b92076e4763ec2f3 *man/bold_seq.Rd
+0e416141c79e26d1bd1842fab32b57fb *man/bold_seqspec.Rd
+0c6f9dfaead4086a18453263dcb9cfea *man/bold_specimens.Rd
+aab27d7c0140b13495e2c220aa35c2f2 *man/bold_tax_id.Rd
+9af831a8dc74cb4849943a42b2d1410a *man/bold_tax_name.Rd
+ee7927818e12a5788a6f82cb4c97d414 *man/bold_trace.Rd
+b7a239e528f11e895a4cd2930c3b9d48 *man/sequences.Rd
+d9066883a8fecb16e80ceeef8323edac *tests/test-all.R
+74c1f2703d973538f8468f485e7e65e1 *tests/testthat/test-bold_identify.R
+f164f45ab5a15f3c43d114a3e675d652 *tests/testthat/test-bold_seq.R
+2fd3dff88e57a86026c346f8a59fbc2c *tests/testthat/test-bold_seqspec.R
+9515b48980e0f7653954527b7630b7a1 *tests/testthat/test-bold_specimens.R
+670498c2dfc92d737a1fd81be132f464 *tests/testthat/test-bold_tax_id.R
+2a174b1e7e11a116070defb4ac5fb4f4 *tests/testthat/test-bold_tax_name.R
+419068dc2a58bb51f979137163fd9c80 *vignettes/bold_vignette.Rmd
diff --git a/NAMESPACE b/NAMESPACE
new file mode 100644
index 0000000..1d84cf5
--- /dev/null
+++ b/NAMESPACE
@@ -0,0 +1,40 @@
+# Generated by roxygen2: do not edit by hand
+
+S3method(bold_identify_parents,data.frame)
+S3method(bold_identify_parents,default)
+S3method(bold_identify_parents,list)
+S3method(print,boldtrace)
+export(bold_filter)
+export(bold_identify)
+export(bold_identify_parents)
+export(bold_seq)
+export(bold_seqspec)
+export(bold_specimens)
+export(bold_tax_id)
+export(bold_tax_name)
+export(bold_trace)
+export(read_trace)
+importFrom(assertthat,assert_that)
+importFrom(httr,GET)
+importFrom(httr,build_url)
+importFrom(httr,content)
+importFrom(httr,parse_url)
+importFrom(httr,progress)
+importFrom(httr,stop_for_status)
+importFrom(httr,write_disk)
+importFrom(jsonlite,fromJSON)
+importFrom(methods,is)
+importFrom(plyr,rbind.fill)
+importFrom(reshape,sort_df)
+importFrom(stats,setNames)
+importFrom(stringr,str_replace)
+importFrom(stringr,str_replace_all)
+importFrom(stringr,str_split)
+importFrom(utils,read.delim)
+importFrom(utils,untar)
+importFrom(xml2,as_list)
+importFrom(xml2,read_xml)
+importFrom(xml2,xml_find_all)
+importFrom(xml2,xml_find_first)
+importFrom(xml2,xml_name)
+importFrom(xml2,xml_text)
diff --git a/NEWS.md b/NEWS.md
new file mode 100644
index 0000000..0d332f2
--- /dev/null
+++ b/NEWS.md
@@ -0,0 +1,118 @@
+bold 0.4.0
+==========
+
+### NEW FEATURES
+
+* New function `bold_identify_parents()` to add taxonomic information
+to the output of `bold_identif()`. We take the taxon names from `bold_identify`
+output, and use `bold_tax_name` to get the taxonomic ID, passing it to
+`bold_tax_id` to get the parent names, then attaches those to the input data.
+There are two options given what you put for the `wide` parameter. If `TRUE`
+you get data.frames of the same dimensions with parent rank name and ID
+as new columns (for each name going up the hierarchy) - while if `FALSE`
+you get a long data.frame. thanks @dougwyu for inspiring this (#36)
+
+### MINOR IMPROVEMENTS
+
+* replace `xml2::xml_find_one` with `xml2::xml_find_first` (#33)
+* Fix description of `db` options in `bold_identify` man file -
+COX1 and COX1_SPECIES were switched (#37) thanks for pointing that out
+ at dougwyu
+
+### BUG FIXES
+
+* Fix to `bold_tax_id` for when some elements returned from the BOLD
+API were empty/`NULL` (#32) thanks @fmichonneau !!
+
+
+bold 0.3.5
+==========
+
+### MINOR IMPROVEMENTS
+
+* Added more tests to the test suite (#28)
+
+### BUG FIXES
+
+* Fixed a bug in an internal data parser (#27)
+
+bold 0.3.4
+==========
+
+### NEW FEATURES
+
+* Added a code of conduct
+
+### MINOR IMPROVEMENTS
+
+* Switched to `xml2` from `XML` as the XML parser for this package (#26)
+* Fixes to `bold_trace()` to create dir and tar file when it doesn't
+already exist
+
+### BUG FIXES
+
+* Fixed odd problem where sometimes resulting data from HTTP request
+was garbled on `content(x, "text")`, so now using `rawToChar(content(x))`,
+which works (#24)
+
+
+bold 0.3.0
+==========
+
+### MINOR IMPROVEMENTS
+
+* Explicitly import non-base R functions (#22)
+* Better package level manual file
+
+
+bold 0.2.6
+==========
+
+### MINOR IMPROVEMENTS
+
+* `sangerseqR` package now in Suggests for reading trace files, and is only used in `bold_trace()`
+function.
+* General code tidying, reduction of code duplication.
+* `bold_trace()` gains two new parameters: `overwrite` to choose whether to overwrite an existing
+file of the same name or not, `progress` to show a progress bar for downloading or not.
+* `bold_trace()` gains a print method to show a tidy summary of the trace file downloaded.
+
+### BUG FIXES
+
+* Fixed similar bugs in `bold_tax_name()` (#17) and `bold_tax_id()` (#18) in which species that were missing from the BOLD database returned empty arrays but 200 status codes. Parsing those as failed attempts now. Also fixes problem in taxize in `bold_search()` that use these two functions.
+
+bold 0.2.0
+==========
+
+### NEW FEATURES
+
+* Package gains two new functions for working with the BOLD taxonomy APIs: `bold_tax_name()` and `bold_tax_id()`, which search for taxonomic data from BOLD using either names or BOLD identifiers, respectively. (#11)
+* Two new packages in Imports: `jsonlite` and `reshape`.
+
+### MINOR IMPROVEMENTS
+
+* Added new taxonomy API functions to the vignette (#14)
+* Added reference URLS to all function doc files to allow easy reference for the appropriate API docs.
+* `callopts` parameter changed to `...` throughout the package, so that passing on options to `httr::GET` is done via named parameters, e.g., `config=verbose()`. (#13)
+* Added examples of doing curl debugging throughout man pages.
+
+
+bold 0.1.2
+==========
+
+### MINOR IMPROVEMENTS
+
+* Improved the vignette (#8)
+* Added small function to print helpful message when user inputs no parameters or zero length parameter values.
+
+### BUG FIXES
+
+* Fixed some broken tests with the new `httr` (v0.4) (#9), and added a few more tests (#7)
+
+
+bold 0.1.0
+==========
+
+### NEW FEATURES
+
+* released to CRAN
diff --git a/R/bold-package.R b/R/bold-package.R
new file mode 100644
index 0000000..c59b1c6
--- /dev/null
+++ b/R/bold-package.R
@@ -0,0 +1,44 @@
+#' bold: A programmatic interface to the Barcode of Life data.
+#'
+#' @section About:
+#'
+#' This package gives you access to data from BOLD System \url{http://www.boldsystems.org/}
+#' via their API.
+#'
+#' @section Functions:
+#'
+#' \itemize{
+#' \item \code{\link{bold_specimens}} - Search for specimen data.
+#' \item \code{\link{bold_seq}} - Search for and retrieve sequences.
+#' \item \code{\link{bold_seqspec}} - Get sequence and specimen data together.
+#' \item \code{\link{bold_trace}} - Get trace files - saves to disk.
+#' \item \code{\link{read_trace}} - Read trace files into R.
+#' \item \code{\link{bold_tax_name}} - Get taxonomic names via input names.
+#' \item \code{\link{bold_tax_id}} - Get taxonomic names via BOLD identifiers.
+#' \item \code{\link{bold_identify}} - Search for match given a COI sequence.
+#' }
+#'
+#' Interestingly, they provide xml and tsv format data for the specimen data, while
+#' they provide fasta data format for the sequence data. So for the specimen data
+#' you can get back raw XML, or a data frame parsed from the tsv data, while for
+#' sequence data you get back a list (b/c sequences are quite long and would make
+#' a data frame unwieldy).
+#'
+#' @importFrom methods is
+#' @importFrom stats setNames
+#' @importFrom utils read.delim untar
+#' @importFrom xml2 read_xml xml_find_all xml_find_first xml_text xml_name as_list
+#' @docType package
+#' @name bold-package
+#' @aliases bold
+NULL
+
+#' List of 3 nucleotide sequences to use in examples for the
+#' \code{\link{bold_identify}} function
+#'
+#'
+#' @details Each sequence is a character string, of lengths 410, 600, and 696.
+#' @name sequences
+#' @docType data
+#' @keywords data
+NULL
diff --git a/R/bold_filter.R b/R/bold_filter.R
new file mode 100644
index 0000000..79bf744
--- /dev/null
+++ b/R/bold_filter.R
@@ -0,0 +1,44 @@
+#' Get BOLD specimen + sequence data.
+#'
+#' @export
+#' @param x (data.frame) a data.frame, as returned from
+#' \code{\link{bold_seqspec}}. Note that some combinations of parameters
+#' in \code{\link{bold_seqspec}} don't return a data.frame. Stops with
+#' error message if this is not a data.frame. Required.
+#' @param by (character) the column by which to group. For example,
+#' if you want the longest sequence for each unique species name, then
+#' pass \strong{species_name}. If the column doesn't exist, error
+#' with message saying so. Required.
+#' @param how (character) one of "max" or "min", which get used as
+#' \code{which.max} or \code{which.min} to get the longest or shorest
+#' sequence, respectively. Note that we remove gap/alignment characters
+#' (\code{-})
+#' @return a tibble/data.frame
+#' @examples \dontrun{
+#' res <- bold_seqspec(taxon='Osmia')
+#' maxx <- bold_filter(res, by = "species_name")
+#' minn <- bold_filter(res, by = "species_name", how = "min")
+#'
+#' vapply(maxx$nucleotides, nchar, 1, USE.NAMES = FALSE)
+#' vapply(minn$nucleotides, nchar, 1, USE.NAMES = FALSE)
+#' }
+bold_filter <- function(x, by, how = "max") {
+ if (!inherits(x, "data.frame")) stop("'x' must be a data.frame",
+ call. = FALSE)
+ if (!how %in% c("min", "max")) stop("'how' must be one of 'min' or 'max'",
+ call. = FALSE)
+ if (!by %in% names(x)) stop(sprintf("'%s' is not a valid column in 'x'", by),
+ call. = FALSE)
+ xsp <- split(x, x[[by]])
+ tibble::as_data_frame(setrbind(lapply(xsp, function(z) {
+ lgts <- vapply(z$nucleotides, function(w) nchar(gsub("-", "", w)), 1,
+ USE.NAMES = FALSE)
+ z[eval(parse(text = paste0("which.", how)))(lgts), ]
+ })))
+}
+
+setrbind <- function(x) {
+ (xxx <- data.table::setDF(
+ data.table::rbindlist(x, fill = TRUE, use.names = TRUE))
+ )
+}
diff --git a/R/bold_identify.R b/R/bold_identify.R
new file mode 100644
index 0000000..ead2c98
--- /dev/null
+++ b/R/bold_identify.R
@@ -0,0 +1,93 @@
+#' Search for matches to sequences against the BOLD COI database.
+#'
+#' @export
+#'
+#' @param sequences (character) Returns all records containing matching marker
+#' codes. Required.
+#' @param db (character) The database to match against, one of COX1,
+#' COX1_SPECIES, COX1_SPECIES_PUBLIC, OR COX1_L604bp. See Details for
+#' more information.
+#' @param response (logical) Note that response is the object that returns
+#' from the Curl call, useful for debugging, and getting detailed info on
+#' the API call.
+#' @param ... Further args passed on to \code{\link[httr]{GET}}, main purpose
+#' being curl debugging
+#'
+#' @section db parmeter options:
+#' \itemize{
+#' \item COX1 Every COI barcode record on BOLD with a minimum sequence
+#' length of 500bp (warning: unvalidated library and includes records without
+#' species level identification). This includes many species represented by
+#' only one or two specimens as well as all species with interim taxonomy. This
+#' search only returns a list of the nearest matches and does not provide a
+#' probability of placement to a taxon.
+#' \item COX1_SPECIES Every COI barcode record with a species level
+#' identification and a minimum sequence length of 500bp. This includes
+#' many species represented by only one or two specimens as well as all
+#' species with interim taxonomy.
+#' \item COX1_SPECIES_PUBLIC All published COI records from BOLD and GenBank
+#' with a minimum sequence length of 500bp. This library is a collection of
+#' records from the published projects section of BOLD.
+#' \item OR COX1_L604bp Subset of the Species library with a minimum sequence
+#' length of 640bp and containing both public and private records. This library
+#' is intended for short sequence identification as it provides maximum overlap
+#' with short reads from the barcode region of COI.
+#' }
+#'
+#' @section Named outputs:
+#' To maintain names on the output list of data make sure to pass in a
+#' named list to the \code{sequences} parameter. You can for example,
+#' take a list of sequences, and use \code{\link{setNames}} to set names.
+#'
+#' @return A data.frame with details for each specimen matched.
+#' @references
+#' \url{http://www.boldsystems.org/index.php/resources/api?type=idengine}
+#' @seealso \code{\link{bold_identify_parents}}
+#' @examples \dontrun{
+#' seq <- sequences$seq1
+#' res <- bold_identify(sequences=seq)
+#' head(res[[1]])
+#' head(bold_identify(sequences=seq, db='COX1_SPECIES')[[1]])
+#' }
+
+bold_identify <- function(sequences, db = 'COX1', response=FALSE, ...) {
+ url <- 'http://boldsystems.org/index.php/Ids_xml'
+
+ foo <- function(a, b){
+ args <- bc(list(sequence = a, db = b))
+ out <- GET(url, query = args, ...)
+ stop_for_status(out)
+ assert_that(out$headers$`content-type` == 'text/xml')
+ if (response) {
+ out
+ } else {
+ tt <- content(out, "text", encoding = "UTF-8")
+ xml <- xml2::read_xml(tt)
+ nodes <- xml2::xml_find_all(xml, "//match")
+ toget <- c("ID","sequencedescription","database",
+ "citation","taxonomicidentification","similarity")
+ outlist <- lapply(nodes, function(x){
+ tmp2 <- vapply(toget, function(y) {
+ tmp <- xml2::xml_find_first(x, y)
+ setNames(xml2::xml_text(tmp), xml2::xml_name(tmp))
+ }, "")
+ spectmp <- xml2::as_list(xml2::xml_find_first(x, "specimen"))
+ spectmp <- unnest(spectmp)
+ names(spectmp) <- c('specimen_url','specimen_country',
+ 'specimen_lat','specimen_lon')
+ spectmp[sapply(spectmp, is.null)] <- NA
+ data.frame(c(tmp2, spectmp), stringsAsFactors = FALSE)
+ })
+ do.call(rbind.fill, outlist)
+ }
+ }
+ lapply(sequences, foo, b = db)
+}
+
+unnest <- function(x){
+ if (is.null(names(x))) {
+ list(unname(unlist(x)))
+ } else {
+ do.call(c, lapply(x, unnest))
+ }
+}
diff --git a/R/bold_identify_parents.R b/R/bold_identify_parents.R
new file mode 100644
index 0000000..7e1e817
--- /dev/null
+++ b/R/bold_identify_parents.R
@@ -0,0 +1,95 @@
+#' Add taxonomic parent names to a data.frame
+#'
+#' @export
+#' @param x (data.frame/list) list of data.frames - the output from a call to
+#' \code{\link{bold_identify}}. or a single data.frame from the output from
+#' same. required.
+#' @param wide (logical) output in long or wide format. See Details.
+#' Default: \code{FALSE}
+#'
+#' @details This function gets unique set of taxonomic names from the input
+#' data.frame, then queries \code{\link{bold_tax_name}} to get the
+#' taxonomic ID, passing it to \code{\link{bold_tax_id}} to get the parent
+#' names, then attaches those to the input data.
+#'
+#' @section wide vs long format:
+#' When \code{wide = FALSE} you get many rows for each record. Essentially,
+#' we \code{cbind} the taxonomic classification onto the one row from the
+#' result of \code{\link{bold_identify}}, giving as many rows as there are
+#' taxa in the taxonomic classification.
+#'
+#' When \code{wide = TRUE} you get one row for each record - thus the
+#' dimenions of the input data stay the same. For this option, we take just
+#' the rows for taxonomic ID and name for each taxon in the taxonomic
+#' classification, and name the columns by the taxon rank, so you get
+#' \code{phylum} and \code{phylum_id}, and so on.
+#'
+#' @return a list of the same length as the input
+#'
+#' @examples \dontrun{
+#' df <- bold_identify(sequences = sequences$seq2)
+#'
+#' # long format
+#' out <- bold_identify_parents(df)
+#' str(out)
+#' head(out$seq1)
+#'
+#' # wide format
+#' out <- bold_identify_parents(df, wide = TRUE)
+#' str(out)
+#' head(out$seq1)
+#' }
+bold_identify_parents <- function(x, wide = FALSE) {
+ UseMethod("bold_identify_parents")
+}
+
+#' @export
+bold_identify_parents.default <- function(x, wide = FALSE) {
+ stop("no 'bold_identify_parents' method for ", class(x), call. = FALSE)
+}
+
+#' @export
+bold_identify_parents.data.frame <- function(x, wide = FALSE) {
+ bold_identify_parents(list(x), wide)
+}
+
+#' @export
+bold_identify_parents.list <- function(x, wide = FALSE) {
+ # get unique set of names
+ uniqnms <-
+ unique(unname(unlist(lapply(x, function(z) z$taxonomicidentification))))
+ if (is.null(uniqnms)) {
+ stop("no fields 'taxonomicidentification' found in input", call. = FALSE)
+ }
+
+ # get parent names via bold_tax_name and bold_tax_id
+ out <- stats::setNames(lapply(uniqnms, function(w) {
+ tmp <- bold_tax_name(w)
+ if (!is.null(tmp$taxid)) {
+ tmp2 <- bold_tax_id(tmp$taxid, includeTree = TRUE)
+ tmp2$input <- NULL
+ return(tmp2)
+ } else {
+ NULL
+ }
+ }), uniqnms)
+
+ # appply parent names to input data
+ lapply(x, function(z) {
+ if (wide) {
+ # replace each data.frame with a wide version with just
+ # taxid and taxon name (with col names with rank name)
+ out <- lapply(out, function(h) do.call("cbind", (apply(h, 1, function(x) {
+ tmp <- as.list(x[c('taxid', 'taxon')])
+ tmp$taxid <- as.numeric(tmp$taxid)
+ data.frame(stats::setNames(tmp, paste0(x['tax_rank'], c('_id', ''))),
+ stringsAsFactors = FALSE)
+ }))))
+ }
+ zsplit <- split(z, z$ID)
+ setrbind(lapply(zsplit, function(w) {
+ suppressWarnings(cbind(w, out[names(out) %in%
+ w$taxonomicidentification][[1]]))
+ }))
+ })
+}
diff --git a/R/bold_seq.R b/R/bold_seq.R
new file mode 100644
index 0000000..147e5bb
--- /dev/null
+++ b/R/bold_seq.R
@@ -0,0 +1,71 @@
+#' Search BOLD for sequences.
+#'
+#' Get sequences for a taxonomic name, id, bin, container, institution,
+#' researcher, geographic, place, or gene.
+#'
+#' @importFrom stringr str_replace_all str_replace str_split
+#' @export
+#' @template args
+#' @template otherargs
+#' @references
+#' \url{http://www.boldsystems.org/index.php/resources/api#sequenceParameters}
+#'
+#' @param marker (character) Returns all records containing matching
+#' marker codes.
+#'
+#' @return A list with each element of length 4 with slots for id, name,
+#' gene, and sequence.
+#'
+#' @examples \dontrun{
+#' res <- bold_seq(taxon='Coelioxys')
+#' bold_seq(taxon='Aglae')
+#' bold_seq(taxon=c('Coelioxys','Osmia'))
+#' bold_seq(ids='ACRJP618-11')
+#' bold_seq(ids=c('ACRJP618-11','ACRJP619-11'))
+#' bold_seq(bin='BOLD:AAA5125')
+#' bold_seq(container='ACRJP')
+#' bold_seq(researchers='Thibaud Decaens')
+#' bold_seq(geo='Ireland')
+#' bold_seq(geo=c('Ireland','Denmark'))
+#'
+#' # Return the httr response object for detailed Curl call response details
+#' res <- bold_seq(taxon='Coelioxys', response=TRUE)
+#' res$url
+#' res$status_code
+#' res$headers
+#'
+#' ## curl debugging
+#' ### You can do many things, including get verbose output on the curl
+#' ### call, and set a timeout
+#' library("httr")
+#' bold_seq(taxon='Coelioxys', config=verbose())[1:2]
+#' # bold_seqspec(taxon='Coelioxys', config=timeout(0.1))
+#' }
+
+bold_seq <- function(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL,
+ response=FALSE, ...) {
+
+ args <- bc(
+ list(
+ taxon = pipeornull(taxon), geo = pipeornull(geo),
+ ids = pipeornull(ids), bin = pipeornull(bin),
+ container = pipeornull(container),
+ institutions = pipeornull(institutions),
+ researchers = pipeornull(researchers), marker = pipeornull(marker)
+ )
+ )
+ check_args_given_nonempty(
+ args,
+ c('taxon','ids','bin','container','institutions','researchers',
+ 'geo','marker')
+ )
+ out <- b_GET(paste0(bbase(), 'API_Public/sequence'), args, ...)
+ if (response) {
+ out
+ } else {
+ tt <- rawToChar(content(out, encoding = "UTF-8"))
+ res <- strsplit(tt, ">")[[1]][-1]
+ lapply(res, split_fasta)
+ }
+}
diff --git a/R/bold_seqspec.R b/R/bold_seqspec.R
new file mode 100644
index 0000000..0b1bf32
--- /dev/null
+++ b/R/bold_seqspec.R
@@ -0,0 +1,87 @@
+#' Get BOLD specimen + sequence data.
+#'
+#' @export
+#' @template args
+#' @template otherargs
+#' @references \url{http://www.boldsystems.org/index.php/resources/api#combined}
+#'
+#' @param marker (character) Returns all records containing matching marker
+#' codes.
+#' @param format (character) One of xml or tsv (default). tsv format gives
+#' back a data.frame object. xml gives back parsed xml as a
+#' @param sepfasta (logical) If \code{TRUE}, the fasta data is separated into
+#' a list with names matching the processid's from the data frame.
+#' Default: \code{FALSE}
+#'
+#' @return Either a data.frame, parsed xml, a httr response object, or a list
+#' with length two (a data.frame w/o nucleotide data, and a list with
+#' nucleotide data)
+#'
+#' @examples \dontrun{
+#' bold_seqspec(taxon='Osmia')
+#' bold_seqspec(taxon='Osmia', format='xml')
+#' bold_seqspec(taxon='Osmia', response=TRUE)
+#' res <- bold_seqspec(taxon='Osmia', sepfasta=TRUE)
+#' res$fasta[1:2]
+#' res$fasta['GBAH0293-06']
+#'
+#' # records that match a marker name
+#' res <- bold_seqspec(taxon="Melanogrammus aeglefinus", marker="COI-5P")
+#'
+#' # records that match a geographic locality
+#' res <- bold_seqspec(taxon="Melanogrammus aeglefinus", geo="Canada")
+#'
+#' # return only the longest sequence for each
+#'
+#' ## curl debugging
+#' ### You can do many things, including get verbose output on the curl call,
+#' ### and set a timeout
+#' library("httr")
+#' head(bold_seqspec(taxon='Osmia', config=verbose()))
+#' ## timeout
+#' # head(bold_seqspec(taxon='Osmia', config=timeout(1)))
+#' ## progress
+#' # x <- bold_seqspec(taxon='Osmia', config=progress())
+#' }
+
+bold_seqspec <- function(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL,
+ response=FALSE, format = 'tsv', sepfasta=FALSE, ...) {
+
+ format <- match.arg(format, choices = c('xml','tsv'))
+ args <- bc(list(taxon = pipeornull(taxon), geo = pipeornull(geo),
+ ids = pipeornull(ids), bin = pipeornull(bin),
+ container = pipeornull(container),
+ institutions = pipeornull(institutions),
+ researchers = pipeornull(researchers),
+ marker = pipeornull(marker), combined_download = format))
+ check_args_given_nonempty(args, c('taxon', 'ids', 'bin', 'container',
+ 'institutions', 'researchers',
+ 'geo', 'marker'))
+ out <- b_GET(paste0(bbase(), 'API_Public/combined'), args, ...)
+ if (response) {
+ out
+ } else {
+ tt <- paste0(rawToChar(content(out, encoding = "UTF-8"), multiple = TRUE),
+ collapse = "")
+ if (tt == "") return(NA)
+ temp <- switch(
+ format,
+ xml = xml2::read_xml(tt),
+ tsv = read.delim(text = tt, header = TRUE, sep = "\t",
+ stringsAsFactors = FALSE)
+ )
+ if (!sepfasta) {
+ temp
+ } else {
+ if (format == "tsv") {
+ fasta <- as.list(temp$nucleotides)
+ names(fasta) <- temp$processid
+ df <- temp[ , !names(temp) %in% "nucleotides" ]
+ list(data = df, fasta = fasta)
+ } else {
+ temp
+ }
+ }
+ }
+}
diff --git a/R/bold_specimens.R b/R/bold_specimens.R
new file mode 100644
index 0000000..221da9f
--- /dev/null
+++ b/R/bold_specimens.R
@@ -0,0 +1,57 @@
+#' Search BOLD for specimens.
+#'
+#' @export
+#' @template args
+#' @template otherargs
+#' @references
+#' \url{http://www.boldsystems.org/index.php/resources/api#specimenParameters}
+#'
+#' @param format (character) One of xml or tsv (default). tsv format gives
+#' back a data.frame object. xml gives back parsed xml as a
+#'
+#' @examples \dontrun{
+#' bold_specimens(taxon='Osmia')
+#' bold_specimens(taxon='Osmia', format='xml')
+#' # bold_specimens(taxon='Osmia', response=TRUE)
+#' res <- bold_specimens(taxon='Osmia', format='xml', response=TRUE)
+#' res$url
+#' res$status_code
+#' res$headers
+#'
+#' # More than 1 can be given for all search parameters
+#' bold_specimens(taxon=c('Coelioxys','Osmia'))
+#'
+#' ## curl debugging
+#' ### These examples below take a long time, so you can set a timeout so that
+#' ### it stops by X sec
+#' library("httr")
+#' head(bold_specimens(taxon='Osmia', config=verbose()))
+#' # head(bold_specimens(geo='Costa Rica', config=timeout(6)))
+#' # head(bold_specimens(taxon="Formicidae", geo="Canada", config=timeout(6)))
+#' }
+
+bold_specimens <- function(taxon = NULL, ids = NULL, bin = NULL,
+ container = NULL, institutions = NULL, researchers = NULL, geo = NULL,
+ response=FALSE, format = 'tsv', ...) {
+
+ format <- match.arg(format, choices = c('xml','tsv'))
+ args <- bc(list(taxon=pipeornull(taxon), geo=pipeornull(geo),
+ ids=pipeornull(ids), bin=pipeornull(bin),
+ container=pipeornull(container),
+ institutions=pipeornull(institutions),
+ researchers=pipeornull(researchers),
+ specimen_download=format))
+ check_args_given_nonempty(args, c('taxon','ids','bin','container',
+ 'institutions','researchers','geo'))
+ out <- b_GET(paste0(bbase(), 'API_Public/specimen'), args, ...)
+ if (response) {
+ out
+ } else {
+ tt <- rawToChar(content(out, encoding = "UTF-8"))
+ switch(format,
+ xml = xml2::read_xml(tt),
+ tsv = read.delim(text = tt, header = TRUE, sep = "\t",
+ stringsAsFactors = FALSE)
+ )
+ }
+}
diff --git a/R/bold_tax_id.R b/R/bold_tax_id.R
new file mode 100644
index 0000000..85599d2
--- /dev/null
+++ b/R/bold_tax_id.R
@@ -0,0 +1,65 @@
+#' Search BOLD for taxonomy data by BOLD ID.
+#'
+#' @export
+#' @param id (integer) One or more BOLD taxonomic identifiers. required.
+#' @param dataTypes (character) Specifies the datatypes that will be
+#' returned. 'all' returns all data. 'basic' returns basic taxon information.
+#' 'images' returns specimen images.
+#' @param includeTree (logical) If TRUE (default: FALSE), returns a list
+#' containing information for parent taxa as well as the specified taxon.
+#' @template otherargs
+#' @references
+#' \url{http://boldsystems.org/index.php/resources/api?type=taxonomy}
+#' @seealso \code{bold_tax_name}
+#' @examples \dontrun{
+#' bold_tax_id(id=88899)
+#' bold_tax_id(id=88899, includeTree=TRUE)
+#' bold_tax_id(id=88899, includeTree=TRUE, dataTypes = "stats")
+#' bold_tax_id(id=c(88899,125295))
+#'
+#' ## dataTypes parameter
+#' bold_tax_id(id=88899, dataTypes = "basic")
+#' bold_tax_id(id=88899, dataTypes = "stats")
+#' bold_tax_id(id=88899, dataTypes = "images")
+#' bold_tax_id(id=88899, dataTypes = "geo")
+#' bold_tax_id(id=88899, dataTypes = "sequencinglabs")
+#' bold_tax_id(id=88899, dataTypes = "depository")
+#' bold_tax_id(id=88899, dataTypes = "thirdparty")
+#' bold_tax_id(id=88899, dataTypes = "all")
+#' bold_tax_id(id=c(88899,125295), dataTypes = "geo")
+#' bold_tax_id(id=c(88899,125295), dataTypes = "images")
+#'
+#' ## Passing in NA
+#' bold_tax_id(id = NA)
+#' bold_tax_id(id = c(88899,125295,NA))
+#'
+#' ## get httr response object only
+#' bold_tax_id(id=88899, response=TRUE)
+#' bold_tax_id(id=c(88899,125295), response=TRUE)
+#'
+#' ## curl debugging
+#' library('httr')
+#' bold_tax_id(id=88899, config=verbose())
+#' }
+
+bold_tax_id <- function(id, dataTypes='basic', includeTree=FALSE,
+ response=FALSE, ...) {
+
+ tmp <- lapply(id, function(x)
+ get_response(args = bc(list(taxId = x, dataTypes = dataTypes,
+ includeTree = if (includeTree) TRUE else NULL)),
+ url = paste0(bbase(), "API_Tax/TaxonData"), ...)
+ )
+ if (response) {
+ tmp
+ } else {
+ res <- do.call(rbind.fill, Map(process_response, x = tmp, y = id,
+ z = includeTree, w = dataTypes))
+ if (NCOL(res) == 1) {
+ res$noresults <- NA
+ return(res)
+ } else {
+ res
+ }
+ }
+}
diff --git a/R/bold_tax_name.R b/R/bold_tax_name.R
new file mode 100644
index 0000000..5bf6b11
--- /dev/null
+++ b/R/bold_tax_name.R
@@ -0,0 +1,53 @@
+#' Search BOLD for taxonomy data by taxonomic name.
+#'
+#' @importFrom httr GET stop_for_status content parse_url build_url
+#' progress write_disk
+#' @importFrom assertthat assert_that
+#' @importFrom jsonlite fromJSON
+#' @importFrom reshape sort_df
+#' @importFrom plyr rbind.fill
+#' @export
+#' @param name (character) One or more scientific names. required.
+#' @param fuzzy (logical) Whether to use fuzzy search or not (default: FALSE).
+#' @template otherargs
+#' @references
+#' \url{http://boldsystems.org/index.php/resources/api?type=taxonomy}
+#' @details The \code{dataTypes} parameter is not supported in this function.
+#' If you want to use that parameter, get an ID from this function and pass
+#' it into \code{bold_tax_id}, and then use the \code{dataTypes} parameter.
+#' @seealso \code{\link{bold_tax_id}}
+#' @examples \dontrun{
+#' bold_tax_name(name='Diplura')
+#' bold_tax_name(name='Osmia')
+#' bold_tax_name(name=c('Diplura','Osmia'))
+#' bold_tax_name(name=c("Apis","Puma concolor","Pinus concolor"))
+#' bold_tax_name(name='Diplur', fuzzy=TRUE)
+#' bold_tax_name(name='Osm', fuzzy=TRUE)
+#'
+#' ## get httr response object only
+#' bold_tax_name(name='Diplura', response=TRUE)
+#' bold_tax_name(name=c('Diplura','Osmia'), response=TRUE)
+#'
+#' ## Names with no data in BOLD database
+#' bold_tax_name("Nasiaeshna pentacantha")
+#' bold_tax_name(name = "Cordulegaster erronea")
+#' bold_tax_name(name = "Cordulegaster erronea", response=TRUE)
+#'
+#' ## curl debugging
+#' library('httr')
+#' bold_tax_name(name='Diplura', config=verbose())
+#' }
+
+bold_tax_name <- function(name, fuzzy = FALSE, response = FALSE, ...) {
+
+ tmp <- lapply(name, function(x)
+ get_response(bc(list(taxName = x, fuzzy = if (fuzzy) 'true' else NULL)),
+ url = paste0(bbase(), "API_Tax/TaxonSearch"), ...)
+ )
+ if (response) {
+ tmp
+ } else {
+ do.call(rbind.fill,
+ Map(process_response, x = tmp, y = name, z = FALSE, w = ""))
+ }
+}
diff --git a/R/bold_trace.R b/R/bold_trace.R
new file mode 100644
index 0000000..3e288e6
--- /dev/null
+++ b/R/bold_trace.R
@@ -0,0 +1,90 @@
+#' Get BOLD trace files
+#'
+#' @export
+#' @template args
+#' @references \url{http://www.boldsystems.org/index.php/resources/api#trace}
+#'
+#' @param marker (character) Returns all records containing matching
+#' marker codes.
+#' @param dest (character) A directory to write the files to
+#' @param overwrite (logical) Overwrite existing directory and file?
+#' @param progress (logical) Print progress or not. Uses
+#' \code{\link[httr]{progress}}.
+#' @param ... Futher args passed on to \code{\link[httr]{GET}}.
+#' @param x Object to print or read.
+#'
+#' @examples \dontrun{
+#' # Use a specific destination directory
+#' bold_trace(taxon='Bombus', geo='Alaska', dest="~/mytarfiles")
+#'
+#' # Another example
+#' # bold_trace(ids='ACRJP618-11', dest="~/mytarfiles")
+#' # bold_trace(ids=c('ACRJP618-11','ACRJP619-11'), dest="~/mytarfiles")
+#'
+#' # read file in
+#' x <- bold_trace(ids=c('ACRJP618-11','ACRJP619-11'), dest="~/mytarfiles")
+#' (res <- read_trace(x$ab1[2]))
+#'
+#' # The progress dialog is pretty verbose, so quiet=TRUE is a nice touch,
+#' # but not by default
+#' # Beware, this one take a while
+#' # x <- bold_trace(taxon='Osmia', quiet=TRUE)
+#'
+#' if (requireNamespace("sangerseqR", quietly = TRUE)) {
+#' library("sangerseqR")
+#' primarySeq(res)
+#' secondarySeq(res)
+#' head(traceMatrix(res))
+#' }
+#' }
+
+bold_trace <- function(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL, dest=NULL,
+ overwrite = TRUE, progress = TRUE, ...) {
+
+ if (!requireNamespace("sangerseqR", quietly = TRUE)) {
+ stop("Please install sangerseqR", call. = FALSE)
+ }
+
+ args <- bc(list(taxon=pipeornull(taxon), geo=pipeornull(geo),
+ ids=pipeornull(ids), bin=pipeornull(bin), container=pipeornull(container),
+ institutions=pipeornull(institutions), researchers=pipeornull(researchers),
+ marker=pipeornull(marker)))
+ url <- make_url(paste0(bbase(), 'API_Public/trace'), args)
+ if (is.null(dest)) {
+ destfile <- paste0(getwd(), "/bold_trace_files.tar")
+ destdir <- paste0(getwd(), "/bold_trace_files")
+ } else {
+ destdir <- path.expand(dest)
+ destfile <- paste0(destdir, "/bold_trace_files.tar")
+ }
+ dir.create(destdir, showWarnings = FALSE, recursive = TRUE)
+ if (!file.exists(destfile)) file.create(destfile, showWarnings = FALSE)
+ res <- GET(url, write_disk(path = destfile, overwrite = overwrite),
+ if(progress) progress(), ...)
+ untar(destfile, exdir = destdir)
+ files <- list.files(destdir, full.names = TRUE)
+ ab1 <- list.files(destdir, pattern = ".ab1", full.names = TRUE)
+ structure(list(destfile = destfile, destdir = destdir, ab1 = ab1,
+ args = args), class = "boldtrace")
+}
+
+#' @export
+print.boldtrace <- function(x, ...){
+ cat("\n<bold trace files>", "\n\n")
+ ff <- x$ab1[1:min(10, length(x$ab1))]
+ if (length(ff) < length(x$ab1)) ff <- c(ff, "...")
+ cat(ff, sep = "\n")
+}
+
+#' @export
+#' @rdname bold_trace
+read_trace <- function(x){
+ if (is(x, "boldtrace")) {
+ if (length(x$ab1) > 1) stop("Number of paths > 1, just pass one in",
+ call. = FALSE)
+ sangerseqR::readsangerseq(x$ab1)
+ } else {
+ sangerseqR::readsangerseq(x)
+ }
+}
diff --git a/R/zzz.R b/R/zzz.R
new file mode 100644
index 0000000..4d539be
--- /dev/null
+++ b/R/zzz.R
@@ -0,0 +1,81 @@
+bbase <- function() 'http://www.boldsystems.org/index.php/'
+
+bc <- function(x) Filter(Negate(is.null), x)
+
+split_fasta <- function(x){
+ temp <- paste(">", x, sep = "")
+ seq <- str_replace_all(str_split(str_replace(temp[[1]], "\n", "<<<"),
+ "<<<")[[1]][[2]], "\n", "")
+ stuff <- str_split(x, "\\|")[[1]][c(1:3)]
+ list(id = stuff[1], name = stuff[2], gene = stuff[1], sequence = seq)
+}
+
+pipeornull <- function(x){
+ if (!is.null(x)) {
+ paste0(x, collapse = "|")
+ } else {
+ NULL
+ }
+}
+
+make_url <- function(url, args){
+ tmp <- parse_url(url)
+ tmp$query <- args
+ build_url(tmp)
+}
+
+check_args_given_nonempty <- function(arguments, x){
+ paramnames <- x
+ matchez <- any(paramnames %in% names(arguments))
+ if (!matchez) {
+ stop(sprintf("You must provide a non-empty value to at least one of\n %s",
+ paste0(paramnames, collapse = "\n ")))
+ } else {
+ arguments_noformat <- arguments[ !names(arguments) %in% 'combined_download' ]
+ argslengths <- vapply(arguments_noformat, nchar, numeric(1),
+ USE.NAMES = FALSE)
+ if (any(argslengths == 0)) {
+ stop(sprintf("You must provide a non-empty value to at least one of\n %s",
+ paste0(paramnames, collapse = "\n ")))
+ }
+ }
+}
+
+process_response <- function(x, y, z, w){
+ tt <- rawToChar(content(x, "raw", encoding = "UTF-8"))
+ out <- if (x$status_code > 202) "stop" else jsonlite::fromJSON(tt)
+ if ( length(out) == 0 || identical(out[[1]], list()) || out == "stop" ) {
+ data.frame(input = y, stringsAsFactors = FALSE)
+ } else {
+ if (w %in% c("stats",'images','geo','sequencinglabs','depository')) out <- out[[1]]
+ trynames <- tryCatch(as.numeric(names(out)), warning = function(w) w)
+ if (!is(trynames, "simpleWarning")) names(out) <- NULL
+ if (any(vapply(out, function(x) is.list(x) && length(x) > 0, logical(1)))) {
+ out <- lapply(out, function(x) Filter(length, x))
+ } else {
+ out <- Filter(length, out)
+ }
+ if (!is.null(names(out))) {
+ df <- data.frame(out, stringsAsFactors = FALSE)
+ } else {
+ df <- do.call(rbind.fill, lapply(out, data.frame, stringsAsFactors = FALSE))
+ }
+ row.names(df) <- NULL
+ if ("parentid" %in% names(df)) df <- sort_df(df, "parentid")
+ row.names(df) <- NULL
+ data.frame(input = y, df, stringsAsFactors = FALSE)
+ }
+}
+
+get_response <- function(args, url, ...){
+ res <- GET(url, query = args, ...)
+ assert_that(res$headers$`content-type` == 'text/html; charset=utf-8')
+ res
+}
+
+b_GET <- function(url, args, ...){
+ out <- GET(url, query = args, ...)
+ stop_for_status(out)
+ assert_that(out$headers$`content-type` == 'application/x-download')
+ out
+}
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..e0b2e7e
--- /dev/null
+++ b/README.md
@@ -0,0 +1,171 @@
+bold
+====
+
+
+
+[](https://travis-ci.org/ropensci/bold)
+[](https://ci.appveyor.com/project/sckott/bold/branch/master)
+[](https://github.com/metacran/cranlogs.app)
+[](https://codecov.io/github/ropensci/bold?branch=master)
+[](https://cran.r-project.org/package=bold)
+
+`bold` accesses BOLD barcode data.
+
+[Documentation for the BOLD API](http://www.boldsystems.org/index.php/resources/api).
+
+
+## Installation
+
+Stable CRAN version
+
+
+```r
+install.packages("bold")
+```
+
+Development version from Github
+
+Install `sangerseqR` first
+
+
+```r
+source("http://bioconductor.org/biocLite.R")
+biocLite("sangerseqR")
+```
+
+Then `bold`
+
+
+```r
+devtools::install_github("ropensci/bold")
+```
+
+
+```r
+library("bold")
+```
+
+
+## Search for sequence data only
+
+Default is to get a list back
+
+
+```r
+bold_seq(taxon='Coelioxys')[[1]]
+#> $id
+#> [1] "BBHYL407-10"
+#>
+#> $name
+#> [1] "Coelioxys porterae"
+#>
+#> $gene
+#> [1] "BBHYL407-10"
+#>
+#> $sequence
+#> [1] "TATAATATATATAATTTTTGCAATATGATCAGGTATAATTGGATCTTCTTTAAGAATAATTATCCGAATAGAATTAAGAATTCCAGGATCATGAATTAGTAATGATCAAATTTATAATTCTTTCATTACAGCACATGCATTCCTAATAATTTTTTTTTTAGTTATACCTTTTTTAATTGGAGGATTTGGTAATTGATTAACCCCACTAATATTAGGAGCTCCTGATATAGCTTTCCCTCGTATAAATAATATTAGATTTTGATTATTACCCCCTGCTCTATTAATATTATTATCAAGAAATTTAATTAATCCAAGACCTGGAACAGGATGAACTGTATACCCCCCTTTATCTTCTTATACTTACCACCCTTCTCCATCTGTAGATTTAGCAATTTTTTCTTTACATTTATCAGGAATTTCTTCAATTATTGGATCAATAAATTTTATTGTAACAATTTTAATAATAAAAAATTATTCAATAAAT [...]
+```
+
+You can optionally get back the `httr` response object
+
+
+```r
+res <- bold_seq(taxon='Coelioxys', response=TRUE)
+res$headers
+#> $date
+#> [1] "Fri, 06 Jan 2017 18:27:39 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=fasta.fas"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+```
+
+## Search for specimen data only
+
+By default you download `tsv` format data, which is given back to you as a `data.frame`
+
+
+```r
+res <- bold_specimens(taxon='Osmia')
+head(res[,1:8])
+#> processid sampleid recordID catalognum fieldnum
+#> 1 ASGCB255-13 BIOUG07489-F04 3955532 BIOUG07489-F04
+#> 2 FBAPB679-09 BC ZSM HYM 02154 1289040 BC ZSM HYM 02154 BC ZSM HYM 02154
+#> 3 FBAPB751-09 BC ZSM HYM 02226 1289112 BC ZSM HYM 02226 BC ZSM HYM 02226
+#> 4 FBAPC359-10 BC ZSM HYM 05964 1709625 BC ZSM HYM 05964 BC ZSM HYM 05964
+#> 5 FBAPC368-10 BC ZSM HYM 05973 1709634 BC ZSM HYM 05973 BC ZSM HYM 05973
+#> 6 FBAPC540-11 BC ZSM HYM 07000 2021833 BC ZSM HYM 07000 BC ZSM HYM 07000
+#> institution_storing bin_uri phylum_taxID
+#> 1 Biodiversity Institute of Ontario BOLD:ABZ2181 20
+#> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1788 20
+#> 3 SNSB, Zoologische Staatssammlung Muenchen 20
+#> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1999 20
+#> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20
+#> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAY5201 20
+```
+
+## Search for specimen plus sequence data
+
+By default you download `tsv` format data, which is given back to you as a `data.frame`
+
+
+```r
+res <- bold_seqspec(taxon='Osmia', sepfasta=TRUE)
+res$fasta[1:2]
+#> $`ASGCB255-13`
+#> [1] "-------------------------------GGAATAATTGGTTCTGCTATAAGTATTATTATTCGAATAGAATTAAGAATTCCTGGATCATTCATTTCTAATGATCAAACTTATAATTCTTTAGTAACAGCTCATGCTTTTTTAATAATTTTTTTTCTTGTAATACCATTTTTAATTGGTGGATTTGGAAATTGATTAATTCCATTAATATTAGGAATCCCAGATATAGCATTTCCTCGAATAAATAATATTAGATTTTGACTTTTACCCCCATCCTTAATAATTTTACTTTTAAGAAATTTCTTAAATCCAAGTCCAGGAACAGGTTGAACTGTATATCCCCCCCTTTCTTCTTATTTATTTCATTCTTCCCCTTCTGTTGATTTAGCTATTTTTTCTCTTCATATTTCTGGTTTATCTTCCATCATAGGTTCTTTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCATTAAAA [...]
+#>
+#> $`FBAPB679-09`
+#> [1] "----------------------------TCTGGAATAATTGGGTCAGCAATAAGAATTATTATTCGAATAGAATTAAGTATTCCAGGATCATGAATTTCTAATGATCAAACATATAATTCTTTAGTAACTGCACATGCTTTTTTAATAATTTTTTTTCTTGTTATACCATTTTTAATTGGAGGATTTGGTAATTGATTAGTTCCATTAATATTAGGAATTCCAGATATAGCTTTTCCTCGAATAAATAATATTAGATTTTGACTTTTACCTCCATCTTTAACATTATTACTTCTAAGAAATTTTCTAAATCCAAGTCCCGGAACAGGATGAACTATTTATCCTCCATTATCTTCAAATTTATTTCATACATCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTCTATCTTCTATTATAGGTTCATTAAACTTTATTGTTACTATTATTATAATAAAAAATATTTCTTTAAAA [...]
+```
+
+Or you can index to a specific sequence like
+
+
+```r
+res$fasta['GBAH0293-06']
+#> $`GBAH0293-06`
+#> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATT [...]
+```
+
+## Get trace files
+
+This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.
+
+
+```r
+x <- bold_trace(ids = 'ACRJP618-11', progress = FALSE)
+read_trace(x$ab1)
+#> Number of datapoints: 8877
+#> Number of basecalls: 685
+#>
+#> Primary Basecalls: NNNNNNNNNNNNNNNNNNGNNNTTGAGCAGGNATAGTAGGANCTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATT [...]
+#>
+#> Secondary Basecalls:
+```
+
+## Meta
+
+* Please [report any issues or bugs](https://github.com/ropensci/bold/issues).
+* License: MIT
+* Get citation information for `bold` in R doing `citation(package = 'bold')`
+* Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
+
+[](https://ropensci.org)
diff --git a/build/vignette.rds b/build/vignette.rds
new file mode 100644
index 0000000..96fc424
Binary files /dev/null and b/build/vignette.rds differ
diff --git a/data/sequences.RData b/data/sequences.RData
new file mode 100644
index 0000000..09bb85e
Binary files /dev/null and b/data/sequences.RData differ
diff --git a/debian/README.test b/debian/README.test
deleted file mode 100644
index 4fe93f7..0000000
--- a/debian/README.test
+++ /dev/null
@@ -1,9 +0,0 @@
-Notes on how this package can be tested.
-────────────────────────────────────────
-
-This package can be tested by running the provided test:
-
-cd tests
-LC_ALL=C R --no-save < test-all.R
-
-in order to confirm its integrity.
diff --git a/debian/changelog b/debian/changelog
deleted file mode 100644
index a43373c..0000000
--- a/debian/changelog
+++ /dev/null
@@ -1,23 +0,0 @@
-r-cran-bold (0.4.0-1) unstable; urgency=medium
-
- * New upstream version
- * debhelper 10
- * d/watch: version=4
- * New Build-Depends: r-cran-data.table, r-cran-tibble
-
- -- Andreas Tille <tille at debian.org> Sun, 08 Jan 2017 08:30:33 +0100
-
-r-cran-bold (0.3.5-1) unstable; urgency=medium
-
- * New upstream version
- * Convert to dh-r
- * Canonical homepage for CRAN
- * New Build-Depends: r-cran-xml2
-
- -- Andreas Tille <tille at debian.org> Tue, 08 Nov 2016 11:38:13 +0100
-
-r-cran-bold (0.3.0-1) unstable; urgency=low
-
- * Initial release (Closes: #819226)
-
- -- Andreas Tille <tille at debian.org> Fri, 25 Mar 2016 07:37:04 +0100
diff --git a/debian/compat b/debian/compat
deleted file mode 100644
index f599e28..0000000
--- a/debian/compat
+++ /dev/null
@@ -1 +0,0 @@
-10
diff --git a/debian/control b/debian/control
deleted file mode 100644
index 95cb96c..0000000
--- a/debian/control
+++ /dev/null
@@ -1,35 +0,0 @@
-Source: r-cran-bold
-Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
-Uploaders: Andreas Tille <tille at debian.org>
-Section: gnu-r
-Priority: optional
-Build-Depends: debhelper (>= 10),
- dh-r,
- r-base-dev,
- r-cran-xml,
- r-cran-stringr,
- r-cran-assertthat,
- r-cran-jsonlite,
- r-cran-reshape,
- r-cran-plyr,
- r-cran-httr,
- r-cran-xml2,
- r-cran-data.table,
- r-cran-tibble
-Standards-Version: 3.9.8
-Vcs-Browser: https://anonscm.debian.org/viewvc/debian-med/trunk/packages/R/r-cran-bold/trunk/
-Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/R/r-cran-bold/trunk/
-Homepage: https://cran.r-project.org/package=bold
-
-Package: r-cran-bold
-Architecture: all
-Depends: ${misc:Depends},
- ${R:Depends}
-Recommends: ${R:Recommends}
-Suggests: ${R:Suggests}
-Description: GNU R interface to Bold Systems for genetic barcode data
- A programmatic interface to the Web Service methods provided by Bold
- Systems for genetic barcode data. Functions include methods for
- searching by sequences by taxonomic names, ids, collectors, and
- institutions; as well as a function for searching for specimens, and
- downloading trace files.
diff --git a/debian/copyright b/debian/copyright
deleted file mode 100644
index cde28c3..0000000
--- a/debian/copyright
+++ /dev/null
@@ -1,32 +0,0 @@
-Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
-Upstream-Contact: Scott Chamberlain <myrmecocystus at gmail.com>
-Upstream-Name: bold
-Source: https://cran.r-project.org/package=bold
-
-Files: *
-Copyright: 2013-2016 Scott Chamberlain <myrmecocystus at gmail.com>
-License: MIT
-
-Files: debian/*
-Copyright: 2016 Andreas Tille <tille at debian.org>
-License: MIT
-
-License: MIT
- Permission is hereby granted, free of charge, to any person obtaining a
- copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
- .
- The above copyright notice and this permission notice shall be included
- in all copies or substantial portions of the Software.
- .
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
- CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
- TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
- SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/debian/docs b/debian/docs
deleted file mode 100644
index 960011c..0000000
--- a/debian/docs
+++ /dev/null
@@ -1,3 +0,0 @@
-tests
-debian/README.test
-debian/tests/run-unit-test
diff --git a/debian/rules b/debian/rules
deleted file mode 100755
index ae86733..0000000
--- a/debian/rules
+++ /dev/null
@@ -1,8 +0,0 @@
-#!/usr/bin/make -f
-
-%:
- dh $@ --buildsystem R
-
-override_dh_install:
- dh_install
- find debian -name LICENSE -delete
diff --git a/debian/source/format b/debian/source/format
deleted file mode 100644
index 163aaf8..0000000
--- a/debian/source/format
+++ /dev/null
@@ -1 +0,0 @@
-3.0 (quilt)
diff --git a/debian/tests/control b/debian/tests/control
deleted file mode 100644
index b044b0c..0000000
--- a/debian/tests/control
+++ /dev/null
@@ -1,3 +0,0 @@
-Tests: run-unit-test
-Depends: @, r-cran-testthat
-Restrictions: allow-stderr
diff --git a/debian/tests/run-unit-test b/debian/tests/run-unit-test
deleted file mode 100644
index 90e00dd..0000000
--- a/debian/tests/run-unit-test
+++ /dev/null
@@ -1,12 +0,0 @@
-#!/bin/sh -e
-
-oname=bold
-pkg=r-cran-`echo $oname | tr [A-Z] [a-z]`
-
-if [ "$ADTTMP" = "" ] ; then
- ADTTMP=`mktemp -d /tmp/${pkg}-test.XXXXXX`
-fi
-cd $ADTTMP
-cp -a /usr/share/doc/${pkg}/tests/* $ADTTMP
-LC_ALL=C R --no-save < test-all.R
-rm -fr $ADTTMP/*
diff --git a/debian/watch b/debian/watch
deleted file mode 100644
index 90b858e..0000000
--- a/debian/watch
+++ /dev/null
@@ -1,3 +0,0 @@
-version=4
-http://cran.r-project.org/src/contrib/bold_([-0-9\.]*).tar.gz
-
diff --git a/inst/doc/bold_vignette.Rmd b/inst/doc/bold_vignette.Rmd
new file mode 100644
index 0000000..5d5d2f9
--- /dev/null
+++ b/inst/doc/bold_vignette.Rmd
@@ -0,0 +1,439 @@
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{bold vignette}
+%\VignetteEncoding{UTF-8}
+-->
+
+
+
+`bold` is an R package to connect to [BOLD Systems](http://www.boldsystems.org/) via their API. Functions in `bold` let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.
+
+### bold info
+
++ [BOLD home page](http://boldsystems.org/)
++ [BOLD API docs](http://boldsystems.org/index.php/resources/api)
+
+### Using bold
+
+**Install**
+
+Install `bold` from CRAN
+
+
+
+```r
+install.packages("bold")
+```
+
+Or install the development version from GitHub
+
+
+```r
+devtools::install_github("ropensci/bold")
+```
+
+Load the package
+
+
+```r
+library("bold")
+```
+
+
+### Search for taxonomic names via names
+
+`bold_tax_name` searches for names with names.
+
+
+```r
+bold_tax_name(name = 'Diplura')
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+```
+
+
+```r
+bold_tax_name(name = c('Diplura', 'Osmia'))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> 3 Osmia 4940 Osmia genus Animals 4962 Megachilinae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+#> 3 Osmia
+```
+
+
+### Search for taxonomic names via BOLD identifiers
+
+`bold_tax_id` searches for names with BOLD identifiers.
+
+
+```r
+bold_tax_id(id = 88899)
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+```
+
+
+```r
+bold_tax_id(id = c(88899, 125295))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+#> 2 125295 125295 Helianthus genus Plants 100962 Asteraceae
+```
+
+
+### Search for sequence data only
+
+The BOLD sequence API gives back sequence data, with a bit of metadata.
+
+The default is to get a list back
+
+
+```r
+bold_seq(taxon = 'Coelioxys')[1:2]
+#> [[1]]
+#> [[1]]$id
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$name
+#> [1] "Coelioxys conica"
+#>
+#> [[1]]$gene
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------ACCTCTTTAAGAATAATTATTCGTATAGAAATAAGAATTCCAGGATCTTGAATTAATAATGATCAAATTTATAACTCCTTTATTACAGCACATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTCTTATTGGAGGATTTGGAAATTGATTAGTACCTTTAATATTAGGATCACCAGATATAGCTTTCCCACGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTATTAAGTAATTTAATAAATCCCAGACCAGGAACAGGCTGAACAGTTTATCCTCCTTTATCTTTATACACATACCACCCTTCTCCCTCAGTTGATTTAGCAATTTTTTCACTACATCTATCAGGAATCTCTTCTATTATTGGATCTATAAATTTTATTGTTACAATTTTAATAATAAAAAACTTTTCAATAAATTATAATCAAATACCATTATTCC [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$name
+#> [1] "Coelioxys afra"
+#>
+#> [[2]]$gene
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ACGAATAAATAATGTAAGATTTTGACTATTACCTCCCTCAATTTTCTTATTATTATCAAGAACCCTAATTAACCCAAGAGCTGGTACTGGATGAACTGTATATCCTCCTTTATCCTTATATACATTTCATGCCTCACCTTCCGTTGATTTAGCAATTTTTTCACTTCATTTATCAGGAATTTCATCAATTATTGGATCAATAAATTTTATTGTTACAATCTTAATAATAAAAAATTTTTCTTTAAAT [...]
+```
+
+You can optionally get back the `httr` response object
+
+
+```r
+res <- bold_seq(taxon = 'Coelioxys', response = TRUE)
+res$headers
+#> $date
+#> [1] "Tue, 15 Sep 2015 20:02:31 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=fasta.fas"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+```
+
+You can do geographic searches
+
+
+```r
+bold_seq(geo = "USA")
+#> [[1]]
+#> [[1]]$id
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$name
+#> [1] "Macrobdella decora"
+#>
+#> [[1]]$gene
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTATTTTAGGTT [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$name
+#> [1] "Haemopis terrestris"
+#>
+#> [[2]]$gene
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTATTTTAGGCT [...]
+#>
+#>
+#> [[3]]
+#> [[3]]$id
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$name
+#> [1] "Steinernema carpocapsae"
+#>
+#> [[3]]$gene
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$sequence
+#> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGCTATTAATT [...]
+#>
+#>
+#> [[4]]
+#> [[4]]$id
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[4]]$gene
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACT [...]
+#>
+#>
+#> [[5]]
+#> [[5]]$id
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[5]]$gene
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACT [...]
+```
+
+And you can search by researcher name
+
+
+```r
+bold_seq(researchers = 'Thibaud Decaens')[[1]]
+#> $id
+#> [1] "BGABA657-14"
+#>
+#> $name
+#> [1] "Coleoptera"
+#>
+#> $gene
+#> [1] "BGABA657-14"
+#>
+#> $sequence
+#> [1] "ACACTCTATTTCATTTTCGGAGCTTGATCAGGAATAGTAGGAACTTCTTTAAGAATACTAATTCGATCTGAATTGGGAAACCCCGGCTCATTGATTGGGGATGATCAAATTTATAATGTTATTGTAACAGCCCATGCATTCATTATAATTTTTTTTATAGTAATACCGATCATAATAGGAGGTTTTGGAAATTGATTAGTCCCGCTAATATTAGGTGCCCCAGATATAGCATTTCCTCGAATAAATAATATAAGATTTTGACTTCTTCCGCCTTCATTAACTTTACTTATTATAAGAAGAATTGTAGAAAACGGGGCGGGAACAGGATGAACAGTTTACCCACCCCTCTCTTCTAACATTGCTCATAGAGGAGCCTCTGTAGATCTTGCAATTTTTAGATTACATTTAGCCGGTGTATCATCAATTTTAGGTGCAGTTAATTTTATTACAACTATTATTAATATACGACCTAAAGGAATAACAT [...]
+```
+
+by taxon IDs
+
+
+```r
+bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11'))
+#> [[1]]
+#> [[1]]$id
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[1]]$gene
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$sequence
+#> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAAT [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[2]]$gene
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$sequence
+#> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAAT [...]
+```
+
+by container (containers include project codes and dataset codes)
+
+
+```r
+bold_seq(container = 'ACRJP')[[1]]
+#> $id
+#> [1] "ACRJP003-09"
+#>
+#> $name
+#> [1] "Lepidoptera"
+#>
+#> $gene
+#> [1] "ACRJP003-09"
+#>
+#> $sequence
+#> [1] "AACATTATATTTTATTTTTGGGATCTGATCTGGAATAGTAGGGACATCTTTAAGTATACTAATTCGAATAGAACTAGGAAATCCTGGATGTTTAATTGGGGATGATCAAATTTATAATACTATTGTTACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCCATTATAATTGGAGGTTTTGGCAATTGACTTGTACCATTAATATTAGGAGCCCCTGATATAGCATTTCCCCGAATAAATAATATAAGATTTTGACTTCTTCCCCCCTCATTAATTTTATTAATTTCAAGAAGAATTGTTGAAAATGGAGCAGGAACAGGATGAACAGTCTATCCTCCATTATCTTCTAATATTGCGCATAGAGGATCCTCTGTTGATTTAGCTATTTTCTCACTTCATTTAGCAGGAATTTCTTCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAATAAATAATTTACTT [...]
+```
+
+by bin (a bin is a _Barcode Index Number_)
+
+
+```r
+bold_seq(bin = 'BOLD:AAA5125')[[1]]
+#> $id
+#> [1] "BLPAB406-06"
+#>
+#> $name
+#> [1] "Eacles ormondei"
+#>
+#> $gene
+#> [1] "BLPAB406-06"
+#>
+#> $sequence
+#> [1] "AACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAGTAGGAACTTCTTTAAGATTACTAATTCGAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACTCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATAATATAATA [...]
+```
+
+And there are more ways to query, check out the docs for `?bold_seq`.
+
+
+### Search for specimen data only
+
+The BOLD specimen API doesn't give back sequences, only specimen data. By default you download `tsv` format data, which is given back to you as a `data.frame`
+
+
+```r
+res <- bold_specimens(taxon = 'Osmia')
+head(res[,1:8])
+#> processid sampleid recordID catalognum fieldnum
+#> 1 ASGCB261-13 BIOUG07489-F10 3955538 BIOUG07489-F10
+#> 2 BCHYM1499-13 BC ZSM HYM 19359 4005348 BC ZSM HYM 19359 BC ZSM HYM 19359
+#> 3 BCHYM412-13 BC ZSM HYM 18272 3896353 BC ZSM HYM 18272 BC ZSM HYM 18272
+#> 4 BCHYM413-13 BC ZSM HYM 18273 3896354 BC ZSM HYM 18273 BC ZSM HYM 18273
+#> 5 FBAPB706-09 BC ZSM HYM 02181 1289067 BC ZSM HYM 02181 BC ZSM HYM 02181
+#> 6 FBAPB730-09 BC ZSM HYM 02205 1289091 BC ZSM HYM 02205 BC ZSM HYM 02205
+#> institution_storing bin_uri phylum_taxID
+#> 1 Biodiversity Institute of Ontario BOLD:AAB8874 20
+#> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAD6282 20
+#> 3 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAE4126 20
+#> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20
+```
+
+You can optionally get back the data in `XML` format
+
+
+```r
+bold_specimens(taxon = 'Osmia', format = 'xml')
+```
+
+
+```r
+<?xml version="1.0" encoding="UTF-8"?>
+<bold_records xsi:noNamespaceSchemaLocation="http://www.boldsystems.org/schemas/BOLDPublic_record.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+ <record>
+ <record_id>1470124</record_id>
+ <processid>BOM1525-10</processid>
+ <bin_uri>BOLD:AAN3337</bin_uri>
+ <specimen_identifiers>
+ <sampleid>DHB 1011</sampleid>
+ <catalognum>DHB 1011</catalognum>
+ <fieldnum>DHB1011</fieldnum>
+ <institution_storing>Marjorie Barrick Museum</institution_storing>
+ </specimen_identifiers>
+ <taxonomy>
+```
+
+You can choose to get the `httr` response object back if you'd rather work with the raw data returned from the BOLD API.
+
+
+```r
+res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE)
+res$url
+#> [1] "http://www.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&specimen_download=xml"
+res$status_code
+#> [1] 200
+res$headers
+#> $date
+#> [1] "Mon, 28 Mar 2016 20:39:18 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=bold_data.xml"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+```
+
+### Search for specimen plus sequence data
+
+The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default `tsv` format data, which is given back to you as a `data.frame`. Here, we're setting `sepfasta=TRUE` so that the sequence data is given back as a list, and taken out of the `data.frame` returned so the `data.frame` is more manageable.
+
+
+```r
+res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE)
+res$fasta[1:2]
+#> $`ASGCB261-13`
+#> [1] "AATTTTATATATAATTTTTGCTATATGATCAGGAATAATTGGTTCAGCAATAAGAATTATTATTCGAATAGAATTAAGAATTCCTGGTTCATGAATTTCAAATGATCAAACTTATAATTCTTTAGTTACTGCTCATGCTTTTTTAATAATTTTTTTCTTAGTTATACCATTCTTAATTGGGGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCATTTCCACGAATAAATAATATTAGATTTTGACTTTTACCTCCTTCTTTAATACTTTTATTATTAAGAAATTTTATAAATCCTAGTCCAGGAACTGGATGAACTGTTTATCCACCTTTATCTTCTCATTTATTTCATTCTTCTCCTTCAGTTGATATAGCTATTTTTTCTTTACATATTTCTGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAA [...]
+#>
+#> $`BCHYM1499-13`
+#> [1] "AATTCTTTACATAATTTTTGCTTTATGATCTGGAATAATTGGGTCAGCAATAAGAATTATTATTCGAATAGAATTAAGTATCCCAGGTTCATGAATTACTAATGATCAAATTTATAATTCTTTAGTAACTGCACATGCTTTTTTAATAATTTTTTTTCTTGTGATACCATTTTTAATTGGAGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCTTTCCCACGAATAAACAATATTAGATTTTGATTATTACCGCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCCCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAA [...]
+```
+
+Or you can index to a specific sequence like
+
+
+```r
+res$fasta['GBAH0293-06']
+#> $`GBAH0293-06`
+#> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATT [...]
+```
+
+### Get trace files
+
+This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.
+
+
+```r
+bold_trace(taxon = 'Osmia', quiet = TRUE)
+```
diff --git a/inst/doc/bold_vignette.html b/inst/doc/bold_vignette.html
new file mode 100644
index 0000000..97a2373
--- /dev/null
+++ b/inst/doc/bold_vignette.html
@@ -0,0 +1,599 @@
+<!DOCTYPE html>
+<html>
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
+
+<title>bold info</title>
+
+<script type="text/javascript">
+window.onload = function() {
+ var imgs = document.getElementsByTagName('img'), i, img;
+ for (i = 0; i < imgs.length; i++) {
+ img = imgs[i];
+ // center an image if it is the only element of its parent
+ if (img.parentElement.childElementCount === 1)
+ img.parentElement.style.textAlign = 'center';
+ }
+};
+</script>
+
+<!-- Styles for R syntax highlighter -->
+<style type="text/css">
+ pre .operator,
+ pre .paren {
+ color: rgb(104, 118, 135)
+ }
+
+ pre .literal {
+ color: #990073
+ }
+
+ pre .number {
+ color: #099;
+ }
+
+ pre .comment {
+ color: #998;
+ font-style: italic
+ }
+
+ pre .keyword {
+ color: #900;
+ font-weight: bold
+ }
+
+ pre .identifier {
+ color: rgb(0, 0, 0);
+ }
+
+ pre .string {
+ color: #d14;
+ }
+</style>
+
+<!-- R syntax highlighter -->
+<script type="text/javascript">
+var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.chi [...]
+hljs.initHighlightingOnLoad();
+</script>
+
+
+
+<style type="text/css">
+body, td {
+ font-family: sans-serif;
+ background-color: white;
+ font-size: 13px;
+}
+
+body {
+ max-width: 800px;
+ margin: auto;
+ padding: 1em;
+ line-height: 20px;
+}
+
+tt, code, pre {
+ font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
+}
+
+h1 {
+ font-size:2.2em;
+}
+
+h2 {
+ font-size:1.8em;
+}
+
+h3 {
+ font-size:1.4em;
+}
+
+h4 {
+ font-size:1.0em;
+}
+
+h5 {
+ font-size:0.9em;
+}
+
+h6 {
+ font-size:0.8em;
+}
+
+a:visited {
+ color: rgb(50%, 0%, 50%);
+}
+
+pre, img {
+ max-width: 100%;
+}
+pre {
+ overflow-x: auto;
+}
+pre code {
+ display: block; padding: 0.5em;
+}
+
+code {
+ font-size: 92%;
+ border: 1px solid #ccc;
+}
+
+code[class] {
+ background-color: #F8F8F8;
+}
+
+table, td, th {
+ border: none;
+}
+
+blockquote {
+ color:#666666;
+ margin:0;
+ padding-left: 1em;
+ border-left: 0.5em #EEE solid;
+}
+
+hr {
+ height: 0px;
+ border-bottom: none;
+ border-top-width: thin;
+ border-top-style: dotted;
+ border-top-color: #999999;
+}
+
+ at media print {
+ * {
+ background: transparent !important;
+ color: black !important;
+ filter:none !important;
+ -ms-filter: none !important;
+ }
+
+ body {
+ font-size:12pt;
+ max-width:100%;
+ }
+
+ a, a:visited {
+ text-decoration: underline;
+ }
+
+ hr {
+ visibility: hidden;
+ page-break-before: always;
+ }
+
+ pre, blockquote {
+ padding-right: 1em;
+ page-break-inside: avoid;
+ }
+
+ tr, img {
+ page-break-inside: avoid;
+ }
+
+ img {
+ max-width: 100% !important;
+ }
+
+ @page :left {
+ margin: 15mm 20mm 15mm 10mm;
+ }
+
+ @page :right {
+ margin: 15mm 10mm 15mm 20mm;
+ }
+
+ p, h2, h3 {
+ orphans: 3; widows: 3;
+ }
+
+ h2, h3 {
+ page-break-after: avoid;
+ }
+}
+</style>
+
+
+
+</head>
+
+<body>
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{bold vignette}
+%\VignetteEncoding{UTF-8}
+-->
+
+<p><code>bold</code> is an R package to connect to <a href="http://www.boldsystems.org/">BOLD Systems</a> via their API. Functions in <code>bold</code> let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.</p>
+
+<h3>bold info</h3>
+
+<ul>
+<li><a href="http://boldsystems.org/">BOLD home page</a></li>
+<li><a href="http://boldsystems.org/index.php/resources/api">BOLD API docs</a></li>
+</ul>
+
+<h3>Using bold</h3>
+
+<p><strong>Install</strong></p>
+
+<p>Install <code>bold</code> from CRAN</p>
+
+<pre><code class="r">install.packages("bold")
+</code></pre>
+
+<p>Or install the development version from GitHub</p>
+
+<pre><code class="r">devtools::install_github("ropensci/bold")
+</code></pre>
+
+<p>Load the package</p>
+
+<pre><code class="r">library("bold")
+</code></pre>
+
+<h3>Search for taxonomic names via names</h3>
+
+<p><code>bold_tax_name</code> searches for names with names.</p>
+
+<pre><code class="r">bold_tax_name(name = 'Diplura')
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+</code></pre>
+
+<pre><code class="r">bold_tax_name(name = c('Diplura', 'Osmia'))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> 3 Osmia 4940 Osmia genus Animals 4962 Megachilinae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+#> 3 Osmia
+</code></pre>
+
+<h3>Search for taxonomic names via BOLD identifiers</h3>
+
+<p><code>bold_tax_id</code> searches for names with BOLD identifiers.</p>
+
+<pre><code class="r">bold_tax_id(id = 88899)
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+</code></pre>
+
+<pre><code class="r">bold_tax_id(id = c(88899, 125295))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+#> 2 125295 125295 Helianthus genus Plants 100962 Asteraceae
+</code></pre>
+
+<h3>Search for sequence data only</h3>
+
+<p>The BOLD sequence API gives back sequence data, with a bit of metadata.</p>
+
+<p>The default is to get a list back</p>
+
+<pre><code class="r">bold_seq(taxon = 'Coelioxys')[1:2]
+#> [[1]]
+#> [[1]]$id
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$name
+#> [1] "Coelioxys conica"
+#>
+#> [[1]]$gene
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------ACCTCTTTAAGAATAATTATTCGTATAGAAATAAGAATTCCAGGATCTTGAATTAATAATGATCAAATTTATAACTCCTTTATTACAGCACATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTCTTATTGGAGGATTTGGAAATTGATTAGTACCTTTAATATTAGGATCACCAGATATAGCTTTCCCACGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTATTAAGTAATTTAATAAATCCCAGACCAGGAACAGGCTGAACAGTTTATCCTCCTTTATCTTTATACACATACCACCCTTCTCCCTCAGTTGATTTAGCAATTTTTTCACTACATCTATCAGGAATCTCTTCTATTATTGGATCTATAAATTTTATTGTTACAATTTTAATAATAAAAAACTTTTCAATAAATTATAATCAAATACC [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$name
+#> [1] "Coelioxys afra"
+#>
+#> [[2]]$gene
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ACGAATAAATAATGTAAGATTTTGACTATTACCTCCCTCAATTTTCTTATTATTATCAAGAACCCTAATTAACCCAAGAGCTGGTACTGGATGAACTGTATATCCTCCTTTATCCTTATATACATTTCATGCCTCACCTTCCGTTGATTTAGCAATTTTTTCACTTCATTTATCAGGAATTTCATCAATTATTGGATCAATAAATTTTATTGTTACAATCTTAATAATAAAAAATTTTT [...]
+</code></pre>
+
+<p>You can optionally get back the <code>httr</code> response object</p>
+
+<pre><code class="r">res <- bold_seq(taxon = 'Coelioxys', response = TRUE)
+res$headers
+#> $date
+#> [1] "Tue, 15 Sep 2015 20:02:31 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=fasta.fas"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+</code></pre>
+
+<p>You can do geographic searches</p>
+
+<pre><code class="r">bold_seq(geo = "USA")
+#> [[1]]
+#> [[1]]$id
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$name
+#> [1] "Macrobdella decora"
+#>
+#> [[1]]$gene
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTAT [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$name
+#> [1] "Haemopis terrestris"
+#>
+#> [[2]]$gene
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTAT [...]
+#>
+#>
+#> [[3]]
+#> [[3]]$id
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$name
+#> [1] "Steinernema carpocapsae"
+#>
+#> [[3]]$gene
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$sequence
+#> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGC [...]
+#>
+#>
+#> [[4]]
+#> [[4]]$id
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[4]]$gene
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAG [...]
+#>
+#>
+#> [[5]]
+#> [[5]]$id
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[5]]$gene
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAG [...]
+</code></pre>
+
+<p>And you can search by researcher name</p>
+
+<pre><code class="r">bold_seq(researchers = 'Thibaud Decaens')[[1]]
+#> $id
+#> [1] "BGABA657-14"
+#>
+#> $name
+#> [1] "Coleoptera"
+#>
+#> $gene
+#> [1] "BGABA657-14"
+#>
+#> $sequence
+#> [1] "ACACTCTATTTCATTTTCGGAGCTTGATCAGGAATAGTAGGAACTTCTTTAAGAATACTAATTCGATCTGAATTGGGAAACCCCGGCTCATTGATTGGGGATGATCAAATTTATAATGTTATTGTAACAGCCCATGCATTCATTATAATTTTTTTTATAGTAATACCGATCATAATAGGAGGTTTTGGAAATTGATTAGTCCCGCTAATATTAGGTGCCCCAGATATAGCATTTCCTCGAATAAATAATATAAGATTTTGACTTCTTCCGCCTTCATTAACTTTACTTATTATAAGAAGAATTGTAGAAAACGGGGCGGGAACAGGATGAACAGTTTACCCACCCCTCTCTTCTAACATTGCTCATAGAGGAGCCTCTGTAGATCTTGCAATTTTTAGATTACATTTAGCCGGTGTATCATCAATTTTAGGTGCAGTTAATTTTATTACAACTATTATTAATATACGACCTAAAGG [...]
+</code></pre>
+
+<p>by taxon IDs</p>
+
+<pre><code class="r">bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11'))
+#> [[1]]
+#> [[1]]$id
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[1]]$gene
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$sequence
+#> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACA [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[2]]$gene
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$sequence
+#> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACA [...]
+</code></pre>
+
+<p>by container (containers include project codes and dataset codes)</p>
+
+<pre><code class="r">bold_seq(container = 'ACRJP')[[1]]
+#> $id
+#> [1] "ACRJP003-09"
+#>
+#> $name
+#> [1] "Lepidoptera"
+#>
+#> $gene
+#> [1] "ACRJP003-09"
+#>
+#> $sequence
+#> [1] "AACATTATATTTTATTTTTGGGATCTGATCTGGAATAGTAGGGACATCTTTAAGTATACTAATTCGAATAGAACTAGGAAATCCTGGATGTTTAATTGGGGATGATCAAATTTATAATACTATTGTTACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCCATTATAATTGGAGGTTTTGGCAATTGACTTGTACCATTAATATTAGGAGCCCCTGATATAGCATTTCCCCGAATAAATAATATAAGATTTTGACTTCTTCCCCCCTCATTAATTTTATTAATTTCAAGAAGAATTGTTGAAAATGGAGCAGGAACAGGATGAACAGTCTATCCTCCATTATCTTCTAATATTGCGCATAGAGGATCCTCTGTTGATTTAGCTATTTTCTCACTTCATTTAGCAGGAATTTCTTCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAATAAATA [...]
+</code></pre>
+
+<p>by bin (a bin is a <em>Barcode Index Number</em>)</p>
+
+<pre><code class="r">bold_seq(bin = 'BOLD:AAA5125')[[1]]
+#> $id
+#> [1] "BLPAB406-06"
+#>
+#> $name
+#> [1] "Eacles ormondei"
+#>
+#> $gene
+#> [1] "BLPAB406-06"
+#>
+#> $sequence
+#> [1] "AACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAGTAGGAACTTCTTTAAGATTACTAATTCGAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACTCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATA [...]
+</code></pre>
+
+<p>And there are more ways to query, check out the docs for <code>?bold_seq</code>.</p>
+
+<h3>Search for specimen data only</h3>
+
+<p>The BOLD specimen API doesn't give back sequences, only specimen data. By default you download <code>tsv</code> format data, which is given back to you as a <code>data.frame</code></p>
+
+<pre><code class="r">res <- bold_specimens(taxon = 'Osmia')
+head(res[,1:8])
+#> processid sampleid recordID catalognum fieldnum
+#> 1 ASGCB261-13 BIOUG07489-F10 3955538 BIOUG07489-F10
+#> 2 BCHYM1499-13 BC ZSM HYM 19359 4005348 BC ZSM HYM 19359 BC ZSM HYM 19359
+#> 3 BCHYM412-13 BC ZSM HYM 18272 3896353 BC ZSM HYM 18272 BC ZSM HYM 18272
+#> 4 BCHYM413-13 BC ZSM HYM 18273 3896354 BC ZSM HYM 18273 BC ZSM HYM 18273
+#> 5 FBAPB706-09 BC ZSM HYM 02181 1289067 BC ZSM HYM 02181 BC ZSM HYM 02181
+#> 6 FBAPB730-09 BC ZSM HYM 02205 1289091 BC ZSM HYM 02205 BC ZSM HYM 02205
+#> institution_storing bin_uri phylum_taxID
+#> 1 Biodiversity Institute of Ontario BOLD:AAB8874 20
+#> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAD6282 20
+#> 3 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAE4126 20
+#> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20
+</code></pre>
+
+<p>You can optionally get back the data in <code>XML</code> format</p>
+
+<pre><code class="r">bold_specimens(taxon = 'Osmia', format = 'xml')
+</code></pre>
+
+<pre><code class="r"><?xml version="1.0" encoding="UTF-8"?>
+<bold_records xsi:noNamespaceSchemaLocation="http://www.boldsystems.org/schemas/BOLDPublic_record.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+ <record>
+ <record_id>1470124</record_id>
+ <processid>BOM1525-10</processid>
+ <bin_uri>BOLD:AAN3337</bin_uri>
+ <specimen_identifiers>
+ <sampleid>DHB 1011</sampleid>
+ <catalognum>DHB 1011</catalognum>
+ <fieldnum>DHB1011</fieldnum>
+ <institution_storing>Marjorie Barrick Museum</institution_storing>
+ </specimen_identifiers>
+ <taxonomy>
+</code></pre>
+
+<p>You can choose to get the <code>httr</code> response object back if you'd rather work with the raw data returned from the BOLD API.</p>
+
+<pre><code class="r">res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE)
+res$url
+#> [1] "http://www.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&specimen_download=xml"
+res$status_code
+#> [1] 200
+res$headers
+#> $date
+#> [1] "Mon, 28 Mar 2016 20:39:18 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=bold_data.xml"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+</code></pre>
+
+<h3>Search for specimen plus sequence data</h3>
+
+<p>The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default <code>tsv</code> format data, which is given back to you as a <code>data.frame</code>. Here, we're setting <code>sepfasta=TRUE</code> so that the sequence data is given back as a list, and taken out of the <code>data.frame</code> returned so the <code>data.frame</code> is more manageable.</p>
+
+<pre><code class="r">res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE)
+res$fasta[1:2]
+#> $`ASGCB261-13`
+#> [1] "AATTTTATATATAATTTTTGCTATATGATCAGGAATAATTGGTTCAGCAATAAGAATTATTATTCGAATAGAATTAAGAATTCCTGGTTCATGAATTTCAAATGATCAAACTTATAATTCTTTAGTTACTGCTCATGCTTTTTTAATAATTTTTTTCTTAGTTATACCATTCTTAATTGGGGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCATTTCCACGAATAAATAATATTAGATTTTGACTTTTACCTCCTTCTTTAATACTTTTATTATTAAGAAATTTTATAAATCCTAGTCCAGGAACTGGATGAACTGTTTATCCACCTTTATCTTCTCATTTATTTCATTCTTCTCCTTCAGTTGATATAGCTATTTTTTCTTTACATATTTCTGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTT [...]
+#>
+#> $`BCHYM1499-13`
+#> [1] "AATTCTTTACATAATTTTTGCTTTATGATCTGGAATAATTGGGTCAGCAATAAGAATTATTATTCGAATAGAATTAAGTATCCCAGGTTCATGAATTACTAATGATCAAATTTATAATTCTTTAGTAACTGCACATGCTTTTTTAATAATTTTTTTTCTTGTGATACCATTTTTAATTGGAGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCTTTCCCACGAATAAACAATATTAGATTTTGATTATTACCGCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCCCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTT [...]
+</code></pre>
+
+<p>Or you can index to a specific sequence like</p>
+
+<pre><code class="r">res$fasta['GBAH0293-06']
+#> $`GBAH0293-06`
+#> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTC [...]
+</code></pre>
+
+<h3>Get trace files</h3>
+
+<p>This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.</p>
+
+<pre><code class="r">bold_trace(taxon = 'Osmia', quiet = TRUE)
+</code></pre>
+
+</body>
+
+</html>
diff --git a/man/bold-package.Rd b/man/bold-package.Rd
new file mode 100644
index 0000000..cddeb46
--- /dev/null
+++ b/man/bold-package.Rd
@@ -0,0 +1,38 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold-package.R
+\docType{package}
+\name{bold-package}
+\alias{bold}
+\alias{bold-package}
+\title{bold: A programmatic interface to the Barcode of Life data.}
+\description{
+bold: A programmatic interface to the Barcode of Life data.
+}
+\section{About}{
+
+
+This package gives you access to data from BOLD System \url{http://www.boldsystems.org/}
+via their API.
+}
+
+\section{Functions}{
+
+
+\itemize{
+ \item \code{\link{bold_specimens}} - Search for specimen data.
+ \item \code{\link{bold_seq}} - Search for and retrieve sequences.
+ \item \code{\link{bold_seqspec}} - Get sequence and specimen data together.
+ \item \code{\link{bold_trace}} - Get trace files - saves to disk.
+ \item \code{\link{read_trace}} - Read trace files into R.
+ \item \code{\link{bold_tax_name}} - Get taxonomic names via input names.
+ \item \code{\link{bold_tax_id}} - Get taxonomic names via BOLD identifiers.
+ \item \code{\link{bold_identify}} - Search for match given a COI sequence.
+}
+
+Interestingly, they provide xml and tsv format data for the specimen data, while
+they provide fasta data format for the sequence data. So for the specimen data
+you can get back raw XML, or a data frame parsed from the tsv data, while for
+sequence data you get back a list (b/c sequences are quite long and would make
+a data frame unwieldy).
+}
+
diff --git a/man/bold_filter.Rd b/man/bold_filter.Rd
new file mode 100644
index 0000000..eb4b8c4
--- /dev/null
+++ b/man/bold_filter.Rd
@@ -0,0 +1,41 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_filter.R
+\name{bold_filter}
+\alias{bold_filter}
+\title{Get BOLD specimen + sequence data.}
+\usage{
+bold_filter(x, by, how = "max")
+}
+\arguments{
+\item{x}{(data.frame) a data.frame, as returned from
+\code{\link{bold_seqspec}}. Note that some combinations of parameters
+in \code{\link{bold_seqspec}} don't return a data.frame. Stops with
+error message if this is not a data.frame. Required.}
+
+\item{by}{(character) the column by which to group. For example,
+if you want the longest sequence for each unique species name, then
+pass \strong{species_name}. If the column doesn't exist, error
+with message saying so. Required.}
+
+\item{how}{(character) one of "max" or "min", which get used as
+\code{which.max} or \code{which.min} to get the longest or shorest
+sequence, respectively. Note that we remove gap/alignment characters
+(\code{-})}
+}
+\value{
+a tibble/data.frame
+}
+\description{
+Get BOLD specimen + sequence data.
+}
+\examples{
+\dontrun{
+res <- bold_seqspec(taxon='Osmia')
+maxx <- bold_filter(res, by = "species_name")
+minn <- bold_filter(res, by = "species_name", how = "min")
+
+vapply(maxx$nucleotides, nchar, 1, USE.NAMES = FALSE)
+vapply(minn$nucleotides, nchar, 1, USE.NAMES = FALSE)
+}
+}
+
diff --git a/man/bold_identify.Rd b/man/bold_identify.Rd
new file mode 100644
index 0000000..f9c1922
--- /dev/null
+++ b/man/bold_identify.Rd
@@ -0,0 +1,73 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_identify.R
+\name{bold_identify}
+\alias{bold_identify}
+\title{Search for matches to sequences against the BOLD COI database.}
+\usage{
+bold_identify(sequences, db = "COX1", response = FALSE, ...)
+}
+\arguments{
+\item{sequences}{(character) Returns all records containing matching marker
+codes. Required.}
+
+\item{db}{(character) The database to match against, one of COX1,
+COX1_SPECIES, COX1_SPECIES_PUBLIC, OR COX1_L604bp. See Details for
+more information.}
+
+\item{response}{(logical) Note that response is the object that returns
+from the Curl call, useful for debugging, and getting detailed info on
+the API call.}
+
+\item{...}{Further args passed on to \code{\link[httr]{GET}}, main purpose
+being curl debugging}
+}
+\value{
+A data.frame with details for each specimen matched.
+}
+\description{
+Search for matches to sequences against the BOLD COI database.
+}
+\section{db parmeter options}{
+
+\itemize{
+ \item COX1 Every COI barcode record on BOLD with a minimum sequence
+ length of 500bp (warning: unvalidated library and includes records without
+ species level identification). This includes many species represented by
+ only one or two specimens as well as all species with interim taxonomy. This
+ search only returns a list of the nearest matches and does not provide a
+ probability of placement to a taxon.
+ \item COX1_SPECIES Every COI barcode record with a species level
+ identification and a minimum sequence length of 500bp. This includes
+ many species represented by only one or two specimens as well as all
+ species with interim taxonomy.
+ \item COX1_SPECIES_PUBLIC All published COI records from BOLD and GenBank
+ with a minimum sequence length of 500bp. This library is a collection of
+ records from the published projects section of BOLD.
+ \item OR COX1_L604bp Subset of the Species library with a minimum sequence
+ length of 640bp and containing both public and private records. This library
+ is intended for short sequence identification as it provides maximum overlap
+ with short reads from the barcode region of COI.
+}
+}
+
+\section{Named outputs}{
+
+To maintain names on the output list of data make sure to pass in a
+named list to the \code{sequences} parameter. You can for example,
+take a list of sequences, and use \code{\link{setNames}} to set names.
+}
+\examples{
+\dontrun{
+seq <- sequences$seq1
+res <- bold_identify(sequences=seq)
+head(res[[1]])
+head(bold_identify(sequences=seq, db='COX1_SPECIES')[[1]])
+}
+}
+\references{
+\url{http://www.boldsystems.org/index.php/resources/api?type=idengine}
+}
+\seealso{
+\code{\link{bold_identify_parents}}
+}
+
diff --git a/man/bold_identify_parents.Rd b/man/bold_identify_parents.Rd
new file mode 100644
index 0000000..b1b3a38
--- /dev/null
+++ b/man/bold_identify_parents.Rd
@@ -0,0 +1,57 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_identify_parents.R
+\name{bold_identify_parents}
+\alias{bold_identify_parents}
+\title{Add taxonomic parent names to a data.frame}
+\usage{
+bold_identify_parents(x, wide = FALSE)
+}
+\arguments{
+\item{x}{(data.frame/list) list of data.frames - the output from a call to
+\code{\link{bold_identify}}. or a single data.frame from the output from
+same. required.}
+
+\item{wide}{(logical) output in long or wide format. See Details.
+Default: \code{FALSE}}
+}
+\value{
+a list of the same length as the input
+}
+\description{
+Add taxonomic parent names to a data.frame
+}
+\details{
+This function gets unique set of taxonomic names from the input
+data.frame, then queries \code{\link{bold_tax_name}} to get the
+taxonomic ID, passing it to \code{\link{bold_tax_id}} to get the parent
+names, then attaches those to the input data.
+}
+\section{wide vs long format}{
+
+When \code{wide = FALSE} you get many rows for each record. Essentially,
+we \code{cbind} the taxonomic classification onto the one row from the
+result of \code{\link{bold_identify}}, giving as many rows as there are
+taxa in the taxonomic classification.
+
+When \code{wide = TRUE} you get one row for each record - thus the
+dimenions of the input data stay the same. For this option, we take just
+the rows for taxonomic ID and name for each taxon in the taxonomic
+classification, and name the columns by the taxon rank, so you get
+\code{phylum} and \code{phylum_id}, and so on.
+}
+\examples{
+\dontrun{
+df <- bold_identify(sequences = sequences$seq2)
+
+# long format
+out <- bold_identify_parents(df)
+str(out)
+head(out$seq1)
+
+# wide format
+out <- bold_identify_parents(df, wide = TRUE)
+str(out)
+head(out$seq1)
+}
+}
+
diff --git a/man/bold_seq.Rd b/man/bold_seq.Rd
new file mode 100644
index 0000000..a7c3fa1
--- /dev/null
+++ b/man/bold_seq.Rd
@@ -0,0 +1,79 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_seq.R
+\name{bold_seq}
+\alias{bold_seq}
+\title{Search BOLD for sequences.}
+\usage{
+bold_seq(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL,
+ response = FALSE, ...)
+}
+\arguments{
+\item{taxon}{(character) Returns all records containing matching taxa. Taxa includes the ranks of
+phylum, class, order, family, subfamily, genus, and species.}
+
+\item{ids}{(character) Returns all records containing matching IDs. IDs include Sample IDs,
+Process IDs, Museum IDs and Field IDs.}
+
+\item{bin}{(character) Returns all records contained in matching BINs. A BIN is defined by a
+Barcode Index Number URI.}
+
+\item{container}{(character) Returns all records contained in matching projects or datasets.
+Containers include project codes and dataset codes}
+
+\item{institutions}{(character) Returns all records stored in matching institutions. Institutions
+are the Specimen Storing Site.}
+
+\item{researchers}{(character) Returns all records containing matching researcher names.
+Researchers include collectors and specimen identifiers.}
+
+\item{geo}{(character) Returns all records collected in matching geographic sites. Geographic
+sites includes countries and province/states.}
+
+\item{marker}{(character) Returns all records containing matching
+marker codes.}
+
+\item{response}{(logical) Note that response is the object that returns from the Curl call,
+useful for debugging, and getting detailed info on the API call.}
+
+\item{...}{Further args passed on to httr::GET, main purpose being curl debugging}
+}
+\value{
+A list with each element of length 4 with slots for id, name,
+gene, and sequence.
+}
+\description{
+Get sequences for a taxonomic name, id, bin, container, institution,
+researcher, geographic, place, or gene.
+}
+\examples{
+\dontrun{
+res <- bold_seq(taxon='Coelioxys')
+bold_seq(taxon='Aglae')
+bold_seq(taxon=c('Coelioxys','Osmia'))
+bold_seq(ids='ACRJP618-11')
+bold_seq(ids=c('ACRJP618-11','ACRJP619-11'))
+bold_seq(bin='BOLD:AAA5125')
+bold_seq(container='ACRJP')
+bold_seq(researchers='Thibaud Decaens')
+bold_seq(geo='Ireland')
+bold_seq(geo=c('Ireland','Denmark'))
+
+# Return the httr response object for detailed Curl call response details
+res <- bold_seq(taxon='Coelioxys', response=TRUE)
+res$url
+res$status_code
+res$headers
+
+## curl debugging
+### You can do many things, including get verbose output on the curl
+### call, and set a timeout
+library("httr")
+bold_seq(taxon='Coelioxys', config=verbose())[1:2]
+# bold_seqspec(taxon='Coelioxys', config=timeout(0.1))
+}
+}
+\references{
+\url{http://www.boldsystems.org/index.php/resources/api#sequenceParameters}
+}
+
diff --git a/man/bold_seqspec.Rd b/man/bold_seqspec.Rd
new file mode 100644
index 0000000..0b6657d
--- /dev/null
+++ b/man/bold_seqspec.Rd
@@ -0,0 +1,87 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_seqspec.R
+\name{bold_seqspec}
+\alias{bold_seqspec}
+\title{Get BOLD specimen + sequence data.}
+\usage{
+bold_seqspec(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL,
+ response = FALSE, format = "tsv", sepfasta = FALSE, ...)
+}
+\arguments{
+\item{taxon}{(character) Returns all records containing matching taxa. Taxa includes the ranks of
+phylum, class, order, family, subfamily, genus, and species.}
+
+\item{ids}{(character) Returns all records containing matching IDs. IDs include Sample IDs,
+Process IDs, Museum IDs and Field IDs.}
+
+\item{bin}{(character) Returns all records contained in matching BINs. A BIN is defined by a
+Barcode Index Number URI.}
+
+\item{container}{(character) Returns all records contained in matching projects or datasets.
+Containers include project codes and dataset codes}
+
+\item{institutions}{(character) Returns all records stored in matching institutions. Institutions
+are the Specimen Storing Site.}
+
+\item{researchers}{(character) Returns all records containing matching researcher names.
+Researchers include collectors and specimen identifiers.}
+
+\item{geo}{(character) Returns all records collected in matching geographic sites. Geographic
+sites includes countries and province/states.}
+
+\item{marker}{(character) Returns all records containing matching marker
+codes.}
+
+\item{response}{(logical) Note that response is the object that returns from the Curl call,
+useful for debugging, and getting detailed info on the API call.}
+
+\item{format}{(character) One of xml or tsv (default). tsv format gives
+back a data.frame object. xml gives back parsed xml as a}
+
+\item{sepfasta}{(logical) If \code{TRUE}, the fasta data is separated into
+a list with names matching the processid's from the data frame.
+Default: \code{FALSE}}
+
+\item{...}{Further args passed on to httr::GET, main purpose being curl debugging}
+}
+\value{
+Either a data.frame, parsed xml, a httr response object, or a list
+with length two (a data.frame w/o nucleotide data, and a list with
+nucleotide data)
+}
+\description{
+Get BOLD specimen + sequence data.
+}
+\examples{
+\dontrun{
+bold_seqspec(taxon='Osmia')
+bold_seqspec(taxon='Osmia', format='xml')
+bold_seqspec(taxon='Osmia', response=TRUE)
+res <- bold_seqspec(taxon='Osmia', sepfasta=TRUE)
+res$fasta[1:2]
+res$fasta['GBAH0293-06']
+
+# records that match a marker name
+res <- bold_seqspec(taxon="Melanogrammus aeglefinus", marker="COI-5P")
+
+# records that match a geographic locality
+res <- bold_seqspec(taxon="Melanogrammus aeglefinus", geo="Canada")
+
+# return only the longest sequence for each
+
+## curl debugging
+### You can do many things, including get verbose output on the curl call,
+### and set a timeout
+library("httr")
+head(bold_seqspec(taxon='Osmia', config=verbose()))
+## timeout
+# head(bold_seqspec(taxon='Osmia', config=timeout(1)))
+## progress
+# x <- bold_seqspec(taxon='Osmia', config=progress())
+}
+}
+\references{
+\url{http://www.boldsystems.org/index.php/resources/api#combined}
+}
+
diff --git a/man/bold_specimens.Rd b/man/bold_specimens.Rd
new file mode 100644
index 0000000..d7a73e1
--- /dev/null
+++ b/man/bold_specimens.Rd
@@ -0,0 +1,69 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_specimens.R
+\name{bold_specimens}
+\alias{bold_specimens}
+\title{Search BOLD for specimens.}
+\usage{
+bold_specimens(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, response = FALSE,
+ format = "tsv", ...)
+}
+\arguments{
+\item{taxon}{(character) Returns all records containing matching taxa. Taxa includes the ranks of
+phylum, class, order, family, subfamily, genus, and species.}
+
+\item{ids}{(character) Returns all records containing matching IDs. IDs include Sample IDs,
+Process IDs, Museum IDs and Field IDs.}
+
+\item{bin}{(character) Returns all records contained in matching BINs. A BIN is defined by a
+Barcode Index Number URI.}
+
+\item{container}{(character) Returns all records contained in matching projects or datasets.
+Containers include project codes and dataset codes}
+
+\item{institutions}{(character) Returns all records stored in matching institutions. Institutions
+are the Specimen Storing Site.}
+
+\item{researchers}{(character) Returns all records containing matching researcher names.
+Researchers include collectors and specimen identifiers.}
+
+\item{geo}{(character) Returns all records collected in matching geographic sites. Geographic
+sites includes countries and province/states.}
+
+\item{response}{(logical) Note that response is the object that returns from the Curl call,
+useful for debugging, and getting detailed info on the API call.}
+
+\item{format}{(character) One of xml or tsv (default). tsv format gives
+back a data.frame object. xml gives back parsed xml as a}
+
+\item{...}{Further args passed on to httr::GET, main purpose being curl debugging}
+}
+\description{
+Search BOLD for specimens.
+}
+\examples{
+\dontrun{
+bold_specimens(taxon='Osmia')
+bold_specimens(taxon='Osmia', format='xml')
+# bold_specimens(taxon='Osmia', response=TRUE)
+res <- bold_specimens(taxon='Osmia', format='xml', response=TRUE)
+res$url
+res$status_code
+res$headers
+
+# More than 1 can be given for all search parameters
+bold_specimens(taxon=c('Coelioxys','Osmia'))
+
+## curl debugging
+### These examples below take a long time, so you can set a timeout so that
+### it stops by X sec
+library("httr")
+head(bold_specimens(taxon='Osmia', config=verbose()))
+# head(bold_specimens(geo='Costa Rica', config=timeout(6)))
+# head(bold_specimens(taxon="Formicidae", geo="Canada", config=timeout(6)))
+}
+}
+\references{
+\url{http://www.boldsystems.org/index.php/resources/api#specimenParameters}
+}
+
diff --git a/man/bold_tax_id.Rd b/man/bold_tax_id.Rd
new file mode 100644
index 0000000..98010f8
--- /dev/null
+++ b/man/bold_tax_id.Rd
@@ -0,0 +1,66 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_tax_id.R
+\name{bold_tax_id}
+\alias{bold_tax_id}
+\title{Search BOLD for taxonomy data by BOLD ID.}
+\usage{
+bold_tax_id(id, dataTypes = "basic", includeTree = FALSE,
+ response = FALSE, ...)
+}
+\arguments{
+\item{id}{(integer) One or more BOLD taxonomic identifiers. required.}
+
+\item{dataTypes}{(character) Specifies the datatypes that will be
+returned. 'all' returns all data. 'basic' returns basic taxon information.
+'images' returns specimen images.}
+
+\item{includeTree}{(logical) If TRUE (default: FALSE), returns a list
+containing information for parent taxa as well as the specified taxon.}
+
+\item{response}{(logical) Note that response is the object that returns from the Curl call,
+useful for debugging, and getting detailed info on the API call.}
+
+\item{...}{Further args passed on to httr::GET, main purpose being curl debugging}
+}
+\description{
+Search BOLD for taxonomy data by BOLD ID.
+}
+\examples{
+\dontrun{
+bold_tax_id(id=88899)
+bold_tax_id(id=88899, includeTree=TRUE)
+bold_tax_id(id=88899, includeTree=TRUE, dataTypes = "stats")
+bold_tax_id(id=c(88899,125295))
+
+## dataTypes parameter
+bold_tax_id(id=88899, dataTypes = "basic")
+bold_tax_id(id=88899, dataTypes = "stats")
+bold_tax_id(id=88899, dataTypes = "images")
+bold_tax_id(id=88899, dataTypes = "geo")
+bold_tax_id(id=88899, dataTypes = "sequencinglabs")
+bold_tax_id(id=88899, dataTypes = "depository")
+bold_tax_id(id=88899, dataTypes = "thirdparty")
+bold_tax_id(id=88899, dataTypes = "all")
+bold_tax_id(id=c(88899,125295), dataTypes = "geo")
+bold_tax_id(id=c(88899,125295), dataTypes = "images")
+
+## Passing in NA
+bold_tax_id(id = NA)
+bold_tax_id(id = c(88899,125295,NA))
+
+## get httr response object only
+bold_tax_id(id=88899, response=TRUE)
+bold_tax_id(id=c(88899,125295), response=TRUE)
+
+## curl debugging
+library('httr')
+bold_tax_id(id=88899, config=verbose())
+}
+}
+\references{
+\url{http://boldsystems.org/index.php/resources/api?type=taxonomy}
+}
+\seealso{
+\code{bold_tax_name}
+}
+
diff --git a/man/bold_tax_name.Rd b/man/bold_tax_name.Rd
new file mode 100644
index 0000000..a5831bb
--- /dev/null
+++ b/man/bold_tax_name.Rd
@@ -0,0 +1,56 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_tax_name.R
+\name{bold_tax_name}
+\alias{bold_tax_name}
+\title{Search BOLD for taxonomy data by taxonomic name.}
+\usage{
+bold_tax_name(name, fuzzy = FALSE, response = FALSE, ...)
+}
+\arguments{
+\item{name}{(character) One or more scientific names. required.}
+
+\item{fuzzy}{(logical) Whether to use fuzzy search or not (default: FALSE).}
+
+\item{response}{(logical) Note that response is the object that returns from the Curl call,
+useful for debugging, and getting detailed info on the API call.}
+
+\item{...}{Further args passed on to httr::GET, main purpose being curl debugging}
+}
+\description{
+Search BOLD for taxonomy data by taxonomic name.
+}
+\details{
+The \code{dataTypes} parameter is not supported in this function.
+If you want to use that parameter, get an ID from this function and pass
+it into \code{bold_tax_id}, and then use the \code{dataTypes} parameter.
+}
+\examples{
+\dontrun{
+bold_tax_name(name='Diplura')
+bold_tax_name(name='Osmia')
+bold_tax_name(name=c('Diplura','Osmia'))
+bold_tax_name(name=c("Apis","Puma concolor","Pinus concolor"))
+bold_tax_name(name='Diplur', fuzzy=TRUE)
+bold_tax_name(name='Osm', fuzzy=TRUE)
+
+## get httr response object only
+bold_tax_name(name='Diplura', response=TRUE)
+bold_tax_name(name=c('Diplura','Osmia'), response=TRUE)
+
+## Names with no data in BOLD database
+bold_tax_name("Nasiaeshna pentacantha")
+bold_tax_name(name = "Cordulegaster erronea")
+bold_tax_name(name = "Cordulegaster erronea", response=TRUE)
+
+## curl debugging
+library('httr')
+bold_tax_name(name='Diplura', config=verbose())
+}
+}
+\references{
+\url{http://boldsystems.org/index.php/resources/api?type=taxonomy}
+}
+\seealso{
+\code{\link{bold_tax_id}}
+}
+
diff --git a/man/bold_trace.Rd b/man/bold_trace.Rd
new file mode 100644
index 0000000..7e948b8
--- /dev/null
+++ b/man/bold_trace.Rd
@@ -0,0 +1,82 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold_trace.R
+\name{bold_trace}
+\alias{bold_trace}
+\alias{read_trace}
+\title{Get BOLD trace files}
+\usage{
+bold_trace(taxon = NULL, ids = NULL, bin = NULL, container = NULL,
+ institutions = NULL, researchers = NULL, geo = NULL, marker = NULL,
+ dest = NULL, overwrite = TRUE, progress = TRUE, ...)
+
+read_trace(x)
+}
+\arguments{
+\item{taxon}{(character) Returns all records containing matching taxa. Taxa includes the ranks of
+phylum, class, order, family, subfamily, genus, and species.}
+
+\item{ids}{(character) Returns all records containing matching IDs. IDs include Sample IDs,
+Process IDs, Museum IDs and Field IDs.}
+
+\item{bin}{(character) Returns all records contained in matching BINs. A BIN is defined by a
+Barcode Index Number URI.}
+
+\item{container}{(character) Returns all records contained in matching projects or datasets.
+Containers include project codes and dataset codes}
+
+\item{institutions}{(character) Returns all records stored in matching institutions. Institutions
+are the Specimen Storing Site.}
+
+\item{researchers}{(character) Returns all records containing matching researcher names.
+Researchers include collectors and specimen identifiers.}
+
+\item{geo}{(character) Returns all records collected in matching geographic sites. Geographic
+sites includes countries and province/states.}
+
+\item{marker}{(character) Returns all records containing matching
+marker codes.}
+
+\item{dest}{(character) A directory to write the files to}
+
+\item{overwrite}{(logical) Overwrite existing directory and file?}
+
+\item{progress}{(logical) Print progress or not. Uses
+\code{\link[httr]{progress}}.}
+
+\item{...}{Futher args passed on to \code{\link[httr]{GET}}.}
+
+\item{x}{Object to print or read.}
+}
+\description{
+Get BOLD trace files
+}
+\examples{
+\dontrun{
+# Use a specific destination directory
+bold_trace(taxon='Bombus', geo='Alaska', dest="~/mytarfiles")
+
+# Another example
+# bold_trace(ids='ACRJP618-11', dest="~/mytarfiles")
+# bold_trace(ids=c('ACRJP618-11','ACRJP619-11'), dest="~/mytarfiles")
+
+# read file in
+x <- bold_trace(ids=c('ACRJP618-11','ACRJP619-11'), dest="~/mytarfiles")
+(res <- read_trace(x$ab1[2]))
+
+# The progress dialog is pretty verbose, so quiet=TRUE is a nice touch,
+# but not by default
+# Beware, this one take a while
+# x <- bold_trace(taxon='Osmia', quiet=TRUE)
+
+if (requireNamespace("sangerseqR", quietly = TRUE)) {
+ library("sangerseqR")
+ primarySeq(res)
+ secondarySeq(res)
+ head(traceMatrix(res))
+}
+}
+}
+\references{
+\url{http://www.boldsystems.org/index.php/resources/api#trace}
+}
+
diff --git a/man/sequences.Rd b/man/sequences.Rd
new file mode 100644
index 0000000..18399b8
--- /dev/null
+++ b/man/sequences.Rd
@@ -0,0 +1,16 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/bold-package.R
+\docType{data}
+\name{sequences}
+\alias{sequences}
+\title{List of 3 nucleotide sequences to use in examples for the
+\code{\link{bold_identify}} function}
+\description{
+List of 3 nucleotide sequences to use in examples for the
+\code{\link{bold_identify}} function
+}
+\details{
+Each sequence is a character string, of lengths 410, 600, and 696.
+}
+\keyword{data}
+
diff --git a/tests/test-all.R b/tests/test-all.R
new file mode 100644
index 0000000..0d33db5
--- /dev/null
+++ b/tests/test-all.R
@@ -0,0 +1,2 @@
+library(testthat)
+test_check('bold')
diff --git a/tests/testthat/test-bold_identify.R b/tests/testthat/test-bold_identify.R
new file mode 100644
index 0000000..2385d1e
--- /dev/null
+++ b/tests/testthat/test-bold_identify.R
@@ -0,0 +1,36 @@
+context("bold_identify")
+
+seq <- sequences$seq1
+
+test_that("bold_identify works as expected", {
+ skip_on_cran()
+
+ aa <- bold_identify(seq)
+ expect_is(aa, 'list')
+ expect_is(aa[[1]], 'data.frame')
+ expect_is(aa[[1]]$ID, 'character')
+})
+
+test_that("bold_identify db param works as expected", {
+ skip_on_cran()
+
+ aa <- bold_identify(seq, db = 'COX1_SPECIES')
+ expect_is(aa, 'list')
+ expect_is(aa[[1]], 'data.frame')
+ expect_is(aa[[1]]$ID, 'character')
+})
+
+test_that("bold_identify response param works as expected", {
+ skip_on_cran()
+
+ aa <- bold_identify(seq, response = TRUE)
+ expect_is(aa, "list")
+ expect_is(aa[[1]], "response")
+ expect_equal(aa[[1]]$status_code, 200)
+})
+
+test_that("bold_identify fails well", {
+ skip_on_cran()
+
+ expect_error(bold_identify(), "argument \"sequences\" is missing, with no default")
+})
diff --git a/tests/testthat/test-bold_seq.R b/tests/testthat/test-bold_seq.R
new file mode 100644
index 0000000..5752b2f
--- /dev/null
+++ b/tests/testthat/test-bold_seq.R
@@ -0,0 +1,30 @@
+# tests for bold_seq fxn in bold
+context("bold_seq")
+
+test_that("bold_seq returns the correct dimensions/classes", {
+ skip_on_cran()
+
+ a <- bold_seq(taxon='Coelioxys')
+ b <- bold_seq(bin='BOLD:AAA5125')
+ c <- bold_seq(taxon='Coelioxys', response=TRUE)
+
+ expect_equal(c$status_code, 200)
+ expect_equal(c$headers$`content-type`, "application/x-download")
+
+ expect_is(a, "list")
+ expect_is(b, "list")
+
+ expect_is(a[[1]], "list")
+ expect_is(a[[1]]$id, "character")
+ expect_is(a[[1]]$sequence, "character")
+
+ expect_is(c, "response")
+ expect_is(c$headers, "insensitive")
+})
+
+test_that("bold_seq returns correct error when parameters empty or not given", {
+ skip_on_cran()
+
+ expect_error(bold_seq(taxon=''), "must provide a non-empty value")
+ expect_error(bold_seq(), "must provide a non-empty value")
+})
diff --git a/tests/testthat/test-bold_seqspec.R b/tests/testthat/test-bold_seqspec.R
new file mode 100644
index 0000000..4c49fea
--- /dev/null
+++ b/tests/testthat/test-bold_seqspec.R
@@ -0,0 +1,32 @@
+# tests for bold_seqspec fxn in bold
+context("bold_seqspec")
+
+test_that("bold_seqspec returns the correct dimensions or values", {
+ skip_on_cran()
+
+ a <- bold_seqspec(taxon='Osmia')
+ b <- bold_seqspec(taxon='Osmia', response=TRUE)
+ c <- bold_seqspec(taxon='Osmia', sepfasta=TRUE)
+
+ expect_equal(b$status_code, 200)
+ expect_equal(b$headers$`content-type`, "application/x-download")
+
+ expect_is(a, "data.frame")
+ expect_is(b, "response")
+ expect_is(c, "list")
+ expect_is(c$data, "data.frame")
+ expect_is(c$fasta, "list")
+ expect_is(c$fasta[[1]], "character")
+
+ expect_is(a$recordID, "integer")
+ expect_is(a$directions, "character")
+
+ expect_is(b$headers, "insensitive")
+})
+
+test_that("bold_seq returns correct error when parameters empty or not given", {
+ skip_on_cran()
+
+ expect_error(bold_seqspec(taxon=''), "must provide a non-empty value")
+ expect_error(bold_seqspec(), "must provide a non-empty value")
+})
diff --git a/tests/testthat/test-bold_specimens.R b/tests/testthat/test-bold_specimens.R
new file mode 100644
index 0000000..374f4c2
--- /dev/null
+++ b/tests/testthat/test-bold_specimens.R
@@ -0,0 +1,35 @@
+# tests for bold_specimens fxn in bold
+context("bold_specimens")
+
+library("httr")
+
+test_that("bold_specimens returns the correct dimensions or values", {
+ skip_on_cran()
+
+ a <- bold_specimens(taxon='Osmia')
+ b <- bold_specimens(taxon='Osmia', format='xml', response=TRUE)
+
+ expect_equal(b$status_code, 200)
+ expect_equal(b$headers$`content-type`, "application/x-download")
+
+ expect_is(a, "data.frame")
+ expect_is(b, "response")
+
+ expect_is(a$recordID, "integer")
+ expect_is(a$directions, "character")
+
+ expect_is(b$headers, "insensitive")
+})
+
+test_that("Throws warning on call that takes forever including timeout in callopts", {
+ skip_on_cran()
+
+ expect_error(bold_specimens(geo='Costa Rica', config=timeout(2)), "Timeout was reached")
+})
+
+test_that("bold_seq returns correct thing when parameters empty or not given", {
+ skip_on_cran()
+
+ expect_error(bold_specimens(taxon=''), "must provide a non-empty value")
+ expect_error(bold_specimens(), "must provide a non-empty value")
+})
diff --git a/tests/testthat/test-bold_tax_id.R b/tests/testthat/test-bold_tax_id.R
new file mode 100644
index 0000000..649a9a5
--- /dev/null
+++ b/tests/testthat/test-bold_tax_id.R
@@ -0,0 +1,74 @@
+context("bold_tax_id")
+
+test_that("bold_tax_id returns the correct classes", {
+ skip_on_cran()
+
+ aa <- bold_tax_id(88899)
+ bb <- bold_tax_id(125295)
+
+ expect_is(aa, "data.frame")
+ expect_is(bb, "data.frame")
+
+ expect_is(aa$input, "numeric")
+ expect_is(aa$taxid, "integer")
+ expect_is(aa$tax_rank, "character")
+})
+
+test_that("bold_tax_id works with multiple ids passed in", {
+ skip_on_cran()
+
+ aa <- bold_tax_id(c(88899,125295))
+
+ expect_is(aa, "data.frame")
+ expect_equal(NROW(aa), 2)
+})
+
+test_that("bold_tax_id dataTypes param works as expected", {
+ skip_on_cran()
+
+ aa <- bold_tax_id(88899, dataTypes = "basic")
+ bb <- bold_tax_id(88899, dataTypes = "stats")
+ dd <- bold_tax_id(88899, dataTypes = "geo")
+ ee <- bold_tax_id(88899, dataTypes = "sequencinglabs")
+ ff <- bold_tax_id(321215, dataTypes = "stats") # no public marker sequences
+ gg <- bold_tax_id(321215, dataTypes = "basic,stats") # no public marker sequences
+
+ expect_is(aa, "data.frame")
+ expect_is(bb, "data.frame")
+ expect_is(dd, "data.frame")
+ expect_is(ee, "data.frame")
+ expect_is(ff, "data.frame")
+ expect_is(gg, "data.frame")
+
+ expect_equal(NROW(aa), 1)
+ expect_equal(NROW(bb), 1)
+ expect_equal(NROW(dd), 1)
+ expect_equal(NROW(ee), 1)
+ expect_equal(NROW(ff), 1)
+ expect_equal(NROW(gg), 1)
+
+ expect_named(dd, c('input','Brazil','Mexico','Panama','Guatemala','Peru','Bolivia','Ecuador'))
+
+ expect_gt(NCOL(bb), NCOL(aa))
+ expect_gt(NCOL(ee), NCOL(aa))
+ expect_gt(NCOL(bb), NCOL(ee))
+ expect_gt(NCOL(ff), NCOL(aa))
+ expect_gt(NCOL(gg), NCOL(ff))
+})
+
+test_that("includeTree param works as expected", {
+ skip_on_cran()
+
+ aa <- bold_tax_id(id=88899, includeTree=FALSE)
+ bb <- bold_tax_id(id=88899, includeTree=TRUE)
+
+ expect_is(aa, "data.frame")
+ expect_is(bb, "data.frame")
+ expect_gt(NROW(bb), NROW(aa))
+})
+
+test_that("bold_tax_id fails well", {
+ skip_on_cran()
+
+ expect_error(bold_tax_id(), "argument \"id\" is missing, with no default")
+})
diff --git a/tests/testthat/test-bold_tax_name.R b/tests/testthat/test-bold_tax_name.R
new file mode 100644
index 0000000..672d7c9
--- /dev/null
+++ b/tests/testthat/test-bold_tax_name.R
@@ -0,0 +1,33 @@
+context("bold_tax_name")
+
+test_that("bold_tax_name returns the correct classes", {
+ skip_on_cran()
+
+ a <- bold_tax_name(name='Diplura')
+ b <- bold_tax_name(name=c('Diplura','Osmia'))
+ cc <- bold_tax_name(name=c("Apis","Puma concolor","Pinus concolor"))
+
+ expect_is(a, "data.frame")
+ expect_is(b, "data.frame")
+ expect_is(cc, "data.frame")
+
+ expect_is(a$input, "character")
+ expect_is(a$taxid, "integer")
+})
+
+test_that("bold_tax_name fails well", {
+ skip_on_cran()
+
+ expect_error(bold_tax_name(), "argument \"name\" is missing, with no default")
+})
+
+test_that("fuzzy works", {
+ skip_on_cran()
+
+ aa <- bold_tax_name(name='Diplur', fuzzy=TRUE)
+ aa_not <- bold_tax_name(name='Diplur', fuzzy=FALSE)
+
+ expect_is(aa, "data.frame")
+ expect_is(aa$input, "character")
+ expect_gt(NROW(aa), NROW(aa_not))
+})
diff --git a/vignettes/bold_vignette.Rmd b/vignettes/bold_vignette.Rmd
new file mode 100644
index 0000000..5d5d2f9
--- /dev/null
+++ b/vignettes/bold_vignette.Rmd
@@ -0,0 +1,439 @@
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{bold vignette}
+%\VignetteEncoding{UTF-8}
+-->
+
+
+
+`bold` is an R package to connect to [BOLD Systems](http://www.boldsystems.org/) via their API. Functions in `bold` let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.
+
+### bold info
+
++ [BOLD home page](http://boldsystems.org/)
++ [BOLD API docs](http://boldsystems.org/index.php/resources/api)
+
+### Using bold
+
+**Install**
+
+Install `bold` from CRAN
+
+
+
+```r
+install.packages("bold")
+```
+
+Or install the development version from GitHub
+
+
+```r
+devtools::install_github("ropensci/bold")
+```
+
+Load the package
+
+
+```r
+library("bold")
+```
+
+
+### Search for taxonomic names via names
+
+`bold_tax_name` searches for names with names.
+
+
+```r
+bold_tax_name(name = 'Diplura')
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+```
+
+
+```r
+bold_tax_name(name = c('Diplura', 'Osmia'))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 Diplura 591238 Diplura order Animals 82 Insecta
+#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
+#> 3 Osmia 4940 Osmia genus Animals 4962 Megachilinae
+#> taxonrep
+#> 1 Diplura
+#> 2 <NA>
+#> 3 Osmia
+```
+
+
+### Search for taxonomic names via BOLD identifiers
+
+`bold_tax_id` searches for names with BOLD identifiers.
+
+
+```r
+bold_tax_id(id = 88899)
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+```
+
+
+```r
+bold_tax_id(id = c(88899, 125295))
+#> input taxid taxon tax_rank tax_division parentid parentname
+#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
+#> 2 125295 125295 Helianthus genus Plants 100962 Asteraceae
+```
+
+
+### Search for sequence data only
+
+The BOLD sequence API gives back sequence data, with a bit of metadata.
+
+The default is to get a list back
+
+
+```r
+bold_seq(taxon = 'Coelioxys')[1:2]
+#> [[1]]
+#> [[1]]$id
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$name
+#> [1] "Coelioxys conica"
+#>
+#> [[1]]$gene
+#> [1] "FBAPB491-09"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------ACCTCTTTAAGAATAATTATTCGTATAGAAATAAGAATTCCAGGATCTTGAATTAATAATGATCAAATTTATAACTCCTTTATTACAGCACATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTCTTATTGGAGGATTTGGAAATTGATTAGTACCTTTAATATTAGGATCACCAGATATAGCTTTCCCACGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTATTAAGTAATTTAATAAATCCCAGACCAGGAACAGGCTGAACAGTTTATCCTCCTTTATCTTTATACACATACCACCCTTCTCCCTCAGTTGATTTAGCAATTTTTTCACTACATCTATCAGGAATCTCTTCTATTATTGGATCTATAAATTTTATTGTTACAATTTTAATAATAAAAAACTTTTCAATAAATTATAATCAAATACCATTATTCC [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$name
+#> [1] "Coelioxys afra"
+#>
+#> [[2]]$gene
+#> [1] "FBAPC351-10"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ACGAATAAATAATGTAAGATTTTGACTATTACCTCCCTCAATTTTCTTATTATTATCAAGAACCCTAATTAACCCAAGAGCTGGTACTGGATGAACTGTATATCCTCCTTTATCCTTATATACATTTCATGCCTCACCTTCCGTTGATTTAGCAATTTTTTCACTTCATTTATCAGGAATTTCATCAATTATTGGATCAATAAATTTTATTGTTACAATCTTAATAATAAAAAATTTTTCTTTAAAT [...]
+```
+
+You can optionally get back the `httr` response object
+
+
+```r
+res <- bold_seq(taxon = 'Coelioxys', response = TRUE)
+res$headers
+#> $date
+#> [1] "Tue, 15 Sep 2015 20:02:31 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=fasta.fas"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+```
+
+You can do geographic searches
+
+
+```r
+bold_seq(geo = "USA")
+#> [[1]]
+#> [[1]]$id
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$name
+#> [1] "Macrobdella decora"
+#>
+#> [[1]]$gene
+#> [1] "GBAN1777-08"
+#>
+#> [[1]]$sequence
+#> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTATTTTAGGTT [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$name
+#> [1] "Haemopis terrestris"
+#>
+#> [[2]]$gene
+#> [1] "GBAN1780-08"
+#>
+#> [[2]]$sequence
+#> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTATTTTAGGCT [...]
+#>
+#>
+#> [[3]]
+#> [[3]]$id
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$name
+#> [1] "Steinernema carpocapsae"
+#>
+#> [[3]]$gene
+#> [1] "GBNM0293-06"
+#>
+#> [[3]]$sequence
+#> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGCTATTAATT [...]
+#>
+#>
+#> [[4]]
+#> [[4]]$id
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[4]]$gene
+#> [1] "NEONV108-11"
+#>
+#> [[4]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACT [...]
+#>
+#>
+#> [[5]]
+#> [[5]]$id
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$name
+#> [1] "Aedes thelcter"
+#>
+#> [[5]]$gene
+#> [1] "NEONV109-11"
+#>
+#> [[5]]$sequence
+#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACT [...]
+```
+
+And you can search by researcher name
+
+
+```r
+bold_seq(researchers = 'Thibaud Decaens')[[1]]
+#> $id
+#> [1] "BGABA657-14"
+#>
+#> $name
+#> [1] "Coleoptera"
+#>
+#> $gene
+#> [1] "BGABA657-14"
+#>
+#> $sequence
+#> [1] "ACACTCTATTTCATTTTCGGAGCTTGATCAGGAATAGTAGGAACTTCTTTAAGAATACTAATTCGATCTGAATTGGGAAACCCCGGCTCATTGATTGGGGATGATCAAATTTATAATGTTATTGTAACAGCCCATGCATTCATTATAATTTTTTTTATAGTAATACCGATCATAATAGGAGGTTTTGGAAATTGATTAGTCCCGCTAATATTAGGTGCCCCAGATATAGCATTTCCTCGAATAAATAATATAAGATTTTGACTTCTTCCGCCTTCATTAACTTTACTTATTATAAGAAGAATTGTAGAAAACGGGGCGGGAACAGGATGAACAGTTTACCCACCCCTCTCTTCTAACATTGCTCATAGAGGAGCCTCTGTAGATCTTGCAATTTTTAGATTACATTTAGCCGGTGTATCATCAATTTTAGGTGCAGTTAATTTTATTACAACTATTATTAATATACGACCTAAAGGAATAACAT [...]
+```
+
+by taxon IDs
+
+
+```r
+bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11'))
+#> [[1]]
+#> [[1]]$id
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[1]]$gene
+#> [1] "ACRJP618-11"
+#>
+#> [[1]]$sequence
+#> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAAT [...]
+#>
+#>
+#> [[2]]
+#> [[2]]$id
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$name
+#> [1] "Lepidoptera"
+#>
+#> [[2]]$gene
+#> [1] "ACRJP619-11"
+#>
+#> [[2]]$sequence
+#> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAAT [...]
+```
+
+by container (containers include project codes and dataset codes)
+
+
+```r
+bold_seq(container = 'ACRJP')[[1]]
+#> $id
+#> [1] "ACRJP003-09"
+#>
+#> $name
+#> [1] "Lepidoptera"
+#>
+#> $gene
+#> [1] "ACRJP003-09"
+#>
+#> $sequence
+#> [1] "AACATTATATTTTATTTTTGGGATCTGATCTGGAATAGTAGGGACATCTTTAAGTATACTAATTCGAATAGAACTAGGAAATCCTGGATGTTTAATTGGGGATGATCAAATTTATAATACTATTGTTACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCCATTATAATTGGAGGTTTTGGCAATTGACTTGTACCATTAATATTAGGAGCCCCTGATATAGCATTTCCCCGAATAAATAATATAAGATTTTGACTTCTTCCCCCCTCATTAATTTTATTAATTTCAAGAAGAATTGTTGAAAATGGAGCAGGAACAGGATGAACAGTCTATCCTCCATTATCTTCTAATATTGCGCATAGAGGATCCTCTGTTGATTTAGCTATTTTCTCACTTCATTTAGCAGGAATTTCTTCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAATAAATAATTTACTT [...]
+```
+
+by bin (a bin is a _Barcode Index Number_)
+
+
+```r
+bold_seq(bin = 'BOLD:AAA5125')[[1]]
+#> $id
+#> [1] "BLPAB406-06"
+#>
+#> $name
+#> [1] "Eacles ormondei"
+#>
+#> $gene
+#> [1] "BLPAB406-06"
+#>
+#> $sequence
+#> [1] "AACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAGTAGGAACTTCTTTAAGATTACTAATTCGAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACTCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATAATATAATA [...]
+```
+
+And there are more ways to query, check out the docs for `?bold_seq`.
+
+
+### Search for specimen data only
+
+The BOLD specimen API doesn't give back sequences, only specimen data. By default you download `tsv` format data, which is given back to you as a `data.frame`
+
+
+```r
+res <- bold_specimens(taxon = 'Osmia')
+head(res[,1:8])
+#> processid sampleid recordID catalognum fieldnum
+#> 1 ASGCB261-13 BIOUG07489-F10 3955538 BIOUG07489-F10
+#> 2 BCHYM1499-13 BC ZSM HYM 19359 4005348 BC ZSM HYM 19359 BC ZSM HYM 19359
+#> 3 BCHYM412-13 BC ZSM HYM 18272 3896353 BC ZSM HYM 18272 BC ZSM HYM 18272
+#> 4 BCHYM413-13 BC ZSM HYM 18273 3896354 BC ZSM HYM 18273 BC ZSM HYM 18273
+#> 5 FBAPB706-09 BC ZSM HYM 02181 1289067 BC ZSM HYM 02181 BC ZSM HYM 02181
+#> 6 FBAPB730-09 BC ZSM HYM 02205 1289091 BC ZSM HYM 02205 BC ZSM HYM 02205
+#> institution_storing bin_uri phylum_taxID
+#> 1 Biodiversity Institute of Ontario BOLD:AAB8874 20
+#> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAD6282 20
+#> 3 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
+#> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAE4126 20
+#> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20
+```
+
+You can optionally get back the data in `XML` format
+
+
+```r
+bold_specimens(taxon = 'Osmia', format = 'xml')
+```
+
+
+```r
+<?xml version="1.0" encoding="UTF-8"?>
+<bold_records xsi:noNamespaceSchemaLocation="http://www.boldsystems.org/schemas/BOLDPublic_record.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+ <record>
+ <record_id>1470124</record_id>
+ <processid>BOM1525-10</processid>
+ <bin_uri>BOLD:AAN3337</bin_uri>
+ <specimen_identifiers>
+ <sampleid>DHB 1011</sampleid>
+ <catalognum>DHB 1011</catalognum>
+ <fieldnum>DHB1011</fieldnum>
+ <institution_storing>Marjorie Barrick Museum</institution_storing>
+ </specimen_identifiers>
+ <taxonomy>
+```
+
+You can choose to get the `httr` response object back if you'd rather work with the raw data returned from the BOLD API.
+
+
+```r
+res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE)
+res$url
+#> [1] "http://www.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&specimen_download=xml"
+res$status_code
+#> [1] 200
+res$headers
+#> $date
+#> [1] "Mon, 28 Mar 2016 20:39:18 GMT"
+#>
+#> $server
+#> [1] "Apache/2.2.15 (Red Hat)"
+#>
+#> $`x-powered-by`
+#> [1] "PHP/5.3.15"
+#>
+#> $`content-disposition`
+#> [1] "attachment; filename=bold_data.xml"
+#>
+#> $connection
+#> [1] "close"
+#>
+#> $`transfer-encoding`
+#> [1] "chunked"
+#>
+#> $`content-type`
+#> [1] "application/x-download"
+#>
+#> attr(,"class")
+#> [1] "insensitive" "list"
+```
+
+### Search for specimen plus sequence data
+
+The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default `tsv` format data, which is given back to you as a `data.frame`. Here, we're setting `sepfasta=TRUE` so that the sequence data is given back as a list, and taken out of the `data.frame` returned so the `data.frame` is more manageable.
+
+
+```r
+res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE)
+res$fasta[1:2]
+#> $`ASGCB261-13`
+#> [1] "AATTTTATATATAATTTTTGCTATATGATCAGGAATAATTGGTTCAGCAATAAGAATTATTATTCGAATAGAATTAAGAATTCCTGGTTCATGAATTTCAAATGATCAAACTTATAATTCTTTAGTTACTGCTCATGCTTTTTTAATAATTTTTTTCTTAGTTATACCATTCTTAATTGGGGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCATTTCCACGAATAAATAATATTAGATTTTGACTTTTACCTCCTTCTTTAATACTTTTATTATTAAGAAATTTTATAAATCCTAGTCCAGGAACTGGATGAACTGTTTATCCACCTTTATCTTCTCATTTATTTCATTCTTCTCCTTCAGTTGATATAGCTATTTTTTCTTTACATATTTCTGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAA [...]
+#>
+#> $`BCHYM1499-13`
+#> [1] "AATTCTTTACATAATTTTTGCTTTATGATCTGGAATAATTGGGTCAGCAATAAGAATTATTATTCGAATAGAATTAAGTATCCCAGGTTCATGAATTACTAATGATCAAATTTATAATTCTTTAGTAACTGCACATGCTTTTTTAATAATTTTTTTTCTTGTGATACCATTTTTAATTGGAGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATTCCAGATATAGCTTTCCCACGAATAAACAATATTAGATTTTGATTATTACCGCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCCCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAA [...]
+```
+
+Or you can index to a specific sequence like
+
+
+```r
+res$fasta['GBAH0293-06']
+#> $`GBAH0293-06`
+#> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATT [...]
+```
+
+### Get trace files
+
+This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.
+
+
+```r
+bold_trace(taxon = 'Osmia', quiet = TRUE)
+```
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-cran-bold.git
More information about the debian-med-commit
mailing list