[med-svn] [r-cran-qqman] 02/06: Imported Upstream version 0.1.2
Dylan Aïssi
bob.dybian-guest at moszumanska.debian.org
Thu Oct 27 20:51:00 UTC 2016
This is an automated email from the git hooks/post-receive script.
bob.dybian-guest pushed a commit to branch master
in repository r-cran-qqman.
commit d3986601178b6c3c88566fe0e288cee857d6cbc7
Author: Dylan Aïssi <bob.dybian at gmail.com>
Date: Thu Oct 27 22:10:26 2016 +0200
Imported Upstream version 0.1.2
---
DESCRIPTION | 15 ++
MD5 | 21 +++
NAMESPACE | 4 +
R/gwasResults-data.R | 5 +
R/manhattan.R | 192 ++++++++++++++++++++++++
R/package.R | 10 ++
R/qq.R | 53 +++++++
R/snpsOfInterest-data.R | 4 +
R/zzz.R | 8 +
README.md | 76 ++++++++++
build/vignette.rds | Bin 0 -> 211 bytes
data/gwasResults.RData | Bin 0 -> 133708 bytes
data/snpsOfInterest.RData | Bin 0 -> 284 bytes
inst/doc/qqman.R | 76 ++++++++++
inst/doc/qqman.Rmd | 133 +++++++++++++++++
inst/doc/qqman.html | 363 ++++++++++++++++++++++++++++++++++++++++++++++
man/gwasResults.Rd | 9 ++
man/manhattan.Rd | 60 ++++++++
man/qq.Rd | 25 ++++
man/qqman.Rd | 15 ++
man/snpsOfInterest.Rd | 9 ++
vignettes/qqman.Rmd | 133 +++++++++++++++++
22 files changed, 1211 insertions(+)
diff --git a/DESCRIPTION b/DESCRIPTION
new file mode 100644
index 0000000..dbcde6a
--- /dev/null
+++ b/DESCRIPTION
@@ -0,0 +1,15 @@
+Package: qqman
+Title: Q-Q and manhattan plots for GWAS data
+Version: 0.1.2
+Author: Stephen Turner <vustephen at gmail.com>
+Maintainer: Stephen Turner <vustephen at gmail.com>
+Description: Q-Q and manhattan plots for GWAS data
+Depends: R (>= 3.0.0)
+Suggests: knitr
+License: GPL-3
+LazyData: true
+VignetteBuilder: knitr
+Packaged: 2014-09-25 16:54:14 UTC; sdt5z
+NeedsCompilation: no
+Repository: CRAN
+Date/Publication: 2014-09-25 22:29:57
diff --git a/MD5 b/MD5
new file mode 100644
index 0000000..4fb8e9e
--- /dev/null
+++ b/MD5
@@ -0,0 +1,21 @@
+878f0a46bac21b0a8714378777493948 *DESCRIPTION
+e8a3b67a1cfc5de2c87e2c05509ccb86 *NAMESPACE
+4703c59c0422edc311c09f34a79425c2 *R/gwasResults-data.R
+506bb6bd378ce87b44d1b452c47dac6a *R/manhattan.R
+6f2bc34e71e66aef115d0a9dff9c2c78 *R/package.R
+32fa6e2f67e18b0f33654b21de1624a6 *R/qq.R
+1d2a7fe82e8a5143d495b8f2f7800eca *R/snpsOfInterest-data.R
+4e7b578263eee2e20de8b161e4c08af4 *R/zzz.R
+7d28ad0c1131dbad19b657df61645cda *README.md
+86002107ab6d65a20ca6f0bbc7db8d34 *build/vignette.rds
+ff43bc38623980d5b64e423362075c10 *data/gwasResults.RData
+a10481df0b1192bc3f599e14414e0a55 *data/snpsOfInterest.RData
+9cb0830cb6ce29dbd98ebbb76d770cab *inst/doc/qqman.R
+47c06f8e3c96a96fca34f31fad80fd6d *inst/doc/qqman.Rmd
+c608b972e60a7ce5b88f47c7ae9055cb *inst/doc/qqman.html
+d1bb3943d85241abb665c6235c55825a *man/gwasResults.Rd
+516c1f05ddcecd781b1b1e01e16ad68d *man/manhattan.Rd
+7b64ebaab0b22df9b3c146614083689b *man/qq.Rd
+e7729862ac4cb3540ee65a21600538a4 *man/qqman.Rd
+efcb6525c2360e9fbf5acfa005b5cdff *man/snpsOfInterest.Rd
+47c06f8e3c96a96fca34f31fad80fd6d *vignettes/qqman.Rmd
diff --git a/NAMESPACE b/NAMESPACE
new file mode 100644
index 0000000..5d9f9a2
--- /dev/null
+++ b/NAMESPACE
@@ -0,0 +1,4 @@
+# Generated by roxygen2 (4.0.1): do not edit by hand
+
+export(manhattan)
+export(qq)
diff --git a/R/gwasResults-data.R b/R/gwasResults-data.R
new file mode 100644
index 0000000..b4648df
--- /dev/null
+++ b/R/gwasResults-data.R
@@ -0,0 +1,5 @@
+#' @name gwasResults
+#' @title Simulated GWAS results
+#' @description Simulated GWAS results as obtained from \code{plink --assoc}.
+#' @docType data
+NULL
\ No newline at end of file
diff --git a/R/manhattan.R b/R/manhattan.R
new file mode 100644
index 0000000..555e1c7
--- /dev/null
+++ b/R/manhattan.R
@@ -0,0 +1,192 @@
+#' Creates a manhattan plot
+#'
+#' Creates a manhattan plot from PLINK assoc output (or any data frame with
+#' chromosome, position, and p-value).
+#'
+#' @param x A data.frame with columns "BP," "CHR," "P," and optionally, "SNP."
+#' @param chr A string denoting the column name for the chromosome. Defaults to
+#' PLINK's "CHR." Said column must be numeric. If you have X, Y, or MT
+#' chromosomes, be sure to renumber these 23, 24, 25, etc.
+#' @param bp A string denoting the column name for the chromosomal position.
+#' Defaults to PLINK's "BP." Said column must be numeric.
+#' @param p A string denoting the column name for the p-value. Defaults to
+#' PLINK's "P." Said column must be numeric.
+#' @param snp A string denoting the column name for the SNP name (rs number).
+#' Defaults to PLINK's "SNP." Said column should be a character.
+#' @param col A character vector indicating which colors to alternate.
+#' @param chrlabs A character vector equal to the number of chromosomes
+#' specifying the chromosome labels (e.g., \code{c(1:22, "X", "Y", "MT")}).
+#' @param suggestiveline Where to draw a "suggestive" line. Default
+#' -log10(1e-5). Set to FALSE to disable.
+#' @param genomewideline Where to draw a "genome-wide sigificant" line. Default
+#' -log10(5e-8). Set to FALSE to disable.
+#' @param highlight A character vector of SNPs in your dataset to highlight.
+#' These SNPs should all be in your dataset.
+#' @param logp If TRUE, the -log10 of the p-value is plotted. It isn't very
+#' useful to plot raw p-values, but plotting the raw value could be useful for
+#' other genome-wide plots, for example, peak heights, bayes factors, test
+#' statistics, other "scores," etc.
+#' @param ... Arguments passed on to other plot/points functions
+#'
+#' @return A manhattan plot.
+#'
+#' @keywords visualization manhattan
+#'
+#' @examples
+#' manhattan(gwasResults)
+#'
+#' @export
+
+manhattan <- function(x, chr="CHR", bp="BP", p="P", snp="SNP",
+ col=c("gray10", "gray60"), chrlabs=NULL,
+ suggestiveline=-log10(1e-5), genomewideline=-log10(5e-8),
+ highlight=NULL, logp=TRUE, ...) {
+
+ # Not sure why, but package check will warn without this.
+ CHR=BP=P=index=NULL
+
+ # Check for sensible dataset
+ ## Make sure you have chr, bp and p columns.
+ if (!(chr %in% names(x))) stop(paste("Column", chr, "not found!"))
+ if (!(bp %in% names(x))) stop(paste("Column", bp, "not found!"))
+ if (!(p %in% names(x))) stop(paste("Column", p, "not found!"))
+ ## warn if you don't have a snp column
+ if (!(snp %in% names(x))) warning(paste("No SNP column found. OK unless you're trying to highlight."))
+ ## make sure chr, bp, and p columns are numeric.
+ if (!is.numeric(x[[chr]])) stop(paste(chr, "column should be numeric. Do you have 'X', 'Y', 'MT', etc? If so change to numbers and try again."))
+ if (!is.numeric(x[[bp]])) stop(paste(bp, "column should be numeric."))
+ if (!is.numeric(x[[p]])) stop(paste(p, "column should be numeric."))
+
+ # Create a new data.frame with columns called CHR, BP, and P.
+ d=data.frame(CHR=x[[chr]], BP=x[[bp]], P=x[[p]])
+
+ # If the input data frame has a SNP column, add it to the new data frame you're creating.
+ if (!is.null(x[[snp]])) d=transform(d, SNP=x[[snp]])
+
+ # Set positions, ticks, and labels for plotting
+ ## Sort and keep only values where is numeric.
+ #d <- subset(d[order(d$CHR, d$BP), ], (P>0 & P<=1 & is.numeric(P)))
+ d <- subset(d, (is.numeric(CHR) & is.numeric(BP) & is.numeric(P)))
+ d <- d[order(d$CHR, d$BP), ]
+ #d$logp <- ifelse(logp, yes=-log10(d$P), no=d$P)
+ if (logp) {
+ d$logp <- -log10(d$P)
+ } else {
+ d$logp <- d$P
+ }
+ d$pos=NA
+
+
+ # Fixes the bug where one chromosome is missing by adding a sequential index column.
+ d$index=NA
+ ind = 0
+ for (i in unique(d$CHR)){
+ ind = ind + 1
+ d[d$CHR==i,]$index = ind
+ }
+
+ # This section sets up positions and ticks. Ticks should be placed in the
+ # middle of a chromosome. The a new pos column is added that keeps a running
+ # sum of the positions of each successive chromsome. For example:
+ # chr bp pos
+ # 1 1 1
+ # 1 2 2
+ # 2 1 3
+ # 2 2 4
+ # 3 1 5
+ nchr = length(unique(d$CHR))
+ if (nchr==1) { ## For a single chromosome
+ options(scipen=999)
+ d$pos=d$BP/1e6
+ ticks=floor(length(d$pos))/2+1
+ xlabel = paste('Chromosome',unique(d$CHR),'position(Mb)')
+ labs = ticks
+ } else { ## For multiple chromosomes
+ lastbase=0
+ ticks=NULL
+ for (i in unique(d$index)) {
+ if (i==1) {
+ d[d$index==i, ]$pos=d[d$index==i, ]$BP
+ } else {
+ lastbase=lastbase+tail(subset(d,index==i-1)$BP, 1)
+ d[d$index==i, ]$pos=d[d$index==i, ]$BP+lastbase
+ }
+ # Old way: assumes SNPs evenly distributed
+ # ticks=c(ticks, d[d$index==i, ]$pos[floor(length(d[d$index==i, ]$pos)/2)+1])
+ # New way: doesn't make that assumption
+ ticks = c(ticks, (min(d[d$CHR == i,]$pos) + max(d[d$CHR == i,]$pos))/2 + 1)
+ }
+ xlabel = 'Chromosome'
+ #labs = append(unique(d$CHR),'') ## I forgot what this was here for... if seems to work, remove.
+ labs <- unique(d$CHR)
+ }
+
+ # Initialize plot
+ xmax = ceiling(max(d$pos) * 1.03)
+ xmin = floor(max(d$pos) * -0.03)
+
+ # The old way to initialize the plot
+ # plot(NULL, xaxt='n', bty='n', xaxs='i', yaxs='i', xlim=c(xmin,xmax), ylim=c(ymin,ymax),
+ # xlab=xlabel, ylab=expression(-log[10](italic(p))), las=1, pch=20, ...)
+
+
+ # The new way to initialize the plot.
+ ## See http://stackoverflow.com/q/23922130/654296
+ ## First, define your default arguments
+ def_args <- list(xaxt='n', bty='n', xaxs='i', yaxs='i', las=1, pch=20,
+ xlim=c(xmin,xmax), ylim=c(0,ceiling(max(d$logp))),
+ xlab=xlabel, ylab=expression(-log[10](italic(p))))
+ ## Next, get a list of ... arguments
+ #dotargs <- as.list(match.call())[-1L]
+ dotargs <- list(...)
+ ## And call the plot function passing NA, your ... arguments, and the default
+ ## arguments that were not defined in the ... arguments.
+ do.call("plot", c(NA, dotargs, def_args[!names(def_args) %in% names(dotargs)]))
+
+ # If manually specifying chromosome labels, ensure a character vector and number of labels matches number chrs.
+ if (!is.null(chrlabs)) {
+ if (is.character(chrlabs)) {
+ if (length(chrlabs)==length(labs)) {
+ labs <- chrlabs
+ } else {
+ warning("You're trying to specify chromosome labels but the number of labels != number of chromosomes.")
+ }
+ } else {
+ warning("If you're trying to specify chromosome labels, chrlabs must be a character vector")
+ }
+ }
+
+ # Add an axis.
+ if (nchr==1) { #If single chromosome, ticks and labels automatic.
+ axis(1, ...)
+ } else { # if multiple chrs, use the ticks and labels you created above.
+ axis(1, at=ticks, labels=labs, ...)
+ }
+
+ # Create a vector of alternatiting colors
+ col=rep(col, max(d$CHR))
+
+ # Add points to the plot
+ if (nchr==1) {
+ with(d, points(pos, logp, pch=20, col=col[1], ...))
+ } else {
+ # if multiple chromosomes, need to alternate colors and increase the color index (icol) each chr.
+ icol=1
+ for (i in unique(d$index)) {
+ with(d[d$index==unique(d$index)[i], ], points(pos, logp, col=col[icol], pch=20, ...))
+ icol=icol+1
+ }
+ }
+
+ # Add suggestive and genomewide lines
+ if (suggestiveline) abline(h=suggestiveline, col="blue")
+ if (genomewideline) abline(h=genomewideline, col="red")
+
+ # Highlight snps from a character vector
+ if (!is.null(highlight)) {
+ if (any(!(highlight %in% d$SNP))) warning("You're trying to highlight SNPs that don't exist in your results.")
+ d.highlight=d[which(d$SNP %in% highlight), ]
+ with(d.highlight, points(pos, logp, col="green3", pch=20, ...))
+ }
+
+}
diff --git a/R/package.R b/R/package.R
new file mode 100644
index 0000000..ea0e5c8
--- /dev/null
+++ b/R/package.R
@@ -0,0 +1,10 @@
+#' Create Q-Q and manhattan plots for GWAS data.
+#'
+#' A package for creating Q-Q and manhattan plots for GWAS data. See the package
+#' vignette for details: \cr\cr
+#' \code{vignette("qqman")}
+#'
+#' @name qqman
+#' @docType package
+#' @author Stephen Turner <\url{http://stephenturner.us}>
+NULL
\ No newline at end of file
diff --git a/R/qq.R b/R/qq.R
new file mode 100644
index 0000000..2b2d31f
--- /dev/null
+++ b/R/qq.R
@@ -0,0 +1,53 @@
+#' Creates a Q-Q plot
+#'
+#' Creates a quantile-quantile plot from p-values from a GWAS study.
+#'
+#' @param pvector A numeric vector of p-values.
+#' @param ... Other arguments passed to \code{plot()}
+#'
+#' @return A Q-Q plot.
+#'
+#' @keywords visualization qq qqplot
+#'
+#' @examples
+#' qq(gwasResults$P)
+#'
+#' @export
+
+qq = function(pvector, ...) {
+
+ # Check for sensible input
+ if (!is.numeric(pvector)) stop("Input must be numeric.")
+
+ # limit to not missing, not nan, not null, not infinite, between 0 and 1
+ pvector <- pvector[!is.na(pvector) & !is.nan(pvector) & !is.null(pvector) & is.finite(pvector) & pvector<1 & pvector>0]
+
+ # Observed and expected
+ o = -log10(sort(pvector,decreasing=FALSE))
+ e = -log10( ppoints(length(pvector) ))
+
+
+# # The old way
+# plot(e, o, pch=20,
+# xlab=expression(Expected~~-log[10](italic(p))),
+# ylab=expression(Observed~~-log[10](italic(p))),
+# ...)
+
+ # The new way to initialize the plot.
+ ## See http://stackoverflow.com/q/23922130/654296
+ ## First, define your default arguments
+ def_args <- list(pch=20, xlim=c(0, max(e)), ylim=c(0, max(o)),
+ xlab=expression(Expected~~-log[10](italic(p))),
+ ylab=expression(Observed~~-log[10](italic(p)))
+ )
+ ## Next, get a list of ... arguments
+ #dotargs <- as.list(match.call())[-1L]
+ dotargs <- list(...)
+ ## And call the plot function passing NA, your ... arguments, and the default
+ ## arguments that were not defined in the ... arguments.
+ tryCatch(do.call("plot", c(list(x=e, y=o), def_args[!names(def_args) %in% names(dotargs)], dotargs)), warn=stop)
+
+ # Add diagonal
+ abline(0,1,col="red")
+
+}
diff --git a/R/snpsOfInterest-data.R b/R/snpsOfInterest-data.R
new file mode 100644
index 0000000..20647a3
--- /dev/null
+++ b/R/snpsOfInterest-data.R
@@ -0,0 +1,4 @@
+#' @name snpsOfInterest
+#' @title snpsOfInterest
+#' @docType data
+NULL
\ No newline at end of file
diff --git a/R/zzz.R b/R/zzz.R
new file mode 100644
index 0000000..37d09ca
--- /dev/null
+++ b/R/zzz.R
@@ -0,0 +1,8 @@
+.onAttach <- function(...){
+ packageStartupMessage()
+ packageStartupMessage("For example usage please run: vignette('qqman')")
+ packageStartupMessage()
+ packageStartupMessage("Citation appreciated but not required:")
+ packageStartupMessage("Turner, S.D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. biorXiv DOI: 10.1101/005165 (2014).")
+ packageStartupMessage()
+}
\ No newline at end of file
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..b32c2bf
--- /dev/null
+++ b/README.md
@@ -0,0 +1,76 @@
+# qqman: An R package for creating Q-Q and manhattan plots from GWAS results.
+
+![qqman.gif](assets/qqman.gif)
+
+## Citation
+
+If you'd like to cite qqman (appreciated but not required), please cite the pre-print below:
+
+Turner, S.D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. [*biorXiv* DOI: 10.1101/005165](http://biorxiv.org/content/early/2014/05/14/005165).
+
+## Installation
+
+Install the stable release from CRAN:
+
+```coffee
+install.packages("qqman")
+```
+
+Or install directly from github using devtools
+
+```coffee
+library(devtools)
+install_github("stephenturner/qqman")
+```
+
+Or install the most recent development release with devtools (note, there be dragons here):
+
+```coffee
+library(devtools)
+install_github("stephenturner/qqman", ref="dev")
+```
+
+Load the package each time you use it:
+
+```coffee
+library(qqman)
+```
+
+## Usage
+
+See the [online package vignette](http://cran.r-project.org/web/packages/qqman/vignettes/qqman.html) for more examples:
+
+```coffee
+vignette("manhattan")
+```
+
+Take a look at the built-in data:
+
+```coffee
+head(gwasResults)
+```
+
+Basic manhattan plot using built-in data:
+
+```coffee
+manhattan(gwasResults)
+```
+
+Basic Q-Q plot using built-in data:
+
+```coffee
+qq(gwasResults$P)
+```
+
+Get help:
+
+```coffee
+?manhattan
+?qq
+```
+
+## Notes
+
+* This release is substantially simplified for the sake of maintainability and creating an R package. The old code that allows confidence intervals on the Q-Q plot and allows more flexible annotation and highlighting is still available at the [version 0.0.0 tag](https://github.com/stephenturner/qqman/tree/v0.0.0).
+* Special thanks to Dan Capurso and Tim Knutsen for useful contributions and bugfixes.
+* Thanks to all the blog commenters for pointing out bugs and other issues.
diff --git a/build/vignette.rds b/build/vignette.rds
new file mode 100644
index 0000000..58afde2
Binary files /dev/null and b/build/vignette.rds differ
diff --git a/data/gwasResults.RData b/data/gwasResults.RData
new file mode 100644
index 0000000..5709f84
Binary files /dev/null and b/data/gwasResults.RData differ
diff --git a/data/snpsOfInterest.RData b/data/snpsOfInterest.RData
new file mode 100644
index 0000000..6843ab5
Binary files /dev/null and b/data/snpsOfInterest.RData differ
diff --git a/inst/doc/qqman.R b/inst/doc/qqman.R
new file mode 100644
index 0000000..162b9b7
--- /dev/null
+++ b/inst/doc/qqman.R
@@ -0,0 +1,76 @@
+## ----, include=FALSE-----------------------------------------------------
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+
+## ----generatedata, eval=FALSE, echo=FALSE--------------------------------
+# # This code used to generate the test data. Runs slow, but does the job.
+# chrstats <- data.frame(chr=1:22, nsnps=1500)
+# chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+# chrstats
+#
+# d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)),
+# CHR=rep(NA, sum(chrstats$nsnps)),
+# BP=rep(NA, sum(chrstats$nsnps)),
+# P=rep(NA, sum(chrstats$nsnps)))
+# snpi <- 1
+# set.seed(42)
+# for (i in chrstats$chr) {
+# for (j in 1:chrstats[i, 2]) {
+# d[snpi, ]$SNP=paste0("rs", snpi)
+# d[snpi, ]$CHR=i
+# d[snpi, ]$BP=j
+# d[snpi, ]$P=runif(1)
+# snpi <- snpi+1
+# }
+# }
+#
+# divisor <- c(seq(2,50,2), seq(50,2,-2))
+# divisor <- divisor^4
+# length(divisor)
+# d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+# snpsOfInterest <- paste0("rs", 3001:3100)
+# qq(d$P)
+# manhattan(d, highlight=snpsOfInterest)
+# gwasResults <- d
+# save(gwasResults, file="data/gwasResults.RData")
+
+## ------------------------------------------------------------------------
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+
+## ------------------------------------------------------------------------
+as.data.frame(table(gwasResults$CHR))
+
+## ------------------------------------------------------------------------
+manhattan(gwasResults)
+
+## ------------------------------------------------------------------------
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+
+## ------------------------------------------------------------------------
+manhattan(subset(gwasResults, CHR==1))
+
+## ------------------------------------------------------------------------
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+
+## ------------------------------------------------------------------------
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+
+## ------------------------------------------------------------------------
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+
+## ------------------------------------------------------------------------
+qq(gwasResults$P)
+
+## ------------------------------------------------------------------------
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+ xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+
diff --git a/inst/doc/qqman.Rmd b/inst/doc/qqman.Rmd
new file mode 100644
index 0000000..c5fe6f8
--- /dev/null
+++ b/inst/doc/qqman.Rmd
@@ -0,0 +1,133 @@
+---
+title: "Intro to the qqman package"
+author: "Stephen D. Turner"
+date: '`r Sys.Date()`'
+output:
+ html_document:
+ toc: yes
+---
+
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+```{r, include=FALSE}
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+```
+
+# Intro to the **qqman** package
+
+```{r generatedata, eval=FALSE, echo=FALSE}
+# This code used to generate the test data. Runs slow, but does the job.
+chrstats <- data.frame(chr=1:22, nsnps=1500)
+chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+chrstats
+
+d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)),
+ CHR=rep(NA, sum(chrstats$nsnps)),
+ BP=rep(NA, sum(chrstats$nsnps)),
+ P=rep(NA, sum(chrstats$nsnps)))
+snpi <- 1
+set.seed(42)
+for (i in chrstats$chr) {
+ for (j in 1:chrstats[i, 2]) {
+ d[snpi, ]$SNP=paste0("rs", snpi)
+ d[snpi, ]$CHR=i
+ d[snpi, ]$BP=j
+ d[snpi, ]$P=runif(1)
+ snpi <- snpi+1
+ }
+}
+
+divisor <- c(seq(2,50,2), seq(50,2,-2))
+divisor <- divisor^4
+length(divisor)
+d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+snpsOfInterest <- paste0("rs", 3001:3100)
+qq(d$P)
+manhattan(d, highlight=snpsOfInterest)
+gwasResults <- d
+save(gwasResults, file="data/gwasResults.RData")
+```
+
+The **qqman** package includes functions for creating manhattan plots and q-q plots from GWAS results. The `gwasResults` data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:
+
+```{r}
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+```
+
+How many SNPs on each chromosome?
+
+```{r}
+as.data.frame(table(gwasResults$CHR))
+```
+
+## Creating manhattan plots
+
+Now, let's make a basic manhattan plot.
+
+```{r}
+manhattan(gwasResults)
+```
+
+We can also pass in other graphical parameters. Let's add a title (`main=`), increase the y-axis limit (`ylim=`), reduce the point size to 60% (`cex=`), and reduce the font size of the axis labels to 90% (`cex.axis=`). While we're at it, let's change the colors (`col=`), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:
+
+```{r}
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+```
+
+Now, let's look at a single chromosome:
+
+```{r}
+manhattan(subset(gwasResults, CHR==1))
+```
+
+Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called `snpsOfInterest`. You'll get a warning if you try to highlight SNPs that don't exist.
+
+```{r}
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+```
+
+We can combine highlighting and limiting to a single chromosome, and use the `xlim` graphical parameter to zoom in on a region of interest (between position 200-500):
+
+```{r}
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+```
+
+Finally, the `manhattan` function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the `p=` argument the name of the column we want to plot instead of the default "P" column. In this example, let's create a test statistic ("zscore"), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive lines because these are only meaningful if you're plotting -lo [...]
+
+```{r}
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+```
+
+A few notes on creating manhattan plots:
+
+* Run `str(gwasResults)`. Notice that the `gwasResults` data.frame has SNP, chromosome, position, and p-value columns named `SNP`, `CHR`, `BP`, and `P`. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the `chr=`, `bp=`, `p=`, and `snp=` arguments. See `help(manhattan)` for details.
+* The chromosome column must be numeric. If you have "X," "Y," or "MT" chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., `fix(manhattan)`) to change the line designating the axis tick labels (`labs <- unique(d$CHR)`) to set this to whatever you'd like it to be.
+* If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for `col="blue"`, `col="red"`, or `col="green3"` to modify the suggestive line, genomewide line, and highlight colors, respectively.
+
+## Creating Q-Q plots
+
+Creating Q-Q plots is straightforward - simply supply a vector of p-values to the `qq()` function.
+
+```{r}
+qq(gwasResults$P)
+```
+
+We can otionally supply many other graphical parameters.
+
+```{r}
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+ xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+```
\ No newline at end of file
diff --git a/inst/doc/qqman.html b/inst/doc/qqman.html
new file mode 100644
index 0000000..43dbc04
--- /dev/null
+++ b/inst/doc/qqman.html
@@ -0,0 +1,363 @@
+<!DOCTYPE html>
+<html>
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
+
+<title>Intro to the <strong>qqman</strong> package</title>
+
+<!-- Styles for R syntax highlighter -->
+<style type="text/css">
+ pre .operator,
+ pre .paren {
+ color: rgb(104, 118, 135)
+ }
+
+ pre .literal {
+ color: #990073
+ }
+
+ pre .number {
+ color: #099;
+ }
+
+ pre .comment {
+ color: #998;
+ font-style: italic
+ }
+
+ pre .keyword {
+ color: #900;
+ font-weight: bold
+ }
+
+ pre .identifier {
+ color: rgb(0, 0, 0);
+ }
+
+ pre .string {
+ color: #d14;
+ }
+</style>
+
+<!-- R syntax highlighter -->
+<script type="text/javascript">
+var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.chi [...]
+hljs.initHighlightingOnLoad();
+</script>
+
+<!-- MathJax scripts -->
+<script type="text/javascript" src="https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+</script>
+
+
+<style type="text/css">
+body, td {
+ font-family: sans-serif;
+ background-color: white;
+ font-size: 13px;
+}
+
+body {
+ max-width: 800px;
+ margin: auto;
+ padding: 1em;
+ line-height: 20px;
+}
+
+tt, code, pre {
+ font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
+}
+
+h1 {
+ font-size:2.2em;
+}
+
+h2 {
+ font-size:1.8em;
+}
+
+h3 {
+ font-size:1.4em;
+}
+
+h4 {
+ font-size:1.0em;
+}
+
+h5 {
+ font-size:0.9em;
+}
+
+h6 {
+ font-size:0.8em;
+}
+
+a:visited {
+ color: rgb(50%, 0%, 50%);
+}
+
+pre, img {
+ max-width: 100%;
+}
+
+pre code {
+ display: block; padding: 0.5em;
+}
+
+code {
+ font-size: 92%;
+ border: 1px solid #ccc;
+}
+
+code[class] {
+ background-color: #F8F8F8;
+}
+
+table, td, th {
+ border: none;
+}
+
+blockquote {
+ color:#666666;
+ margin:0;
+ padding-left: 1em;
+ border-left: 0.5em #EEE solid;
+}
+
+hr {
+ height: 0px;
+ border-bottom: none;
+ border-top-width: thin;
+ border-top-style: dotted;
+ border-top-color: #999999;
+}
+
+ at media print {
+ * {
+ background: transparent !important;
+ color: black !important;
+ filter:none !important;
+ -ms-filter: none !important;
+ }
+
+ body {
+ font-size:12pt;
+ max-width:100%;
+ }
+
+ a, a:visited {
+ text-decoration: underline;
+ }
+
+ hr {
+ visibility: hidden;
+ page-break-before: always;
+ }
+
+ pre, blockquote {
+ padding-right: 1em;
+ page-break-inside: avoid;
+ }
+
+ tr, img {
+ page-break-inside: avoid;
+ }
+
+ img {
+ max-width: 100% !important;
+ }
+
+ @page :left {
+ margin: 15mm 20mm 15mm 10mm;
+ }
+
+ @page :right {
+ margin: 15mm 10mm 15mm 20mm;
+ }
+
+ p, h2, h3 {
+ orphans: 3; widows: 3;
+ }
+
+ h2, h3 {
+ page-break-after: avoid;
+ }
+}
+</style>
+
+
+
+</head>
+
+<body>
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+<h1>Intro to the <strong>qqman</strong> package</h1>
+
+<p>The <strong>qqman</strong> package includes functions for creating manhattan plots and q-q plots from GWAS results. The <code>gwasResults</code> data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:</p>
+
+<pre><code class="r">str(gwasResults)
+</code></pre>
+
+<pre><code>'data.frame': 16470 obs. of 5 variables:
+ $ SNP : chr "rs1" "rs2" "rs3" "rs4" ...
+ $ CHR : int 1 1 1 1 1 1 1 1 1 1 ...
+ $ BP : int 1 2 3 4 5 6 7 8 9 10 ...
+ $ P : num 0.915 0.937 0.286 0.83 0.642 ...
+ $ zscore: num 0.107 0.0789 1.0666 0.2141 0.4653 ...
+</code></pre>
+
+<pre><code class="r">head(gwasResults)
+</code></pre>
+
+<pre><code> SNP CHR BP P zscore
+1 rs1 1 1 0.9148 0.10698
+2 rs2 1 2 0.9371 0.07895
+3 rs3 1 3 0.2861 1.06663
+4 rs4 1 4 0.8304 0.21413
+5 rs5 1 5 0.6417 0.46526
+6 rs6 1 6 0.5191 0.64474
+</code></pre>
+
+<pre><code class="r">tail(gwasResults)
+</code></pre>
+
+<pre><code> SNP CHR BP P zscore
+16465 rs16465 22 530 0.5644 0.5764
+16466 rs16466 22 531 0.1383 1.4822
+16467 rs16467 22 532 0.3937 0.8529
+16468 rs16468 22 533 0.1779 1.3473
+16469 rs16469 22 534 0.2393 1.1767
+16470 rs16470 22 535 0.2630 1.1192
+</code></pre>
+
+<p>How many SNPs on each chromosome?</p>
+
+<pre><code class="r">as.data.frame(table(gwasResults$CHR))
+</code></pre>
+
+<pre><code> Var1 Freq
+1 1 1500
+2 2 1191
+3 3 1040
+4 4 945
+5 5 877
+6 6 825
+7 7 784
+8 8 750
+9 9 721
+10 10 696
+11 11 674
+12 12 655
+13 13 638
+14 14 622
+15 15 608
+16 16 595
+17 17 583
+18 18 572
+19 19 562
+20 20 553
+21 21 544
+22 22 535
+</code></pre>
+
+<h2>Creating manhattan plots</h2>
+
+<p>Now, let's make a basic manhattan plot. </p>
+
+<pre><code class="r">manhattan(gwasResults)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can also pass in other graphical parameters. Let's add a title (<code>main=</code>), increase the y-axis limit (<code>ylim=</code>), reduce the point size to 60% (<code>cex=</code>), and reduce the font size of the axis labels to 90% (<code>cex.axis=</code>). While we're at it, let's change the colors (<code>col=</code>), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:</p>
+
+<pre><code class="r">manhattan(gwasResults, main = "Manhattan Plot", ylim = c(0, 10), cex = 0.6,
+ cex.axis = 0.9, col = c("blue4", "orange3"), suggestiveline = F, genomewideline = F,
+ chrlabs = c(1:20, "P", "Q"))
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Now, let's look at a single chromosome:</p>
+
+<pre><code class="r">manhattan(subset(gwasResults, CHR == 1))
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called <code>snpsOfInterest</code>. You'll get a warning if you try to highlight SNPs that don't exist.</p>
+
+<pre><code class="r">str(snpsOfInterest)
+</code></pre>
+
+<pre><code> chr [1:100] "rs3001" "rs3002" "rs3003" "rs3004" "rs3005" ...
+</code></pre>
+
+<pre><code class="r">manhattan(gwasResults, highlight = snpsOfInterest)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can combine highlighting and limiting to a single chromosome, and use the <code>xlim</code> graphical parameter to zoom in on a region of interest (between position 200-500):</p>
+
+<pre><code class="r">manhattan(subset(gwasResults, CHR == 3), highlight = snpsOfInterest, xlim = c(200,
+ 500), main = "Chr 3")
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Finally, the <code>manhattan</code> function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the <code>p=</code> argument the name of the column we want to plot instead of the default “P” column. In this example, let's create a test statistic (“zscore”), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive l [...]
+
+<pre><code class="r"># Add test statistics
+gwasResults <- transform(gwasResults, zscore = qnorm(P/2, lower.tail = FALSE))
+head(gwasResults)
+</code></pre>
+
+<pre><code> SNP CHR BP P zscore
+1 rs1 1 1 0.9148 0.10698
+2 rs2 1 2 0.9371 0.07895
+3 rs3 1 3 0.2861 1.06663
+4 rs4 1 4 0.8304 0.21413
+5 rs5 1 5 0.6417 0.46526
+6 rs6 1 6 0.5191 0.64474
+</code></pre>
+
+<pre><code class="r"># Make the new plot
+manhattan(gwasResults, p = "zscore", logp = FALSE, ylab = "Z-score", genomewideline = FALSE,
+ suggestiveline = FALSE, main = "Manhattan plot of Z-scores")
+</code></pre>
+
+<p><img src=" [...]
+
+<p>A few notes on creating manhattan plots:</p>
+
+<ul>
+<li>Run <code>str(gwasResults)</code>. Notice that the <code>gwasResults</code> data.frame has SNP, chromosome, position, and p-value columns named <code>SNP</code>, <code>CHR</code>, <code>BP</code>, and <code>P</code>. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the <code>chr=</code>, <code>bp=</code>, <code>p=</code>, and <code>snp=</code> arguments. See <code>help(manhattan)</code> for details.</li>
+<li>The chromosome column must be numeric. If you have “X,” “Y,” or “MT” chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., <code>fix(manhattan)</code>) to change the line designating the axis tick labels (<code>labs <- unique(d$CHR)</code>) to set this to whatever you'd like it to be.</li>
+<li>If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for <code>col="blue"</code>, <code>col="red"</code>, or <code>col="green3"</code> to modify the suggestive line, genomewide line, and highlight colors, respectively.</li>
+</ul>
+
+<h2>Creating Q-Q plots</h2>
+
+<p>Creating Q-Q plots is straightforward - simply supply a vector of p-values to the <code>qq()</code> function. </p>
+
+<pre><code class="r">qq(gwasResults$P)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can otionally supply many other graphical parameters.</p>
+
+<pre><code class="r">qq(gwasResults$P, main = "Q-Q plot of GWAS p-values", xlim = c(0, 7), ylim = c(0,
+ 12), pch = 18, col = "blue4", cex = 1.5, las = 1)
+</code></pre>
+
+<p><img src=" [...]
+
+</body>
+
+</html>
diff --git a/man/gwasResults.Rd b/man/gwasResults.Rd
new file mode 100644
index 0000000..ee31dd0
--- /dev/null
+++ b/man/gwasResults.Rd
@@ -0,0 +1,9 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{data}
+\name{gwasResults}
+\alias{gwasResults}
+\title{Simulated GWAS results}
+\description{
+Simulated GWAS results as obtained from \code{plink --assoc}.
+}
+
diff --git a/man/manhattan.Rd b/man/manhattan.Rd
new file mode 100644
index 0000000..df48072
--- /dev/null
+++ b/man/manhattan.Rd
@@ -0,0 +1,60 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\name{manhattan}
+\alias{manhattan}
+\title{Creates a manhattan plot}
+\usage{
+manhattan(x, chr = "CHR", bp = "BP", p = "P", snp = "SNP",
+ col = c("gray10", "gray60"), chrlabs = NULL,
+ suggestiveline = -log10(1e-05), genomewideline = -log10(5e-08),
+ highlight = NULL, logp = TRUE, ...)
+}
+\arguments{
+\item{x}{A data.frame with columns "BP," "CHR," "P," and optionally, "SNP."}
+
+\item{chr}{A string denoting the column name for the chromosome. Defaults to
+PLINK's "CHR." Said column must be numeric. If you have X, Y, or MT
+chromosomes, be sure to renumber these 23, 24, 25, etc.}
+
+\item{bp}{A string denoting the column name for the chromosomal position.
+Defaults to PLINK's "BP." Said column must be numeric.}
+
+\item{p}{A string denoting the column name for the p-value. Defaults to
+PLINK's "P." Said column must be numeric.}
+
+\item{snp}{A string denoting the column name for the SNP name (rs number).
+Defaults to PLINK's "SNP." Said column should be a character.}
+
+\item{col}{A character vector indicating which colors to alternate.}
+
+\item{chrlabs}{A character vector equal to the number of chromosomes
+specifying the chromosome labels (e.g., \code{c(1:22, "X", "Y", "MT")}).}
+
+\item{suggestiveline}{Where to draw a "suggestive" line. Default
+-log10(1e-5). Set to FALSE to disable.}
+
+\item{genomewideline}{Where to draw a "genome-wide sigificant" line. Default
+-log10(5e-8). Set to FALSE to disable.}
+
+\item{highlight}{A character vector of SNPs in your dataset to highlight.
+These SNPs should all be in your dataset.}
+
+\item{logp}{If TRUE, the -log10 of the p-value is plotted. It isn't very
+useful to plot raw p-values, but plotting the raw value could be useful for
+other genome-wide plots, for example, peak heights, bayes factors, test
+statistics, other "scores," etc.}
+
+\item{...}{Arguments passed on to other plot/points functions}
+}
+\value{
+A manhattan plot.
+}
+\description{
+Creates a manhattan plot from PLINK assoc output (or any data frame with
+chromosome, position, and p-value).
+}
+\examples{
+manhattan(gwasResults)
+}
+\keyword{manhattan}
+\keyword{visualization}
+
diff --git a/man/qq.Rd b/man/qq.Rd
new file mode 100644
index 0000000..f97730e
--- /dev/null
+++ b/man/qq.Rd
@@ -0,0 +1,25 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\name{qq}
+\alias{qq}
+\title{Creates a Q-Q plot}
+\usage{
+qq(pvector, ...)
+}
+\arguments{
+\item{pvector}{A numeric vector of p-values.}
+
+\item{...}{Other arguments passed to \code{plot()}}
+}
+\value{
+A Q-Q plot.
+}
+\description{
+Creates a quantile-quantile plot from p-values from a GWAS study.
+}
+\examples{
+qq(gwasResults$P)
+}
+\keyword{qq}
+\keyword{qqplot}
+\keyword{visualization}
+
diff --git a/man/qqman.Rd b/man/qqman.Rd
new file mode 100644
index 0000000..30ce912
--- /dev/null
+++ b/man/qqman.Rd
@@ -0,0 +1,15 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{package}
+\name{qqman}
+\alias{qqman}
+\alias{qqman-package}
+\title{Create Q-Q and manhattan plots for GWAS data.}
+\description{
+A package for creating Q-Q and manhattan plots for GWAS data. See the package
+vignette for details: \cr\cr
+\code{vignette("qqman")}
+}
+\author{
+Stephen Turner <\url{http://stephenturner.us}>
+}
+
diff --git a/man/snpsOfInterest.Rd b/man/snpsOfInterest.Rd
new file mode 100644
index 0000000..411b532
--- /dev/null
+++ b/man/snpsOfInterest.Rd
@@ -0,0 +1,9 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{data}
+\name{snpsOfInterest}
+\alias{snpsOfInterest}
+\title{snpsOfInterest}
+\description{
+snpsOfInterest
+}
+
diff --git a/vignettes/qqman.Rmd b/vignettes/qqman.Rmd
new file mode 100644
index 0000000..c5fe6f8
--- /dev/null
+++ b/vignettes/qqman.Rmd
@@ -0,0 +1,133 @@
+---
+title: "Intro to the qqman package"
+author: "Stephen D. Turner"
+date: '`r Sys.Date()`'
+output:
+ html_document:
+ toc: yes
+---
+
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+```{r, include=FALSE}
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+```
+
+# Intro to the **qqman** package
+
+```{r generatedata, eval=FALSE, echo=FALSE}
+# This code used to generate the test data. Runs slow, but does the job.
+chrstats <- data.frame(chr=1:22, nsnps=1500)
+chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+chrstats
+
+d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)),
+ CHR=rep(NA, sum(chrstats$nsnps)),
+ BP=rep(NA, sum(chrstats$nsnps)),
+ P=rep(NA, sum(chrstats$nsnps)))
+snpi <- 1
+set.seed(42)
+for (i in chrstats$chr) {
+ for (j in 1:chrstats[i, 2]) {
+ d[snpi, ]$SNP=paste0("rs", snpi)
+ d[snpi, ]$CHR=i
+ d[snpi, ]$BP=j
+ d[snpi, ]$P=runif(1)
+ snpi <- snpi+1
+ }
+}
+
+divisor <- c(seq(2,50,2), seq(50,2,-2))
+divisor <- divisor^4
+length(divisor)
+d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+snpsOfInterest <- paste0("rs", 3001:3100)
+qq(d$P)
+manhattan(d, highlight=snpsOfInterest)
+gwasResults <- d
+save(gwasResults, file="data/gwasResults.RData")
+```
+
+The **qqman** package includes functions for creating manhattan plots and q-q plots from GWAS results. The `gwasResults` data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:
+
+```{r}
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+```
+
+How many SNPs on each chromosome?
+
+```{r}
+as.data.frame(table(gwasResults$CHR))
+```
+
+## Creating manhattan plots
+
+Now, let's make a basic manhattan plot.
+
+```{r}
+manhattan(gwasResults)
+```
+
+We can also pass in other graphical parameters. Let's add a title (`main=`), increase the y-axis limit (`ylim=`), reduce the point size to 60% (`cex=`), and reduce the font size of the axis labels to 90% (`cex.axis=`). While we're at it, let's change the colors (`col=`), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:
+
+```{r}
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+```
+
+Now, let's look at a single chromosome:
+
+```{r}
+manhattan(subset(gwasResults, CHR==1))
+```
+
+Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called `snpsOfInterest`. You'll get a warning if you try to highlight SNPs that don't exist.
+
+```{r}
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+```
+
+We can combine highlighting and limiting to a single chromosome, and use the `xlim` graphical parameter to zoom in on a region of interest (between position 200-500):
+
+```{r}
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+```
+
+Finally, the `manhattan` function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the `p=` argument the name of the column we want to plot instead of the default "P" column. In this example, let's create a test statistic ("zscore"), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive lines because these are only meaningful if you're plotting -lo [...]
+
+```{r}
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+```
+
+A few notes on creating manhattan plots:
+
+* Run `str(gwasResults)`. Notice that the `gwasResults` data.frame has SNP, chromosome, position, and p-value columns named `SNP`, `CHR`, `BP`, and `P`. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the `chr=`, `bp=`, `p=`, and `snp=` arguments. See `help(manhattan)` for details.
+* The chromosome column must be numeric. If you have "X," "Y," or "MT" chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., `fix(manhattan)`) to change the line designating the axis tick labels (`labs <- unique(d$CHR)`) to set this to whatever you'd like it to be.
+* If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for `col="blue"`, `col="red"`, or `col="green3"` to modify the suggestive line, genomewide line, and highlight colors, respectively.
+
+## Creating Q-Q plots
+
+Creating Q-Q plots is straightforward - simply supply a vector of p-values to the `qq()` function.
+
+```{r}
+qq(gwasResults$P)
+```
+
+We can otionally supply many other graphical parameters.
+
+```{r}
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+ xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+```
\ No newline at end of file
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-cran-qqman.git
More information about the debian-med-commit
mailing list