AMPtk is a series of scripts to process NGS amplicon data using USEARCH and VSEARCH, it can also be used to process any NGS amplicon data and includes databases setup for analysis of fungal ITS, fungal LSU, bacterial 16S, and insect COI amplicons. It can handle Ion Torrent, MiSeq, and 454 data. At least USEARCH v9.1.13 and VSEARCH v2.2.0 are required as of AMPtk v0.7.0.
Palmer JM, Jusino MA, Banik MT, Lindner DL. 2018. Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data. PeerJ 6:e4925; DOI 10.7717/peerj.4925. https://peerj.com/articles/4925/
There are several ways to install AMPtk, the easiest and recommended way is with Conda
#setup your conda env with bioconda, type the following in order to setup channels conda config --add channels defaults conda config --add channels conda-forge conda config --add channels bioconda #create amptk env (optional) conda create -n amptk amptk #UPDATE 2/9/2019: Conda solver seems to hang, if taking forever try this conda create -n amptk conda install -n amptk bioconductor-dada2 bioconductor-phyloseq biom-format \ biopython matplotlib natsort numpy pandas pigz psutil python-edlib r-base r-dt \ r-htmltools r-plotly seaborn vsearch conda activate amptk conda install amptk
You can install the python portion of AMPtk with pip, but you will need to then install the external dependencies such as usearch, vsearch, DADA2 and the amptk stats script will need to install R dependencies.
pip install amptk
Users can also install manually, download a release. You can also build the latest unreleased version from github:
#clone the repository git clone https://github.com/nextgenusfs/amptk.git #then install, optional add --prefix to control location python setup.py install --prefix /User/Tools/amptk
Dependencies Requiring Manual Install¶
- AMPtk utilizes USEARCH9 which must be installed manually from the developer here. Obtain the proper version of USEARCH v9.2.64 and softlink into the PATH:
#make executable sudo chmod +x /path/to/usearch9.2.64_i86osx32 #create softlink sudo ln -s /path/to/usearch9.2.64_i86osx32 /usr/local/bin/usearch9
1b) (optional) One script also requires USEARCH10, so you can download usearch10 and put into your path as follows:
#make executable sudo chmod +x /path/to/usearch10.0.240_i86osx32 #create softlink sudo ln -s /path/to/usearch10.0.240_i86osx32 /usr/local/bin/usearch10
- (optional) LULU post-clustering OTU table filtering via
amptk lulurequires the R package LULU. Install requires devtools.
#install devtools if you don't have already install.packages('devtools') library('devtools') install_github("tobiasgf/lulu") #not listed as dependency but on my system also requires dpylr install.packages('dpylr') or perhaps all of tidyverse install.packages('tidyverse') #could also install tidyverse from conda conda install r-tidyverse
Dependencies installed via package managers¶
You only need to worry about these dependencies if you installed manually and/or some will be necessary if used homebrew for installation (for example homebrew doesn’t install R packages)
- AMPtk requires VSEARCH, which you can install from here. Note, if you use homebrew recipe it will be install automatically or can use conda.
#install vsearch with homebrew brew install vsearch #or with bioconda conda install -c bioconda vsearch
- Several Python modules are also required, they can be installed with pip or conda:
#install with pip pip install -U biopython natsort pandas numpy matplotlib seaborn edlib biom-format psutil #install with conda conda install biopython natsort pandas numpy matplotlib seaborn python-edlib biom-format psutil
- (optional) DADA2 denoising algorithm requires installation of R and DADA2. Instructions are located here.
#install with conda/bioconda conda install r-base bioconductor-dada2
- (optional) To run some preliminary community ecology stats via
amptk statsyou will also need the R package Phyloseq. One way to install with conda:
#install with conda/bioconda conda install r-base bioconductor-phyloseq
Run from Docker¶
There is a base installation of AMPtk on Docker at nextgenusfs/amptk-base. Because usearch9 and usearch10 are required but must be personally licensed, here are the directions to get a working AMPtk docker image.
- Download the Dockerfile build file.
- Download usearch9.2.64 and usearch10.0.240 for linux (32 bit version) here.
- Build AMPtk docker image
docker build -t amptk -f Dockerfile .
- You can now launch the docker image like so (make sure files you need are in current directory)
docker run -it --rm -v $PWD:/work amptk /bin/bash
- AMPtk Overview - an overview of the steps in AMPtk.
- AMPtk Quick Start - walkthrough of test data.
- AMPtk Pre-Processing - details of the critical pre-processing steps.
- AMPtk Clustering - overview of clustering/denoising algorithms in AMPtk
- AMPtk OTU Table Filtering - OTU table filtering based on Mock community
- AMPtk Taxonomy - assigning taxonomy in AMPtk
- AMPtk all commands - all commands in AMPtk