Monday, October 17, 2011

Denoising pyrosequencing reads

Tools to denoise pyrosequencing reads.

1. Denoiser
- Software for rapidly denoising pyrosequencing amplicaon reads by exploiting rank-abundance distributions.
- It is an alternative to PyroNoise software suite
- It is included in the new release of Qiime 1.3

$ wget
$ tar zxvf Denoiser_0.91.tgz
$ cd Denoiser_0.91


Python: >= 2.5.1
PyCogent toolkit: >= 1.4
ghc: >= 6.8 (recommended for install)

Install pre-requisites
$ sudo apt-get install ghc

Install Denoiser

$ cd FlowgramAlignment
$ make
ghc --make -O2 FlowgramAli_4frame
[1 of 3] Compiling ADPCombinators   ( ADPCombinators.lhs, ADPCombinators.o )
[2 of 3] Compiling FlowgramUtils    ( FlowgramUtils.lhs, FlowgramUtils.o )
[3 of 3] Compiling Main             ( FlowgramAli_4frame.lhs, FlowgramAli_4frame.o )
Linking FlowgramAli_4frame ...
$ make install
cp FlowgramAli_4frame ../bin/

Provide the path to Denoiser in ~/.bashrc file


Define variable in Denoiser/

PROJECT_HOME = home + "/path/to/Denoiser_0.91/"

Follow the mini-tutorial in README to denoise the sequences
To get fasta file, quality file and sff text file from sff file

$ sffinfo  454Reads.sff > 454Reads.sff.txt
$ sffinfo -s 454Reads.sff > 454Reads.fasta
$ sffinfo -q 454Reads.sff > 454Reads.qual

Quality Filtering and barcode assignment

$ denoiser/ -f 454Reads.fasta -q 454Reads.qual -m barcode_to_sample_mapping.txt -w 50 -r -l 150 -L 350
-f fasta file
-q quality file
-m barcode mapping file
-w enable sliding window test for quality scores

-r remove unassigned reads (deprecated)
-l minimum sequence length
-L maximum sequence length

Prefix clustering

$ -i 454Reads.sff.txt -f seqs.fna -o example_pp -s -v -p CATGCTGCCTCCCGTAGGAGT
-i SFF text file
-f quality filtered sequence file
-o output directory
-s squeeze, run-length encoding for prefix-filtering
-v verbose
-p the primer sequence

Flowgram clustering and Denoising -i 454Reads.sff.txt -p example_pp -v -o example_denoised
-i SFF text file
-p Output directory of prefix clustering

-v verbose
-o output directory

2. Pre-cluster
- It is part of the mothur package
- It a pseudo-single linkage algorithm with the goal of removing sequences that are likely due to pyrosequencing errors.

3. SeqNoise

Wednesday, October 12, 2011

Python :: Installation :: Without root access

Operating System: Ubuntu 64-bit

Download Python from

$ wget
$ bunzip2 Python-2.7.2.tar.bz2
$ tar xvf Python-2.7.2.tar
$ ./configure --prefix=$HOME/bin/python
$ make
$ make install

Check if it works
$ $HOME/bin/python/bin/python

Load the path in your .bashrc file
$ echo "PATH=$PATH:$HOME/python/bin" >> ~/.bashrc
$ source ~/.bashrc

No we should be able to run python.

Tuesday, October 4, 2011

Qiime-1.3.0 :: Installation

Download Qiime-1.3.0 from

Installation on Ubuntu 64-bit machine

Extract the contents in the tar ball
$ tar zxvf Qiime-1.3.0.tar.gz
$ cd Qiime-1.3.0
$ vim README

Dependencies( check and Install)
1. Python >= 2.6
$ python --version
Python 2.7.1+

2. NumPy >= 1.3.0
$ tar zxvf numpy-1.6.1.tar.gz
$ cd numpy-1.6.1/
$ python install --user   # installs to your home directory

$ sudo apt-get install python-numpy python-numpy-doc

3. PyCogent >= 1.5.1.
Extract the contents
$ tar zxvf PyCogent-1.5.1.tgz
$ cd PyCogent-1.5.1/

With root access,
$ python build
$ sudo python install

Without root access,
$ python build_ext -if
- which compiles the extensions in place (the i option) forcibly (the f option, ie even if they’ve already been compiled).
- move the cogent directory to where you want it (or leave it in place)
- add this location to your python path using sys.path.insert(0, "/your/path/to/PyCogent") in each script, or by setting shell environment variables (e.g. $ export PYTHONPATH=/your/path/to/PyCogent:$PYTHONPATH)

4. Matplotlib
$ sudo apt-get install python-matplotlib 

Installation Qiime
$ python install --install-scripts=/home/username/qiime/bin/ --install-purelib=/home/username/qiime/lib/ --install-data=/home/username/qiime/lib/
- where user has to be replaced with username of the current user.
- As of now, it runs into a problem
running install_data
error: can't copy 'qiime/support_files/denoiser/bin/FlowgramAli_4frame': doesn't exist or not a regular file.

I looked up in Qiime Google groups page, as of this post, the developers has been informed on this error with and they have mentioned on rectifying it. This error may not show when u download. If it does, then a small trick completes the installation. Just run the below command and rerun the command
$ touch qiime/support_files/denoiser/bin/FlowgramAli_4frame

- If you DO NOT intend on using denoiser, it works good
- If you Do intend on using denoiser, it will create an error called Calling /home/username/qiime/lib/qiime/support_files/denoiser/bin/FlowgramAli_4frame failed. Check permissions and that it is in fact an executable.
- I am not sure yet how to solve this.

The build will succeed with the following message
Build finished. The HTML pages are in _build/html.
Local documentation built with Sphinx. Open to following path with a web browser:

If you have installed Qiime using above steps then you have to provide path in .bashrc file
$ echo "export PATH=/home/username/qiime/bin/:$PATH" >> ~/.bashrc
$ echo "export PYTHONPATH=/home/username/qiime/lib/:$PYTHONPATH" >> ~/.bashrc
$ source ~/.bashrc
- dont forget to replace the username with your username

Now move the qiime config file to your home directory
$ cp qiime/support_files/qiime_config ~/.qiime_config

Since we created the binary files in another location, we have to provide the path to it in the configuration file. Look for the line that begins with qiime_scripts_dir
$ vim ~/.qiime_config
qiime_scripts_dir /home/username/qiime/bin
- Note, the white space between the two values is a tab, Dont use space
- Also dont forget to replace the username with your username.

Move to the tests folder and execute the tests script
$ cd tests
$ python