DEploid

A software that deconvolutes mixed genomes with unknown proportions.

Description

dEploid is designed for deconvoluting mixed genomes with unknown proportions. Traditional ‘phasing’ programs are limited to diploid organisms. Our method modifies Li and Stephen’s [Li2003] algorithm with Markov chain Monte Carlo (MCMC) approaches, and builds a generic framework that allows haplotype searches in a multiple infection setting.

dEploid is primarily developed as part of the Pf3k project, from which this documentation will take examples from for demonstration. The Pf3k project is a global collaboration using the latest sequencing technologies to provide a high-resolution view of natural variation in the malaria parasite Plasmodium falciparum. Parasite DNA are extracted from patient blood sample, which often contains more than one parasite strain, with unknown proportions. DEploid is used for deconvoluting mixed haplotypes, and reporting the mixture proportions from each sample.

An illustration of mixed infection in malaria
[Li2003]Li, N. and M. Stephens (2003). Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165(4), 2213–2233.

Installation

$ pip install dEploid

Reporting Bugs

If you encounter any problem when using dEploid, please file a short bug report by using the issue tracker on GitHub or email joe.zhu (at) well.ox.ac.uk.

Please include the output of dEploid -v and the platform you are using dEploid on in the report. If the problem occurs while executing dEploid, please also include the command you are using and the random seed.

Thank you!

Citing DEploid

If you use dEploid with the flag -ibd, please cite the following paper:

Zhu, J. S., J. A. Hendry, J. Almagro-Garcia, R. D. Pearson, R. Amato, A. Miles, D. J. Weiss, T. C. D. Lucas, M. Nguyen, P. W. Gething, D. Kwiatkowski, G. McVean, and for the Pf3k Project. (2018) The origins and relatedness structure of mixed infections vary with local prevalence of P. falciparum malaria. biorxiv, doi: https://doi.org/10.1101/387266.

Bibtex record::

@article {Zhu387266,
author = {Zhu, Sha Joe and Hendry, Jason A. and Almagro-Garcia, Jacob and Pearson, Richard D. and Amato, Roberto and Miles, Alistair and Weiss, Daniel J. and Lucas, Tim C.D. and Nguyen, Michele and Gething, Peter W. and Kwiatkowski, Dominic and McVean, Gil and ,},
title = {The origins and relatedness structure of mixed infections vary with local prevalence of P. falciparum malaria},
year = {2018},
doi = {10.1101/387266},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2018/08/09/387266},
eprint = {https://www.biorxiv.org/content/early/2018/08/09/387266.full.pdf},
journal = {bioRxiv}
}

If you use dEploid in your work, please cite the program:

Zhu, J. S. J. A. Garcia G. McVean. (2017) Deconvolution of multiple infections in Plasmodium falciparum from high throughput sequencing data. Bioinformatics btx530. doi: https://doi.org/10.1093/bioinformatics/btx530.

Bibtex record::

@article {Zhubtx530,
author = {Zhu, Sha Joe and Almagro-Garcia, Jacob and McVean, Gil},
title = {Deconvolution of multiple infections in {{\em Plasmodium falciparum}} from high throughput sequencing data},
year = {2017},
doi = {10.1093/bioinformatics/btx530},
URL = {https://doi.org/10.1093/bioinformatics/btx530},
journal = {Bioinformatics}
}

dEploid

mcmc module

vcf module