academic/hyphy: Added (Hypothesis Testing using Phylogenies).

Signed-off-by: David Spencer <idlemoor@slackbuilds.org>
This commit is contained in:
Petar Petrov 2017-10-07 21:23:27 +01:00 committed by Willy Sudiarto Raharjo
parent 7f80fb2d08
commit cef175ce57
5 changed files with 216 additions and 0 deletions

53
academic/hyphy/README Normal file
View file

@ -0,0 +1,53 @@
HyPhy: Hypothesis testing using Phylogenies
HyPhy is an open-source software package for the analysis of genetic
sequences (in particular the inference of natural selection) using
techniques in phylogenetics, molecular evolution, and machine learning.
It features a rich scripting language for limitless customization of
analyses. Additionally, HyPhy features support for parallel computing
environments (via message passing interface).
HyPhy was designed to allow the specification and fitting of a broad
class of continuous-time discrete-space Markov models of sequence
evolution. To implement these models, HyPhy provides its own scripting
language - HBL, or HyPhy Batch Language, which can be used to develop
custom analyses or modify existing ones. Importantly, it is not
necessary to learn (or even be aware of) HBL in order to use HyPhy, as
most common models and analyses have been implemented for user
convenience. Once a model is defined, it can be fitted to data (using a
fixed topology tree), its parameters can be constrained in user-defined
ways to test various hypotheses (e.g. is rate1 > rate2), and simulate
data from. HyPhy primarily implements maximum likelihood methods, but
it can also be used to perform some forms of Bayesian inference (e.g.
FUBAR), fit Bayesian graphical models to data, run genetic algorithms to
perform complex model selection.
Features
- Support for arbitrary sequence data, including nucleotide, amino-acid,
codon, binary, count (microsattelite) data, including multiple
partitions mixing differen data types.
- Complex models of rate variation, including site-to-site, branch-to-
branch, hidden markov model (autocorrelated rates), between/within
partitions, and co-varion type models.
- Fast numerical fitting routines, supporting parallel and distributed
execution.
- A broad collection of pre-defined evolutionary models.
- The ability to specify flexible constraints on model parameters and
estimate confidence intervals on MLEs.
- Ancestral sequence reconstruction and sampling.
- Simulate data from any model that can be defined and fitted in the
language.
- Apply unique (for this domain) machine learning methods to discover
patterns in the data, e.g. genetic algorithms, stochastic context free
grammars, Bayesian graphical models.
- Script analyses completely in HBL including flow control, I/O,
parallelization, etc.
Registration
you are highly advised to fill the registration form found at:
https://veg.github.io/hyphy-site/register/
Citing
Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse (2005)
HyPhy: hypothesis testing using phylogenies.
Bioinformatics 21(5): 676-679

28
academic/hyphy/References Normal file
View file

@ -0,0 +1,28 @@
HyPhy
Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse (2005)
HyPhy: hypothesis testing using phylogenies. Bioinformatics 21(5): 676-679
Datamonkey webserver
Wayne Delport, Art F. Poon, Simon D. W. Frost and Sergei L. Kosakovsky Pond.
Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 2010 July 29[Epub ahead of print; PMID: 20671151]
Sergei L. Kosakovsky Pond and Simon D. W. Frost (2005). Datamonkey: rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics 21(10): 2531-2533
Specific methods implemented in HyPhy
Selection detection (SLAC/FEL/REL) - Sergei L. Kosakovsky Pond and Simon D. W. Frost (2005) Not So Different After All: A Comparison of Methods for Detecting Amino Acid Sites Under Selection. Molecular Biology and Evolution 22(5): 1208-1222
Internal Fixed Effects Likelihood (IFEL) Sergei L Kosakovsky Pond, Simon DW Frost, Zehava Grossman, Michael B Gravenor, Douglas D Richman and Andrew J Leigh Brown (2006). Adaptation to different human populations by HIV-1 revealed by codon-based analyses. PLoS Computational Biology 2(6): e62
TOGGLE - Wayne Delport, Konrad Scheffler and Cathal Seoighe (2008). Frequent Toggling between Alternative Amino Acids Is Driven by Selection in HIV-1. PLoS Pathogens 4(12): e1000242.
Directional Evolution in Protein Sequences (DEPS) Sergei L Kosakovsky Pond, Art FY Poon, Andrew J Leigh Brown and Simon Frost (2008). A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus. Molecular Biology and Evolution 25(9): 1809-1824
PARRIS - Konrad Scheffler,Darren P. Martin and Cathal Seoighe (2006). Robust inference of positive selection from recombining coding sequences. Bioinformatics 22(20): 2493-2499
GA-Branch - S.L. Kosakovsky Pond and S.D.W. Frost (2005). A Genetic Algorithm Approach to Detecting Lineage-specific Variation in Selection Pressure. Molecular Biology and Evolution 22(3): 478-485
Evolutionary Selection Distance (ESD) Sergei L Kosakovsky Pond, Konrad Scheffler, Michael B Gravenor, Art FY Poon and Simon DW Frost (2009).
Evolutionary Fingerprinting of Genes. Molecular Biology and Evolution 27(3): 520-536
Spidermonkey/BGM - Art Poon, Fraser Lewis, Sergei Kosakovsky Pond and Simon Frost (2007). An evolutionary-network model reveals stratified interactions in the V3 loop of the HIV-1 envelope. PLoS Computational Biology 3(11): e23
Codon Model Selection (CMS) - Wayne Delport, Konrad Scheffler, Gordon Botha, Michael B Gravenor, Spencer V. Muse and Sergei L Kosakovsky Pond (2010) CodonTest: modeling amino-acid substitution preferences in coding sequences. PLoS Computational Biology 6(8): e1000885
Branch-site REL - Sergei L. Kosakovsky Pond1, Ben Murrell, Mathieu Fourment, Simon D. W. Frost, Wayne Delport and Konrad Scheffler (2011)
A random effects branch-site model for detecting episodic diversifying selection. Molecular Biology and Evolution (first published online June 13, 2011 doi:10.1093/ molbev/msr125)
MEME - Murrell, B., Wertheim, J. O., Moola, S., Weighill, T., Scheffler, K., and Kosakovsky Pond, S. L. (2012) Detecting Individual Sites Subject to Episodic Diversifying Selection". PLoS Genet, 8(7), e1002764+
SBP/GARD - Sergei L Kosakovsky Pond, David Posada, Michael B Gravenor, Christopher H Woelk and Simon DW Frost. Automated Phylogenetic Detection of Recombination Using a Genetic Algorithm. Molecular Biology and Evolution 23(10): 1891-1901
SCUEAL - Sergei L Kosakovsky Pond, David Posada, Eric Stawiski, Colombe Chappey, Art FY Poon, Gareth Hughes, Esther Fearnhill, Mike B Gravenor, Andrew J Leigh Brown and Simon DW Frost (2009). An Evolutionary Model-Based Algorithm for Accurate Phylogenetic Breakpoint Mapping and Subtype Prediction in HIV-1. PLoS Computational Biology 5(11): e1000581
Ancestral Sequence Reconstruction (ASR) (joint) - Tal Pupko, Itsik Pe'er Ron Shamir and Dan Graur (2000). A Fast Algorithm for Joint Reconstruction of Ancestral Amino Acid Sequences. Molecular Biology and Evolution 17: 890-896
ASR (marginal) - Z Yang, S Kumar and M Nei (1995). A New Method of Inference of Ancestral Nucleotide and Amino Acid Sequences. Genetics 141: 1641-1650
ASR (sampled) - Rasmus Nielsen (2002) Mapping mutations on phylogenies. Systematic Biology 51(5): 729-739

View file

@ -0,0 +1,106 @@
#!/bin/sh
# Slackware build script for hyphy
# Copyright 2017 Petar Petrov slackalaxy@gmail.com
# All rights reserved.
#
# Redistribution and use of this script, with or without modification, is
# permitted provided that the following conditions are met:
#
# 1. Redistributions of this script must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED
# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
PRGNAM=hyphy
VERSION=${VERSION:-2.3.4}
BUILD=${BUILD:-1}
TAG=${TAG:-_SBo}
if [ -z "$ARCH" ]; then
case "$( uname -m )" in
i?86) ARCH=i586 ;;
arm*) ARCH=arm ;;
*) ARCH=$( uname -m ) ;;
esac
fi
CWD=$(pwd)
TMP=${TMP:-/tmp/SBo}
PKG=$TMP/package-$PRGNAM
OUTPUT=${OUTPUT:-/tmp}
if [ "$ARCH" = "i586" ]; then
SLKCFLAGS="-O2 -march=i586 -mtune=i686"
LIBDIRSUFFIX=""
elif [ "$ARCH" = "i686" ]; then
SLKCFLAGS="-O2 -march=i686 -mtune=i686"
LIBDIRSUFFIX=""
elif [ "$ARCH" = "x86_64" ]; then
SLKCFLAGS="-O2 -fPIC"
LIBDIRSUFFIX="64"
else
SLKCFLAGS="-O2"
LIBDIRSUFFIX=""
fi
set -e
rm -rf $PKG
mkdir -p $TMP $PKG $OUTPUT
cd $TMP
rm -rf $PRGNAM-$VERSION
tar xvf $CWD/$PRGNAM-$VERSION.tar.gz
cd $PRGNAM-$VERSION
chown -R root:root .
find -L . \
\( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \
-o -perm 511 \) -exec chmod 755 {} \; -o \
\( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \
-o -perm 440 -o -perm 400 \) -exec chmod 644 {} \;
# Fix the libraries path on 64 systems
sed -i "s:lib/hyphy:lib${LIBDIRSUFFIX}/hyphy:g" CMakeLists.txt
mkdir -p build
cd build
cmake \
-DCMAKE_C_FLAGS:STRING="$SLKCFLAGS" \
-DCMAKE_CXX_FLAGS:STRING="$SLKCFLAGS" \
-DCMAKE_REQUIRED_FLAGS="$SLKCFLAGS" \
-DDEFAULT_COMPILE_FLAGS="$SLKCFLAGS" \
-DINSTALL_PREFIX=/usr \
-DCMAKE_BUILD_TYPE=Release ..
# This builds build a HyPhy executable (HYPHYMP) using pthreads to do multiprocessing
make MP
make install DESTDIR=$PKG
cd ..
find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
| cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
# Include a few examples
mkdir -p $PKG/usr/share/$PRGNAM
cp -a Examples $PKG/usr/share/$PRGNAM
mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
cp -a help/*.pdf LICENSE README.md $PKG/usr/doc/$PRGNAM-$VERSION
cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild
cat $CWD/References > $PKG/usr/doc/$PRGNAM-$VERSION/References
mkdir -p $PKG/install
cat $CWD/slack-desc > $PKG/install/slack-desc
cd $PKG
/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.${PKGTYPE:-tgz}

10
academic/hyphy/hyphy.info Normal file
View file

@ -0,0 +1,10 @@
PRGNAM="hyphy"
VERSION="2.3.4"
HOMEPAGE="https://veg.github.io/hyphy-site/"
DOWNLOAD="https://github.com/veg/hyphy/archive/2.3.4/hyphy-2.3.4.tar.gz"
MD5SUM="1377f4973f40c7d336cb8d7c81a0bd34"
DOWNLOAD_x86_64=""
MD5SUM_x86_64=""
REQUIRES=""
MAINTAINER="Petar Petrov"
EMAIL="slackalaxy@gmail.com"

19
academic/hyphy/slack-desc Normal file
View file

@ -0,0 +1,19 @@
# HOW TO EDIT THIS FILE:
# The "handy ruler" below makes it easier to edit a package description.
# Line up the first '|' above the ':' following the base package name, and
# the '|' on the right side marks the last column you can put a character in.
# You must make exactly 11 lines for the formatting to be correct. It's also
# customary to leave one space after the ':' except on otherwise blank lines.
|-----handy-ruler------------------------------------------------------|
hyphy: hyphy (Hypothesis Testing using Phylogenies)
hyphy:
hyphy: HyPhy is a software package for the analysis of genetic sequences,
hyphy: in particular the inference of natural selection, using techniques
hyphy: in phylogenetics, molecular evolution, and machine learning. It
hyphy: features a rich scripting language for limitless customization of
hyphy: analyses.
hyphy:
hyphy: Home: https://veg.github.io/hyphy-site/
hyphy:
hyphy: