mirror of
https://github.com/Ponce/slackbuilds
synced 2024-11-20 19:41:34 +01:00
academic/muscle5: Added (MUSCLE 5: Next-generation MUSCLE)
Signed-off-by: Willy Sudiarto Raharjo <willysr@slackbuilds.org>
This commit is contained in:
parent
d5622d171b
commit
1f43c07f3c
6 changed files with 273 additions and 0 deletions
28
academic/muscle5/README
Normal file
28
academic/muscle5/README
Normal file
|
@ -0,0 +1,28 @@
|
|||
MUSCLE 5: Next-generation MUSCLE
|
||||
|
||||
Muscle v5 is a major re-write of MUSCLE based on new algorithms.
|
||||
|
||||
* Highest accuracy, scalable to thousands of sequences:
|
||||
Compared to previous versions, Muscle v5 is much more accurate, is often
|
||||
faster, and scales to much larger datasets. At the time of writing (late
|
||||
2021), Muscle v5 has the highest scores on multiple alignment benchmarks
|
||||
including Balibase, Bralibase, Prefab and Balifam. It can align tens of
|
||||
thousands of sequences with high accuracy on a low-cost commodity
|
||||
computer (say, an 8-core Intel CPU with 32 Gb RAM). On large datasets,
|
||||
Muscle v5 is 20-30% more accurate than MAFFT and Clustal-Omega.
|
||||
|
||||
* Alignment ensembles:
|
||||
Muscle v5 can generate ensembles of high-accuracy alternative
|
||||
alignments. All replicates have equal average accuracy on benchmark
|
||||
test, including the MSA made with default parameters. By comparing
|
||||
results of downstream analysis (trees, structure prediction...) on
|
||||
different replicates, you can assess the effects of alignment errors on
|
||||
your study.
|
||||
|
||||
* Manual:
|
||||
https://drive5.com/muscle5/manual/
|
||||
|
||||
* Reference (included in the package)
|
||||
R.C. Edgar (2021) "MUSCLE v5 enables improved estimates of phylogenetic
|
||||
tree confidence by ensemble bootstrapping"
|
||||
https://www.biorxiv.org/content/10.1101/2021.06.20.449169v1.full.pdf
|
5
academic/muscle5/References
Normal file
5
academic/muscle5/References
Normal file
|
@ -0,0 +1,5 @@
|
|||
References
|
||||
|
||||
R.C. Edgar (2021) "MUSCLE v5 enables improved estimates of phylogenetic
|
||||
tree confidence by ensemble bootstrapping"
|
||||
https://www.biorxiv.org/content/10.1101/2021.06.20.449169v1.full.pdf
|
93
academic/muscle5/muscle5.1
Normal file
93
academic/muscle5/muscle5.1
Normal file
|
@ -0,0 +1,93 @@
|
|||
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.48.5.
|
||||
.TH MUSCLE "1" "January 2022" "muscle 5.1" "User Commands"
|
||||
.SH NAME
|
||||
muscle \- Multiple alignment program of protein sequences
|
||||
.SH DESCRIPTION
|
||||
MUSCLE is a multiple alignment program for protein sequences. MUSCLE
|
||||
stands for multiple sequence comparison by log-expectation. In the
|
||||
authors tests, MUSCLE achieved the highest scores of all tested
|
||||
programs on several alignment accuracy benchmarks, and is also one of
|
||||
the fastest programs out there.
|
||||
.SH USAGE
|
||||
.SS "Align FASTA input, write aligned FASTA (AFA) output:"
|
||||
.IP
|
||||
muscle \fB\-align\fR input.fa \fB\-output\fR aln.afa
|
||||
.PP
|
||||
Align large input using Super5 algorithm if \fB\-align\fR is too expensive,
|
||||
typically needed with more than a few hundred sequences:
|
||||
.IP
|
||||
muscle \fB\-super5\fR input.fa \fB\-output\fR aln.afa
|
||||
.SS "Single replicate alignment:"
|
||||
.IP
|
||||
muscle \fB\-align\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
|
||||
muscle \fB\-super5\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
|
||||
.IP
|
||||
PERM is guide tree permutation none, abc, acb, bca (default none).
|
||||
SEED is perturbation seed 0, 1, 2... (default 0 = don't perturb).
|
||||
.PP
|
||||
Ensemble of replicate alignments, output in Ensemble FASTA (EFA) format,
|
||||
EFA has one aligned FASTA for each replicate with header line "<PERM.SEED":
|
||||
.IP
|
||||
muscle \fB\-align\fR input.fa \fB\-stratified\fR \fB\-output\fR stratified_ensemble.efa
|
||||
muscle \fB\-align\fR input.fa \fB\-diversified\fR \fB\-output\fR diversified_ensemble.afa
|
||||
.HP
|
||||
\fB\-replicates\fR N
|
||||
.IP
|
||||
Number of replicates, defaults 4, 100, 100 for stratified,
|
||||
.IP
|
||||
diversified, resampled. With \fB\-stratified\fR there is one
|
||||
replicate per guide tree permutation, total is 4 x N.
|
||||
.PP
|
||||
Generate resampled ensemble from existing ensemble by sampling columns
|
||||
with replacement:
|
||||
.IP
|
||||
muscle \fB\-resample\fR ensemble.efa \fB\-output\fR resampled.efa
|
||||
.HP
|
||||
\fB\-maxgapfract\fR F
|
||||
.IP
|
||||
Maximum fraction of gaps in a column (F=0..1, default 0.5).
|
||||
.HP
|
||||
\fB\-minconf\fR CC
|
||||
.IP
|
||||
Minimum column confidence (CC=0..1, default 0.5).
|
||||
.PP
|
||||
If ensemble output filename has @, then one FASTA file is generated
|
||||
for each replicate where @ is replaced by perm.s, otherwise all replicates
|
||||
are written to one EFA file.
|
||||
.SS "Calculate disperson of an ensemble:"
|
||||
.IP
|
||||
muscle \fB\-disperse\fR ensemble.efa
|
||||
.SS "Extract replicate with highest total CC (diversified input recommended):"
|
||||
.IP
|
||||
muscle \fB\-maxcc\fR ensemble.efa \fB\-output\fR maxcc.afa
|
||||
.SS "Extract aligned FASTA files from EFA file:"
|
||||
.IP
|
||||
muscle \fB\-efa_explode\fR ensemble.efa
|
||||
.SS "Convert FASTA to EFA, input has one filename per line:"
|
||||
.IP
|
||||
muscle \fB\-fa2efa\fR filenames.txt \fB\-output\fR ensemble.efa
|
||||
.PP
|
||||
Update ensemble by adding two sequences of digits to each replicate, digits
|
||||
are column confidence (CC) values, e.g. "73" means CC=0.73, "++" is CC=1.0:
|
||||
.IP
|
||||
muscle \fB\-addconfseqs\fR ensemble.efa \fB\-output\fR ensemble_cc.efa
|
||||
.PP
|
||||
Calculate letter confidence (LC) values, \fB\-ref\fR specifies the alignment to
|
||||
compare against the ensemble (e.g. from \fB\-maxcc\fR), output is in aligned
|
||||
FASTA format with LC values 0, 1 ... 9 instead of letters:
|
||||
.IP
|
||||
muscle \fB\-letterconf\fR ensemble.efa \fB\-ref\fR aln.afa \fB\-output\fR letterconf.afa
|
||||
.HP
|
||||
\fB\-html\fR aln.html
|
||||
.IP
|
||||
Alignment colored by LC in HTML format.
|
||||
.HP
|
||||
\fB\-jalview\fR aln.features
|
||||
.IP
|
||||
Jalview feature file with LC values and colors.
|
||||
.SS "More documentation at:"
|
||||
.IP
|
||||
https://drive5.com/muscle
|
||||
.SH AUTHOR
|
||||
This manpage was written by Andreas Tille for the Debian distribution and
|
||||
can be used for any other usage of the program.
|
118
academic/muscle5/muscle5.SlackBuild
Normal file
118
academic/muscle5/muscle5.SlackBuild
Normal file
|
@ -0,0 +1,118 @@
|
|||
#!/bin/bash
|
||||
|
||||
# Slackware build script for muscle5
|
||||
|
||||
# Copyright 2022 Petar Petrov slackalaxy@gmail.com
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use of this script, with or without modification, is
|
||||
# permitted provided that the following conditions are met:
|
||||
#
|
||||
# 1. Redistributions of this script must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED
|
||||
# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
|
||||
# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
|
||||
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
|
||||
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
|
||||
# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
cd $(dirname $0) ; CWD=$(pwd)
|
||||
|
||||
PRGNAM=muscle5
|
||||
VERSION=${VERSION:-5.1}
|
||||
BUILD=${BUILD:-1}
|
||||
TAG=${TAG:-_SBo}
|
||||
PKGTYPE=${PKGTYPE:-tgz}
|
||||
|
||||
SRCNAM=muscle
|
||||
|
||||
if [ -z "$ARCH" ]; then
|
||||
case "$( uname -m )" in
|
||||
i?86) ARCH=i586 ;;
|
||||
arm*) ARCH=arm ;;
|
||||
*) ARCH=$( uname -m ) ;;
|
||||
esac
|
||||
fi
|
||||
|
||||
# If the variable PRINT_PACKAGE_NAME is set, then this script will report what
|
||||
# the name of the created package would be, and then exit. This information
|
||||
# could be useful to other scripts.
|
||||
if [ ! -z "${PRINT_PACKAGE_NAME}" ]; then
|
||||
echo "$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
TMP=${TMP:-/tmp/SBo}
|
||||
PKG=$TMP/package-$PRGNAM
|
||||
OUTPUT=${OUTPUT:-/tmp}
|
||||
|
||||
if [ "$ARCH" = "i586" ]; then
|
||||
SLKCFLAGS="-O2 -march=i586 -mtune=i686"
|
||||
LIBDIRSUFFIX=""
|
||||
elif [ "$ARCH" = "i686" ]; then
|
||||
SLKCFLAGS="-O2 -march=i686 -mtune=i686"
|
||||
LIBDIRSUFFIX=""
|
||||
elif [ "$ARCH" = "x86_64" ]; then
|
||||
SLKCFLAGS="-O2 -fPIC"
|
||||
LIBDIRSUFFIX="64"
|
||||
else
|
||||
SLKCFLAGS="-O2"
|
||||
LIBDIRSUFFIX=""
|
||||
fi
|
||||
|
||||
set -e
|
||||
|
||||
rm -rf $PKG
|
||||
mkdir -p $TMP $PKG $OUTPUT
|
||||
cd $TMP
|
||||
rm -rf $SRCNAM-$VERSION
|
||||
tar xvf $CWD/$SRCNAM-$VERSION.tar.gz
|
||||
cd $SRCNAM-$VERSION
|
||||
|
||||
chown -R root:root .
|
||||
find -L . \
|
||||
\( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \
|
||||
-o -perm 511 \) -exec chmod 755 {} \; -o \
|
||||
\( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \
|
||||
-o -perm 440 -o -perm 400 \) -exec chmod 644 {} \;
|
||||
|
||||
cd src
|
||||
|
||||
# do not create static executable
|
||||
sed -i "s:LDFLAGS += -static:#LDFLAGS += -static:" Makefile
|
||||
make CFLAGS="$SLKCFLAGS" \
|
||||
CXXFLAGS="$SLKCFLAGS"
|
||||
|
||||
install -D -m755 Linux/$SRCNAM $PKG/usr/bin/$PRGNAM
|
||||
cd ..
|
||||
|
||||
# Thanks to Debian for the man page
|
||||
mkdir -p $PKG/usr/man/man1
|
||||
cp $CWD/$PRGNAM.1 $PKG/usr/man/man1/$PRGNAM.1
|
||||
|
||||
# The Makefile strips the binary...
|
||||
#find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
|
||||
# | cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
|
||||
|
||||
find $PKG/usr/man -type f -exec gzip -9 {} \;
|
||||
for i in $( find $PKG/usr/man -type l ) ; do ln -s $( readlink $i ).gz $i.gz ; rm $i ; done
|
||||
|
||||
mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
|
||||
cp -a \
|
||||
CONTRIBUTING.md LICENSE README.md \
|
||||
$PKG/usr/doc/$PRGNAM-$VERSION
|
||||
|
||||
cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild
|
||||
cat $CWD/References > $PKG/usr/doc/$PRGNAM-$VERSION/References
|
||||
|
||||
mkdir -p $PKG/install
|
||||
cat $CWD/slack-desc > $PKG/install/slack-desc
|
||||
|
||||
cd $PKG
|
||||
/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE
|
10
academic/muscle5/muscle5.info
Normal file
10
academic/muscle5/muscle5.info
Normal file
|
@ -0,0 +1,10 @@
|
|||
PRGNAM="muscle5"
|
||||
VERSION="5.1"
|
||||
HOMEPAGE="https://github.com/rcedgar/muscle"
|
||||
DOWNLOAD="https://github.com/rcedgar/muscle/archive/v5.1/muscle-5.1.tar.gz"
|
||||
MD5SUM="99b5ef38a119994e7a8f0ea7a12b5987"
|
||||
DOWNLOAD_x86_64=""
|
||||
MD5SUM_x86_64=""
|
||||
REQUIRES=""
|
||||
MAINTAINER="Petar Petrov"
|
||||
EMAIL="slackalaxy@gmail.com"
|
19
academic/muscle5/slack-desc
Normal file
19
academic/muscle5/slack-desc
Normal file
|
@ -0,0 +1,19 @@
|
|||
# HOW TO EDIT THIS FILE:
|
||||
# The "handy ruler" below makes it easier to edit a package description.
|
||||
# Line up the first '|' above the ':' following the base package name, and
|
||||
# the '|' on the right side marks the last column you can put a character in.
|
||||
# You must make exactly 11 lines for the formatting to be correct. It's also
|
||||
# customary to leave one space after the ':' except on otherwise blank lines.
|
||||
|
||||
|-----handy-ruler------------------------------------------------------|
|
||||
muscle5: muscle5 (MUSCLE 5: Next-generation MUSCLE)
|
||||
muscle5:
|
||||
muscle5: Muscle v5 is a major re-write of MUSCLE based on new algorithms.
|
||||
muscle5: Compared to previous versions, Muscle v5 is much more accurate,
|
||||
muscle5: faster, and scales to much larger datasets.
|
||||
muscle5:
|
||||
muscle5: https://drive5.com/muscle5/
|
||||
muscle5: https://drive5.com/muscle5/manual/
|
||||
muscle5:
|
||||
muscle5:
|
||||
muscle5:
|
Loading…
Reference in a new issue