academic/samtools: Updated for version 1.10.

Signed-off-by: Robby Workman <rworkman@slackbuilds.org>
This commit is contained in:
Rob van Nues 2020-01-02 19:49:02 -06:00 committed by Robby Workman
parent a570db958d
commit d04080be8c
5 changed files with 106 additions and 39 deletions

View file

@ -1,11 +1,16 @@
SAM (Sequence Alignment/Map) format is a generic format for storing large
nucleotide sequence alignments. The original samtools package has been split
into three separate but tightly coordinated projects: htslib (C library for
handling high-throughput sequencing data), samtools (for handling SAM, BAM,
CRAM), and bcftools (for handling VCF and BCF).
Samtools is now distributed as an individual package. Installation is set up
so that the code uses an external HTSlib (also at SBo). Although deprecated
upstream, in the case that people need parts of samtools-legacy (e.g header
files or libbam) these can be installed from this package by modifying the
samtools.Slackbuild. Note that the sam.h of htslib differs from sam.h coming
with samtools.
Prior to the introduction of HTSlib, SAMtools and BCFtools were distributed
in a single samtools-0.1.x package.
This old version remains available from SBo as samtools-legacy
Samtools is now distributed as an individual package.
Installation is set up so that the code uses an external HTSlib (also at SBo).
Although deprecated upstream, in the case that people need parts of samtools-legacy
(e.g header files or libbam) these can be installed from this package by modifying
the samtools.Slackbuild.
Note that the sam.h of htslib differs from sam.h coming with samtools.
in a single samtools-0.1.x package. This old version remains available from
SBo as samtools-legacy.

View file

@ -0,0 +1,64 @@
References:
======================
File formats
The introduction of the SAM/BAM format and the samtools command line tool:
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup, The Sequence alignment/map (SAM) format and SAMtools, Bioinformatics (2009) 25(16) 2078-9 [19505943]
Extension of the SAM/BAM format to support de novo assemblies:
Cock PJA, Bonfield JK, Chevreux B, Li H, SAM/BAM format v1.5 extensions for de novo assemblies, bioRxiv (2015) 020024 [doi:10.1101/020024]
The introduction of the CRAM format:
Hsi-Yang Fritz M, Leinonen R, Cochrane G, and Birney E, Efficient storage of high throughput DNA sequencing data using reference-based compression, Genome Research (2011) 21(5) 734-740.
The introduction of the VCF format:
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 Genomes Project Analysis Group, The variant call format and VCFtools, Bioinformatics (2011) 27(15) 2156-8
======================
Calling and analysis
The original mpileup calling algorithm plus mathematical notes (mpileup/bcftools call -c):
Li H, A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data, Bioinformatics (2011) 27(21) 2987-93.
Li H, Mathematical Notes on SAMtools Algorithms (2010)
Mathematical notes for the updated multiallelic calling model (mpileup/bcftools call -m):
Danecek P, Schiffels S, and Durbin R, Multiallelic calling model in bcftools (-m) (2014)
Hidden Markov model for detecting runs of homozygosity (bcftools roh):
Narasimhan V, Danecek P, Scally A, Xue Y, Tyler-Smith C, and Durbin R, BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data, Bioinformatics (2016) 32(11) 1749-51
Copy number variation/aneuploidy calling from microarray data (bcftools cnv/bcftools polysomy):
Danecek P, McCarthy SA, HipSci Consortium, and Durbin R, A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data, PLoS One (2016) 11(5) e0155014
Haplotype-aware calling of variant consequences (bcftools csq):
Danecek P, McCarthy SA, BCFtools/csq: Haplotype-aware variant consequences, Bioinformatics (2017) 33(13) 2037-39
======================
Other
Base alignment quality (BAQ) method improve SNP calling around INDELs:
Li H, Improving SNP discovery by base alignment quality, Bioinformatics (2011) 27(8) 1157-8
Segregation based QC metric originally implemented in SGA:
Durbin R, Segregation based metric for variant call QC (2014)

View file

@ -3,7 +3,7 @@
# Slackware build script for samtools
# Copyright 2013-2016 Petar Petrov slackalaxy@gmail.com
# Copyright 2017-2018 Rob van Nues # All rights reserved.
# Copyright 2017-2020 Rob van Nues # All rights reserved.
#
# Redistribution and use of this script, with or without modification, is
# permitted provided that the following conditions are met:
@ -23,7 +23,7 @@
# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
PRGNAM=samtools
VERSION=${VERSION:-1.9}
VERSION=${VERSION:-1.10}
BUILD=${BUILD:-1}
TAG=${TAG:-_SBo}
@ -96,6 +96,17 @@ CXXFLAGS="$SLKCFLAGS" \
make
make install DESTDIR=$PKG
mkdir -p $PKG/usr/share/$PRGNAM-$VERSION
cp -a misc/*.lua $PKG/usr/share/$PRGNAM-$VERSION
# include samtools-API if set above
if [ "$SAMLIB" = "yes" ] ; then
mkdir -p $PKG/usr/include/bam
mkdir -p $PKG/usr/lib${LIBDIRSUFFIX}
install -m644 libbam.a "$PKG/usr/lib${LIBDIRSUFFIX}"
install -m644 *.h "$PKG/usr/include/bam"
fi
find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
| cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
@ -106,24 +117,11 @@ mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
cp -a \
AUTHORS LICENSE README INSTALL NEWS examples \
$PKG/usr/doc/$PRGNAM-$VERSION
cp $CWD/README.references $PKG/usr/doc/$PRGNAM-$VERSION
cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild
mkdir -p $PKG/usr/share/$PRGNAM-$VERSION
cp -a \
misc/*.lua \
$PKG/usr/share/$PRGNAM-$VERSION
mkdir -p $PKG/install
cat $CWD/slack-desc > $PKG/install/slack-desc
# include samtools-API if set above
if [ "$SAMLIB" = "yes" ] ; then
mkdir -p $PKG/usr/include/bam
mkdir -p $PKG/usr/lib${LIBDIRSUFFIX}
install -m644 libbam.a "$PKG/usr/lib${LIBDIRSUFFIX}"
install -m644 *.h "$PKG/usr/include/bam"
fi
cd $PKG
/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.${PKGTYPE:-tgz}

View file

@ -1,8 +1,8 @@
PRGNAM="samtools"
VERSION="1.9"
VERSION="1.10"
HOMEPAGE="http://www.htslib.org"
DOWNLOAD="https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2"
MD5SUM="cca9a40d9b91b007af2ff905cb8b5924"
DOWNLOAD="https://github.com/samtools/samtools/releases/download/1.10/samtools-1.10.tar.bz2"
MD5SUM="506b0b9b2628e1f3bbedd77855b4c709"
DOWNLOAD_x86_64=""
MD5SUM_x86_64=""
REQUIRES="htslib"

View file

@ -8,12 +8,12 @@
|-----handy-ruler------------------------------------------------------|
samtools: samtools (Sequence Alignment/Map Tools)
samtools:
samtools: SAM (Sequence Alignment/Map) format is a generic format for
samtools: storing large nucleotide sequence alignments. The original samtools
samtools: package has been split into three separate but tightly coordinated
samtools: projects: htslib (C-library for handling high-throughput sequencing
samtools: data); samtools (for handling SAM, BAM, CRAM) and bcftools (for
samtools: handling VCF, BCF). Both samtools and bcftools are set up to use
samtools: system-wide installed hstlib sources.
samtools: Home: http://www.htslib.org
samtools: SAM (Sequence Alignment/Map) format is a generic format for storing
samtools: large nucleotide sequence alignments. The original samtools package
samtools: has been split into three separate but tightly coordinated projects:
samtools: htslib (C-library for handling high-throughput sequencing data),
samtools: samtools (for handling SAM, BAM, CRAM), and bcftools (for handling
samtools: VCF and BCF).
samtools:
samtools: Homepage: http://www.htslib.org
samtools: