python/PyStemmer: Added (Snowball stemming algorithms).

Signed-off-by: Willy Sudiarto Raharjo <willysr@slackbuilds.org>
This commit is contained in:
Nikos Giotis 2017-03-05 11:02:08 +07:00 committed by Willy Sudiarto Raharjo
parent 847c66a1c5
commit 66f974475f
4 changed files with 132 additions and 0 deletions

View file

@ -0,0 +1,85 @@
#!/bin/sh
# Slackware build script for PyStemmer
# Copyright 2017 Nikos Giotis <nikos.giotis@gmail.com>
# All rights reserved.
#
# Redistribution and use of this script, with or without modification, is
# permitted provided that the following conditions are met:
#
# 1. Redistributions of this script must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED
# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
PRGNAM=PyStemmer
VERSION=${VERSION:-1.3.0}
BUILD=${BUILD:-1}
TAG=${TAG:-_SBo}
if [ -z "$ARCH" ]; then
case "$( uname -m )" in
i?86) ARCH=i586 ;;
arm*) ARCH=arm ;;
*) ARCH=$( uname -m ) ;;
esac
fi
CWD=$(pwd)
TMP=${TMP:-/tmp/SBo}
PKG=$TMP/package-$PRGNAM
OUTPUT=${OUTPUT:-/tmp}
if [ "$ARCH" = "i586" ]; then
SLKCFLAGS="-O2 -march=i586 -mtune=i686"
LIBDIRSUFFIX=""
elif [ "$ARCH" = "i686" ]; then
SLKCFLAGS="-O2 -march=i686 -mtune=i686"
LIBDIRSUFFIX=""
elif [ "$ARCH" = "x86_64" ]; then
SLKCFLAGS="-O2 -fPIC"
LIBDIRSUFFIX="64"
else
SLKCFLAGS="-O2"
LIBDIRSUFFIX=""
fi
set -e
rm -rf $PKG
mkdir -p $TMP $PKG $OUTPUT
cd $TMP
rm -rf $PRGNAM-$VERSION
tar xvf $CWD/$PRGNAM-$VERSION.tar.gz
cd $PRGNAM-$VERSION
chown -R root:root .
find -L . \
\( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \
-o -perm 511 \) -exec chmod 755 {} \; -o \
\( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \
-o -perm 440 -o -perm 400 \) -exec chmod 644 {} \;
python setup.py install --root=$PKG
find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
| cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
cp -a HACKING LICENSE $PKG/usr/doc/$PRGNAM-$VERSION
cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild
mkdir -p $PKG/install
cat $CWD/slack-desc > $PKG/install/slack-desc
cd $PKG
/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.${PKGTYPE:-tgz}

View file

@ -0,0 +1,10 @@
PRGNAM="PyStemmer"
VERSION="1.3.0"
HOMEPAGE="http://snowball.tartarus.org/"
DOWNLOAD="https://pypi.python.org/packages/21/ee/19e0e4ec9398cc022617baa5f013fd415cce4887748245126aa6d4fac3c6/PyStemmer-1.3.0.tar.gz"
MD5SUM="46ee623eeeba5a7cc0d95cbfa7e18abd"
DOWNLOAD_x86_64=""
MD5SUM_x86_64=""
REQUIRES=""
MAINTAINER="Nikos Giotis"
EMAIL="nikos.giotis@gmail.com"

18
python/PyStemmer/README Normal file
View file

@ -0,0 +1,18 @@
Snowball stemming algorithms, for information retrieval
Stemming algorithms
PyStemmer provides access to efficient algorithms for calculating a "stemmed"
form of a word. This is a form with most of the common morphological endings
removed; hopefully representing a common linguistic base form. This is most
useful in building search engines and information retrieval software;
for example, a search with stemming enabled should be able to find a document
containing "cycling" given the query "cycles".
PyStemmer provides algorithms for several (mainly european) languages, by
wrapping the libstemmer library from the Snowball project in a Python module.
It also provides access to the classic Porter stemming algorithm for english:
although this has been superceded by an improved algorithm, the original
algorithm may be of interest to information retrieval researchers wishing
to reproduce results of earlier experiments.

View file

@ -0,0 +1,19 @@
# HOW TO EDIT THIS FILE:
# The "handy ruler" below makes it easier to edit a package description.
# Line up the first '|' above the ':' following the base package name, and
# the '|' on the right side marks the last column you can put a character in.
# You must make exactly 11 lines for the formatting to be correct. It's also
# customary to leave one space after the ':' except on otherwise blank lines.
|-----handy-ruler------------------------------------------------------|
PyStemmer: PyStemmer (Snowball stemming algorithms, for information retrieval)
PyStemmer:
PyStemmer: PyStemmer provides access to efficient algorithms for calculating a
PyStemmer: "stemmed" form of a word. This is a form with most of the common
PyStemmer: morphological endings removed; hopefully representing a common
PyStemmer: linguistic base form. This is most useful in building search engines
PyStemmer: and information retrieval software; for example, a search with
PyStemmer: stemming enabled should be able to find a document containing
PyStemmer: "cycling" given the query "cycles".
PyStemmer:
PyStemmer: http://snowball.tartarus.org/