mirror of
https://github.com/Ponce/slackbuilds
synced 2024-11-06 08:26:50 +01:00
66f974475f
Signed-off-by: Willy Sudiarto Raharjo <willysr@slackbuilds.org>
18 lines
931 B
Text
18 lines
931 B
Text
Snowball stemming algorithms, for information retrieval
|
|
|
|
Stemming algorithms
|
|
|
|
PyStemmer provides access to efficient algorithms for calculating a "stemmed"
|
|
form of a word. This is a form with most of the common morphological endings
|
|
removed; hopefully representing a common linguistic base form. This is most
|
|
useful in building search engines and information retrieval software;
|
|
for example, a search with stemming enabled should be able to find a document
|
|
containing "cycling" given the query "cycles".
|
|
|
|
PyStemmer provides algorithms for several (mainly european) languages, by
|
|
wrapping the libstemmer library from the Snowball project in a Python module.
|
|
|
|
It also provides access to the classic Porter stemming algorithm for english:
|
|
although this has been superceded by an improved algorithm, the original
|
|
algorithm may be of interest to information retrieval researchers wishing
|
|
to reproduce results of earlier experiments.
|