mirror of
https://github.com/Ponce/slackbuilds
synced 2024-11-06 08:26:50 +01:00
15 lines
645 B
Text
15 lines
645 B
Text
|
Pattern is a web mining module for the Python programming language.
|
||
|
|
||
|
It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider,
|
||
|
HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface,
|
||
|
syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity +
|
||
|
LSA metrics), clustering and classification (k-means, k-NN, SVM), and data
|
||
|
visualization (graph networks).
|
||
|
|
||
|
The module is bundled with 30+ examples and 350+ unit tests.
|
||
|
|
||
|
Pattern is written for Python 2.5+ (no support for Python 3 yet).
|
||
|
|
||
|
The source code is licensed under BSD and available from
|
||
|
http://www.clips.ua.ac.be/pages/pattern.
|