Tools for Natural Language Processing(转)

http://www.nltk.org/
Open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.
LingPipe

LingPipe is a suite of Java libraries for the linguistic analysis of human language.

Text simplification - Wikipedia, the free encyclopedia
Text simplification is an operation used in natural language processing to modify, enhance, classify or otherwise process an existing corpus of human-readable text in such a way that the grammar and structure of the prose is greatly simplified, while the underlying meaning and information remains the same. Text simplification is an important area of research, because natural human languages ordinarily contain complex compound constructions that are not easily processed through automation.

 


 

CoPT, Corpus Processing Tools
CoPT, Corpus Processing Tools, is a set of java classes intended to assist field linguists, NLP researchers and developers, students and software developers in all corpus-related processing.

 


 

Jazzy - Java spell checker API
Jazzy is a Java spell checker based on the algorithms used by aspell.

 


 

JLinkGrammarParser
JLinkGrammarParser is a Java port of the CMU link grammar parser, a syntactic parser for english.

 


 

jSpellCorrect
It’s a simple statistical spelling corrector.

 


 

jTokeniser
jTokeniser is a Java library for tokenising strings into a list of tokens. A variety of possible tokenisers are available, including a very basic whitespace tokeniser, a more flexible StringTokeniser, a couple of regular expression tokenisers, and a tokeniser that utilises Java’s BreakIterator, which provides more complex, locale dependant tokenisation. More recently, a tokeniser that add breaks text into its constituent sentences. All are very simple to use.

 


 

Linguistic Tree Constructor
LTC is a free program for building linguistic syntax trees from text.
It lets the user build the tree in a point-and-click fashion.
The program does no analysis on its own — the user is completely free to draw the tree however he or she wishes. However, the program makes sure that the tree is a tree and not some other kind of graph.

 


 

MII Medical NLP Toolkit
This is a toolkit for medical natural language processing (NLP). The core engine is general enough to be used in a variety of text processing domains, though the toolkit includes specific support for medical reports and patient de-identification.

 


 

nlpFarm
The nlpFarm is a Natural Language Processing (NLP) resource where early research prototypes (Java) can evolve into robust and useful open source. Our farmstead collaborates under the OpenNLP initiative, in order to make NLP software publically available.

 


 

OpenNLP
OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP components

 


 

Open source natural language tools
Toolkit for implementing question answering systems and machine translation in both controlled languages and natural languages. Includes first order logic inference, parsing and semantic analysis, and APIs and standalone server software. Currently some t

 


 

The OpenNLP Grok Library
Grok is a library of natural language processing components, including support for parsing with categorial grammars and various preprocessing tasks such as part-of-speech tagging, sentence detection, and tokenization.

 


 

The OpenNLP Leo Project
Leo is a project to provide an architecture for defining XML specifications of grammars for different natural language parsing systems and tools for using that architecture to permit sharing of grammar resources across different systems.

 


 

The OpenNLP Maximum Entropy Package
Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Grok Library.

 


 

Visuwords™ online graphical dictionary - download source code
Download the source code for Visuwords.

 


 

Balie
Extraction from Text with Machine Learning and Natural Language Techniques

 


 

FerFT: Spectral Analyzer
This software is for multi-purpose power spectral analyzer based on the successive Fourier transformation method. (® UTD) It has been developed with Java (ver.1.5) and works on any OS implemented Java ver.1.5 or later.

 


 

Julius Speech Recognition Engine
Julius Speech recognition engine

 


 

Modular Audio Recognition Framework
MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

 


 

VoxForge 0.0.1
Speech recognition support

 


 

OpenCCG: The OpenNLP CCG Library
OpenCCG, the OpenNLP CCG Library, is an open source natural language processing library written in Java, which provides parsing and realization services based on Mark Steedman’s Combinatory Categorial Grammar (CCG) formalism.

 


 

Joone
Joone (Java Object Oriented Neural Engine) is an artificial neural network Java framework. It is used to build and train neural networks with a powerful visual environment. It has a modular design and can be easily extended by writing new modules to implement new learning algorithms or architectures.

 

posted on 2009-06-23 09:14  chzhcpu  阅读(1004)  评论(1编辑  收藏  举报

导航