This example goes over how to set up CoreNLP from the GitHub repo. The GitHub code has newer features than the official release, but may be unstable. This example will take you through downloading, building, and running a simple command-line invocation of CoreNLP.
- Java 8 or newer.
- Apache Ant
- For the example: Bash or similar shell, and wget or curl
Clone the CoreNLP Git repository:
git clone email@example.com:stanfordnlp/CoreNLP.git
Enter the CoreNLP directory:
Build the project into a self-contained jar file. The easiest way to do this is with:
Download the latest models.
Or using curl (what you get by default on macOS):
curl -O http://nlp.stanford.edu/software/stanford-corenlp-models-current.jar
Set up your classpath. If you’re using an IDE, you should set the classpath in your IDE.
export CLASSPATH="$CLASSPATH:javanlp-core.jar:stanford-corenlp-models-current.jar"; for file in `find lib -name "*.jar"`; do export CLASSPATH="$CLASSPATH:`realpath $file`"; done
If you’ll be using CoreNLP frequently, this is a useful line to have in your
~/.bashrc(or equivalent) file, replacing the directory
/path/to/corenlp/with the appropriate path to where you unzipped CoreNLP (3 replacements):
export CLASSPATH="$CLASSPATH:/path/to/corenlp/javanlp-core.jar:/path/to/corenlp/stanford-corenlp-models-current.jar"; for file in `find /path/to/corenlp/lib -name "*.jar"`; do export CLASSPATH="$CLASSPATH:`realpath $file`"; don
Try it out! For example, the following will make a simple text file to annotate, and run CoreNLP over this file. The output will be saved to
input.txt.outas a JSON file. Note that CoreNLP requires quite a bit of memory. You should give it at least 2GB (
-mx2g) in most cases.
echo "the quick brown fox jumped over the lazy dog" > input.txt java -mx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -outputFormat json -file input.txt
if you want to reproduce, please indicate the source:
Getting started with stanford-nlp – Basic Setup from GitHub - CodeDay