Class GoogleStyleWordTokenizer

java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.rules.en.GoogleStyleWordTokenizer
All Implemented Interfaces:
Tokenizer

public class GoogleStyleWordTokenizer extends WordTokenizer
Tokenize sentences to tokens like Google does for its ngram index. Note: there doesn't seem to be official documentation about the way Google tokenizes there, so this is just an approximation.
Since:
3.2