Uses of Interface
org.languagetool.tokenizers.Tokenizer
Packages that use Tokenizer
Package
Description
-
Uses of Tokenizer in org.languagetool
Fields in org.languagetool declared as TokenizerMethods in org.languagetool that return TokenizerModifier and TypeMethodDescriptionLanguage.createDefaultWordTokenizer()Creates language specific word tokenizer.Language.getWordTokenizer()Get this language's word tokenizer implementation.Methods in org.languagetool with parameters of type TokenizerModifier and TypeMethodDescriptionvoidLanguage.setWordTokenizer(Tokenizer tokenizer) Set this language's word tokenizer implementation. -
Uses of Tokenizer in org.languagetool.language
Methods in org.languagetool.language that return TokenizerModifier and TypeMethodDescriptionArabic.createDefaultWordTokenizer()Belarusian.createDefaultWordTokenizer()Deprecated.Breton.createDefaultWordTokenizer()Catalan.createDefaultWordTokenizer()Chinese.createDefaultWordTokenizer()Dutch.createDefaultWordTokenizer()English.createDefaultWordTokenizer()Esperanto.createDefaultWordTokenizer()French.createDefaultWordTokenizer()Galician.createDefaultWordTokenizer()German.createDefaultWordTokenizer()Greek.createDefaultWordTokenizer()Irish.createDefaultWordTokenizer()Japanese.createDefaultWordTokenizer()Khmer.createDefaultWordTokenizer()LanguageBuilder.ExtendedLanguage.createDefaultWordTokenizer()Persian.createDefaultWordTokenizer()Polish.createDefaultWordTokenizer()Portuguese.createDefaultWordTokenizer()Romanian.createDefaultWordTokenizer()Russian.createDefaultWordTokenizer()Spanish.createDefaultWordTokenizer()Tagalog.createDefaultWordTokenizer()Deprecated.Ukrainian.createDefaultWordTokenizer() -
Uses of Tokenizer in org.languagetool.language.tokenizers
Classes in org.languagetool.language.tokenizers that implement Tokenizer -
Uses of Tokenizer in org.languagetool.noop
Methods in org.languagetool.noop that return Tokenizer -
Uses of Tokenizer in org.languagetool.rules.en
Classes in org.languagetool.rules.en that implement TokenizerModifier and TypeClassDescriptionclassTokenize sentences to tokens like Google does for its ngram index. -
Uses of Tokenizer in org.languagetool.rules.ngrams
Methods in org.languagetool.rules.ngrams that return TokenizerModifier and TypeMethodDescription(package private) static TokenizerLanguageModelUtils.getGoogleStyleWordTokenizer(Language language) Return a tokenizer that works more like Google does for its ngram index (which doesn't seem to be properly documented).protected TokenizerNgramProbabilityRule.getGoogleStyleWordTokenizer()Methods in org.languagetool.rules.ngrams with parameters of type TokenizerModifier and TypeMethodDescription(package private) static List<GoogleToken> GoogleToken.getGoogleTokens(String sentence, boolean addStartToken, Tokenizer wordTokenizer) (package private) static List<GoogleToken> GoogleToken.getGoogleTokens(AnalyzedSentence sentence, boolean addStartToken, Tokenizer wordTokenizer) -
Uses of Tokenizer in org.languagetool.tokenizers
Subinterfaces of Tokenizer in org.languagetool.tokenizersModifier and TypeInterfaceDescriptioninterfaceInterface for components that take compound words and split them into their parts.interfaceTokenizes text into sentences.Classes in org.languagetool.tokenizers that implement TokenizerModifier and TypeClassDescriptionclassclassclassA very simple sentence tokenizer that splits on[.!?…]followed by whitespace or an uppercase letter.classClass to tokenize sentences using rules from an SRX file.classTokenizes a sentence into words. -
Uses of Tokenizer in org.languagetool.tokenizers.be
Classes in org.languagetool.tokenizers.be that implement TokenizerModifier and TypeClassDescriptionclassSpecific to Belarusian: apostrophes (', ’, ʼ) are part of the word. -
Uses of Tokenizer in org.languagetool.tokenizers.br
Classes in org.languagetool.tokenizers.br that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.ca
Classes in org.languagetool.tokenizers.ca that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.crh
Classes in org.languagetool.tokenizers.crh that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.de
Classes in org.languagetool.tokenizers.de that implement TokenizerModifier and TypeClassDescriptionclassSplit German nouns using the jWordSplitter library.class -
Uses of Tokenizer in org.languagetool.tokenizers.el
Classes in org.languagetool.tokenizers.el that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.en
Classes in org.languagetool.tokenizers.en that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.eo
Classes in org.languagetool.tokenizers.eo that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.es
Classes in org.languagetool.tokenizers.es that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.fr
Classes in org.languagetool.tokenizers.fr that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.gl
Classes in org.languagetool.tokenizers.gl that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.ja
Classes in org.languagetool.tokenizers.ja that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.km
Classes in org.languagetool.tokenizers.km that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.nl
Classes in org.languagetool.tokenizers.nl that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.pl
Classes in org.languagetool.tokenizers.pl that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.pt
Classes in org.languagetool.tokenizers.pt that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.ro
Classes in org.languagetool.tokenizers.ro that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.ru
Classes in org.languagetool.tokenizers.ru that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.uk
Classes in org.languagetool.tokenizers.uk that implement Tokenizer -
Uses of Tokenizer in org.languagetool.tokenizers.zh
Classes in org.languagetool.tokenizers.zh that implement Tokenizer