Package org.languagetool.tagging.de
Class GermanTagger
java.lang.Object
org.languagetool.tagging.BaseTagger
org.languagetool.tagging.de.GermanTagger
- All Implemented Interfaces:
Tagger
- Direct Known Subclasses:
SwissGermanTagger
German part-of-speech tagger, requires data file in
de/german.dict in the classpath.
The POS tagset is described in
tagset.txt-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class(package private) static class(package private) static class(package private) static class(package private) static class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final Patternprivate static final Patternprivate static final Patternprivate static final Supplier<GermanTagger.ExpansionInfos> private static final Patternprivate static final Patternprivate static final Patternstatic final GermanTaggerprivate static final Patternprivate static final String[]private static final String[]private static final String[]private static final String[]private static final String[]private static final String[]private static final String[]private static final String[]private static final String[]private static final Stringprivate static final String[]private static final Stringprivate static final String[]private static final Stringprivate final ManualTaggerFields inherited from class org.languagetool.tagging.BaseTagger
locale, wordTagger -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate List<TaggedWord> addStem(List<TaggedWord> analyzedWordResults, String stem) private static voidfillAdjInfos(String word, String suffix, List<String> tagsForForm, Map<String, List<GermanTagger.AdjInfo>> adjInfos) private List<AnalyzedToken> getAnalyzedTokens(List<TaggedWord> taggedWords, String word) private List<AnalyzedToken> getAnalyzedTokens(List<TaggedWord> taggedWords, String word, List<String> compoundParts) private List<AnalyzedToken> getImperativeForm(String word, List<String> sentenceTokens, int pos) private AnalyzedTokengetNoInfoToken(String word) private List<AnalyzedToken> getSubstantivatedForms(String word, List<String> sentenceTokens) private static GermanTagger.ExpansionInfos(package private) booleanisWeiseException(String word) Return only the first reading of the given word ornull.private booleanmatchesUppercaseAdjective(String unknownUppercaseToken) (package private) StringprefixedVerbLastPart(String word) private StringsanitizeWord(String word) Returns a list ofAnalyzedTokens that assigns each term in the sentence some kind of part-of-speech information (not necessarily just one tag).Methods inherited from class org.languagetool.tagging.BaseTagger
additionalTags, asAnalyzedToken, asAnalyzedTokenList, asAnalyzedTokenListForTaggedWords, createNullToken, createToken, getAnalyzedTokens, getDictionary, getDictionaryPath, getManualAdditionsFileNames, getManualRemovalsFileNames, getWordTagger, overwriteWithManualTagger
-
Field Details
-
allAdjGruTags
-
mitarbeitendenPattern
-
genderGapChars
-
afterAsterisk
-
innenPattern1
-
anythingDash
-
innenPattern2
-
DDD_ER_PATTERN
-
nounTagExpansionExceptions
-
prefixesSeparableVerbs
-
prefixesSeparableVerbsRegexp
- See Also:
-
prefixesNonSeparableVerbs
-
prefixesNonSeparableVerbsRegexp
- See Also:
-
prefixesVerbs
-
prefixesVerbsRegexp
- See Also:
-
partizip2contains1PluPra
-
partizip2contains1PluPrt
-
postagsPartizipEndingE
-
postagsPartizipEndingEm
-
postagsPartizipEndingEn
-
postagsPartizipEndingEr
-
postagsPartizipEndingEs
-
notAVerb
-
tagsForWeise
-
removalTagger
-
expansionInfos
-
INSTANCE
-
-
Constructor Details
-
GermanTagger
public GermanTagger()
-
-
Method Details
-
initExpansionInfos
-
toPA2
-
fillAdjInfos
-
addStem
-
sanitizeWord
-
lookup
Return only the first reading of the given word ornull.- Throws:
IOException
-
tag
-
matchesUppercaseAdjective
-
tag
Description copied from interface:TaggerReturns a list ofAnalyzedTokens that assigns each term in the sentence some kind of part-of-speech information (not necessarily just one tag).Note that this method takes exactly one sentence. Its implementation may implement special cases for the first word of a sentence, which is usually written with an uppercase letter.
- Specified by:
tagin interfaceTagger- Overrides:
tagin classBaseTagger- Parameters:
sentenceTokens- the text as returned by a WordTokenizer- Throws:
IOException
-
tag
public List<AnalyzedTokenReadings> tag(List<String> sentenceTokens, boolean ignoreCase) throws IOException - Throws:
IOException
-
prefixedVerbLastPart
-
isWeiseException
-
getImperativeForm
-
getSubstantivatedForms
-
getNoInfoToken
-
getAnalyzedTokens
-
getAnalyzedTokens
private List<AnalyzedToken> getAnalyzedTokens(List<TaggedWord> taggedWords, String word, List<String> compoundParts)
-