Package org.languagetool.rules.de
Class ProhibitedCompoundRule
java.lang.Object
org.languagetool.rules.Rule
org.languagetool.rules.de.ProhibitedCompoundRule
Find compounds that might be morphologically correct but are still probably wrong, like 'Lehrzeile'.
- Since:
- 4.1
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class(package private) static class -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<String> private ProhibitedCompoundRule.Pairprivate static final Patternprivate final Languageprivate static LinguServicesprivate final BaseLanguageModelprivate static final List<ProhibitedCompoundRule.Pair> protected Map<String, List<ProhibitedCompoundRule.Pair>> private static final Map<String, List<ProhibitedCompoundRule.Pair>> private static final com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<String> static final StringDeprecated.each pair has its own id since LT 5.1 -
Constructor Summary
ConstructorsConstructorDescriptionProhibitedCompoundRule(ResourceBundle messages, LanguageModel lm, UserConfig userConfig, Language language) -
Method Summary
Modifier and TypeMethodDescriptionprivate static voidaddAllCaseVariants(List<ProhibitedCompoundRule.Pair> candidatePairs, ProhibitedCompoundRule.Pair lcPair) protected static voidaddItemsFromConfusionSets(List<ProhibitedCompoundRule.Pair> pairs, String confusionSetsFile, boolean isUpperCase) private static voidA short description of the error this rule can detect, usually in the language of the text that is checked.getId()A string used to identify the rule in e.g.private intgetMatches(AnalyzedSentence sentence, List<RuleMatch> ruleMatches, AnalyzedTokenReadings readings, int partsStartPos, String wordPart, int toPosCorrection) (package private) intprivate static booleanisMisspelled(String word) match(AnalyzedSentence sentence) Check whether the given sentence matches this error rule, i.e.(package private) StringvoidsetConfusionPair(ProhibitedCompoundRule.Pair confusionPair) ignore automatically loaded pairs and only match using given confusionPair used for evaluation by ProhibitedCompoundRuleEvaluatorprotected static com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<String> setupAhoCorasickSearch(List<ProhibitedCompoundRule.Pair> pairs, Map<String, List<ProhibitedCompoundRule.Pair>> pairMap) Methods inherited from class org.languagetool.rules.Rule
addExamplePair, addTags, addToneTags, cacheAntiPatterns, estimateContextForSureMatch, getAntiPatterns, getCategory, getCorrectExamples, getDistanceTokens, getErrorTriggeringExamples, getFullId, getIncorrectExamples, getLocQualityIssueType, getMinPrevMatches, getPriority, getRuleOptions, getSentenceWithImmunization, getSourceFile, getSubId, getTags, getToneTags, getUrl, hasTag, hasToneTag, isDefaultOff, isDefaultTempOff, isDictionaryBasedSpellingRule, isGoalSpecific, isIncludedInHiddenMatches, isOfficeDefaultOff, isOfficeDefaultOn, isPremium, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setDistanceTokens, setErrorTriggeringExamples, setExamplePair, setGoalSpecific, setIncludedInHiddenMatches, setIncorrectExamples, setLocQualityIssueType, setMinPrevMatches, setOfficeDefaultOff, setOfficeDefaultOn, setPremium, setPriority, setTags, setToneTags, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
Field Details
-
RULE_ID
Deprecated.each pair has its own id since LT 5.1- Since:
- 4.3
- See Also:
-
lowercasePairs
-
HERRN_FRAU
-
ignoreWords
-
blacklistRegex
-
linguServices
-
cache
-
ahoCorasickDoubleArrayTrie
-
pairMap
-
prohibitedCompoundRuleSearcher
private static final com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<String> prohibitedCompoundRuleSearcher -
prohibitedCompoundRulePairMap
-
lm
-
language
-
confusionPair
-
-
Constructor Details
-
ProhibitedCompoundRule
public ProhibitedCompoundRule(ResourceBundle messages, LanguageModel lm, UserConfig userConfig, Language language)
-
-
Method Details
-
addAllCaseVariants
private static void addAllCaseVariants(List<ProhibitedCompoundRule.Pair> candidatePairs, ProhibitedCompoundRule.Pair lcPair) -
addUpperCaseVariants
-
addItemsFromConfusionSets
protected static void addItemsFromConfusionSets(List<ProhibitedCompoundRule.Pair> pairs, String confusionSetsFile, boolean isUpperCase) -
setupAhoCorasickSearch
protected static com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<String> setupAhoCorasickSearch(List<ProhibitedCompoundRule.Pair> pairs, Map<String, List<ProhibitedCompoundRule.Pair>> pairMap) -
getId
Description copied from class:RuleA string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Zand the underscore. -
getDescription
Description copied from class:RuleA short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescriptionin classRule
-
match
Description copied from class:RuleCheck whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different from the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example). In other words, implementations must be stateless, so that a previous call to this method has no influence on later calls.- Specified by:
matchin classRule- Parameters:
sentence- a pre-analyzed sentence- Returns:
- an array of
RuleMatchobjects - Throws:
IOException
-
isMisspelled
-
getMatches
private int getMatches(AnalyzedSentence sentence, List<RuleMatch> ruleMatches, AnalyzedTokenReadings readings, int partsStartPos, String wordPart, int toPosCorrection) -
getThreshold
int getThreshold() -
setConfusionPair
ignore automatically loaded pairs and only match using given confusionPair used for evaluation by ProhibitedCompoundRuleEvaluator- Parameters:
confusionPair- pair to evaluate, parts are assumed to be lowercase / null to reset
-
removeHyphensAndAdaptCase
-