Package org.languagetool.rules.spelling
Class SpellingCheckRule
java.lang.Object
org.languagetool.rules.Rule
org.languagetool.rules.spelling.SpellingCheckRule
- Direct Known Subclasses:
HunspellRule,MorfologikSpellerRule,SymSpellRule,VagueSpellChecker.NonThreadSafeSpellRule
An abstract rule for spellchecking rules.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate List<DisambiguationPatternRule> private booleanprivate booleanprotected static final Stringprivate static final Stringstatic final Stringstatic final floatThe confidence value for a suggestion with high confidence.protected intprotected final Languageprotected LanguageModelstatic final StringThe stringLanguageTool.static final StringThe stringLanguageTooler.static final intprivate static final Patternprivate static final Patternprivate static final Stringprivate static final Stringprivate static final Stringprivate static final Stringprotected final CachingWordListLoaderprivate String[]private String[] -
Constructor Summary
ConstructorsConstructorDescriptionSpellingCheckRule(ResourceBundle messages, Language language, UserConfig userConfig) SpellingCheckRule(ResourceBundle messages, Language language, UserConfig userConfig, List<Language> altLanguages) SpellingCheckRule(ResourceBundle messages, Language language, UserConfig userConfig, List<Language> altLanguages, LanguageModel languageModel) -
Method Summary
Modifier and TypeMethodDescriptionvoidacceptPhrases(List<String> phrases) Accept (case-sensitively, unless at the start of a sentence) the given phrases even though they are not in the built-in dictionary.voidaddIgnoreTokens(List<String> tokens) Add the given words to the list of words to be ignored during spell check.protected voidaddIgnoreWords(String line) protected voidaddProhibitedWords(List<String> words) protected static voidaddSuggestionsToRuleMatch(String word, List<SuggestedReplacement> userCandidatesList, List<SuggestedReplacement> candidatesList, SuggestionsOrderer orderer, RuleMatch match) protected RuleMatchcreateWrongSplitMatch(AnalyzedSentence sentence, List<RuleMatch> ruleMatchesSoFar, int pos, String coveredWord, String suggestion1, String suggestion2, int prevPos) expandLine(String line) Expand suffixes in a line.protected static <T> List<T> filterDupes(List<T> words) protected List<SuggestedReplacement> protected List<SuggestedReplacement> filterSuggestions(List<SuggestedReplacement> suggestions) Remove prohibited words from suggestions.Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.Get the name of additional spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.protected List<SuggestedReplacement> getAdditionalSuggestions(List<SuggestedReplacement> suggestions, String word) Get additional suggestions added after other suggestions (note the rule may choose to re-order the suggestions anyway).protected List<SuggestedReplacement> getAdditionalTopSuggestions(List<SuggestedReplacement> suggestions, String word) Get additional suggestions added before other suggestions (note the rule may choose to re-order the suggestions anyway).Overwrite this to avoid false alarms by ignoring these patterns - note that yourRule.match(AnalyzedSentence)method needs to callRule.getSentenceWithImmunization(org.languagetool.AnalyzedSentence)for this to be used and you need to checkAnalyzedTokenReadings.isImmunized()abstract StringA short description of the error this rule can detect, usually in the language of the text that is checked.abstract StringgetId()A string used to identify the rule in e.g.protected StringGet the name of the ignore file, which lists words to be accepted, even when the spell checker would not accept them.Get the name of the spelling file for a language variant (e.g., en-US or de-AT), which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.protected List<SuggestedReplacement> getOnlySuggestions(String word) Get suggestions that will replace all other suggestions.protected StringGet the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.Get the name of the spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.private static List<PatternToken> getTokensForSentenceStart(String[] parts) protected booleanLikeignoreWord(String), but will only be called after the standard spell check has run and considered this word to be incorrect.protected booleanignoreToken(AnalyzedTokenReadings[] tokens, int idx) Returns true iff the token at the given position should be ignored by the spell checker.protected booleanignoreWord(String word) Returns true iff the word should be ignored by the spell checker.protected booleanignoreWord(List<String> words, int idx) Returns true iff the word at the given position should be ignored by the spell checker.protected voidinit()booleanWhether this is a spelling rule that uses a dictionary.protected static booleanprotected booleanisIgnoredNoCase(String word) protected booleanisInIgnoredSet(String word) protected booleanabstract booleanisMisspelled(String word) protected booleanisProhibited(String word) Whether the word is prohibited, i.e.private booleanisProperNoun(String wordWithoutS) protected static booleanabstract RuleMatch[]match(AnalyzedSentence sentence) Check whether the given sentence matches this error rule, i.e.voidsetConsiderIgnoreWords(boolean considerIgnoreWords) Set whether the list of words to be explicitly ignored (set withaddIgnoreTokens(List)) is considered at all.voidsetConvertsCase(boolean convertsCase) Used to determine whether the dictionary will use case conversions for spell checking.protected intstartsWithIgnoredWord(String word, boolean caseSensitive) Checks whether awordstarts with an ignored word.protected booleanprivate voidMethods inherited from class org.languagetool.rules.Rule
addExamplePair, addTags, addToneTags, cacheAntiPatterns, estimateContextForSureMatch, getCategory, getCorrectExamples, getDistanceTokens, getErrorTriggeringExamples, getFullId, getIncorrectExamples, getLocQualityIssueType, getMinPrevMatches, getPriority, getRuleOptions, getSentenceWithImmunization, getSourceFile, getSubId, getTags, getToneTags, getUrl, hasTag, hasToneTag, isDefaultOff, isDefaultTempOff, isGoalSpecific, isIncludedInHiddenMatches, isOfficeDefaultOff, isOfficeDefaultOn, isPremium, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setDistanceTokens, setErrorTriggeringExamples, setExamplePair, setGoalSpecific, setIncludedInHiddenMatches, setIncorrectExamples, setLocQualityIssueType, setMinPrevMatches, setOfficeDefaultOff, setOfficeDefaultOn, setPremium, setPriority, setTags, setToneTags, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
Field Details
-
HIGH_CONFIDENCE
public static final float HIGH_CONFIDENCEThe confidence value for a suggestion with high confidence. Not 1.0, as even with a high confidence, we might still be wrong.- See Also:
-
LANGUAGETOOL
The stringLanguageTool.- Since:
- 2.3
- See Also:
-
LANGUAGETOOLER
The stringLanguageTooler.- Since:
- 4.4
- See Also:
-
MAX_TOKEN_LENGTH
public static final int MAX_TOKEN_LENGTH- See Also:
-
language
-
languageModel
- Since:
- 4.5 For rules from @see Language.getRelevantLanguageModelCapableRules Optional, allows e.g. better suggestions when set
-
wordListLoader
-
SPELLING_IGNORE_FILE
- See Also:
-
SPELLING_FILE
- See Also:
-
CUSTOM_SPELLING_FILE
- See Also:
-
GLOBAL_SPELLING_FILE
- See Also:
-
SPELLING_PROHIBIT_FILE
- See Also:
-
CUSTOM_SPELLING_PROHIBIT_FILE
- See Also:
-
SPELLING_FILE_VARIANT
-
wordsToBeProhibited
-
wordsToBeIgnoredDictionary
-
wordsToBeIgnoredDictionaryIgnoreCase
-
antiPatterns
-
considerIgnoreWords
private boolean considerIgnoreWords -
convertsCase
private boolean convertsCase -
wordsToBeIgnored
-
ignoreWordsWithLength
protected int ignoreWordsWithLength -
pHasNoLetterLatin
-
pHasNoLetter
-
-
Constructor Details
-
SpellingCheckRule
-
SpellingCheckRule
public SpellingCheckRule(ResourceBundle messages, Language language, UserConfig userConfig, List<Language> altLanguages) - Since:
- 4.4
-
SpellingCheckRule
public SpellingCheckRule(ResourceBundle messages, Language language, UserConfig userConfig, List<Language> altLanguages, @Nullable LanguageModel languageModel) - Since:
- 4.5
-
-
Method Details
-
addSuggestionsToRuleMatch
protected static void addSuggestionsToRuleMatch(String word, List<SuggestedReplacement> userCandidatesList, List<SuggestedReplacement> candidatesList, @Nullable SuggestionsOrderer orderer, RuleMatch match) - Parameters:
word- misspelled word that suggestions should be generated foruserCandidatesList- candidates from personal dictionarycandidatesList- candidates from default dictionaryorderer- model to rank suggestions / extract features, or nullmatch- rule match to add suggestions to
-
createWrongSplitMatch
-
getId
Description copied from class:RuleA string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the charactersA-Zand the underscore. -
getDescription
Description copied from class:RuleA short description of the error this rule can detect, usually in the language of the text that is checked.- Specified by:
getDescriptionin classRule
-
match
Description copied from class:RuleCheck whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different from the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example). In other words, implementations must be stateless, so that a previous call to this method has no influence on later calls.- Specified by:
matchin classRule- Parameters:
sentence- a pre-analyzed sentence- Returns:
- an array of
RuleMatchobjects - Throws:
IOException
-
isMisspelled
- Throws:
IOException- Since:
- 4.8
-
isDictionaryBasedSpellingRule
public boolean isDictionaryBasedSpellingRule()Description copied from class:RuleWhether this is a spelling rule that uses a dictionary. Rules that returntruehere are basically rules that work like a simple hunspell-like spellchecker: they check words without considering the words' context.- Overrides:
isDictionaryBasedSpellingRulein classRule
-
addIgnoreTokens
Add the given words to the list of words to be ignored during spell check. You might want to useacceptPhrases(List)instead, as only that can also deal with phrases. -
updateIgnoredWordDictionary
private void updateIgnoredWordDictionary() -
setConsiderIgnoreWords
public void setConsiderIgnoreWords(boolean considerIgnoreWords) Set whether the list of words to be explicitly ignored (set withaddIgnoreTokens(List)) is considered at all. -
getAdditionalTopSuggestions
protected List<SuggestedReplacement> getAdditionalTopSuggestions(List<SuggestedReplacement> suggestions, String word) throws IOException Get additional suggestions added before other suggestions (note the rule may choose to re-order the suggestions anyway). Only add suggestions here that you know are spelled correctly, they will not be checked again before being shown to the user.- Throws:
IOException
-
getOnlySuggestions
Get suggestions that will replace all other suggestions. Only add suggestions here that you know are spelled correctly, they will not be checked again before being shown to the user. -
getAdditionalSuggestions
protected List<SuggestedReplacement> getAdditionalSuggestions(List<SuggestedReplacement> suggestions, String word) Get additional suggestions added after other suggestions (note the rule may choose to re-order the suggestions anyway). -
ignoreToken
Returns true iff the token at the given position should be ignored by the spell checker. UseignorePotentiallyMisspelledWord(String)if the check you want to implement is slightly computationally expensive.- Throws:
IOException
-
ignoreWord
Returns true iff the word should be ignored by the spell checker. If possible, useignoreToken(AnalyzedTokenReadings[], int)instead.- Throws:
IOException
-
isInIgnoredSet
-
isIgnoredNoCase
-
ignoreWord
Returns true iff the word at the given position should be ignored by the spell checker. If possible, useignoreToken(AnalyzedTokenReadings[], int)instead.- Throws:
IOException- Since:
- 2.6
-
ignorePotentiallyMisspelledWord
LikeignoreWord(String), but will only be called after the standard spell check has run and considered this word to be incorrect. This way, tests run here can be a bit more computationally expensive.- Throws:
IOException
-
setConvertsCase
public void setConvertsCase(boolean convertsCase) Used to determine whether the dictionary will use case conversions for spell checking.- Parameters:
convertsCase- if true, then conversions are used.- Since:
- 2.5
-
isUrl
-
isEMail
-
filterDupes
-
init
- Throws:
IOException
-
getIgnoreFileName
Get the name of the ignore file, which lists words to be accepted, even when the spell checker would not accept them. Unlike withgetSpellingFileName()the words in this file will not be used for creating suggestions for misspelled words.- Since:
- 2.7
-
getSpellingFileName
Get the name of the spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 2.9, public since 3.5
-
getAdditionalSpellingFileNames
Get the name of additional spelling file, which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 4.8
-
getLanguageVariantSpellingFileName
Get the name of the spelling file for a language variant (e.g., en-US or de-AT), which lists words to be accepted and used for suggestions, even when the spell checker would not accept them.- Since:
- 4.3
-
getProhibitFileName
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.- Since:
- 2.8
-
getAdditionalProhibitFileNames
Get the name of the prohibit file, which lists words not to be accepted, even when the spell checker would accept them.- Since:
- 2.8
-
isProhibited
Whether the word is prohibited, i.e. whether it should be marked as a spelling error even if the spell checker would accept it. (This is useful to improve our spell checker without waiting for the upstream checker to be updated.)- Since:
- 2.8
-
filterSuggestions
Remove prohibited words from suggestions.- Since:
- 2.8
-
filterNoSuggestWords
-
isProperNoun
-
addIgnoreWords
- Parameters:
line- the line as read fromspelling.txt.- Since:
- 2.9, signature modified in 3.9
-
addProhibitedWords
- Parameters:
words- list of words to be prohibited.- Since:
- 4.2
-
expandLine
Expand suffixes in a line. By default, the line is not expanded. Implementations might e.g. turnbicycle/Sinto[bicycle, bicycles].- Since:
- 3.0
-
acceptPhrases
Accept (case-sensitively, unless at the start of a sentence) the given phrases even though they are not in the built-in dictionary. Use this to avoid false alarms on e.g. names and technical terms. UnlikeaddIgnoreTokens(List)this can deal with phrases. A way to call this is like this:rule.acceptPhrases(Arrays.asList("duodenal atresia"))This way, checking would not create an error for "duodenal atresia", but it would still create and error for "duodenal" or "atresia" if they appear on their own.- Since:
- 3.3
-
getTokensForSentenceStart
-
getAntiPatterns
Description copied from class:RuleOverwrite this to avoid false alarms by ignoring these patterns - note that yourRule.match(AnalyzedSentence)method needs to callRule.getSentenceWithImmunization(org.languagetool.AnalyzedSentence)for this to be used and you need to checkAnalyzedTokenReadings.isImmunized()- Overrides:
getAntiPatternsin classRule
-
startsWithIgnoredWord
Checks whether awordstarts with an ignored word. Note that a minimumword-length of 4 characters is expected. (This is for better performance. Moreover, such short words are most likely contained in the dictionary.)- Parameters:
word- - entire wordcaseSensitive- - determines whether the check is case-sensitive- Returns:
- length of the ignored word (i.e., return value is 0, if the word does not start with an ignored word). If there are several matches from the set of ignored words, the length of the longest matching word is returned.
- Since:
- 3.5
-
tokenizeNewWords
protected boolean tokenizeNewWords() -
isLatinScript
protected boolean isLatinScript()
-