Class LanguageIdentifier
java.lang.Object
org.languagetool.language.identifier.LanguageIdentifier
- Direct Known Subclasses:
DefaultLanguageIdentifier,SimpleLanguageIdentifier
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final CommonWordsDetectorprotected static final intprivate static final Patternprotected intprivate static final Patternprivate static final Patternprotected static final com.optimaize.langdetect.text.TextFilterprotected static final com.optimaize.langdetect.text.TextFilterprotected static final com.optimaize.langdetect.text.TextFilterprotected static final com.optimaize.langdetect.text.TextFilterprotected static final floatprivate static final Patternprotected static final UnicodeBasedDetectorprivate static final Pattern -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncleanAndShortenText(String text) abstract LanguagedetectLanguage(String cleanText) abstract DetectedLanguageabstract DetectedLanguagedetectLanguage(String cleanText, List<String> noopLangsTmp, List<String> preferredLangsTmp, boolean limitOnPreferredLangs) abstract List<DetectedLanguage> getDetectedLanguageScores(String cleanText, List<String> noopLangsTmp, List<String> preferredLangsTmp, boolean limitOnPreferredLangs, int count) getHighestScoringResult(Map<String, Double> probs) getOrderedScores(Map<String, Double> scores, int count) protected LanguageIdentifier.ParsedLanguageLists
-
Field Details
-
URL_REGEX
-
MAIL_REGEX
-
SIGNATURE
-
MENTION
-
NBSP_INVIS_SEPARATOR
-
SCORE_THRESHOLD
protected static final float SCORE_THRESHOLD- See Also:
-
CONSIDER_ONLY_PREFERRED_THRESHOLD
protected static final int CONSIDER_ONLY_PREFERRED_THRESHOLD- See Also:
-
NON_LATIN_CHARS_LANGUAGES
-
REMOVE_EMAIL_SIGNATURE_FILTER
protected static final com.optimaize.langdetect.text.TextFilter REMOVE_EMAIL_SIGNATURE_FILTER -
REMOVE_MENTION_FILTER
protected static final com.optimaize.langdetect.text.TextFilter REMOVE_MENTION_FILTER -
REMOVE_NON_BREAKING_SPACES_FILTER
protected static final com.optimaize.langdetect.text.TextFilter REMOVE_NON_BREAKING_SPACES_FILTER -
REMOVE_URL_FILTER
protected static final com.optimaize.langdetect.text.TextFilter REMOVE_URL_FILTER -
UNICODE_BASED_LANG_IDENTIFIER
-
COMMON_WORDS_LANG_IDENTIFIER
-
maxLength
protected int maxLength
-
-
Constructor Details
-
LanguageIdentifier
public LanguageIdentifier(int maxLength)
-
-
Method Details
-
detectLanguage
@Nullable public abstract DetectedLanguage detectLanguage(String cleanText, List<String> noopLangsTmp, List<String> preferredLangsTmp) - Parameters:
cleanText- a cleanText as returned bycleanAndShortenText(String)noopLangsTmp- list of codes that are detected but will lead to the NoopLanguage that has no rules- Returns:
- language or
nullif language could not be identified - Since:
- 4.4 (new parameter noopLangs, changed return type to DetectedLanguage)
-
detectLanguage
-
getDetectedLanguageScores
-
detectLanguage
- Parameters:
cleanText- a cleanText as returned bycleanAndShortenText(String)- Returns:
- language or
nullif language could not be identified - Since:
- 4.4 (new parameter noopLangs, changed return type to DetectedLanguage)
-
cleanAndShortenText
- Since:
- 5.8
-
prepareDetectLanguage
-
getHighestScoringResult
-
getOrderedScores
-