Package org.languagetool.languagemodel
Class LuceneSingleIndexLanguageModel
java.lang.Object
org.languagetool.languagemodel.BaseLanguageModel
org.languagetool.languagemodel.LuceneSingleIndexLanguageModel
- All Implemented Interfaces:
AutoCloseable,LanguageModel
Information about ngram occurrences, taken from Lucene indexes (one index per ngram level).
This is not a real language model as it only returns information
about occurrence counts but has no probability calculation, especially
not for the case with 0 occurrences.
- Since:
- 3.2
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static class -
Field Summary
Fields inherited from interface org.languagetool.languagemodel.LanguageModel
GOOGLE_SENTENCE_END, GOOGLE_SENTENCE_START -
Constructor Summary
ConstructorsConstructorDescriptionLuceneSingleIndexLanguageModel(int maxNgram) LuceneSingleIndexLanguageModel(File topIndexDir) -
Method Summary
Modifier and TypeMethodDescriptionstatic voidOnly used internally.voidclose()protected voiddoValidateDirectory(File topIndexDir) longGet the occurrence count fortoken.longGet the occurrence count for the given token sequence.getLuceneSearcher(int ngramSize) longtoString()static voidvalidateDirectory(File topIndexDir) Throw RuntimeException is the given directory does not seem to be a valid ngram top directory with sub directories1gramsetc.Methods inherited from class org.languagetool.languagemodel.BaseLanguageModel
getPseudoProbability, getPseudoProbabilityStupidBackoff
-
Constructor Details
-
LuceneSingleIndexLanguageModel
- Parameters:
topIndexDir- a directory which contains at least another sub directory called3grams, which is a Lucene index with ngram occurrences as created byorg.languagetool.dev.FrequencyIndexCreator.
-
LuceneSingleIndexLanguageModel
-
-
Method Details
-
validateDirectory
Throw RuntimeException is the given directory does not seem to be a valid ngram top directory with sub directories1gramsetc.- Since:
- 3.0
-
clearCaches
Only used internally.- Since:
- 3.2
-
doValidateDirectory
-
getCount
Description copied from class:BaseLanguageModelGet the occurrence count for the given token sequence.- Specified by:
getCountin classBaseLanguageModel
-
getCount
Description copied from class:BaseLanguageModelGet the occurrence count fortoken.- Specified by:
getCountin classBaseLanguageModel
-
getTotalTokenCount
public long getTotalTokenCount()- Specified by:
getTotalTokenCountin classBaseLanguageModel
-
getLuceneSearcher
-
close
public void close() -
toString
-