Package org.languagetool.synthesis
Class ManualSynthesizer
java.lang.Object
org.languagetool.synthesis.ManualSynthesizer
A synthesizer that reads the inflected form and POS information from a plain (UTF-8) text file.
This makes it possible for the user to edit the text file to let the system know
about new words or missing readings in the synthesizer *.dict file.
File Format: fullform baseform postags (tab separated)
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final String[]private static final Stringprivate static final intprivate final it.unimi.dsi.fastutil.ints.Int2IntMapA map from lemma+POS hashes to encoded lemma+POS+word tuple offsets indataprivate static final intprivate static final intprivate static final intprivate static final String -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static it.unimi.dsi.fastutil.objects.ObjectOpenHashSet<String> collectTags(Map<TaggedWord, List<String>> mapping) private static StringdecodeForm(String lemma, String word) private static StringencodeForm(String lemma, String word) Retrieve all the possible POS values.private static it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap<List<org.apache.commons.lang3.tuple.Triple<String, String, String>>> groupByHash(Map<TaggedWord, List<String>> mapping) private static intprivate static Map<TaggedWord, List<String>> loadMapping(InputStream inputStream) Look up a word's inflected form as specified by the lemma and POS tag.
-
Field Details
-
SUFFIX_MARKER
- See Also:
-
possibleTags
-
OFFSET_SHIFT
private static final int OFFSET_SHIFT- See Also:
-
MAX_LENGTH
private static final int MAX_LENGTH- See Also:
-
MAX_OFFSET
private static final int MAX_OFFSET- See Also:
-
ENTRY_SIZE
private static final int ENTRY_SIZE- See Also:
-
data
-
map
private final it.unimi.dsi.fastutil.ints.Int2IntMap mapA map from lemma+POS hashes to encoded lemma+POS+word tuple offsets indata -
DEFAULT_SEPARATOR
- See Also:
-
-
Constructor Details
-
ManualSynthesizer
- Throws:
IOException
-
-
Method Details
-
groupByHash
-
encodeForm
-
decodeForm
-
collectTags
private static it.unimi.dsi.fastutil.objects.ObjectOpenHashSet<String> collectTags(Map<TaggedWord, List<String>> mapping) -
hashCode
-
getPossibleTags
Retrieve all the possible POS values. -
lookup
Look up a word's inflected form as specified by the lemma and POS tag.- Parameters:
lemma- the lemma to inflect.posTag- the required POS tag.- Returns:
- a list with all the inflected forms of the specified lemma having the specified POS tag.
If no inflected form is found, the function returns
null.
-
loadMapping
- Throws:
IOException
-