Package org.languagetool.synthesis
Class BaseSynthesizer
java.lang.Object
org.languagetool.synthesis.BaseSynthesizer
- All Implemented Interfaces:
Synthesizer
- Direct Known Subclasses:
ArabicSynthesizer,CatalanSynthesizer,CrimeanTatarSynthesizer,DutchSynthesizer,EnglishSynthesizer,FrenchSynthesizer,GalicianSynthesizer,GermanSynthesizer,GreekSynthesizer,IrishSynthesizer,ItalianSynthesizer,PolishSynthesizer,PortugueseSynthesizer,RomanianSynthesizer,RussianSynthesizer,SlovakSynthesizer,SpanishSynthesizer,SwedishSynthesizer,UkrainianSynthesizer
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate morfologik.stemming.Dictionaryprotected Languageprivate final ManualSynthesizerprivate final Sorosprivate final ManualSynthesizerprivate final ManualSynthesizerprivate final Stringprivate final Sorosprivate final Stringfinal Stringfinal Stringfinal Stringprivate final morfologik.stemming.IStemmerprivate final String -
Constructor Summary
ConstructorsConstructorDescriptionBaseSynthesizer(String resourceFileName, String tagFileName, String langShortCode) BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, String langShortCode) BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, Language lang) Deprecated.BaseSynthesizer(String resourceFileName, String tagFileName, Language lang) Deprecated. -
Method Summary
Modifier and TypeMethodDescriptionprivate SoroscreateNumberSpeller(String langcode) private Sorosprotected morfologik.stemming.IStemmerCreates a newIStemmerbased on the configureddictionary.protected morfologik.stemming.DictionaryReturns theDictionaryused for this synthesizer.getPosTagCorrection(String posTag) Gets a corrected version of the POS tag used for synthesis.getRomanNumber(String arabicNumeral) getSpelledNumber(String arabicNumeral) Spells out a numbermorfologik.stemming.IStemmergetTargetPosTag(List<String> posTags, String targetPosTag) Select the desired POS tag to synthesizeprotected voidprotected booleanloadTags()Lookup the inflected forms of a lemma defined by a part-of-speech tag.protected String[]removeExceptions(String[] words) String[]synthesize(AnalyzedToken token, String posTag) Get a form of a given AnalyzedToken, where the form is defined by a part-of-speech tag.String[]synthesize(AnalyzedToken token, String posTag, boolean posTagRegExp) Generates a form of the word with a given POS tag for a given lemma.String[]synthesizeForPosTags(String lemma, Predicate<String> acceptTag) Synthesize forms for the given lemma and for all POS tags satisfying the given predicate.
-
Field Details
-
SPELLNUMBER_TAG
- See Also:
-
SPELLNUMBER_FEMININE_TAG
- See Also:
-
SPELLNUMBER_ROMAN_TAG
- See Also:
-
possibleTags
-
tagFileName
-
resourceFileName
-
stemmer
private final morfologik.stemming.IStemmer stemmer -
manualSynthesizer
-
removalSynthesizer
-
removalSynthesizer2
-
sorosFileName
-
numberSpeller
-
romanNumberer
-
dictionary
private volatile morfologik.stemming.Dictionary dictionary -
language
-
-
Constructor Details
-
BaseSynthesizer
public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, Language lang) Deprecated.- Parameters:
resourceFileName- The dictionary file name.tagFileName- The name of a file containing all possible tags.
-
BaseSynthesizer
public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, String langShortCode) - Parameters:
resourceFileName- The dictionary file name.tagFileName- The name of a file containing all possible tags.langShortCode- the language short code used to find the data files
-
BaseSynthesizer
Deprecated. -
BaseSynthesizer
-
-
Method Details
-
getDictionary
Returns theDictionaryused for this synthesizer. The dictionary file can be defined in theconstructor.- Throws:
IOException- In case the dictionary cannot be loaded.
-
createStemmer
protected morfologik.stemming.IStemmer createStemmer()Creates a newIStemmerbased on the configureddictionary. The result must not be shared among threads.- Since:
- 2.3
-
createNumberSpeller
-
createRomanNumberer
-
lookup
Lookup the inflected forms of a lemma defined by a part-of-speech tag.- Parameters:
lemma- the lemma to be inflected.posTag- the desired part-of-speech tag.
-
synthesize
Get a form of a given AnalyzedToken, where the form is defined by a part-of-speech tag.- Specified by:
synthesizein interfaceSynthesizer- Parameters:
token- AnalyzedToken to be inflected.posTag- The desired part-of-speech tag.- Returns:
- inflected words, or an empty array if no forms were found
- Throws:
IOException
-
synthesize
public String[] synthesize(AnalyzedToken token, String posTag, boolean posTagRegExp) throws IOException Description copied from interface:SynthesizerGenerates a form of the word with a given POS tag for a given lemma. POS tag can be specified using regular expressions.- Specified by:
synthesizein interfaceSynthesizer- Parameters:
token- the token to be used for synthesisposTag- POS tag of the form to be generatedposTagRegExp- Specifies whether the posTag string is a regular expression.- Throws:
IOException
-
synthesizeForPosTags
Synthesize forms for the given lemma and for all POS tags satisfying the given predicate.- Throws:
IOException- Since:
- 5.3
-
getPosTagCorrection
Description copied from interface:SynthesizerGets a corrected version of the POS tag used for synthesis. Useful when the tagset defines special disjunction that need to be converted into regexp disjunctions.- Specified by:
getPosTagCorrectionin interfaceSynthesizer- Parameters:
posTag- original POS tag to correct- Returns:
- converted POS tag
-
getStemmer
public morfologik.stemming.IStemmer getStemmer()- Returns:
- the stemmer interface to be used.
- Since:
- 2.5
-
initPossibleTags
- Throws:
IOException
-
loadTags
- Throws:
IOException
-
getSpelledNumber
Description copied from interface:SynthesizerSpells out a number- Specified by:
getSpelledNumberin interfaceSynthesizer- Parameters:
arabicNumeral- in arabic numerals- Returns:
- String of the spelled out number
-
getRomanNumber
-
isException
-
removeExceptions
-
getTargetPosTag
Description copied from interface:SynthesizerSelect the desired POS tag to synthesize- Specified by:
getTargetPosTagin interfaceSynthesizer
-
BaseSynthesizer(String, String, String, String)