Package org.languagetool
Class AnalyzedSentence
java.lang.Object
org.languagetool.AnalyzedSentence
A sentence that has been tokenized and analyzed.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final AnalyzedTokenReadings[]private final AnalyzedTokenReadings[]private final AnalyzedTokenReadings[]private Stringprivate final AnalyzedTokenReadings[]private final int[] -
Constructor Summary
ConstructorsModifierConstructorDescriptionAnalyzedSentence(AnalyzedTokenReadings[] tokens) Creates an AnalyzedSentence from the givenAnalyzedTokenReadings.privateAnalyzedSentence(AnalyzedTokenReadings[] tokens, int[] mapping, AnalyzedTokenReadings[] nonBlankTokens, AnalyzedTokenReadings[] nonBlankPreDisambigTokens) AnalyzedSentence(AnalyzedTokenReadings[] tokens, AnalyzedTokenReadings[] preDisambigTokens) -
Method Summary
Modifier and TypeMethodDescriptionprivate StringcalcText()copy(AnalyzedSentence sentence) The method copiesAnalyzedSentenceand returns the copy.booleanGet disambiguator actions log.intText length taking position fixes (for removed soft hyphens etc.) into account, so this is _not_ always equal togetText().getLemmaOffsets(String token) Get the lowercase lemmas of this sentence in a set.private List<AnalyzedTokenReadings> getNonBlankReadings(AnalyzedTokenReadings[] tokens, int whCounter, int nonWhCounter, int[] mapping) intGet the length of the array returned bygetTokensWithoutWhitespace()without additional allocations.intgetOriginalPosition(int nonWhPosition) Get a position of a non-whitespace token in the original sentence with whitespace.getText()Return the original text.getTokenOffsets(String token) Returns theAnalyzedTokenReadingsof the analyzed text.Get the lowercase tokens of this sentence in a set.Returns theAnalyzedTokenReadingsof the analyzed text, with whitespace tokens removed but with the artificialSENT_STARTtoken included.inthashCode()indexLemmas(AnalyzedTokenReadings[] tokens) indexTokens(AnalyzedTokenReadings[] tokens) makeUnmodifiable(Map<String, List<Integer>> result) toShortString(String readingDelimiter) Return string representation without chunk information.toString()Return string representation with chunk information.private String(package private) StringReturn string representation without any analysis information, just the original text.
-
Field Details
-
tokens
-
preDisambigTokens
-
nonBlankTokens
-
nonBlankPreDisambigTokens
-
whPositions
private final int[] whPositions -
tokenOffsets
-
lemmaOffsets
-
text
-
-
Constructor Details
-
AnalyzedSentence
Creates an AnalyzedSentence from the givenAnalyzedTokenReadings. Whitespace is also a token. -
AnalyzedSentence
-
AnalyzedSentence
private AnalyzedSentence(AnalyzedTokenReadings[] tokens, int[] mapping, AnalyzedTokenReadings[] nonBlankTokens, AnalyzedTokenReadings[] nonBlankPreDisambigTokens)
-
-
Method Details
-
getNonBlankReadings
@NotNull private List<AnalyzedTokenReadings> getNonBlankReadings(AnalyzedTokenReadings[] tokens, int whCounter, int nonWhCounter, int[] mapping) -
indexTokens
-
indexLemmas
-
makeUnmodifiable
-
copy
The method copiesAnalyzedSentenceand returns the copy. Useful for performing local immunization (for example).- Parameters:
sentence-AnalyzedSentenceto be copied- Returns:
- a new object which is a copy
- Since:
- 2.5
-
getTokens
Returns theAnalyzedTokenReadingsof the analyzed text. Whitespace is also a token. -
getPreDisambigTokens
- Since:
- 4.5
-
getTokensWithoutWhitespace
Returns theAnalyzedTokenReadingsof the analyzed text, with whitespace tokens removed but with the artificialSENT_STARTtoken included. -
getNonWhitespaceTokenCount
@Internal public int getNonWhitespaceTokenCount()Get the length of the array returned bygetTokensWithoutWhitespace()without additional allocations. -
getPreDisambigTokensWithoutWhitespace
- Since:
- 4.5
-
getOriginalPosition
public int getOriginalPosition(int nonWhPosition) Get a position of a non-whitespace token in the original sentence with whitespace.- Parameters:
nonWhPosition- position of a non-whitespace token- Returns:
- position in the original sentence.
-
toString
-
toShortString
Return string representation without chunk information.- Since:
- 2.3
-
getText
Return the original text.- Since:
- 2.7
-
calcText
-
getCorrectedTextLength
public int getCorrectedTextLength()Text length taking position fixes (for removed soft hyphens etc.) into account, so this is _not_ always equal togetText().- Since:
- 5.1
-
toTextString
String toTextString()Return string representation without any analysis information, just the original text.- Since:
- 2.6
-
toString
Return string representation with chunk information. -
toString
-
getAnnotations
Get disambiguator actions log. -
getTokenSet
Get the lowercase tokens of this sentence in a set. Used internally for performance optimization.- Since:
- 2.4
-
getLemmaSet
Get the lowercase lemmas of this sentence in a set. Used internally for performance optimization.- Since:
- 2.5
-
getTokenOffsets
- Returns:
- all offsets in
getTokensWithoutWhitespace()where tokens with the given text occur (case-insensitive), ornullif there are no such occurrences - Since:
- 5.3
-
getLemmaOffsets
- Returns:
- all offsets in
getTokensWithoutWhitespace()where tokens with the given lemma occur (case-insensitive), ornullif there are no such occurrences - Since:
- 5.3
-
equals
-
hashCode
public int hashCode()
-