Class PatternToken

java.lang.Object
org.languagetool.rules.patterns.PatternToken
All Implemented Interfaces:
Cloneable

public class PatternToken extends Object implements Cloneable
A part of a pattern, represents the 'token' element of the grammar.xml.
  • Field Details

    • UNKNOWN_TAG

      public static final String UNKNOWN_TAG
      Matches only tokens without any POS tag.
      See Also:
    • EMPTY_ARRAY

      private static final PatternToken[] EMPTY_ARRAY
    • INFLECTED_MASK

      private static final int INFLECTED_MASK
      See Also:
    • NEGATION_MASK

      private static final int NEGATION_MASK
      See Also:
    • TEST_WHITESPACE_MASK

      private static final int TEST_WHITESPACE_MASK
      See Also:
    • WHITESPACE_BEFORE_MASK

      private static final int WHITESPACE_BEFORE_MASK
      See Also:
    • INSIDE_MARKER_MASK

      private static final int INSIDE_MARKER_MASK
      See Also:
    • EXCEPTION_VALID_NEXT_MASK

      private static final int EXCEPTION_VALID_NEXT_MASK
      True if scope=="next".
      See Also:
    • MAY_BE_OMITTED_MASK

      private static final int MAY_BE_OMITTED_MASK
      See Also:
    • TEST_STRING_MASK

      private static final int TEST_STRING_MASK
      This var is used to determine if calling setStringElement(java.lang.String) makes sense. This method takes most time so it's best to reduce the number of its calls.
      See Also:
    • UNIFICATION_NEUTRAL_MASK

      private static final int UNIFICATION_NEUTRAL_MASK
      Determines whether the element should be ignored when doing unification
      See Also:
    • UNI_NEGATION_MASK

      private static final int UNI_NEGATION_MASK
      See Also:
    • LAST_UNIFIED_MASK

      private static final int LAST_UNIFIED_MASK
      Set to true on tokens that close the unification block.
      See Also:
    • flags

      private short flags
    • textMatcher

      private StringMatcher textMatcher
    • posToken

      private PatternToken.PosToken posToken
    • rareFields

      private PatternToken.RareFields rareFields
    • skip

      private byte skip
    • maxOccurrence

      private byte maxOccurrence
  • Constructor Details

    • PatternToken

      public PatternToken(String token, boolean caseSensitive, boolean regExp, boolean inflected)
      Creates Element that is used to match tokens in the text.
      Parameters:
      token - String to be matched
      caseSensitive - true if the check is case-sensitive
      regExp - true if the check uses regular expressions
      inflected - true if the check refers to base forms (lemmas), note that token must be a base form for this to work
    • PatternToken

      PatternToken(boolean inflected, @NotNull StringMatcher textMatcher)
  • Method Details

    • clone

      public Object clone() throws CloneNotSupportedException
      Overrides:
      clone in class Object
      Throws:
      CloneNotSupportedException
    • isMatched

      public boolean isMatched(AnalyzedToken token)
      Checks whether the rule element matches the token given as a parameter.
      Parameters:
      token - AnalyzedToken to check matching against
      Returns:
      True if token matches, false otherwise.
    • hasFlag

      boolean hasFlag(int mask)
    • setFlag

      private void setFlag(int mask, boolean value)
    • isExceptionMatched

      public boolean isExceptionMatched(AnalyzedToken token)
      Checks whether an exception matches.
      Parameters:
      token - AnalyzedToken to check matching against
      Returns:
      True if any of the exceptions matches (logical disjunction).
    • isAndExceptionGroupMatched

      public boolean isAndExceptionGroupMatched(AnalyzedToken token)
      Enables testing multiple conditions specified by multiple element exceptions. Works as logical AND operator.
      Parameters:
      token - the token checked for exceptions.
      Returns:
      true if all conditions are met, false otherwise.
    • isExceptionMatchedCompletely

      public boolean isExceptionMatchedCompletely(AnalyzedToken token)
      This method checks exceptions both in AND-group and the token. Introduced to for clarity.
      Parameters:
      token - Token to match
      Returns:
      True if matched.
    • setAndGroupElement

      public void setAndGroupElement(PatternToken andToken)
    • hasAndGroup

      public boolean hasAndGroup()
      Checks if this element has an AND group associated with it.
      Returns:
      true if the element has a group of elements that all should match.
    • getAndGroup

      public List<PatternToken> getAndGroup()
      Returns the group of elements linked with AND operator.
    • setOrGroupElement

      public void setOrGroupElement(PatternToken orToken)
      Since:
      2.3
    • hasOrGroup

      public boolean hasOrGroup()
      Checks if this element has an OR group associated with it.
      Returns:
      true if the element has a group of elements that all should match.
      Since:
      2.3
    • getOrGroup

      public List<PatternToken> getOrGroup()
      Returns the group of elements linked with OR operator.
      Since:
      2.3
    • isMatchedByScopeNextException

      public boolean isMatchedByScopeNextException(AnalyzedToken token)
      Checks whether a previously set exception matches (in case the exception had scope == "next").
      Parameters:
      token - AnalyzedToken to check matching against.
      Returns:
      True if any of the exceptions matches.
    • isMatchedByPreviousException

      public boolean isMatchedByPreviousException(AnalyzedToken token)
      Checks whether an exception for a previous token matches (in case the exception had scope == "previous").
      Parameters:
      token - AnalyzedToken to check matching against.
      Returns:
      True if any of the exceptions matches.
    • isMatchedByPreviousException

      public boolean isMatchedByPreviousException(AnalyzedTokenReadings prevToken)
      Checks whether an exception for a previous token matches all readings of a given token (in case the exception had scope == "previous").
      Parameters:
      prevToken - AnalyzedTokenReadings to check matching against.
      Returns:
      true if any of the exceptions matches.
    • isSentenceStart

      public boolean isSentenceStart()
      Checks if the token is a sentence start.
      Returns:
      True if the element starts the sentence and the element hasn't been set to have negated POS token.
    • setPosToken

      public void setPosToken(PatternToken.PosToken posToken)
      Since:
      2.9
    • setChunkTag

      public void setChunkTag(ChunkTag chunkTag)
      Since:
      2.9
    • getString

      public String getString()
    • setStringElement

      public void setStringElement(String token)
    • setTextMatcher

      void setTextMatcher(@NotNull StringMatcher matcher)
    • normalizeTextPattern

      static String normalizeTextPattern(String token)
    • setStringPosException

      public void setStringPosException(String token, boolean regExp, boolean inflected, boolean negation, boolean scopeNext, boolean scopePrevious, String posToken, boolean posRegExp, boolean posNegation, Boolean caseSensitivity)
      Sets a string and/or pos exception for matching tokens.
      Parameters:
      token - The string in the exception.
      regExp - True if the string is specified as a regular expression.
      inflected - True if the string is a base form (lemma).
      negation - True if the exception is negated.
      scopeNext - True if the exception scope is next tokens.
      scopePrevious - True if the exception should match only a single previous token.
      posToken - The part of the speech tag in the exception.
      posRegExp - True if the POS is specified as a regular expression.
      posNegation - True if the POS exception is negated.
      caseSensitivity - if null, use this element's setting for case sensitivity, otherwise the specified value
      Since:
      2.9
    • addException

      void addException(boolean scopeNext, boolean scopePrevious, PatternToken exception)
    • initRareFields

      @NotNull private PatternToken.RareFields initRareFields()
    • isPosTokenMatched

      private boolean isPosTokenMatched(AnalyzedToken token)
      Tests if part of speech matches a given string. Special value UNKNOWN_TAG matches null POS tags.
      Parameters:
      token - Token to test.
      Returns:
      true if matches
    • getTestToken

      private String getTestToken(AnalyzedToken token)
    • getSkipNext

      public int getSkipNext()
      Gets the exception scope length.
      Returns:
      scope length in tokens
    • getMinOccurrence

      public int getMinOccurrence()
      The minimum number of times the element needs to occur.
    • getMaxOccurrence

      public int getMaxOccurrence()
      The maximum number of times the element may occur.
    • setSkipNext

      public void setSkipNext(int i)
      Parameters:
      i - exception scope length.
    • setMinOccurrence

      public void setMinOccurrence(int i)
      The minimum number of times this element may occur.
      Parameters:
      i - currently only 0 and 1 are supported
    • setMaxOccurrence

      public void setMaxOccurrence(int i)
      The maximum number of times this element may occur.
      Parameters:
      i - a number >= 1 or -1 for unlimited occurrences
    • hasPreviousException

      public boolean hasPreviousException()
      Checks if the element has an exception for a previous token.
      Returns:
      True if the element has a previous token matching exception.
    • hasNextException

      public boolean hasNextException()
      Checks if the element has an exception for a next scope. (only used for testing)
      Returns:
      True if the element has exception for the next scope.
    • setNegation

      public void setNegation(boolean negation)
      Negates the matching so that non-matching elements match and vice-versa.
    • getNegation

      public boolean getNegation()
      Since:
      0.9.3
    • isReferenceElement

      public boolean isReferenceElement()
      Returns:
      true when this element refers to another token.
    • setMatch

      public void setMatch(Match match)
      Sets the reference to another token.
      Parameters:
      match - Formatting object for the token reference.
    • getMatch

      public Match getMatch()
    • compile

      public PatternToken compile(AnalyzedTokenReadings token, Synthesizer synth) throws IOException
      Prepare PatternToken for matching by formatting its string token and POS (if the Element is supposed to refer to some other token).
      Parameters:
      token - the token specified as AnalyzedTokenReadings
      synth - the language synthesizer (Synthesizer)
      Throws:
      IOException
    • doCompile

      private void doCompile(AnalyzedTokenReadings token, Synthesizer synth) throws IOException
      Throws:
      IOException
    • setPhraseName

      public void setPhraseName(String id)
      Sets the phrase the element is in.
      Parameters:
      id - ID of the phrase.
    • isPartOfPhrase

      public boolean isPartOfPhrase()
      Checks if the Element is in any phrase.
      Returns:
      True if the Element is contained in the phrase.
    • isCaseSensitive

      public boolean isCaseSensitive()
      Whether the element matches case sensitively.
      Since:
      2.3
    • isRegularExpression

      public boolean isRegularExpression()
      Tests whether the element matches a regular expression.
      Since:
      0.9.6
    • isPOStagRegularExpression

      public boolean isPOStagRegularExpression()
      Tests whether the POS matches a regular expression.
      Since:
      1.3.0
    • getPOStag

      @Nullable public String getPOStag()
      Returns:
      the POS of the Element or null
      Since:
      0.9.6
    • getChunkTag

      @Nullable public ChunkTag getChunkTag()
      Returns:
      the chunk tag of the Element or null
      Since:
      2.3
    • getPOSNegation

      public boolean getPOSNegation()
      Returns:
      true if the POS is negated.
    • isInflected

      public boolean isInflected()
      Returns:
      true if the token matches all inflected forms
    • getPhraseName

      @Nullable public String getPhraseName()
      Gets the phrase the element is in.
      Returns:
      String The name of the phrase.
    • isUnified

      public boolean isUnified()
    • setUnification

      public void setUnification(Map<String,List<String>> uniFeatures)
    • getUniFeatures

      @Nullable public Map<String,List<String>> getUniFeatures()
      Get unification features and types.
      Returns:
      A map from features to a list of types or null
      Since:
      1.0.1
    • setUniNegation

      public void setUniNegation()
    • isUniNegated

      public boolean isUniNegated()
    • isLastInUnification

      public boolean isLastInUnification()
    • setLastInUnification

      public void setLastInUnification()
    • isUnificationNeutral

      public boolean isUnificationNeutral()
      Determines whether the element should be silently ignored during unification, and simply added.
      Returns:
      True when the element is not included in unifying.
      Since:
      2.5
    • setUnificationNeutral

      public void setUnificationNeutral()
      Sets the element as ignored during unification.
      Since:
      2.5
    • setWhitespaceBefore

      public void setWhitespaceBefore(boolean isWhite)
    • isInsideMarker

      public boolean isInsideMarker()
    • setInsideMarker

      public void setInsideMarker(boolean isInsideMarker)
    • setExceptionSpaceBefore

      public void setExceptionSpaceBefore(boolean isWhite)
      Sets the attribute on the exception that determines matching of patterns that depends on whether there was a space before the token matching the exception or not. The same procedure is used for tokens that are valid for previous or current tokens.
      Parameters:
      isWhite - If true, the space before exception is required.
    • isWhitespaceBefore

      public boolean isWhitespaceBefore(AnalyzedToken token)
    • getExceptionList

      @NotNull public List<PatternToken> getExceptionList()
      Returns:
      A List of Exceptions. Used for testing.
      Since:
      1.0.0
    • hasCurrentOrNextExceptions

      @Internal public boolean hasCurrentOrNextExceptions()
    • hasExceptionList

      public boolean hasExceptionList()
    • calcFormHints

      @Nullable Set<String> calcFormHints()
      Returns:
      all possible forms that this token pattern can accept, or null if such set is unknown/unbounded. This is used internally for performance optimizations.
    • calcLemmaHints

      @Nullable Set<String> calcLemmaHints()
      Returns:
      all possible forms that this token pattern can accept, or null if such set is unknown/unbounded. This is used internally for performance optimizations.
    • calcStringHints

      private Set<String> calcStringHints(boolean inflected)
    • calcOwnPossibleStringValues

      @Nullable private Set<String> calcOwnPossibleStringValues()
    • hasStringThatMustMatch

      boolean hasStringThatMustMatch()
    • toString

      public String toString()
      Overrides:
      toString in class Object