Class SpanishWordTokenizer

java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.tokenizers.es.SpanishWordTokenizer
All Implemented Interfaces:
Tokenizer

public class SpanishWordTokenizer extends WordTokenizer
Tokenizes a sentence into words. Punctuation and whitespace gets its own token.
  • Field Details

    • wordCharacters

      private static final String wordCharacters
      See Also:
    • tokenizerPattern

      private static final Pattern tokenizerPattern
    • DECIMAL_POINT

      private static final Pattern DECIMAL_POINT
    • DECIMAL_COMMA

      private static final Pattern DECIMAL_COMMA
    • ORDINAL_POINT

      private static final Pattern ORDINAL_POINT
    • PATTERN_1

      private static final Pattern PATTERN_1
    • PATTERN_2

      private static final Pattern PATTERN_2
    • PATTERN_3

      private static final Pattern PATTERN_3
    • SOFT_HYPHEN

      private static final Pattern SOFT_HYPHEN
  • Constructor Details

    • SpanishWordTokenizer

      public SpanishWordTokenizer()
  • Method Details