public final class LowerCaseTokenizer extends LetterTokenizer
Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.
AttributeSource.AttributeFactory, AttributeSource.State| Constructor and Description |
|---|
LowerCaseTokenizer(AttributeSource.AttributeFactory factory,
Reader in)
Construct a new LowerCaseTokenizer using a given
AttributeSource.AttributeFactory. |
LowerCaseTokenizer(AttributeSource source,
Reader in)
Construct a new LowerCaseTokenizer using a given
AttributeSource. |
LowerCaseTokenizer(Reader in)
Construct a new LowerCaseTokenizer.
|
| Modifier and Type | Method and Description |
|---|---|
protected char |
normalize(char c)
Converts char to lower case
Character.toLowerCase(char). |
isTokenCharend, incrementToken, next, next, resetclose, correctOffsetgetOnlyUseNewAPI, reset, setOnlyUseNewAPIaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toStringpublic LowerCaseTokenizer(Reader in)
public LowerCaseTokenizer(AttributeSource source, Reader in)
AttributeSource.public LowerCaseTokenizer(AttributeSource.AttributeFactory factory, Reader in)
AttributeSource.AttributeFactory.protected char normalize(char c)
Character.toLowerCase(char).normalize in class CharTokenizerCopyright © 2000-2012 Apache Software Foundation. All Rights Reserved.