public abstract class BaseUIMATokenizer extends Tokenizer
Tokenizer which is able to analyze the given input with a
UIMA AnalysisEngineAttributeSource.State| Modifier and Type | Field and Description |
|---|---|
protected org.apache.uima.analysis_engine.AnalysisEngine |
ae |
protected org.apache.uima.cas.CAS |
cas |
protected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> |
iterator |
DEFAULT_TOKEN_ATTRIBUTE_FACTORY| Modifier | Constructor and Description |
|---|---|
protected |
BaseUIMATokenizer(AttributeFactory factory,
String descriptorPath,
Map<String,Object> configurationParameters) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
analyzeInput()
analyzes the tokenizer input using the given analysis engine
|
protected abstract void |
initializeIterator()
initialize the FSIterator which is used to build tokens at each incrementToken() method call
|
void |
reset()
This method is called by a consumer before it begins consumption using
TokenStream.incrementToken(). |
close, correctOffset, setReaderend, incrementTokenaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toStringprotected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> iterator
protected org.apache.uima.analysis_engine.AnalysisEngine ae
protected org.apache.uima.cas.CAS cas
protected BaseUIMATokenizer(AttributeFactory factory, String descriptorPath, Map<String,Object> configurationParameters)
protected void analyzeInput()
throws org.apache.uima.resource.ResourceInitializationException,
org.apache.uima.analysis_engine.AnalysisEngineProcessException,
IOException
cas will be filled with extracted metadata (UIMA annotations, feature structures)
IOException - If there is a low-level I/O error.org.apache.uima.resource.ResourceInitializationExceptionorg.apache.uima.analysis_engine.AnalysisEngineProcessExceptionprotected abstract void initializeIterator()
throws IOException
IOException - If there is a low-level I/O error.public void reset()
throws IOException
TokenStreamTokenStream.incrementToken().
Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh.
If you override this method, always call super.reset(), otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw IllegalStateException on further usage).
reset in class TokenizerIOExceptionCopyright © 2000–2015 The Apache Software Foundation. All rights reserved.