dict.prepare
Class ParserBase

java.lang.Object
  extended by dict.prepare.ParserBase
Direct Known Subclasses:
DictParser, ThesParser, UDDLParser

public abstract class ParserBase
extends java.lang.Object

Base class for classes implementing parsing of diverse dictionary files.

Version:
$Revision: 21 $
Author:
Daniel Stoinski

Field Summary
private static int BUFLEN
           
protected  IDictParserHandler m_handler
          Handler for retrieving extracted index keywords.
private  int m_type
          The dictionary type.
 
Constructor Summary
ParserBase(IDictParserHandler aHandler, int aType)
          Initialises the parser for the given handler and type.
 
Method Summary
static ParserBase getInstance(IDictParserHandler aHandler, int aType)
          Returns an instance of the class handling the given type.
protected abstract  boolean processLine(java.lang.String s, int fileno, long from, long to)
          Extracts index keywords from a single dictionary or thesaurus line.
 void read(java.io.InputStream input, int fileno)
          Extracts index keywords from the given stream.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BUFLEN

private static final int BUFLEN
See Also:
Constant Field Values

m_handler

protected IDictParserHandler m_handler
Handler for retrieving extracted index keywords.


m_type

private int m_type
The dictionary type. One of the constants defined in DictType.

Constructor Detail

ParserBase

public ParserBase(IDictParserHandler aHandler,
                  int aType)
Initialises the parser for the given handler and type. You must call read() in order to start parsing.

Parameters:
aHandler - the handler for extracted index keywords.
aType - the dictionary type.
Method Detail

getInstance

public static final ParserBase getInstance(IDictParserHandler aHandler,
                                           int aType)
Returns an instance of the class handling the given type.

Parameters:
aHandler - the handler for extracted translations, directly passed to the constructor of the particular class.
aType - the dictionary type, one of the constants DictType.DICTCC, DictType.UDDL, DictType.THES.
Returns:
an instance of DictParser (DICTCC), IDDLParser (UDDL) or ThesParser (THES), depending on aType. null for an unknown type.

processLine

protected abstract boolean processLine(java.lang.String s,
                                       int fileno,
                                       long from,
                                       long to)
Extracts index keywords from a single dictionary or thesaurus line.

Parameters:
s - the line of thex from which to extract the index keywors.
from - position of the begin of the line in the dictionary or thesaurus file.
to - position of the end of the line in the dictionary or thesaurus file.
fileno - index of the dictionary file, form which the line s has been extracted.
Returns:
true, if the process has to be continued, false if it was interrupted.

read

public final void read(java.io.InputStream input,
                       int fileno)
                throws java.io.IOException
Extracts index keywords from the given stream. Reads the stream line by line, translating them into strings using encoding suitable for the dictionary type. For each line it calls processLine() in order to extract index keywords for the entry contained in the line. The begin and end position of the line is computed by counting bytes read in this method. CR (13), LF (10) or both are used in order to discover the end of the line.

Parameters:
input - the stream to read.
fileno - index of the dictionary file to read (the dictionary consists of multiple files)
Throws:
java.io.IOException - on read errors.