Dict.cc dictionary file parser
For a list of all members of this type, see DictParser Members.
Public Class DictParser
Public static (Shared in Visual Basic) members of this type are
safe for multithreaded operations. Instance members are not guaranteed to be
Extracts keywords and locations of the explanation for these keywords from a dictionary file obtained from dict.cc. The following format of the dictionary file is expected:
- The dictionary file is a text file encoded according to ParserBase.getEncoding(), presently it is cp1252 for dict.cc.
- One translation is located in one line of this text file.
- Each line consists of two parts separated with INDEX_SEPARATOR (::). The left part contains text in the source language, for example English in the English-German dictionary, the right part contains the translation in the target language.
- The class splits the left part into keywords. It uses SEPARATORS (white spaces, tabulators, semicolons, slashes, dots, commas etc.) in order to extract single keywords from the whole sentence. Text passages in the left part located between OPENING and CLOSING (usually between brackets, or HTML sequences like between an ampersound and a semicolon) are ignored.
For each extracted keyword IDictParseHandler.addIndex() is called, with the extracted keyword and the position from and to of the whole line in the dictionary file (in bytes). The location from and to can be used for example in the class DictEntryRef in order to read the whole line from the dictionary file and return it as string.
Assembly: dict (in dict.exe)
DictParser Members | dict Namespace | DictImport | DictEntryRef