This directory contains all the Cyrillic-related files that accompany David J. Birnbaum, Mavis Cournane, and Peter Flynn's "Using the TEI Writing System Declaration (WSD)," Computers and the Humanities 00: 1-9, 1998. The Greek- and Hebrew-related files are available at http://imbolc.ucc.ie/~pflynn/wsd/.
The present system replaces SDATA character entities in the original source SGML file with the UCS and AFII numerical values taken from the WSD, rather than with the replacement text strings in the SDATA entity files associated with the source document. These replacements are numerical identifiers, rather than raw glyphs or characters, because there is no readily-available system-independent way to render UCS-2 characters or AFII glyphs, and the use of numerical identifiers provides a system-independent way to verify that the WSD is being processed and used properly. An eventual production system would need to map these identifiers to actual UCS-2 characters or AFII glyphs in a way that would cause them to be rendered properly.
The system uses four basic Omnimark scripts (*.xom) to generate four output files (*.out). The two scripts that create new SDATA entity set files on disk create these as temporary files (*.tmp). The source files are then rewritten as new SGML (*.sgml) files that use the new SDATA entity sets, after which a special null Omnimark script (null.xom) is used to parse the new source files against the temporary SDATA entity files. Omnimark errors are written to log (*.log) files; there should be none. In tabular form:
| File Type | Character (ucs) | Glyph (afii) |
|---|---|---|
| Script (*.xom) | chsl_ucs_memory.xom | chsl_afii_memory.xom |
| Output (*.out) | chsl_ucs_memory.out | chsl_afii_memory.out |
| Error Log (*.log) | chsl_ucs_memory.log | chsl_afii_memory.log |
| File Type | Character (ucs) | Glyph (afii) |
|---|---|---|
| Script (*.xom) | chsl_ucs_disk.xom | chsl_afii_disk.xom |
| Temporary SDATA Entity File (*.tmp) | chsl_ucs_disk.tmp | chsl_afii_disk.tmp |
| Temporary SGML Source File (*.sgml) | chsl_ucs_disk.sgml | chsl_afii_disk.sgml |
| Temporary File Generation Script (*.xom) | use_ucs.xom | use_afii.xom |
| Reparsing Script (*.xom) | null.xom | null.xom |
| Output (*.out) | chsl_ucs_disk.out | chsl_afii_disk.out |
| Error Log (*.log) | chsl_ucs_disk.log | chsl_afii_disk.log |
Temporary files, which would normally be deleted at the end of the script, are retained here for examination.
The full TEI P3 distribution, including a modified SGML declaration (here sgmldecl.tei) and files added after the original release to support formal public identifiers, is required. TEI support for FPIs requires additional modification to the TEI distribution, as described in my separate Notes on TEI Support for Formal Public Identifiers (FPIs).
The present scripts are hard-coded for specific filenames. A general production system would generate filenames dynamically from the input files (SGML source, SDATA entities, WSDs).
Last modified 1998-12-02 by David J. Birnbaum (djb@clover.slavic.pitt.edu)