Notes on TEI Support for Formal Public Identifiers (FPIs)
(From my posting to the TEI-Tech distribution list, 1998-11-27)
The TEI FPI support introduced with the addition of teifpi2.dtd and
teifpi2.ent to the P3 distribution does not work properly with
Omnimark's omle v3.1 on my OS/2 system. I'm guessing that the problem
is not unique to OS/2, and perhaps not unique to this version of
Omnimark, but it is Omnimark-related, since it involves the structure
of Omnimark library files, which are used by Omnimark in place of the
more general SGML Open catalog files.
Here's the problem: A TEI file that parses correctly with nsgmls
fails to parse with omle, generating an error message about the
inability to resolve an entity declaration mapped to a system
identifier, public identifer, or both. The fix (steps #1, #2, and #4
below) causes nsgmls to fail to parse the file, revealing another
problem (step #3, below).
Here are the details of the solution:
- The revised TEI P3 distribution that supports FPIs declares
several entities with both public and system identifiers in
teifpi2.dtd and teifpi2.ent. Edit these to comment out the system
identifiers. The behavior of entities declared with both public and
system identifiers is undefined in SGML (DeRose, SGML FAQ
Book, section 4.9), which means that using both together is
asking for trouble. In particular, Omnimark ignores the public
identifier if it finds the system identifier and it looks for the
filename specified in the system identifier in its current working
directory. If you launch Omnimark from a directory that does not
contain the TEI distribution, it will fail to find the referenced
files. Removing the system identifiers from the entity declarations
tells Omnimark to consult its library to resolve the references.
- The revised TEI distribution declares only a system identifier for
wdgis2.ent in teiwsd2.dtd. Remove this system identifier and insert a
public identifier in its place. This appears to be an oversight that
occurred when the TEI DTDs were modified to support FPIs in the first
place.
- The preceding two steps create a problem for nsgmls, which was not
present as long as the system identifiers were declared explicitly.
Apparently through oversight, teifpi2.ent uses different FPIs for
several terminological database entities than teifpi.cat. If you edit
teifpi.cat to provide your own path to the TEI files and then
incorporate teifpi.cat into your own SGML Open catalog, it will not
contain some of the FPIs used in teifpi2.ent. Specifically, teifpi.ent
uses the word "Data" in several places where teifpi.cat uses the word
"Databases," as in:
(from teifpi.cat)
PUBLIC "-//TEI P3//ELEMENTS Base Element Set
for Terminological Databases 1994-05//EN"
"&DTDPATH;teiterm2.dtd"
(from teifpi2.ent, with the system entity commented out)
<!ENTITY % TEI.terminology.ent
PUBLIC '-//TEI P3//ENTITIES Element Classes for Terminological Data
1994-05//EN'
--'teiterm2.ent'-- >
nsgmls apparently fell back on the system identifier, and knew where
to look for it, so the original absence of a correct mapping for the
FPI did not generate error messages. Stripping out the system
identifiers from the TEI DTD files, which is necessary to enable the
Omnimark library to function properly, forces nsgmls to rely on the
SGML Open catalog. To fix this problem, modify your catalog to support
the FPIs actually used in teifpi2.ent.
- Omnimark library files cannot be generated automatically from SGML
Open catalog files by adding the line "LIBRARY" at the head and
deleting the string "PUBLIC " at the start of each public
identifier. One difference between the treatment of strings in SGML
Open catalog files and Omnimark library files is that the former treats
EOLs as spaces, while the latter treats EOLs differently from
spaces. To convert the TEI SGML Open catalog to an Omnimark library,
it is necessary to combine all public identifier names that are broken
across multiple lines into single lines.
Cautionary Note: The modifications described above,
especially in #1 and #2, fix the problem on my system, which relies
exclusively on public identifiers. Users who rely on system
identifiers, and do not use public identifiers, will probably find
that these modifications introduce new problems into their
configurations.
Last modified 1998-12-02 by David J. Birnbaum (djb@clover.slavic.pitt.edu)