Notes on TEI Support for Formal Public Identifiers (FPIs)

(From my posting to the TEI-Tech distribution list, 1998-11-27)

The TEI FPI support introduced with the addition of teifpi2.dtd and teifpi2.ent to the P3 distribution does not work properly with Omnimark's omle v3.1 on my OS/2 system. I'm guessing that the problem is not unique to OS/2, and perhaps not unique to this version of Omnimark, but it is Omnimark-related, since it involves the structure of Omnimark library files, which are used by Omnimark in place of the more general SGML Open catalog files.

Here's the problem: A TEI file that parses correctly with nsgmls fails to parse with omle, generating an error message about the inability to resolve an entity declaration mapped to a system identifier, public identifer, or both. The fix (steps #1, #2, and #4 below) causes nsgmls to fail to parse the file, revealing another problem (step #3, below).

Here are the details of the solution:

  1. The revised TEI P3 distribution that supports FPIs declares several entities with both public and system identifiers in teifpi2.dtd and teifpi2.ent. Edit these to comment out the system identifiers. The behavior of entities declared with both public and system identifiers is undefined in SGML (DeRose, SGML FAQ Book, section 4.9), which means that using both together is asking for trouble. In particular, Omnimark ignores the public identifier if it finds the system identifier and it looks for the filename specified in the system identifier in its current working directory. If you launch Omnimark from a directory that does not contain the TEI distribution, it will fail to find the referenced files. Removing the system identifiers from the entity declarations tells Omnimark to consult its library to resolve the references.
  2. The revised TEI distribution declares only a system identifier for wdgis2.ent in teiwsd2.dtd. Remove this system identifier and insert a public identifier in its place. This appears to be an oversight that occurred when the TEI DTDs were modified to support FPIs in the first place.
  3. The preceding two steps create a problem for nsgmls, which was not present as long as the system identifiers were declared explicitly. Apparently through oversight, teifpi2.ent uses different FPIs for several terminological database entities than teifpi.cat. If you edit teifpi.cat to provide your own path to the TEI files and then incorporate teifpi.cat into your own SGML Open catalog, it will not contain some of the FPIs used in teifpi2.ent. Specifically, teifpi.ent uses the word "Data" in several places where teifpi.cat uses the word "Databases," as in:
    
    (from teifpi.cat)
    PUBLIC "-//TEI P3//ELEMENTS Base Element Set
            for Terminological Databases 1994-05//EN"
           "&DTDPATH;teiterm2.dtd"
    
    (from teifpi2.ent, with the system entity commented out)
    <!ENTITY % TEI.terminology.ent
        PUBLIC '-//TEI P3//ENTITIES Element Classes for Terminological Data
    1994-05//EN'
               --'teiterm2.ent'-- >
    
    
    nsgmls apparently fell back on the system identifier, and knew where to look for it, so the original absence of a correct mapping for the FPI did not generate error messages. Stripping out the system identifiers from the TEI DTD files, which is necessary to enable the Omnimark library to function properly, forces nsgmls to rely on the SGML Open catalog. To fix this problem, modify your catalog to support the FPIs actually used in teifpi2.ent.
  4. Omnimark library files cannot be generated automatically from SGML Open catalog files by adding the line "LIBRARY" at the head and deleting the string "PUBLIC " at the start of each public identifier. One difference between the treatment of strings in SGML Open catalog files and Omnimark library files is that the former treats EOLs as spaces, while the latter treats EOLs differently from spaces. To convert the TEI SGML Open catalog to an Omnimark library, it is necessary to combine all public identifier names that are broken across multiple lines into single lines.

Cautionary Note: The modifications described above, especially in #1 and #2, fix the problem on my system, which relies exclusively on public identifiers. Users who rely on system identifiers, and do not use public identifiers, will probably find that these modifications introduce new problems into their configurations.


Last modified 1998-12-02 by David J. Birnbaum (djb@clover.slavic.pitt.edu)