Report on TEI Workshop in Blagoevgrad

1st Conference for Computer-Supported Processing of Medieval Slavic Manuscripts

26th and 28th of July, 1995


David Birnbaum

Document No: TEI EX R 4

Note: This report is based on a posting to TEITEACH on 8 August 1995.

I began the conference with a keynote address that is old hat to all of you: we need multiple-use, platform-independent files, and we do not need to standardize hardware, operating systems, application software, or even fonts, but we do need to standardize document file formats. And we need to do so in a flexible way, so that different formats (which serve different scholarly purposes) would simultaneously be different, yet standardized. SGML, as a markup METAlanguage, provides a mechanism for this. Even on the character coding level, we do not need to standardize our character sets as long as we standardize our way of documenting them (through something like the Writing System Declaration mechanism). I considered it important to emphasize these issues at the very beginning of the conference, so that all discussions of individual projects and applications would be conducted against a backdrop not only of "what can it do?" but also of "couldn't it do all this and still be portable?"

Many of the participants came to the conference already using TUSTEP, Collate, KLEIO, or other text processing software, usually with proprietary character encoding systems (which differ from one another not just in bit combinations, but also in basic inventory). Many believe that one hardware platform is better than any other, and that the way to standardize operations is for everyone to use the same fonts (often commercial, platform-specific creations). Most participants cannot afford to give up the productivity of working in their current environments, and need to be told that they can use SGML without giving up the programs mentioned above. I felt that the best opportunity for persuading them to consider SGML was to emphasize that it was not an alternative that would cut them off from their present systems, and that it would allow them to retain the latter while simultaneously providing people on different systems with access to their materials.

Most participants arrived with the assumption that you need to encode your text for a specific purpose. Our team emphasized (and anticipated) Jeffrey Triggs's recent observation ("Varieties of Electronic Experience," TEXT Technology, vol. 5, no. 3, Autumn, 1995, pp. 184-85) [1] that "... all processing should be post-processing. This can take any number of forms. It may be an XWindow text display system, such as Lector, a translation into nroff for display on ASCII terminals, or translation into troff, or TeX, or Postscript for printing, or a translation into HTML for distribution through the Web. It could also involve processing into word lists, filtering into a skeletal structure, or specialized tagging for grammar. The important thing is that none of these used forms is identical with the stored form." (Note: Information about TEXT Technology is available from Eric Johnson at JohnsonE@columbia.dsu.edu.)

Almost all participants seemed very excited about a system that would allow them to create portable documents while retaining the ability to continue using their local platforms and applications. As a followup to the conference, I will be establishing an ftp archive for early Slavic electronic texts at ftp.pitt.edu in dept/slavic, and Ralph Cleminson (Central European University, Budapest) will be setting up a web page devoted to the computer processing of early Slavic manuscripts ( http://www.ceu.hu/medstud/obsht.htm). Several participants have offered to make texts available for ftp distribution. We are accepting texts in absolutely any format, but we hope, of course, that with time SGML files will come to dominate.

We coordinated the conference largely by email. Winfried, Harry, and I met in Sofia Sunday afternoon (the TEI workshop was Wednesday) and continued our preparation on the bus to Blagoevgrad, and Nick met us in Blagoevgrad Monday evening. We spent almost five hours Tuesday night rehearsing our workshop by running through our individual segments (most of the assignments had been determined over email, with some fine tuning in person) and reviewing one another's slides. This rehearsal was absolutely crucial, not just because we learned from one another's comments, but also because it helped us ensure that our individual segments were organized to produce a cohesive whole. As my colleagues have already remarked, it would have been terribly difficult to maintain sufficient energy and interest had there been fewer than four of us.

Given that we had only five hours to work with, our goal was not to bring people to the level of being able to produce SGML documents, but simply to help them understand both what SGML can do and what an SGML document looks like. We all understand the relationship between data and processing, but people who are not familiar with informatics tend to think that programs are everything, and I was often asked "can your program do this / that / the other?" My constant response was that I don't have a program, I have a data architecture. I think this fundamental misunderstanding is terribly important and also terribly difficult to overcome, and that it needs to be addressed very explicitly, very simply, and repeatedly in the "what is SGML" part of the workshop. In no other part of our curriculum was there such a disparity between our assumptions and the assumptions of our audience.

The meeting room had two overhead projectors, one of which could be fitted with an LCD projector. None of us used the LCD projector for the TEI workshop (although Winfried did use it for a separate TUSTEP workshop), but we made a lot of use of the two-projector system to show, for example, a DTD fragment and a tagged text simultaneously. We noticed that our various slides showed different personal styles, ranging from some that were very bare (five or six word) outline lists of what was being discussed to some that were much more detailed. We did agree to exchange slides at the end of the workshop.

(Parenthetical confession time: none of us used SGML to prepare our slides. Some used word processors, some used PowerPoint, some used TeX. In justification, we felt that slide production was in many respects more a matter of typesetting [juggling font sizes and line breaks to ensure the best fit and the best presentation] than of content description. The same logical element was rendered differently in different places if that led to the best presentation.)

It was impossible to get the room dark enough to demonstrate the use of different SGML software for performing different tasks . We also had a lot of trouble getting the LCD projector to work with a Mac, a problem that was solved only because Nick happened to be carrying the necessary connector with him (which our host institution lacked, although they had assured us that their projector would work with both Macs and PCs). My reluctant personal impression is that I would not like to have to rely on the legibility of an LCD projector for a workshop, at least given today's projection technology. One compromise might be to make static transparent slides of screen shots of different SGML software performing different tasks; while this loses the effect of seeing it all happen before one's eyes, the results are far more reliable and far more legible.

The hands-on Author/Editor session was almost a disaster. The network at the American University in Bulgaria (our host institution in Blagoevgrad) is one of the best in Bulgaria, but it was terribly, terribly slow. We had loaded Author/Editor on the server and verified that it worked before the hands-on sessions, but when the time came for the actual session, we ran into problems with the system load that we had not anticipated. We fired up the fifteen machines and started Author/Editor an hour before the participants arrived, but when they did arrive, not a single copy of Author/Editor had started running, and all the monitors were still displaying the dreaded hourglass. None of us was feeling very optimistic.

We had originally planned for Nick and Anisava Miltenova (who is directing a TEI-conformant encoding project at the Institute of Literature of the Bulgarian Academy of Sciences in Sofia) to run the hands-on session from the front of the lab, with Harry, Winfried, and me wandering around and peering over people's shoulders as they got into trouble. But the nonfunctioning machines (plus the presence of some forty or fifty people in a lab not designed to hold that many) led very rapidly to chaos and confusion. Fortunately, a few machines began to come up shortly after everyone arrived, and once Author/Editor had begun running on a machine, it ran at an acceptable speed. Harry, Nick, and I had also brought notebooks loaded with Author/Editor and other SGML software (in addition to Author/Editor, I demonstrated PSGML and SP). People gathered in small groups around the notebooks and other machines (as they came alive), and the TEI instructors wandered around the lab working in these small groups (using texts that we had preloaded into the server). Fortunately, several members of Anisava's team from the Institute of Literature were there, and they already had Author/Editor experience, which enabled them to help with the instruction.

I'm not sure what can be done to avoid this type of situation in the future. We were aware of the importance of loading and testing the software in advance, and we had done so, but the problems we encountered were not evident during the test, and we had no idea that it would take Author/Editor over an hour to come up. We were very lucky that it came up at all, and that it ran at an acceptable speed once it had started. Perhaps the best advice is to be prepared with a contingency plan, as we were with our laptops and small group sessions. Once again, the fact that we had several TEI instructors present was essential to our success.


Professor David J. Birnbaum
Department of Slavic Languages
1417 Cathedral of Learning
University of Pittsburgh
Pittsburgh, PA 15260 USA
djbpitt+@pitt.edu
voice: 1-412-624-5712
fax: 1-412-624-9714
http://clover.slavic.pitt.edu/~djb/

Notes

[1] An on-line version of this paper is available at http://www.oed.com/Waterloo.html
[return to text]