Modified Library of Congress Cyrillic Encoding

David J. Birnbaum (djb@clover.slavic.pitt.edu)
Yulia Chugunova (yulia@clover.slavic.pitt.edu)

Copyright © 1996 by David J. Birnbaum and Yulia Chugunova.
All rights reserved.


The Modified Library of Congress Cyrillic Encoding Vector is optimized for ASCII-based interfaces, such as HTML forms text input fields. The encoding system is based on the "scholarly" transliteration system, modified to cope with the lack of support for diacritics in ASCII. All "scholarly" character codes that do not require diacritics are retained (e.g., c [not ts], x [not kh], j [not i], ja [not ia]). "Scholarly" character codes that require diacritics are replaced by their Library of Congress counterparts where possible (e.g., sh, zh, shch).

Two idiosyncrasies are eh for e oborotnoe and oh for jo. The latter is counter-intuitive, but was required to avoid confusing ohzh 'hedgehog' with jod 'iodine'. Stress is marked by an asterisk (* ) following the stressed vowel. There is no provision for marking secondary stress.

Advantages:

Disadvantages:

Modified Library of Congress Cyrillic Encoding
LetterUpper
Case
Lower
Case
aAa
bBb
vVv
gGg
dDd
eEe
joOHoh
zhZHzh
zZz
iIi
jJj
kKk
lLl
mMm
nNn
oOo
pPp
rRr
sSs
tTt
uUu
fFf
xXx
cCc
chCHch
shSHsh
shchSHCHshch
hard sign""
yYy
soft sign''
e oborotnoeEHeh
juJUju
jaJAja