Backstage Library Works

The Catalog Polyglot: Metadata and World Languages

Collections come in all varieties of languages, and, sometimes, in non-Roman character sets. How does a catalog handle Kanji, Cyrillic, Hangul, or Perso-Arabic?

Which language goes where?

Let’s refresh on the basics for those non-catalogers in our audience.

  • The language of the cataloging agency will be the language used for the encoding of a record. In MARC terms, this is identified in the 040 $b. If you are an English-speaking catalog agency, your records will be in English, or eng.
  • Languages alternate to that of the cataloging language are still transcribed as found. In MARC terms, fields that reflect the resource, such as the 245, 264, and related fields, will be in the language of the source material.
Blue represents the language of the cataloging agency – English, in this case – while fields that represent the material are in the source language: here, Italian.

The same principles apply, but become a little more complicated, when we start considering character sets alternate to our own. Let’s continue with our example of an English-speaking catalog agency: we are used to recording materials using Western Latin character sets. Italian, Spanish, French, German, and many other languages fall under the same alphabet making cataloging for these languages easier; even if we’re unable to supply subject analysis for materials outside of our linguistical capabilities, we can at least transcribe principle identifying features.

Transliteration: How? What? Where? Why?

When we catalog works in alternative character sets, we would typically want to represent a version with the source-material’s script as well as a mirrored, transliterated field. We call the process of transliterating scripts into non-Western Latin character sets “romanization.”

Why transliterate?

  1. Not all catalogs are able to support non-Western Latin character sets.
  2. Transliteration provides an access point for those that do not have language expertise in the given script, aiding both searching and maintenance of bibliographic records.

Why provide the original script, then?

  1. Some catalogs are able to support alternative character sets.
  2. Those with language expertise will derive more information from the original script than from the transliteration.

Where possible, providing both is important, but it is also complicated. Libraries acquisition materials in many languages, all the better to fulfill the needs of their patrons. That doesn’t mean every library is going to have the language expertise to support those needs within a catalog.

The Library of Congress has made it an ongoing goal to support libraries in making the process as simple as possible, implementing automatic transliteration tools. Joel Hahn, too, is legendary for his macros, several of which take on a large part of the burden.

Some localities rely on cooperative cataloging to help bridge the gap of language expertise, as with Illinois libraries in the RAILS network. Cooperative cataloging programs are an excellent way to leverage the expertise within sister libraries, and Backstage partners with several institutions to take on the language groups that remain after the fact.  

World Language Cataloging at Backstage

Indic, Slavic, Austronesian – Backstage has endeavored for years to develop and maintain a network of qualified, experienced catalogers to cover the widest range of languages possible. Furthermore, we’ve always maintained that every project, and every library, is unique – over the last decade, our projects have come in many sizes. Sometimes, it’s a few titles through the year in the languages that aren’t within the expertise of the library’s cataloging team. Other times, it’s picking up the full breadth of multi-lingual acquisitions, as with the LC-COOP program.

A question we’re frequently asked when it comes to World Language cataloging is, “Is there a minimum batch size?” There is not, and we have many projects set up in such a way as to directly facilitate cataloging in a range of different languages as the titles are acquired through the year.

What languages can our team catalog?

  • English
  • African: Niger-Congo languages
  • Afro-Asiatic: Arabic, Hebrew, Amharic, Tigrinya
  • Altaic: Mongolian, Turkish, Ottoman Turkish, Turkic
  • Austronesian: Tagalog, Malaysian (Malay), Indonesian, Hawaiian
  • Baltic: Latvian, Lithuanian
  • Basque
  • Celtic: Welsh, Scots, Irish Gaelic, Breton
  • Chinese (Mandarin, Cantonese), Japanese, Korean
  • Dravidian: Tamil, Telugu, Malayalam
  • Finno-Ugric: Estonian, Finnish, Hungarian
  • Greek, Albanian, Armenian, Caucasian, Georgian
  • Indic: Bengali, Gujarati, Hindi, Marathi, Punjabi, Sanskrit, Urdu
  • Indo-Iranian: Nepalese, Oriya, Persian (Farsi)
  • Mayan languages
  • Romance: French, Italian, Portuguese, Romanian, Spanish (and Bilindex headings), Catalan, Galician, Provençal, Latin
  • Scandinavian: Danish, Swedish, Norwegian, Icelandic, Faroese
  • Sino-Tibetan: Burmese, Tibetan
  • Slavic: Belarusian, Bosnian, Bulgarian, Church Slavic, Croatian, Czech, Macedonian, Polish, Russian, Serbian, Serbo-Croatian, Slovak, Slovenian, Ukrainian
  • Tai-Kadai: Thai
  • Vietnamese
  • Western Germanic: Dutch, German
  • Yiddish, Ladino

If you’d like to know more, call us at 1.800.288.1265, visit us online at, or send an email to 

Learn More

Share this post

Looking for Something?

Search our site below

Skip to content