Mapping between ISO 639 Language Codes and the Languages Identified in the Ethnologue

The mapping information and links formerly available from this page have been superseded with the continuing development of the ISO 639 family of standards, in particular the publication of the ISO 639-3 standard. Please consult the website of the ISO 639-3 Registration Authority at: http://www.sil.org/iso639-3/.
 
The information provided here gives some historical background on the original mappings of ISO 639-2 code set to the code set of the Ethnologue 14th edition.



Mappings

The following link displays a page showing all the two- and three-letter codes of the ISO 639 standard. Each code is formatted as a link that displays a full page describing how that ISO 639 code maps onto the languages identified in the Ethnologue:

The inverse mapping, namely, from Ethnologue language to ISO 639 code, is shown at the top of every Ethnologue language page ( show_language.asp) beneath the page title. For instance, the following link displays a page showing the complete set of SIL three-letter codes used in the Ethnologue. Each code is formatted as a link that displays the full page describing that language:

The mapping between the two code sets may also be downloaded as a data table that can be used in custom applications.

Background

ISO 639, Code for the representation of names of languages (Geneva, International Organization for Standardization, 1998), is the most widely known standard for language identification codes. Part 1 of the standard defines 160 two-letter codes for identifying individual languages. Part 2 of the standard defines three-letter codes for 381 languages (including all those covered in Part 1). In addition, it defines 55 "collective" codes which are used to cover languages that do not have individual codes. With the inclusion of a catch-all code for "miscellaneous languages", part 2 of the standard is designed to assign one of its 400+ codes to any language of the world. The authoritative on-line version of the standard may be found at:

The Ethnologue system of language identifiers, by contrast, assigns a unique three-letter code to each of the 7,000+ known living and recently extinct languages of the world. Because this set of language codes is comprehensive, it has become a de facto standard among many projects that need a unique code for every language. The complete code set used in the Ethnologue is available for download at:

The motivation for publishing this mapping between ISO 639 language codes and the languages identified in the Ethnologue is two-fold:

Methodology

In many cases determining the mapping from a given ISO 639 code to one or more languages identified in the Ethnologue is far from straightforward. In order to avoid an outcome in which each mapping was an arbitrary choice that could be debated endlessly, we articulated a set of twelve principles to govern the decision-making process and then tried to follow them consistently. Thus questions about individual mappings should be addressed by demonstrating that the principles were not applied consistently, or by proposing a change to one of the principles. These principles are set out in the following working paper:

A key resource consulted in the process of determining the mappings was the following online publication by the Library of Congress that documents their interpretation of the ISO 639 codes:

In cases where following our principles leads to a different conclusion than the MARC code list, the show_iso639.asp report on the specific code notes the discrepancy.

Almost two-thirds of the codes from ISO 639-2 map straightforwardly to one Ethnologue language. However, this leaves over 150 codes that are not straightforward. The following link displays a page that offers an analysis of the different ways in which an ISO 639-2 code may map onto Ethnologue languages:

Download

Terms of Use. The mappings are made available as downloadable tables. You are welcome to download the mappings as provided below and incorporate the supplied tables into your own database application on condition that you do so in accordance with our Terms of Use statement.

The tables are in tab-delimited format and have only two columns: the first column is the ISO 639 code and the second column is the SIL code for a corresponding Ethnologue language.

The first two tables give the mapping from the two-letter codes of ISO 639-1. The first lists the mappings for all two-letter codes from ISO 639-1; the second lists only those two-letter codes that correspond to exactly one Ethnologue language. (ISO 639-1 codes that correspond to more than one Ethnologue language are listed more than once in the first table.)

Similarly, the next two tables give the mapping from the three-letter codes of ISO 639-2 (including all codes from both the B and T sets). The first lists the mappings for all three-letter codes from ISO 639-2, except for those that are out of scope for the Ethnologue and thus have no equivalent. The second lists only the three-letter codes that are for an individual language and that correspond to exactly one Ethnologue language. (ISO 639-2 codes that correspond to more than one Ethnologue language are listed more than once in the first table.)

 


This web edition of the Ethnologue contains all the content of the print edition and may be cited as:
Grimes, Barbara F. (ed.), 2000. Ethnologue: Languages of the World, Fourteenth edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com/14.

  Ethnologue: 14 Edition  |  Site map, 14th edition  |  Other editions  |  Who we are  |  Site search  
 


Contact us

Copyright © 2000 SIL International