Structure of the Code Tables


Three files make up the package of data tables that SIL International releases in support of the ISO 639-3 standard for language identifiers. They are tab-delimited files in which each line represents one row of a database table. The characters are encoded in the 8-bit standard known as ISO 8859-1 (which is a subset of the default Windows code page 1252). See Downloading the Code Tables for the latest version of the tables.  The complete list of three-letter language identifiers used in the current Ethnologue (along with name, primary country, and language status). The list of two-letter country codes that are used in the main language code table. An index for finding languages by country and by all known names (including primary name, alternate names, and dialect names).

The following declarations provide the formal definitions for SQL data tables into which the tab-delimited files can be loaded:

CREATE TABLE LanguageCodes (
   LangID      char(3) NOT NULL,  -- Three-letter code
   CountryID   char(2) NOT NULL,  -- Main country where used
   LangStatus  char(1) NOT NULL,  -- L(iving), (e)X(tinct)
   Name    varchar(75) NOT NULL)  -- Primary name in that country

CREATE TABLE CountryCodes (
   CountryID  char(2) NOT NULL,  -- Two-letter code from ISO3166
   Name   varchar(75) NOT NULL,  -- Country name
   Area   varchar(10) NOT NULL ) -- World area 
CREATE TABLE LanguageIndex (
   LangID    char(3) NOT NULL,  -- Three-letter code for language
   CountryID char(2) NOT NULL,  -- Country where this name is used
   NameType  char(2) NOT NULL,  -- L(anguage), LA(lternate),
                                -- D(ialect), DA(lternate)
                                -- LP,DP (a pejorative alternate)
   Name  varchar(75) NOT NULL ) -- The name