Ethnologue Global Dataset


The Ethnologue Global Dataset makes it possible for researchers to replicate the statistical summaries that are published in Ethnologue and to use data from Ethnologue in their own analyses. Most of the information published in Ethnologue is in the form of textual comments that are not amenable to statistical analysis; these fields of information are specifically not included in the dataset. This dataset contains only data fields with simple values (like booleans, numbers, categories) that can be submitted to statistical analysis. The data tables are supplied in the ubiquitous tab-delimited format which can be loaded into virtually any spreadsheet, database, or other data analysis tool.

The product is distributed as a zip-archive containing four files:

  • A document describing the product in detail: its terms of use, its format, the exact contents of each data table, and information on sources of additional data.
  • A table of country data containing 12 columns of information about 236 countries.
  • A table of language data containing 22 columns of information about 7,469 languages. The data in this table pertain to the language in general or provide aggregated results over all the countries in which it is used.
  • A table of language-in-country data containing 22 columns of information about 10,995 instances of language-in-country. The data in this table pertain to the language in a particular country where it is used.

The full documentation may be downloaded in order to read the definitions of all the data columns that are included.

Licensing and terms of use

The Ethnologue Global Dataset is licensed under the following terms of use:

  • The product is licensed for individual use. You may not share your copy with others, except to do so temporarily with someone who is assisting you in the analysis you are performing.
  • You may freely publish and distribute visualizations you create from your analysis of these data and tables presenting results that aggregate over the data.
  • You should cite this product as the source in any work you produce that is based on analysis of this dataset or that includes visualizations of any of this information.
  • You may not redistribute the raw data in any form, including displaying it on a public web site, posting the files on an intranet site, or incorporating any of these data into a dataset that you are distributing. To inquire about uses such as these, contact us.

A discounted Academic License is offered to academic and humanitarian institutions who will use the data for non-commercial purposes.  Application must be made for this license. The product is available for immediate download under the Commercial License to any one who agrees to the terms listed above.

Back to Ethnologue Products