Put our language data to work for you.

Interested in the raw data behind Ethnologue? We can empower you with unique perspectives into the global language landscape. Whether you’re engaged in research, product development, or demographic analysis, we’re here to equip you with the essential data to ensure the success of your project.

Our expertise lies in the in-depth coverage of diverse data focusing on languages, countries, speakers, literacy, population, official languages, and other critical sociolinguistic elements. By tapping into this wealth of information, we help you address practical issues and challenges, gain evidence and insights to guide policy and decision-making, and uncover empirical evidence for innovation and investigation.

Ethnologue is committed to helping you navigate the complexities of global networks, harness the power of language and cultural insights, and transform data into a competitive advantage. Contact us today to explore how we can support your organization’s journey to success.

 

Nonprofit and academic pricing available

Data resources available

Ethnologue Global Dataset

The Ethnologue Global Dataset contains the raw data behind our website, gathered by hundreds of linguists, experts, and field contributors from around the world. The dataset is comprised of three separate files:

Language DataCountry DataLanguage-in-Country Data
ISO codes, language families, statuses, country counts, centroid points, and more for 7,467 languages.Language counts, literacy rates, populations, diversity indices, and more for 242 countries.11,292 listings of data specific to a particular language within a particular country where it is used.

Files are in the standard tab-delimited format, which can be loaded into virtually any spreadsheet, database, or other data analysis tool. Note that this dataset does not include commentary; it contains basic data fields with simple values that can be submitted to statistical analysis.

Global Dataset Documentation (PDF)

Language GIS Dataset

Our Language GIS Dataset is the most comprehensive, up-to-date geographic dataset of the locations of the world’s 7,100+ living languages. It contains centroid coordinates for each language, and over seventy-five percent are also represented by boundary polygons that display the traditional homeland of each indigenous language. Data is provided in file geodatabase format for GIS systems (Esri shapefiles available upon request).

Language GIS Dataset Documentation coming soon.

Custom Datasets

Don’t need global data? Are you looking for additional information not included in either of these datasets? Let us know what you want to achieve, and our researchers will work with you to make it happen. We can provide subsets from either the Ethnologue Global Dataset or the Language GIS Dataset, or create other custom datasets including any data fields shown on our Language and Country profiles. You tell us what you need, and we’ll create the ideal solution.