script disparities and colonial histories


What are the implications of colonial history on language ?  How can we evaluate the evidence of disparities in "colonized" vs. "colonizing" languages in the present day ?


If we are going to talk about today, we need to take into consideration the domains in which language is used today.  The digital sphere is, by most evidence, the principal site of all human languages' futures: if a language cannot survive in a digital landscape (and particularly on the Internet), its chances of survival at all, over the next century, are drastically decreased.




the digital language gap

English is the mother tongue of only about 5% of the world's population; and yet, over 52% of online content is in English.  By comparison, about 17% of the world's population speaks Mandarin or another variant of Chinese as a first language; a mere 1.2% of web content is in Chinese.


The disparity in real language use versus digital language representation is enormous, and it often occurs along historically colonial lines.  Using data from's "Top 500 sites on the web" database, in combination with data on language user populations from Ethnologue, I have compiled the following data visualizations to give some tangibility to this discrepancy.  Toggle between the graphs by clicking on the title tabs, and keep in mind the colonial histories of each language represented.


Step by Step Charts

case study: north africa

As a region relatively recently affected by the European colonial enterprise, North Africa can be considered as still in a process of decolonization.  In order to provide some focus to my research on language representation in postcolonial contexts, I decided to concentrate on three countries from this region—Morocco, Algeria, and Tunisia—whose colonial histories differ at certain points but share several important characteristics.  All three countries were colonized by France throughout the 19th (in the case of Algeria and Tunisia) and 20th (in the case of all three) centuries.  As a result of colonial language policies and the technological dominance of European languages—indeed, of any Latin-script-based language—throughout the mid-to-late 20th century, digital development in these three countries occurred mostly by and for European language-users.


In the sphere of language politics, the tables have largely turned in all three of these countries since their respective official decolonizations: all now have established Arabic as (at least one of) the official language(s) of state, and have also demoted French, the ex-colonizer's language, to mere "national" status rather than official administrative status.


Unfortunately, the lack of hardware and software development for the Arabic script—especially in comparison to that for the Latin script that French and the other most widely-spoken European language in the region, English—has decreased the potency of language policy in effecting real change in the digital linguistic landscape of all three countries.


I used data from's country-by-country site rankings to compile these particular graphs; in order to determine the default language of each site mentioned, I visited the site manually and recorded the language in which it automatically appeared.  Click here to view the full data sets, which also include (non-visualized) the languages in which each site is available in addition to its default language, in order to provide a slightly more complete picture of how the field is developing.


I. morocco

Like Tunisia and Algiera, Morocco does actually have its own Arabic-script country domain name now: .المغرب.  However, the domain is only used for one site as of yet, which is a test site for the developing domain.  For this reason, I have not included data on .المغرب.


In contrast to this lag behind Tunisia and Algeria in use of Arabic-script domains, Morocco actually has, in its most popular sites, the highest rate of Arabic language use and .ma domain-name use.


II. tunisia

Tunisia's .تونس domain name has become fairly active, now with over 300 sites registered within it.  Tunisia's top sites are primarily in English, and it is the only country in this grouping that also includes an Italian-language site in its top sites.


III. algeria

The Algerian domain name .الجزائر is currently more active than Morocco's .المغرب, but only has 48 registered sites as of now.  Algeria falls between Tunisia and Morocco in terms of English and French-language dominance on the top sites list; overall, though, Algeria also has the fewest sites registered in its .dz county domain, at only 9778 (compared to over 30,000 in Tunisia's .tn and over 60,000 in Morocco's .ma).



Site design by Madeleine Leddy, © 2018