
The SWiP project makes use of language, data and knowledge technologies to promote language equality among all of
South Africa
South Africa, officially the Republic of South Africa (RSA), is the Southern Africa, southernmost country in Africa. Its Provinces of South Africa, nine provinces are bounded to the south by of coastline that stretches along the Atlantic O ...
's official languages. The linguistic hegemonic status of English (and to a lesser extent Afrikaans) has resulted in English being the language of learning and teaching which downplays an African epistemology, thus local African languages are commonly under resourced. The acronym"SWiP" describes the three main partners in a national collaboration between
SADiLaR
SADiLaR (the South African Centre for Digital Language Resources), is a Department of Science and Innovation-sponsored initiative to create and manage digital resources and software supporting research and development in digital language reso ...
, the free encyclopedia Wikipedia and
PanSALB who are working alongside local speech and language communities within Academica, to address language equality using digital technologies, especially Wikipedia.
Under
apartheid
Apartheid ( , especially South African English: , ; , ) was a system of institutionalised racial segregation that existed in South Africa and South West Africa (now Namibia) from 1948 to the early 1990s. It was characterised by an ...
, certain languages were marginalised, including
isiNdebele,
Siswati
Swazi or siSwati is a Bantu language of the Nguni group spoken in Eswatini (formerly Swaziland) and South Africa by the Swati people. The number of speakers is estimated to be in the region of 4.7 million including first and second langua ...
,
Xitsonga
Tsonga ( ) or Xitsonga as an endonym (also known as Changana in Mozambique), is a Bantu language spoken by the Tsonga people of South Africa and . It is mutually intelligible with Tswa and Ronga and the name "Tsonga" is often used as a ...
and
Tshivenda
Venḓa or Tshivenḓa is a Bantu language and an official language of South Africa and Zimbabwe. It is mainly spoken by the Venda people (or Vhavenḓa) in the northern part of South Africa's Limpopo province, as well as by some Lemba peop ...
.
To address the underrepresentation of
South Africa's indigenous languages, three organisations are collaborating to build better low-resource languages corpora. These organisations are:
*
South African Centre for Digital Language Resources (SADiLaR)
*
Wikimedia South Africa
The Wikimedia Foundation, Inc. (WMF) is an American 501(c)(3) nonprofit organization headquartered in San Francisco, California, and registered there as a charitable foundation. It is the host of Wikipedia, the eighth most visited website ...
, (Wikipedia)
*
Pan South African Language Board
The Pan South African Language Board (, abbreviated PanSALB) is an organisation in South Africa established to promote multilingualism, to develop and preserve th12 official languages and to protect language rights in South Africa. The Board was ...
(PanSALB)
Wikipedia
Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
is a common source of language data for
natural language processing
Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
(NLP). Low-resource languages have limited corpora of text (speech data, annotated text and other forms of linguistic data) for LLMs to draw on for NLP. The SWiP project has introduced a variety of alternative possibilities for the collection and compilation of corpora of suitable text for low-resource languages, and rolled this out on a national scale. These corpora can be used to create corpus-based dictionaries or semi-automatic translation.
This collaborative project is also intended to promote, preserve, and digitise
South Africa
South Africa, officially the Republic of South Africa (RSA), is the Southern Africa, southernmost country in Africa. Its Provinces of South Africa, nine provinces are bounded to the south by of coastline that stretches along the Atlantic O ...
's indigenous languages and cultural knowledge by enhancing their presence on digital platforms such as Wikipedia.
By partnering with cultural and linguistic organisations, the project was designed to close the digital gap and ensure that local languages and cultural narratives are preserved and shared online.
Outcomes
It is anticipated that the SWiP Project will:
* Enhance the digital presence of indigenous South African languages on platforms like Wikipedia
* Empower communities in digital content creation through training and capacity-building
* Digitise and disseminate cultural knowledge and heritage in native languages
* Foster sustainable collaboration among academic, cultural, and digital communities in
South Africa
South Africa, officially the Republic of South Africa (RSA), is the Southern Africa, southernmost country in Africa. Its Provinces of South Africa, nine provinces are bounded to the south by of coastline that stretches along the Atlantic O ...
History
Phase 1 of the SWiP Project was launched on 20 September 2023 at UNISA with his Royal Majesty Enock Makhosoke
II Mabhena, the King of amaNdebele, attending.
This event launched a number of events listed below and was successfully completed. Phase 2 of the project began in November 2024 and continues through 2025 at venues such as the
Nelson Mandela University
Nelson Mandela University, formerly Nelson Mandela Metropolitan University, is a public university in South Africa. Established in 1882 as Port Elizabeth, Art School it comprises the former University of Port Elizabeth, the Port Elizabeth Tec ...
,
University of Mpumalanga as well as
University of Limpopo
The University of Limpopo () is a public university in the Limpopo Province, South Africa. It was formed on 1 January 2005, by merger of the University of the North and the Medical University of South Africa (MEDUNSA). These previous institution ...
.
Initiatives and events
IsiNdebele Wikipedia integration
An early success of the project was the integration of
isiNdebele into Wikipedia. Initially represented by only 11 articles in the Wikipedia Incubator, the language saw rapid growth to over 140 articles within a year (as of 29 May 2025, there are 180 articles), marking its transition to Wikipedia's main platform.
Community training and engagement
The project has conducted extensive training sessions, engaging over 300 participants from various
South African universities. Trainers introduced academics to Wikipedia and they learned article authorship skills (add content, citations, and photographs) and practiced translation using the Wikipedia translation tool.
These sessions led to the creation of hundreds of new articles, thousands of edits, and significant contributions of written content, references, and
multimedia
Multimedia is a form of communication that uses a combination of different content forms, such as Text (literary theory), writing, Sound, audio, images, animations, or video, into a single presentation. T ...
. The initiatives have fostered
digital literacy
Digital literacy is an individual's ability to find, evaluate, and communicate information using typing or digital media platforms. Digital literacy combines technical and cognitive abilities; it consists of using information and communication tec ...
and community engagement while significantly enhancing Wikipedia's indigenous language content.
Impact and achievements
Since its inception, the SWiP Project has:{{CN, date=May 2025
* Expanded Digital Content – Hundreds of new Wikipedia articles have been created in indigenous languages.
* Preserved Cultural Narratives – The project has ensured that cultural stories, languages, and traditions are accessible to global audiences.
* Empowered Communities – Through training sessions and collaborative workshops, over 300 participants have become active digital content creators.
*

Enhanced Visibility – The newly created content has collectively amassed millions of views, signifying a broad digital reach.
Dashboard from SWip Phase 2 at the
University of Mpumalangabr>
Dashboard from the SWip Phase 2 at the
University of Limpopo
The University of Limpopo () is a public university in the Limpopo Province, South Africa. It was formed on 1 January 2005, by merger of the University of the North and the Medical University of South Africa (MEDUNSA). These previous institution ...
br>
SWiP Resource Page
The
metawiki:SWiP Resource Page, SWiP Resource Page is accessible to anyone interested in learning how to edit Wikipedia.
References
Workshops
Organisations based in South Africa
Language education in South Africa