History of Kamusi

The Kamusi Project arose from a student's frustrations at learning the Swahili language. Martin Benjamin was learning Swahili to prepare for his Anthropology PhD research. While on a Fulbright intensive language studies program in Tanzania in 1993, he was regularly stymied by the old Swahili dictionaries then available, which were confusing and incomplete. He read about a project that had used something called “the Internet” to parcel out the work of breaking supposedly uncrackable cryptographic code, and thought a similar process could be applied to writing a new Swahili dictionary. He mentioned the idea to Ann Biersteker, his Swahili professor at Yale University, who encouraged him to write a proposal to the local branch of the Consortium for Language Teaching and Learning (CLTL). The Consortium approved the proposal in autumn 1994. In December of that year, in the same week that Netscape released the first “web browser,” the Kamusi Project was born.

The first step was for Benjamin to enter about three thousand terms into a spreadsheet, copied with permission from existing learners' glossaries. He then divided those terms into packs of 100 and put those files on a “gopher” server that people could access via a command line interface and dial-up modem. The intent was for volunteers to each expand one pack with new terms, and to keep subdividing the packs as contributions rolled in. That idea never really worked, however, because the process was too cumbersome and the number of Swahili enthusiasts using computers was too small. Instead, the project received copyright permission for a large out-of-print dictionary by Charles Rechenbach, was awarded a larger grant from the full CLTL, and concentrated on data entry and the development of a website (Yale's first in the social sciences or humanities) to distribute the results to the public.

In 1996, Dr. Biersteker was awarded funding for the project from the United States Department of Education's International Research and Studies program (IRS). This grant enabled the development of the “Edit Engine,” a tool that makes it possible for anyone to help edit dictionary entries. The Edit Engine went live in 1999, a year before Wikipedia began with a similar model (and with the important difference that all Kamusi changes must be approved by an editor before becoming public). At the same time, data became available through a searchable online database, rather than having to be downloaded as text or Excel files.

A second IRS grant in 2003 supported many additional features, such as a photo uploader for users to illustrate dictionary entries with appropriate images, a parser to return useable dictionary entries from conjugated verbs, and a grouping tool to organize entries according to priority and sense. By 2006, the Kamusi Project was being used about a million times a month by 60,000 unique visitors.

2007 marked a major transition for the project, which had run out of funding. Benjamin had left Yale, where the project was still housed, and moved to Lausanne, Switzerland for family reasons. Several interesting potential partnerships were emerging around the idea of expanding the Kamusi model to other African languages. However, these projects for the international public were better housed at an institution devoted specifically to the cause of language development. It was decided to move the project to the care of the non-profit World Language Documentation Centre, based in Wales, as an interim home while steps were taken to incorporate Kamusi independently. The online presence was established as kamusiproject.org, and then kamusi.org when that name was donated by its original registrant.

Incorporating Kamusi was completed in 2010. The organization is actually two legally independent non-profit entities: Kamusi Project USA for American-based activities and Swiss-based Kamusi Project International for projects with the rest of the world. Our US status makes it possible for Americans, historically Kamusi's most generous supporters, to continue contributing to our work. At the same time, Swiss incorporation facilitates work with partners throughout Africa, due to Switzerland's special open relations with most of the world. The two organizations have independent boards and completely separate accounting. Dr. Benjamin now serves as Executive Director of both NGOs.

Between 2007 and 2013, the Kamusi Project embarked on several exciting new initiatives:

In 2013, Kamusi joined the Distributed Information Systems Laboratory (LSIR) at EPFL in Switzerland. LSIR has provided a home for numerous exciting technical developments. However, financial viability has continued to elude us. In 2015, our move to a much larger and more intricate data model caused our server to crash, and we did not have the financial wherewithal to get back online for more than a year. In 2016, we stepped away from our limping big-data machine, and began offering the public a restricted service that provides the most accurate vocabulary translations available anywhere for numerous language pairs, while we seek a funding path that will enable us to offer the many services that have been developed behind the scenes at EPFL.

The history of the Kamusi Project has been one of both innovation and struggle. Funding resources for "exotic" languages are few and far between, and the project has found that it is very difficult to make progress unless key partners can be remunerated for their time. Nonetheless, the Kamusi Project has pressed forward and is now in a technical and regulatory position to provide advanced services for a great many languages spoken around the world. Many new and innovative projects are now in the pipeline, with partners from countries on every continent. The next chapters of this history are poised to be written.

Here are some annual highlights:

1993: Project conceived as a way to use collective resources to create new tools for learning Swahili.
1994: First proposal submitted, November. First glossary (3,000 words) begun, December.
1995: Gopher site established, January. Website established, April - first website in the social sciences or humanities at Yale. Wordlists incorporated from many remote contributors. 21,000 entry dictionary posted, September.
1996: Data entry to incorporate Rechenbach's Swahili-English Dictionary .
1997: Data editing.
1998: Programming work begins on Edit Engine. Swahili-Russian dictionary posted.
1999: 56,000 entry dictionary posted, Discussion Forum established, Africa Guide established.
2000: Revised dictionary posted, Edit Engine launched, April.
2001 - 2002: Project has no funding. Development work slows to a crawl, though Edit Engine submissions regularly incorporated into Kamusi lexicon.
2003: Renewed funding begins late July. Development work begins on Learning Guide.
2004: Move to faster, more secure server completed, March. Photo Upload feature introduced, May. Enabled search of plural forms, June. Begin formal collaboration with University of Dar es Salaam Department of Computer Science to establish a mirror server in Tanzania and incorporate computer terminology into the Kamusi lexicon, October. Launch complete site redesign, November. Introduce specialized vocabulary features, November. Continue work on Learning Center .
2005: Introduce the Grouping Tool to arrange dictionary entries. Add new data fields for terminology, dialect, taxonomy, derivation, related words, English definitions, and alternate spellings. Migrate to a more stable and flexible software platform. Improve search and display features. Add user conveniences, including more direct access to the Edit Engine.
2006: Funding runs out in January, project staff furloughed. Work continues with the help of private donations, including a generous grant from the Negaunee Foundation. The Kamusi Parser is introduced that allows users to search and evaluate conjugated Swahili verbs directly within the search engine.
2007: Project is moved from Yale to the World Language Documentation Centre and development work continues with the support of private donations.
2008: National Endowment for the Humanities grant to Grambling University to begin work within Kamusi for expanding the model to multiple languages, with a focus on Kinyarwanda. This grant was subsequently transferred directly to Kamusi after we completed our incorporation as a US legal non-profit corporation.
2009: Incorporation of Kamusi Project USA as a 501(c)(3) non-profit organization registered in Delaware, and Kamusi Project International as a non-governmental organization with the equivalent status registered in Geneva.
2010: Development of KamusiTERMS participatory terminology system and production of localization terminologies in 12 African languages, with the African Network for Localization, IT46, Translate House, and the support of IDRC in Canada. New logo unveiled.
2011: Begin work with University of Ngozi in Burundi on Kirundi language, in association with Universidad Politécnica de Madrid, with students receiving stipends in exchange for working on Kirundi entries.
2012: Programming of multilingual platform with Telamenta in South Africa.
2013: Launch of multilingual pilot, with 100 parallel terms defined in 20 languages, demonstrates that the new multilingual system works and has the potential to scale for unlimited additional languages. However, with no funding for continued language work, linguistic development grinds to a halt. In September, Kamusi joins the Distributed Information Systems Laboratory (LSIR) at EPFL in Switzerland, with support for certain technical development. In November, Kamusi is recognized as a launch partner in the White House Big Data Initiative.
2014: Focus on technical development, including games and mobile apps for engaging the public in the production of linguistic data.
2015: Our Big Data Beta introduces 1.2 million new interlinked records in more than 20 languages, proving Kamusi's capacity to scale. Work is launched on Vietnamese. Server crash in September knocked the site offline to the public for about a year.
2016: Public access moved to kamusigold.org while resources sought to restore full services on the main Kamusi site. Introduction of DUCKS shows the way Kamusi will align data across hundreds or thousands of languages. Launch of Kamusi Here! puts the world's most advanced multilingual dictionary search in the hands of users worldwide.

/info/history

Kamusi GOLD

These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.

Key

•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams

Software and Systems

We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:

Articles and Information

Kamusi has many elements. With these articles, you can read the details that interest you:

Videos and Slideshows

Some of what you need to know about Kamusi can best be understood visually. Our 📽 videos are not professional, but we hope you find them useful:

Partners

Our partners - past, present, and future - include:

Hack Kamusi

Here are some of the work elements on our task list that you can help do or fund:

Theory of Kamusi

Select a link below to learn about the principles that guide the project's unique approach to lexicography and public service.

Contact Us

We welcome your comments and questions, and will try to respond quickly. To get in touch, please visit our contact page. You must use a real email address if you want to get a real reply!

kamusigold.org/info/contact

© Copyright ©

The Kamusi Project dictionaries and the Kamusi Project databases are intellectual property protected by international copyright law, ©2007 through ©2016, under the joint ownership of Kamusi Project International and Kamusi Project USA. Further explanation may be found on our © Copyright page.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Commentary

Discussion items about language, technology, and society, from the Kamusi editor and others. This box is growing. To help develop or fund the project, please contact us!

Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.

/info/donate

Frequently Asked Questions

Answers to general questions you might have about Kamusi services.

We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.

Try it : Ask a "FAQ"!

Press Coverage

Kamusi in the news: Reports by journalists and bloggers about our work in newspapers, television, radio, and online.

Sponsor Search:
Who Do You Know?



To keep Kamusi growing as a "free" knowledge resource for the world's languages, we need major contributions from philanthropists and organizations. Do you have any connections with a generous person, corporation, foundation, or family office that might wish to make a long term impact on educational outcomes and economic opportunity for speakers of excluded languages around the world? If you can help us reach out to a potential 💛😇 GOLD Angel, please contact us!