Words Words Words

What words belong in a dictionary? We want them all - every word, in every language. To reach anywhere close to that goal, we combine a number of strategies:
  1. Aligned open 🔢 data. Terms for about 50 languages have been matched to the English version of 🔠🕸 WordNet. Each English 💭 idea has a number, and each language in WordNet has matched its terms to the same number. We've assembled all those words in Kamusi Here!, but that is just a starting point because (a) most languages don't cover the full 🔠🕸WordNet, (b) WordNet does not include nearly all concepts or terms in English, (c) every language has its own indigenous concepts that are not included in WordNet, and (d) 🔠🕸 WordNet only includes basic canonical forms, not the range of shapes that comprise a term. Additional data that, such as images, that has been elsewhere aligned to 🔠🕸 WordNet can also be automatically imported.
  2. Non-aligned open 🔢 data. Many collections of words are 🆓 freely available, but are not already matched to enumerated concepts. If we have a 🔢 data point that something is equivalent to l-i-g-h-t, is that ⚖ (not heavy), 💡 (not dark), or 😆 (not serious)? We have designed 🐥📊 DUCKS for our visitors to match data they recognize to the data we already have, which both lines up equivalent concepts and reveals concepts we don't yet have, in English and indigenous to other languages.
  3. Legacy dictionaries. Most dictionaries were not conceived as "🔢 data", but rather little bricks of information about their terms. There is no consistency among old dictionaries, or even within one - for example, whether a comma is used to separate synonyms or two different senses. Such dictionaries need to be converted into operable data, either from formats such as Word (if we are 🍀 lucky), or scanned as PDFs and passed through OCR (which frequently fails with smudged 👴📃 old texts or unusual character sets). Moreover, copyrights © need to be respected, so we can only use very old dictionaries or those for which we can take months to find the owners and negotiate permission.
  4. Data 🔢 from 👪 people. Our list of concepts lets us know which 💭 ideas we do not have expressions for in any language. We display that missing data as gold boxes our visitors can fill in, and we can publish that information once it achieves consensus among a 👪🔊 speaker community. Over time, our crowd techniques are aimed to fill as many gaps as possible in collecting every word in every language.
/info/words

Kamusi GOLD

These are the languages for which we have datasets that we are actively working toward putting online. Languages that are Active for you to search are marked with "A" in the list below.

Key

•A = Active language, aligned and searchable
•c = Data 🔢 elicited through the Comparative African Word List
•d = Data from independent sources that Kamusi participants align playing 🐥📊 DUCKS
•e = Data from the 🎮 games you can play on 😂🌎🤖 EmojiWorldBot
•P = Pending language, data in queue for alignment
•w = Data from 🔠🕸 WordNet teams

Software and Systems

We are actively creating new software for you to make use of and contribute to the 🎓 knowledge we are bringing together. Learn about software that is ready for you to download or in development, and the unique data systems we are putting in place for advanced language learning and technology:

Articles and Information

Kamusi has many elements. With these articles, you can read the details that interest you:

Videos and Slideshows

Some of what you need to know about Kamusi can best be understood visually. Our 📽 videos are not professional, but we hope you find them useful:

Partners

Our partners - past, present, and future - include:

Hack Kamusi

Here are some of the work elements on our task list that you can help do or fund:

Theory of Kamusi

Select a link below to learn about the principles that guide the project's unique approach to lexicography and public service.

Contact Us

We welcome your comments and questions, and will try to respond quickly. To get in touch, please visit our contact page. You must use a real email address if you want to get a real reply!

kamusigold.org/info/contact

© Copyright ©

The Kamusi Project dictionaries and the Kamusi Project databases are intellectual property protected by international copyright law, ©2007 through ©2016, under the joint ownership of Kamusi Project International and Kamusi Project USA. Further explanation may be found on our © Copyright page.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Commentary

Discussion items about language, technology, and society, from the Kamusi editor and others. This box is growing. To help develop or fund the project, please contact us!

Our biggest struggle is keeping Kamusi online and keeping it free. We cannot charge money for our services because that would block access to the very people we most want to benefit, the students and speakers of languages around the world that are almost always excluded from information technology. So, we ask, request, beseech, beg you, to please support our work by donating as generously as you can to help build and maintain this unique public resource.

/info/donate

Frequently Asked Questions

Answers to general questions you might have about Kamusi services.

We are building this page around real questions from members of the Kamusi community. Send us a question that you think will help other visitors to the site, and frequently we will place the answer here.

Try it : Ask a "FAQ"!

Press Coverage

Kamusi in the news: Reports by journalists and bloggers about our work in newspapers, television, radio, and online.

Sponsor Search:
Who Do You Know?



To keep Kamusi growing as a "free" knowledge resource for the world's languages, we need major contributions from philanthropists and organizations. Do you have any connections with a generous person, corporation, foundation, or family office that might wish to make a long term impact on educational outcomes and economic opportunity for speakers of excluded languages around the world? If you can help us reach out to a potential 💛😇 GOLD Angel, please contact us!