![Wave](/dist/wave-top.a2d249fc3a1ed2a0.png)
Why Common Voice?
Common Voice is a publicly available voice dataset, powered by the voices of volunteer contributors around the world. People who want to build voice applications can use the dataset to train machine learning models.
At present, most voice datasets are owned by companies, which stifles innovation. Voice datasets also underrepresent: non-English speakers, people of colour, disabled people, women and LGBTQIA+ people. This means that voice-enabled technology doesn’t work at all for many languages, and where it does work, it may not perform equally well for everyone. We want to change that by mobilising people everywhere to share their voice.
![robot](/dist/robot.dece5f93221179c4.png)
How does Common Voice work?
We’re crowdsourcing an open-source dataset of voices. Donate your voice, validate the accuracy of other people’s clips, make the dataset better for everyone.
Language Request
Someone asks for a language to be added.
Website Localization
The website text is translated into that language.
Sentence Collection
Sentences are collected for people to read aloud.
New Language Launch
We launch the Common Voice site in this language.
Voice Contribution
People come and contribute their voices.
Voice Validation
Other people validate those voice clips.
Dataset Release
We release the dataset every 3 months.
Want to stay in touch with Common Voice?
What is a language on Common Voice?
There are lots of ways to think about language. For the purposes of speech recognition models, Common Voice suggests focussing on ‘mutual intelligibility’, or ‘can speakers of this language mostly understand one another if they try to?’
We want speech models to be better at understanding a diverse range of speakers. For this to happen, a voice dataset must represent lots of different people.
Some languages have enormous variation in grammar, vocabulary and pronunciation. For this reason, we are introducing ‘Variants’ in 2022. This gives communities a way to distinguish their languages within the larger dataset.
How do I add a language?
First, check if your language already exists. If it isn’t, you can ask about adding your language. There are two stages. Translating the site, and collecting sentences
Translating the site
Watch our guide on how to use Pontoon.
We use a Mozilla tool called Pontoon for translations. Pontoon has lots of languages, but if it doesn’t have yours you can request for your language to be added. Then, to make the language available on the Common Voice project, request the new language on GitHub. See more on site translation.
Collecting sentences
Watch our guide on using the Sentence Collector.
You can add small numbers of sentences, or you can do bulk imports using Github. Remember that sentences need to be CC0 (or public domain), or you can write your own.
How does site localization work?
Translation of the Common Voice site happens on Pontoon.
Create an account if you don’t have one. Then, choose your language (‘Team’) and then choose the project, Common Voice. There will be files to translate. Click on one, then it shows words in English and a box to translate them.
Translation is from English, but you can see Suggestions in other languages. Click the Profile icon, then Settings link and add any languages you speak. At the bottom right-hand corner will be a list of translations called Locales. Translations show on the site after one day.
The site is ready to be launched when it reaches 75% completion.
Watch our video explainer to helpHow do I add sentences?
You can add sentences on the Write page or review sentences on the Review page.
Sentences must be reviewed and accepted by two people to be included in Common Voice. You create guidelines for your language here. Sentences must be in the public domain and shorter than 15 words. You can ask the owner of a text to make it CC0 using our waiver process, and send to us on commonvoice@mozilla.com
You can use the Sentence extractor to leverage short sentences from Wikipedia.
How do I record a high quality voice clip?
Speak in your normal voice! The way you speak is welcome here - we want your accent as it is, and we want your usual volume, style and intonation.
Avoid too much background noise - it should be easy to hear you.
Read the sentence carefully - don’t miss, change or add words.
Make sure the platform is recording before you start speaking, and that it only stops once you’re finished.
How can we effectively grow a language on Common Voice?
Creating opportunities for a diversity of people to contribute to Common Voice ensures the dataset serves as many people as possible. We’ve created resources and templates that you can use!
Events
You can run events to help people contribute. It’s easier than you think. You could do it online with a videoconferencing tool, or in person if it’s safe. Check out our templates and resources for running events.
Social media
You could use social media platforms to get the message out. Share posts that explain why it matters, and get in touch with other people talking about issues like language rights, voice AI, or bias in tech. See more advice on running a social campaign, including content you can re-use.
Partnerships and networks
Find others who care. That could be universities, language schools, advocacy groups or data science communities. Reach out and explain clearly how they can help and why. See our template outreach emails.
Get creative! Your language community will be unique, and these are just a few ways to get started.
How do I know whether to approve a voice clip?
If you could hear them and understand them, it’s usually best to approve.
Do not reject clips where the speaker ‘has an accent’ that is different to your own - this is important for voice recognition to work better for everyone.
If you think the pronunciation makes it impossible to understand, or there’s a lot of background noise, or there are other people speaking too, then you should reject the clip. See more information in our accuracy criteria.
If a clip is rejected by 2 people, it is released in a different subset of the dataset.
How is Common Voice funded?
Common Voice is a project of the Mozilla Foundation, a US 501c3. The project is currently funded entirely by philanthropic grants, and donations from people around the world.
It costs a lot of money to continually host and release the datasets, improve the platform and run community programmes.
If you or your organisation would like to contribute back to the project, you can make a donation or reach out to our partnerships team on commonvoice@mozilla.com.
How do I access and use the dataset?
You can go to the datasets page, select the version and language(s) you want, and download it! The files have associated metadata, such as demographic information and validation data. You’ll need to provide an email address to download the dataset.
If you’re looking for tools to build ASR models, you can connect to other people in the community on Discourse.
How are project decisions made?
Mozilla Common Voice is made possible by a diverse community of activists, linguists, data scientists, academics and software engineers from all over the world. The project is stewarded by the Mozilla Foundation.
Our governance is founded on the pillars of:
Privacy, security and transparency.
Community participation and decision making.
Value and recognition.
Mutual accountability.
Read more about how we're governed