AI demonstrated successes in researching rare and underrepresented languages
Large language models are narrowing the linguistic gap
New versions of artificial intelligence (AI) demonstrate significant progress in handling rare and low‑resource languages, substantially reducing the global “language gap.” This is confirmed by a study from RWS published on TechRadar.
1. What the research shows
- Google Gemini Pro received a quality score above 4.5 out of 5 for its knowledge of Kinyarwanda – spoken by about 12 million people in Rwanda, Uganda and the Democratic Republic of Congo.
- The authors explain the success as stemming from modern models relying not only on massive datasets for a specific language but also on shared statistical patterns across all languages (a cross‑lingual transfer mechanism).
- Improvements in tokenizers—systems that split text into “tokens”—also contribute to more accurate handling of rare languages.
2. The “benchmark drift” effect
Experts found that when moving from one model version to the next, its capabilities can change unexpectedly:
- The latest OpenAI GPT version falls behind older models on some content‑generation tasks, even though its predecessor was more effective.
- Tokenizer efficiency can vary up to 3.5 times between generations; this means results from earlier tests are not always applicable to new versions.
3. What is changing in developers’ priorities
- Earlier AI labs prioritized performance in English and a few key languages.
- Modern models already handle these tasks successfully, so attention shifts toward a broader audience: support for rare languages becomes increasingly important.
- However, a 4.5/5 rating does not guarantee real language proficiency; multilingual support is still not considered critical.
4. Bottom line
AI continues to break down barriers between cultures and languages. Although coverage of rare languages has not yet become a mandatory requirement, the trend toward broader audience reach is already visible and, as experts expect, will intensify in the coming years.
Comments (0)
Share your thoughts — please be polite and stay on topic.
Log in to comment