Supported Languages
Jiwar supports a wide range of languages, both with built-in corpora and through custom corpus support.
Languages with Built-in Corpora
These languages have pre-loaded corpora and can be used immediately with Jiwar:
N |
Code |
Language |
Supported Writing Script |
|---|---|---|---|
1 |
af |
Afrikaans |
Latin |
2 |
ar |
Arabic |
Fully diacritized Arabic |
3 |
bg |
Bulgarian |
Cyrillic |
4 |
bs |
Bosnian |
Cyrillic, Latin |
5 |
ca |
Catalan |
Latin |
6 |
cs |
Czech |
Latin |
7 |
de |
German |
Latin |
8 |
el |
Greek |
Greek |
9 |
en-gb |
English (GB) |
Latin |
10 |
en-us |
English (US) |
Latin |
11 |
eo |
Esperanto |
Latin |
12 |
es |
Spanish |
Latin |
13 |
et |
Estonian |
Latin |
14 |
eu |
Basque |
Latin |
15 |
fa |
Persian |
Perso-Arabic |
16 |
fi |
Finnish |
Latin |
17 |
fr |
French |
Latin |
18 |
hr |
Croatian |
Latin |
19 |
hu |
Hungarian |
Latin |
20 |
hy |
Armenian |
Armenian |
21 |
id |
Indonesian |
Latin |
22 |
it |
Italian |
Latin |
23 |
kk |
Kazakh |
Cyrillic |
24 |
ko |
Korean |
Hangul |
25 |
lt |
Lithuanian |
Latin |
26 |
lv |
Latvian |
Latin |
27 |
mk |
Macedonian |
Cyrillic |
28 |
ms |
Malay |
Latin |
29 |
nl |
Dutch |
Latin |
30 |
no |
Norwegian |
Latin |
31 |
pl |
Polish |
Latin |
32 |
pt |
Portuguese |
Latin |
33 |
ro |
Romanian |
Latin |
34 |
ru |
Russian |
Cyrillic |
35 |
sk |
Slovak |
Latin |
36 |
sq |
Albanian |
Latin |
37 |
sr |
Serbian |
Cyrillic |
38 |
sv |
Swedish |
Latin |
39 |
tr |
Turkish |
Latin |
40 |
uk |
Ukrainian |
Cyrillic |
41 |
ur |
Urdu |
Perso-Arabic |
Languages Requiring Custom Corpora
These languages are supported by Jiwar but require a custom corpus:
Code |
Language |
|---|---|
am |
Amharic |
an |
Aragonese |
as |
Assamese |
az |
Azerbaijani |
ba |
Bashkir |
bn |
Bengali |
bpy |
Bishnupriya Manipuri |
chr |
Cherokee |
cmn |
Mandarin Chinese |
cv |
Chuvash |
en-029 |
Caribbean English |
en-gb-x-gbclan |
Lancastrian English |
en-gb-x-gbcwmd |
West Midlands English |
en-gb-x-rp |
Received Pronunciation English |
es-419 |
Latin American Spanish |
fa-latn |
Persian (Latin script) |
fr-be |
Belgian French |
fr-ch |
Swiss French |
ga |
Irish Gaelic |
gd |
Scottish Gaelic |
gn |
Guarani |
grc |
Ancient Greek |
gu |
Gujarati |
hak |
Hakka Chinese |
haw |
Hawaiian |
he |
Hebrew |
hi |
Hindi |
ht |
Haitian Creole |
hyw |
Western Armenian |
ia |
Interlingua |
io |
Ido |
is |
Icelandic |
ja |
Japanese |
jbo |
Lojban |
ka |
Georgian |
kl |
Greenlandic |
kn |
Kannada |
kok |
Konkani |
ku |
Kurdish |
ky |
Kyrgyz |
la |
Latin |
lb |
Luxembourgish |
lfn |
Lingua Franca Nova |
ltg |
Latgalian |
mi |
Maori |
ml |
Malayalam |
mr |
Marathi |
mt |
Maltese |
my |
Burmese |
nci |
Classical Nahuatl |
ne |
Nepali |
nb |
Norwegian Bokmål |
nog |
Nogai |
om |
Oromo |
or |
Oriya |
pa |
Punjabi |
pap |
Papiamento |
piqd |
Klingon |
pt-br |
Brazilian Portuguese |
qdb |
Lang Belta |
qu |
Quechua |
quc |
K’iche’ |
qya |
Quenya |
ru-lv |
Latvian Russian |
sd |
Sindhi |
shn |
Shan |
si |
Sinhala |
sjn |
Sindarin |
smj |
Lule Sami |
sw |
Swahili |
ta |
Tamil |
te |
Telugu |
th |
Thai |
tk |
Turkmen |
tl |
Tagalog |
tn |
Setswana |
tt |
Tatar |
ug |
Uyghur |
uz |
Uzbek |
vi-vn-x-central |
Central Vietnamese |
vi-vn-x-south |
Southern Vietnamese |
yue |
Cantonese |
For instructions on creating and using custom corpora, please refer to the Creating and Using Custom Corpora page.