Supported Languages

Jiwar supports a wide range of languages, both with built-in corpora and through custom corpus support.

Languages with Built-in Corpora

These languages have pre-loaded corpora and can be used immediately with Jiwar:

N

Code

Language

Supported Writing Script

1

af

Afrikaans

Latin

2

ar

Arabic

Fully diacritized Arabic

3

bg

Bulgarian

Cyrillic

4

bs

Bosnian

Cyrillic, Latin

5

ca

Catalan

Latin

6

cs

Czech

Latin

7

de

German

Latin

8

el

Greek

Greek

9

en-gb

English (GB)

Latin

10

en-us

English (US)

Latin

11

eo

Esperanto

Latin

12

es

Spanish

Latin

13

et

Estonian

Latin

14

eu

Basque

Latin

15

fa

Persian

Perso-Arabic

16

fi

Finnish

Latin

17

fr

French

Latin

18

hr

Croatian

Latin

19

hu

Hungarian

Latin

20

hy

Armenian

Armenian

21

id

Indonesian

Latin

22

it

Italian

Latin

23

kk

Kazakh

Cyrillic

24

ko

Korean

Hangul

25

lt

Lithuanian

Latin

26

lv

Latvian

Latin

27

mk

Macedonian

Cyrillic

28

ms

Malay

Latin

29

nl

Dutch

Latin

30

no

Norwegian

Latin

31

pl

Polish

Latin

32

pt

Portuguese

Latin

33

ro

Romanian

Latin

34

ru

Russian

Cyrillic

35

sk

Slovak

Latin

36

sq

Albanian

Latin

37

sr

Serbian

Cyrillic

38

sv

Swedish

Latin

39

tr

Turkish

Latin

40

uk

Ukrainian

Cyrillic

41

ur

Urdu

Perso-Arabic

Languages Requiring Custom Corpora

These languages are supported by Jiwar but require a custom corpus:

Code

Language

am

Amharic

an

Aragonese

as

Assamese

az

Azerbaijani

ba

Bashkir

bn

Bengali

bpy

Bishnupriya Manipuri

chr

Cherokee

cmn

Mandarin Chinese

cv

Chuvash

en-029

Caribbean English

en-gb-x-gbclan

Lancastrian English

en-gb-x-gbcwmd

West Midlands English

en-gb-x-rp

Received Pronunciation English

es-419

Latin American Spanish

fa-latn

Persian (Latin script)

fr-be

Belgian French

fr-ch

Swiss French

ga

Irish Gaelic

gd

Scottish Gaelic

gn

Guarani

grc

Ancient Greek

gu

Gujarati

hak

Hakka Chinese

haw

Hawaiian

he

Hebrew

hi

Hindi

ht

Haitian Creole

hyw

Western Armenian

ia

Interlingua

io

Ido

is

Icelandic

ja

Japanese

jbo

Lojban

ka

Georgian

kl

Greenlandic

kn

Kannada

kok

Konkani

ku

Kurdish

ky

Kyrgyz

la

Latin

lb

Luxembourgish

lfn

Lingua Franca Nova

ltg

Latgalian

mi

Maori

ml

Malayalam

mr

Marathi

mt

Maltese

my

Burmese

nci

Classical Nahuatl

ne

Nepali

nb

Norwegian Bokmål

nog

Nogai

om

Oromo

or

Oriya

pa

Punjabi

pap

Papiamento

piqd

Klingon

pt-br

Brazilian Portuguese

qdb

Lang Belta

qu

Quechua

quc

K’iche’

qya

Quenya

ru-lv

Latvian Russian

sd

Sindhi

shn

Shan

si

Sinhala

sjn

Sindarin

smj

Lule Sami

sw

Swahili

ta

Tamil

te

Telugu

th

Thai

tk

Turkmen

tl

Tagalog

tn

Setswana

tt

Tatar

ug

Uyghur

uz

Uzbek

vi-vn-x-central

Central Vietnamese

vi-vn-x-south

Southern Vietnamese

yue

Cantonese

For instructions on creating and using custom corpora, please refer to the Creating and Using Custom Corpora page.