Off-the-Shelf AI Training Datasets

Swedish (Sweden) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDswe_SWE_PHON

TypeText

Unit105,000 words

LanguageSwedish

CountrySweden

Sylheti (Bangladesh – India) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDsyl_BGD_PHON

TypeText

Unit22,000 words

LanguageSylheti

CountryBangladesh - India

Tagalog (Philippines) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDtgl_PHL_PHON

TypeText

Unit34,000 words

LanguageTagalog

CountryPhilippines

Tamil (India) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDtam_IND_PHON

TypeText

Unit106,000 words

LanguageTamil

CountryIndia

Telugu (India) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDtel_IND_PHON

TypeText

Unit51,000 words

LanguageTelugu

CountryIndia

Thai (Thailand) printed text OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_THA_CN

TypeImage

Unit1219 images

LanguageThai

CountryThailand

Swedish (Sweden) Pronunciation Dictionary

Dataset successfully added to the Quote List

Sylheti (Bangladesh – India) Pronunciation Dictionary

Dataset successfully added to the Quote List

Tagalog (Philippines) Pronunciation Dictionary

Dataset successfully added to the Quote List

Tamil (India) Pronunciation Dictionary

Dataset successfully added to the Quote List

Telugu (India) Pronunciation Dictionary

Dataset successfully added to the Quote List

Thai (Thailand) printed text OCR

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets