Filters
Search
Product type
Language
Country
Year of Collection

Swedish (Sweden) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDswe_SWE_PHON
TypeText
Unit105,000 words
LanguageSwedish
CountrySweden

Sylheti (Bangladesh – India) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDsyl_BGD_PHON
TypeText
Unit22,000 words
LanguageSylheti
CountryBangladesh - India

Tagalog (Philippines) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDtgl_PHL_PHON
TypeText
Unit34,000 words
LanguageTagalog
CountryPhilippines

Tamil (India) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDtam_IND_PHON
TypeText
Unit106,000 words
LanguageTamil
CountryIndia

Telugu (India) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDtel_IND_PHON
TypeText
Unit51,000 words
LanguageTelugu
CountryIndia

Thai (Thailand) printed text OCR

More info
Common Use CasesDocument Processing, Document Search, Text detection
Dataset IDIMG_OCR_THA_CN
TypeImage
Unit1219 images
LanguageThai
CountryThailand

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert