Filters
Search
Product type
Language
Country
Year of Collection

Lithuanian (Lithuania) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDlit_LTU_PHON
TypeText
Unit71,000 words
LanguageLithuanian
CountryLithuania

Malayalam (India) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDmal_IND_PHON
TypeText
Unit19,000 words
LanguageMalayalam
CountryIndia

Malaysian (Malaysia) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDmsa_MYS_PHON
TypeText
Unit26,000 words
LanguageMalaysian
CountryMalaysia

Mandarin (Simplified) (China) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDzho_CHN_PHON
TypeText
Unit35,000 words
LanguageMandarin (Simplified)
CountryChina

Mandarin (Traditional) (Taiwan) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDzho_TWN_PHON
TypeText
Unit50,000 words
LanguageMandarin (Traditional)
CountryTaiwan

Marathi (India) Pronunciation Dictionary

More info
Common Use CasesASR, TTS, Language Modelling
Dataset IDmar_IND_PHON
TypeText
Unit30,000 words
LanguageMarathi
CountryIndia

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert