Off-the-Shelf AI Training Datasets

Arabic (MSA) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDarb_MSA_PHON

TypeText

Unit40,000 words

LanguageArabic (Standard)

CountryN/A

Chinese and English related texts

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDGLWB_CN

TypeText

Unit400000

LanguageEnglish/Chinese

CountryN/A

Code Q&A Dataset

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDDM_CNRD

TypeText

Unit12 million pairs

LanguageEnglish

CountryN/A

English Inverse text normalisation

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Modelling, Closed Captioning

Dataset IDENG_ITN001

TypeText

Unit4454 test cases

LanguageEnglish

CountryN/A

French Inverse text normalisation

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Modelling, Closed Captioning

Dataset IDFRA_ITN001

TypeText

Unit3274 test cases

LanguageFrench

CountryN/A

German Inverse text normalisation

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Modelling, Closed Captioning

Dataset IDDEU_ITN001

TypeText

Unit8001 test cases

LanguageGerman

CountryN/A

Off-the-shelf (OTS) Datasets

Arabic (MSA) Pronunciation Dictionary

Dataset successfully added to the Quote List

Chinese and English related texts

Dataset successfully added to the Quote List

Code Q&A Dataset

Dataset successfully added to the Quote List

English Inverse text normalisation

Dataset successfully added to the Quote List

French Inverse text normalisation

Dataset successfully added to the Quote List

German Inverse text normalisation

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets

Off-the-shelf (OTS) Datasets

Arabic (MSA) Pronunciation Dictionary

Dataset successfully added to the Quote List

Chinese and English related texts

Dataset successfully added to the Quote List

Code Q&A Dataset

Dataset successfully added to the Quote List

English Inverse text normalisation

Dataset successfully added to the Quote List

French Inverse text normalisation

Dataset successfully added to the Quote List

German Inverse text normalisation

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets

Get in touch