Off-the-Shelf AI Training Datasets

Cantonese (China) business dialogues

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Conversational AI, Speech Analytics, Business Intelligence

Dataset IDYYDH_ASR001_CN

TypeAudio

Unit98.35 hours

LanguageCantonese

CountryChina

Chinese and English related texts

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDGLWB_CN

TypeText

Unit400000

LanguageEnglish/Chinese

CountryN/A

Chinese command and control prompt response corpus

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training, Command and Control, TV Player, Device Control

Dataset IDDSDH_corpus_CN

TypeText

Unit20000 sentences

LanguageChinese

CountryChina

Chinese instruction set sentence corpus

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDZLJ_corpus_CN

TypeText

Unit200000 sentences

LanguageChinese

CountryChina

Chinese multidisciplinary test questions corpus

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDMTQ_CN

TypeText

Unit319970 sentences

LanguageChinese

CountryChina

Chinese news text summaries corpus

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training

Dataset IDDMXWB_corpus_CN

TypeText

Unit20000 summaries

LanguageChinese

CountryChina

Cantonese (China) business dialogues

Dataset successfully added to the Quote List

Chinese and English related texts

Dataset successfully added to the Quote List

Chinese command and control prompt response corpus

Dataset successfully added to the Quote List

Chinese instruction set sentence corpus

Dataset successfully added to the Quote List

Chinese multidisciplinary test questions corpus

Dataset successfully added to the Quote List

Chinese news text summaries corpus

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets