Off-the-Shelf AI Training Datasets

Arabic (Morocco) conversational telephony translation

More info

Dataset successfully added to the Quote List

Common Use CasesMT, Chatbot, Conversational AI

Dataset IDARY_MT001

TypeText

Unit80,430 utterances

LanguageArabic

CountryMorocco

Arabic (UAE) printed text annotated OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_ARU002_CN

TypeImage

Unit20000 images

LanguageArabic

CountryUnited Arab Emirates

Baby crying audio

More info

Dataset successfully added to the Quote List

Common Use CasesBaby Monitor, Security & Other Consumer Applications

Dataset IDCRY_ASR001_CN

TypeAudio

Unit70 hours

LanguageN/A

CountryChina

Business-to-business printed text document OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_B2B

TypeImage

Unit5,838 documents

LanguageN/A

CountryN/A

Dutch (Netherlands & Belgium) scripted in-car

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Virtual Assistant, In Car HMI & Entertainment

Dataset IDDutch and Flemish SpeechDat-Car

TypeAudio

Unit27 hours

LanguageDutch

CountryNetherland - Belgium

English (United States) Ultra High-Volume labeled speech

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Conversational AI, Speech Analytics, Automatic Captioning, In Car HMI & Entertainment, Virtual Assistant

Dataset IDUSE_UHV001

TypeAudio

Unit1196 hours

LanguageEnglish

CountryUnited States

Arabic (Morocco) conversational telephony translation

Dataset successfully added to the Quote List

Arabic (UAE) printed text annotated OCR

Dataset successfully added to the Quote List

Baby crying audio

Dataset successfully added to the Quote List

Business-to-business printed text document OCR

Dataset successfully added to the Quote List

Dutch (Netherlands & Belgium) scripted in-car

Dataset successfully added to the Quote List

English (United States) Ultra High-Volume labeled speech

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets