Off-the-Shelf AI Training Datasets

Arabic (UAE) printed text annotated OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_ARU002_CN

TypeImage

Unit20000 images

LanguageArabic

CountryUnited Arab Emirates

Business-to-business printed text document OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_B2B

TypeImage

Unit5,838 documents

LanguageN/A

CountryN/A

Chinese command and control prompt response corpus

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training, Command and Control, TV Player, Device Control

Dataset IDDSDH_corpus_CN

TypeText

Unit20000 sentences

LanguageChinese

CountryChina

Finnish (Finland) printed text OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_FIN_CN

TypeImage

Unit7293 images

LanguageFinnish

CountryFinland

GlobalPhone Multilingual Text & Speech Database

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Identification, Multilingual Speech Synthesis, Virtual Assistant, Chatbot

Dataset IDGLOBALPHONE

TypeAudio

Unit450 hours

LanguageN/A

CountryGlobal coverage

Handwritten text document OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_Handwritten

TypeImage

Unit663 images

LanguageN/A

CountryN/A

Arabic (UAE) printed text annotated OCR

Dataset successfully added to the Quote List

Business-to-business printed text document OCR

Dataset successfully added to the Quote List

Chinese command and control prompt response corpus

Dataset successfully added to the Quote List

Finnish (Finland) printed text OCR

Dataset successfully added to the Quote List

GlobalPhone Multilingual Text & Speech Database

Dataset successfully added to the Quote List

Handwritten text document OCR

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets