Off-the-Shelf AI Training Datasets

English (United States) symbols in development

More info

Dataset successfully added to the Quote List

Common Use CasesImage recognition, Object recognition, OCR

Dataset IDIMG_SYMBOLS_US

TypeImage

Unit1500 images

LanguageEnglish

CountryUnited States

English (United States) Ultra High-Volume labeled speech

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Conversational AI, Speech Analytics, Automatic Captioning, In Car HMI & Entertainment, Virtual Assistant

Dataset IDUSE_UHV001

TypeAudio

Unit1196 hours

LanguageEnglish

CountryUnited States

GlobalPhone Multilingual Text & Speech Database

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Identification, Multilingual Speech Synthesis, Virtual Assistant, Chatbot

Dataset IDGLOBALPHONE

TypeAudio

Unit450 hours

LanguageN/A

CountryGlobal coverage

Object Image Collection text descriptions in development

More info

Dataset successfully added to the Quote List

Common Use CasesImage label recognition training, Accessibility, LLM image generation

Dataset IDIMG_TAG_CN

TypeImage

Unit2000 images

LanguageN/A

CountryN/A

Pashto (Afghanistan) broadcast

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Automatic Captioning, Keyword Spotting

Dataset IDPAS_BRC001

TypeAudio

Unit51 hours

LanguageNorthern Pashto - Southern Pashto

CountryAfghanistan

Selfie image and video collection

More info

Dataset successfully added to the Quote List

Common Use CasesFacial Recognition, Human Body Movement Recognition

Dataset IDIMG_VID_SELFIE_US

TypeImage, Video

Unit2938 files (1403 images, 1535 videos)

LanguageN/A

CountryUnited States

English (United States) symbols **in development**

Dataset successfully added to the Quote List

English (United States) Ultra High-Volume labeled speech

Dataset successfully added to the Quote List

GlobalPhone Multilingual Text & Speech Database

Dataset successfully added to the Quote List

Object Image Collection **text descriptions in development**

Dataset successfully added to the Quote List

Pashto (Afghanistan) broadcast

Dataset successfully added to the Quote List

Selfie image and video collection

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets

English (United States) symbols in development

Object Image Collection text descriptions in development