Off-the-Shelf AI Training Datasets

English (United States) Harmful and harmless prompts and responses in development

More info

Dataset successfully added to the Quote List

Common Use CasesLLM training, LLM Red teaming, Chatbot

Dataset IDeng_USA_LLM001

TypeText

Unit300 prompts

LanguageEnglish

CountryUnited States

English (United States) Ultra High-Volume labeled speech

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Conversational AI, Speech Analytics, Automatic Captioning, In Car HMI & Entertainment, Virtual Assistant

Dataset IDUSE_UHV001

TypeAudio

Unit1196 hours

LanguageEnglish

CountryUnited States

Finnish (Finland) printed text OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_FIN_CN

TypeImage

Unit7293 images

LanguageFinnish

CountryFinland

Handwritten text document OCR

More info

Dataset successfully added to the Quote List

Common Use CasesDocument Processing, Document Search, Text detection

Dataset IDIMG_OCR_Handwritten

TypeImage

Unit663 images

LanguageN/A

CountryN/A

Location entrance human body movement videos

More info

Dataset successfully added to the Quote List

Common Use CasesSecurity, Movement detection, Human Body Movement Recognition

Dataset IDHUMAN_BODY_VID002

TypeVideo

Unit130 videos

LanguageN/A

CountryUnited Kingdom, Philippines

Pashto (Afghanistan) broadcast

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Automatic Captioning, Keyword Spotting

Dataset IDPAS_BRC001

TypeAudio

Unit51 hours

LanguageNorthern Pashto - Southern Pashto

CountryAfghanistan

English (United States) Harmful and harmless prompts and responses **in development**

Dataset successfully added to the Quote List

English (United States) Ultra High-Volume labeled speech

Dataset successfully added to the Quote List

Finnish (Finland) printed text OCR

Dataset successfully added to the Quote List

Handwritten text document OCR

Dataset successfully added to the Quote List

Location entrance human body movement videos

Dataset successfully added to the Quote List

Pashto (Afghanistan) broadcast

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets

English (United States) Harmful and harmless prompts and responses in development