Filters
Search
Product type
Language
Country
Year of Collection

Cantonese (China) business dialogues

More info
Common Use CasesASR, Conversational AI, Speech Analytics, Business Intelligence
Dataset IDYYDH_ASR001_CN
TypeAudio
Unit98.35 hours
LanguageCantonese
CountryChina

East African facial images

More info
Common Use CasesFacial Recognition
Dataset IDIMG_FACE_KEN_CN
TypeImage
Unit13500 images
LanguageN/A
CountryKenya

English (United States) Adversarial prompts for LLM red teaming **in development**

More info
Common Use CasesLLM training, LLM Red teaming
Dataset IDeng_USA_LLM002
TypeText
Unit500 prompts
LanguageEnglish
CountryUnited States

English (United States) Harmful and harmless prompts and responses **in development**

More info
Common Use CasesLLM training, LLM Red teaming, Chatbot
Dataset IDeng_USA_LLM001
TypeText
Unit300 prompts
LanguageEnglish
CountryUnited States

English Inverse text normalisation

More info
Common Use CasesASR, Language Modelling, Closed Captioning
Dataset IDENG_ITN001
TypeText
Unit4454 test cases
LanguageEnglish
CountryN/A

French Inverse text normalisation

More info
Common Use CasesASR, Language Modelling, Closed Captioning
Dataset IDFRA_ITN001
TypeText
Unit3274 test cases
LanguageFrench
CountryN/A

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert