Filters
Search
Product type
Language
Country
Year of Collection

English (United States) Ultra High-Volume labeled speech

More info
Common Use CasesASR, Conversational AI, Speech Analytics, Automatic Captioning, In Car HMI & Entertainment, Virtual Assistant
Dataset IDUSE_UHV001
TypeAudio
Unit1196 hours
LanguageEnglish
CountryUnited States

English NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDENG_NER001
TypeText
Unit22,768 sentences
LanguageEnglish
CountryN/A

Farsi/Persian NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDFAR_NER001
TypeText
Unit19,584 sentences
LanguageIranian Persian
CountryIran

French (Belgium) scripted telephony

More info
Common Use CasesASR, Virtual Assistant
Dataset IDBelgian French SpeechDat(II) FDB-1000 (FIXED1BF)
TypeAudio
Unit76 hours
LanguageFrench
CountryBelgium

French (Canada) scripted microphone

More info
Common Use CasesASR, Virtual Assistant, Chatbot
Dataset IDFRC_ASR002
TypeAudio
Unit46 hours
LanguageFrench
CountryCanada

French (Canada) scripted telephony

More info
Common Use CasesASR, Virtual Assistant
Dataset IDFRC_ASR001
TypeAudio
Unit131 hours
LanguageFrench
CountryCanada

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert