Filters
Search
Product type
Language
Country
Year of Collection

Arabic NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDARB_NER001
TypeText
Unit20,774 sentences
LanguageArabic (Standard)
CountryN/A

Chinese command and control prompt response corpus

More info
Common Use CasesLLM training, Command and Control, TV Player, Device Control
Dataset IDDSDH_corpus_CN
TypeText
Unit20000 sentences
LanguageChinese
CountryChina

East African facial images

More info
Common Use CasesFacial Recognition
Dataset IDIMG_FACE_KEN_CN
TypeImage
Unit13500 images
LanguageN/A
CountryKenya

English NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDENG_NER001
TypeText
Unit22,768 sentences
LanguageEnglish
CountryN/A

Farsi/Persian NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDFAR_NER001
TypeText
Unit19,584 sentences
LanguageIranian Persian
CountryIran

Japanese NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDJPY_NER001
TypeText
Unit20,629 sentences
LanguageJapanese
CountryJapan

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert