Filters
Search
Product type
Language
Country
Year of Collection

English (United States) symbols **in development**

More info
Common Use CasesImage recognition, Object recognition, OCR
Dataset IDIMG_SYMBOLS_US
TypeImage
Unit1500 images
LanguageEnglish
CountryUnited States

English Inverse text normalisation

More info
Common Use CasesASR, Language Modelling, Closed Captioning
Dataset IDENG_ITN001
TypeText
Unit4454 test cases
LanguageEnglish
CountryN/A

English NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDENG_NER001
TypeText
Unit22,768 sentences
LanguageEnglish
CountryN/A

Farsi/Persian NER news text

More info
Common Use CasesNER, Content Classification, Search Engines
Dataset IDFAR_NER001
TypeText
Unit19,584 sentences
LanguageIranian Persian
CountryIran

French Inverse text normalisation

More info
Common Use CasesASR, Language Modelling, Closed Captioning
Dataset IDFRA_ITN001
TypeText
Unit3274 test cases
LanguageFrench
CountryN/A

Garments image and video collection **in development**

More info
Common Use CasesImage recognition, Object recognition, Retail, e-commerce
Dataset IDIMG_VID_GARMENTS_US
TypeVideo
Unit300 sessions
LanguageN/A
CountryUnited States

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert