Off-the-shelf (OTS) Datasets

Mandarin NER news text

Dataset ID:
MAC_NER001
Dataset Name:
Mandarin NER news text
Common Use Cases:
NER, Content Classification, Search Engines
Language:
Mandarin Chinese
Country:
China
Language Code:
cmn
Country Code:
CHN
Product Type
Text
Detailed Product Type
News NER
Unit
17,313 sentences
Recording Device
N/A
Recording Condition
N/A
Contributors
N/A
Utterances
17,313
Unique Words
Available on request
Sample Rate (kHz):
N/A
Channels
N/A
Data Format
text
Source
Appen Global
Additional Info:
  • News text corpora with entities tagged in XML format: Person, Title, Organization, Location, Geo-political entity, Facility, Religion, Nationality, Quantity
Year of Collection
2009

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert