Dataset ID:
ARS_ASR001_CN
Dataset Name:
Arabic (Saudi Arabia) scripted smartphone
Common Use Cases:
ASR, Virtual Assistant, Chatbot
Language:
Arabic
Country:
Saudi Arabia
Language Code:
ara
Country Code:
SAU
Product Type
Audio
Detailed Product Type
Scripted Speech
Unit
322 hours
Recording Device
Mobile phone
Recording Condition
Low background noise (home/office)
Contributors
227
Utterances
104,574
Unique Words
156,282
Sample Rate (kHz):
16
Channels
1
Data Format
wav
Source
Appen China
Additional Info:
- Dataset contains audio with corresponding text prompts
- Text prompts are not vowelised
- 300-1000 prompts per speaker covering general content including education, sports, entertainment, travel, culture and technology
Year of Collection
2020
Get Started with Off-the-Shelf AI Training Datasets
Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.
Talk to an expert