Off-the-shelf (OTS) Datasets

Vietnamese (Vietnam) microphone

Dataset ID:
VIE_ASR001
Dataset Name:
Vietnamese (Vietnam) microphone
Common Use Cases:
ASR, Virtual Assistant, Chatbot
Language:
Vietnamese
Country:
Vietnam
Language Code:
vie
Country Code:
VNM
Product Type
Audio
Detailed Product Type
Scripted Speech
Unit
19 hours
Recording Device
Microphone
Recording Condition
Mixed (quiet home/office, public, outdoor)
Contributors
129
Utterances
18,842
Unique Words
Available on request
Sample Rate (kHz):
16
Channels
1
Data Format
wav
Source
GlobalPhone
Additional Info:
  • Part of a multilingual corpus; tiered package prices available with purchase of multiple Global Phone languages or the full corpus
  • Dataset is fully transcribed and the transcription is available both in original script and in Romanized form
  • Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary
  • Developed in collaboration with the Karlsruhe Institute of Technology (KIT)
Year of Collection
2009

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert