Wuhan dialect (China) Conversational Speech

Name: Wuhan dialect (China) Conversational Speech
SKU: d96409bf8942
Availability: InStock

Dataset successfully added to the Quote List

Dataset ID:

WUHAN_ASR002_CN

Dataset Name:

Wuhan dialect (China) Conversational Speech

Common Use Cases:

ASR, Conversational AI, Speech Analytics

Language:

Wuhan dialect

Country:

China

Language Code:

cmn

Country Code:

CHN

Product Type

Audio

Detailed Product Type

Conversational Speech

Unit

58.6 hours

Recording Device

Mobile phone

Recording Condition

Low background noise

Contributors

180

Utterances

Unique Words

Sample Rate (kHz):

Channels

Data Format

wav

Source

Appen China

Additional Info:

Audio only; transcription in development for Q1 2025
Audio recordings cover 5 districts of Wuhan: Jiang 'an, Jianghan, Qiao Kou, Hanyang and Wuchang
Northeast suburb accents not included, and no minors were recorded.
Each recording session contains 20-30 minutes of free dialogue between 2-5 people.
Sensitive data and personal information has been scrubbed.

Year of Collection

2020

Get Started with Off-the-Shelf AI Training Datasets

Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.

Talk to an expert