Off-the-Shelf AI Training Datasets

English (United States) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDeng_USA_PHON

TypeText

Unit358,000 words

LanguageEnglish

CountryUnited States

English Inverse text normalisation

More info

Dataset successfully added to the Quote List

Common Use CasesASR, Language Modelling, Closed Captioning

Dataset IDENG_ITN001

TypeText

Unit4454 test cases

LanguageEnglish

CountryN/A

Finnish (Finland) Part of Speech Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDfin_FIN_POS

TypeText

Unit10,000 words

LanguageFinnish

CountryFinland

Finnish (Finland) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDfin_FIN_PHON

TypeText

Unit86,000 words

LanguageFinnish

CountryFinland

French (Algeria) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDfra_DZA_PHON

TypeText

Unit4,000 words

LanguageFrench

CountryAlgeria

French (Canada) Pronunciation Dictionary

More info

Dataset successfully added to the Quote List

Common Use CasesASR, TTS, Language Modelling

Dataset IDfra_CAN_PHON

TypeText

Unit67,000 words

LanguageFrench

CountryCanada

English (United States) Pronunciation Dictionary

Dataset successfully added to the Quote List

English Inverse text normalisation

Dataset successfully added to the Quote List

Finnish (Finland) Part of Speech Dictionary

Dataset successfully added to the Quote List

Finnish (Finland) Pronunciation Dictionary

Dataset successfully added to the Quote List

French (Algeria) Pronunciation Dictionary

Dataset successfully added to the Quote List

French (Canada) Pronunciation Dictionary

Dataset successfully added to the Quote List

Get Started with Off-the-Shelf AI Training Datasets