MediaInterface Expands to France With Pre-Labeled Datasets

We were expanding to a new market. Although we had a fully localized software, we were lacking resources, so our clients could not optimally use it. Appen helped us out with French lexicon data.

– Ines Wendler, Product Manager, MediaInterface

The Company

For over 20 years, MediaInterface has delivered language technology solutions to primarily healthcare-related institutions in Germany and other parts of Europe, including Austria and Switzerland. Their core product, SpeaKING, leverages speech recognition artificial intelligence (AI) to support medical documentation, the result being faster, higher-quality documentation workflows. The product user base spans over 75,000 users across 600 hospitals and 700 medical practices.

The Challenge

With many years of success in several European countries, MediaInterface was looking to expand to France next. There, they noticed a growing demand for user-friendly solutions to support the various workflows involved in medical documentation.

The problem with expanding to a new market, however, was that the 15-plus years of data that MediaInterface had collected didn’t apply to a different language. What they required was a French background lexicon with high-quality phonetic transcriptions, which would help them build out a comprehensive vocabulary base.

As part of that lexicon, the biggest data gap was French names and places, which were often referenced in patient health information. This data would be the most difficult to acquire: due to the European General Data Protection Regulation, much of the health data MediaInterface could collect had to be anonymized so that person and place names weren’t included. MediaInterface had to look to an outside source to help them fill these significant data gaps while meeting proper data regulations and requirements.

The Solution

MediaInterface discovered us at a speech processing conference, INTERSPEECH, in 2015 and stayed in contact as MediaInterface’s needs evolved in 2019. Our pre-labeled datasets datasets fit both the budget and requirements of MediaInterface’s France expansion project, and a partnership was born. The company utilized our datasets to acquire approximately 21,000 French names and 14,000 place names, helping fill the most critical data gaps. MediaInterface keeps using these units of data to develop their background lexicon.

The Result

We had to buy critical data from Appen, and that data has been incorporated into our background lexicon. This helps us build out new vocabularies for our clients.

– Ines Wendler, Product Manager, MediaInterface

Our ability to support French vocabulary with our pre-labeled datasets datasets helped MediaInterface to develop language-specific parts of their product and therefore to expand to an entirely new market and highlight the possibilities for future markets. Now, MediaInterface offers high credibility to French clients due to its full coverage of essential dictation and speech recognition needs for healthcare institutions. The background lexicon further provides opportunities for French clients to customize. They can use the feature SmartLearning of MediaInterface’s speech recognition solution SpeaKING along with the background lexicon to add their own texts to personalize existing vocabularies, leading to improved speech recognition results by adding data to the underlying AI model.

Our pre-labeled datasets help our clients achieve faster deployments with improved product accuracy. In the case of MediaInterface, our datasets equipped them with the tools to confidently broaden their customer base and at the same time improving quality and client experience.