Trusting the Speechace API

In a world where AI is everywhere, businesses and end-users should rightly be concerned about where to place their trust. Often, providers don't know what is behind the models serving their use cases, or where data they send to these models might end up.

We understand these concerns and believe trust should be earned. Speechace is a 100% B2B provider that has earned the trust of Fortune 500s, the world's top Education Publishers, Institutes, and Language Learning providers. We take these concerns seriously. Speechace is designed and architected from the ground up for the safety, protection, ethical and compliant use of your data.

Let us share why you should place your trust in Speechace.

We collect and use our own first-party data for training

When it comes to training our models, we directly source and collect our own data. First, it allows us to maintain the promise that Speechace API customer data belongs to the customer and only the customer. Second, it allows us to get first-hand control and visibility into the demographics, sources, scale, and nature of our training data. This is fundamental in planning and validating models against demographic or ethnic bias.

You own and control the retention of your data

It follows from the above that Speechace has no need to hang on to your data. Every API customer can choose their retention policy including zero day retention. This means that after scoring your audio, we set it to expire right away. If you do choose a longer retention period (e.g. 30 days), we only retain the data to serve support or analysis requests from you. We do not sell, share, or use your data for anything except serving you. All our data lifecycle events are audited and we can unequivocally verify when your data has been erased.

Our data processing is highly available and fully automated

While Speechace is a Data Sub-processor, our service operates as a highly available cloud service running in 6 worldwide Cloud Regions and in at least 3 Availability Zones within each region. This enables us to commit to 99.95% Service Availability and to maintain no processing facilities outside the cloud, and no personnel direct involvement in data processing.

Our AI models are rigorously and independently validated

Speechace maintains a home-grown state-of-the-art human data labeling, validation, and evaluation system. Datasets are regularly sampled and blindly rated by multiple independent professional raters. Our platform is designed to eliminate rater bias, evaluate the raters themselves first, and achieve high confidence in the evaluation datasets which drive ultimate model decisions. We maintain the largest and most diverse known non-native speaker language proficiency evaluation datasets.

Speechace API Technical Report

With our close partners and customers, and under NDA, we share a detailed technical report with metrics and KPIs which detail how each of our models perform, and the dataset characteristics that drive training our models. Please reach out to us at [email protected] if you would like to gain access to our API Technical Report.

Last updated