Handling fluency response
Last updated
Last updated
The pronunciation interpretation of the spoken word or sentence remains the same as in the Score Text/Pronunciation function. To interpret the fluency quality, refer to the below key elements:
Overall Fluency Scores
These scores assist test creators in evaluating the overall fluency of spoken responses, offering insights into the quality of the test-taker's speech. Below is an example of how the fluency score is presented. For detailed interpretation, please refer to the overall score guide, which includes scales from systems such as IELTS, PTE, and Speechace.
Fluency Metrics
The API returns the following feedback metrics under the fluency
node:
duration
total length of speech in seconds
articulation
total length of articulation (speech minus pauses, hesitations and non-speech events such as laughter). Excludes beginning silence on very first segment and ending silence on very last segment.
speech_rate
speaking rate in syllables per second.
syllable_count
Count of syllables in this segment
word_count
Count of words in this segment
correct_syllable_count
Count of correctly spoken syllables in this segment
correct_word_count
Count of correctly spoken words in this segment
syllable_correct_per_minute
correct_syllable_count / duration in mins
word_correct_per_minute
correct_word_count / duration in mins
all_pause_count
count of all pauses (filled and unfilled) which are longer than the minimum pause threshold
all_pause_duration
total duration of all pauses (filled and unfilled) in seconds
all_pause_list[]
a list of all the pauses with the begin/end markers for each in extents of 10 msecs
mean_length_run
mean length of run in syllables between pauses
max_length_run
max length of run in syllables between pauses
segment_metrics_list[]
A list of segments within the overall text/audio with the fluency metrics for each segment.
The following are the most commonly used metrics to provide feedback to the user:
word_correct_per_minute
: This measures the count of words per minute. You can color-code the test-taker's rate and compare it to the standard rate of 120 words per minute, which is widely considered the minimum fluent speaking rate.
all_pause_list []
: This is a list of all pauses, with each pause marked by begin and end times, accurate to within 10 milliseconds. Identify and display the locations of medium pause duration (≥500 milliseconds) and long pause duration (>1 second) based on the length and positions of entries in the all_pause_list[]
.
duration
and articulation
: Display the duration and articulation length to show how much time the user spent speaking compared to pausing or using fillers.
a. duration
: The total length of the speech in seconds, including all pauses, fillers, and non-speech events.
b. articulation
: The total length of actual articulation, calculated as the total speech duration minus pauses, hesitations, non-speech events (such as laughter), and excluding any silence at the very beginning and very end of the speech.