Speechace
  • Introduction
    • Overview
    • Use-Cases
  • Getting Started
    • Pre-requisites
      • API Features
      • Getting the API Key
      • API Regions and endpoints
      • API Limits
    • API Samples
    • Supported Languages
    • API Versioning
    • Authentication
    • Try the Speechace API
    • Error Handling
      • Common Errors
      • Retry Strategies
  • Solutions
    • Speaking Practice for Language Learning
    • Automated Language Assessment with AI
    • Voice AI for Early Literacy
    • Test Prep for Standardized tests
      • PTE Speaking Questions
      • IELTS Speaking Questions
      • TOEFL Speaking Questions
      • CEFR Speaking Questions
      • TOEIC Speaking Questions
    • Speaking Practice in Spanish and French
  • Features
    • Introduction
    • Scripted activities
      • Pronunciation Scoring
        • Word and Sentence pronunciation
        • Multiple choice
        • Custom pronunciations
        • Phoneme list
      • Fluency scoring
        • Passage scoring
      • Lexical stress and intonation
    • Spontaneous activities
      • Open-ended scoring
        • Language scoring
        • Relevance scoring
        • Language detection
      • Task achievement scoring
        • Describe Image
        • Re-tell Lecture
        • Answer Question
  • API Reference
    • Postman API reference
    • Score Text/Pronunciation
      • Handling overall scores
      • Handling word scores
      • Handling phoneme and syllable scores
    • Score Text/Multiple choice
      • Handling multiple choice response
    • Score Text/Markup Language
      • Handling Markup Response
    • Score Text/Stress & Intonation
      • Handing stress and intonation response
    • Score Text/Phoneme list
      • Handling phoneme list response
    • Score Text/Fluency
      • Handling fluency response
      • Fidelity detection
    • Score Text/Validate Text
    • Score Speech/Open-ended
      • Handling language scores
      • Per metric feedback
        • Grammar metrics
        • Vocabulary metrics
        • Coherence metrics
    • Score Speech/Relevance
      • Handling relevance response
    • Score Speech/Language Detection
    • Score Task/Task Achievement
  • Guides on common topics
    • Intepreting quality score
    • Interpreting overall scores
      • Pronunciation Bands
      • Fluency Bands
      • Vocabulary Bands
      • Grammar Bands
      • Coherence Bands
    • Scoring rubrics
    • Interpreting fidelity class
    • Phonetic notation
      • US English (en-us)
      • UK English (en-gb)
      • French (fr-fr, fr-ca)
      • Spanish (es-es, es-mx)
    • Getting word timestamps in audio
    • Automatic handling of unknown words
    • Phoneme to letter mapping
    • Markup Language
  • Other Resources
    • Requesting Support
    • Rate Limiting
    • Data Retention
    • FAQs
    • Appendices
Powered by GitBook
On this page
  1. Features
  2. Spontaneous activities
  3. Task achievement scoring

Describe Image

PreviousTask achievement scoringNextRe-tell Lecture

Last updated 6 months ago

Use function to score describe image style questions.

In order to score describe image style questions, developers can pass a description of the image along with an audio file response from a user and the function will automatically assess how closely the audio describes description of the image.

The image description provided to the API is used for guidance and not for pattern matching and instead, the task achievement AI looks for semantic similarity between the user's response and the image description. Semantic matching allows the user to get a high score even if they speak in their own words and their answer does not exactly resemble the image description string.

Let us review the below example to illustrate scoring a user's response to a prompt wherein the user is asked to describe an image:

Image description: “The image shows a busy street in a city. There are people walking on the sidewalk, and some are waiting at a bus stop. In the background, I can see tall buildings, and there’s a bright blue sky. A few cars are driving by, and it looks like a sunny day.”

User's Response:

Where task is achieved:

"The image depicts a bustling city street filled with activity. People are walking briskly along the sidewalks, some chatting with friends while others are engrossed in their phones. At a nearby bus stop, a small crowd is waiting, checking schedules and exchanging quick conversations.

In the background, tall skyscrapers rise against a clear blue sky, their glass facades reflecting the sunlight. Cars and buses navigate the road, creating a dynamic scene of urban life. Street vendors are set up along the curb, offering snacks and drinks, adding to the vibrant atmosphere. Colorful banners and advertisements hang from buildings, contributing to the lively energy of the city. Overall, the image captures the essence of a busy urban environment, full of movement and life."

Where task is not achieved:

Aliens have long captivated human imagination, inspiring countless stories, theories, and scientific inquiries. These extraterrestrial beings are often depicted as intelligent life forms from distant planets, sparking debates about the possibility of life beyond Earth. In popular culture, aliens are portrayed in various ways, from benevolent visitors seeking to share knowledge to malevolent invaders threatening humanity. The fascination with aliens extends to scientific exploration, with initiatives like the Search for Extraterrestrial Intelligence (SETI) actively seeking signals from other civilizations. As technology advances and our understanding of the universe deepens, the question remains: are we alone, or do other life forms exist in the vast cosmos?


If the user utters a relevant response as above, the developers can create a detailed report that provides the user a positive score as below:

If the user utters an irrelevant response, then based on the , zero score can be assigned to the user and developer's can create a detailed report as shown below to communicate the user's failure. Please note that while the language scores can be non-zero, the task achievement score can be zero.

Such functionality can be built by utilizing the function of the Speechace API.

API
Score Task/Task Achievement
Score Task/Task Achievement
Score Task/Task Achievement