Markup Language
In scoring text using markup language, the input can include markups to explicitly define phonetic mappings for one or more words, allowing you to bypass the Speechace lexicon. This feature is supported for both en-US and en-GB requests and can be utilized in scoring, multiple-choice, or Validate Text requests.
Markup Syntax
Each markup has the pattern: [l1|l2|...|ln]{s1|s2|...|sn}, where:
l1,l2, ...,lnare substrings of a word in the input text, ands1,s2, ...,snare syllables of the word corresponding tol1,l2, ...,ln.Each syllable,
si, has this pattern:p1p2...pk, wherepiis a phoneme in ARPABET notation for the current dialect (i.e. en-us or en-gb).If
piis a vowel phoneme,piends with 0, 1 or 2 to denote the stress level of the syllable.There should be at most one vowel per syllable.
You can now pass the parameter markup_language=arpa_mark to the API and include markup for words within the text as follows:
He [read]{r eh1 d} his [frag|ments]{f r ae1 g|m ah0 n t s} aloud.In the text above, the words "read" and "fragments" are marked up to disambiguate them from heteronyms.
Expressing letters in markup
Optionally, markup can specify letter-to-phoneme mapping. This allows the Speechace API to return phoneme-to-letter mapping in the results for the marked-up words.
To add letter information in markup:
Prefix the first syllable: Add
2|befores1(the first syllable) as follows:[l1|l2|...|ln]{2|s1|s2|...|sn}. This indicates that the markup contains phoneme-to-letter mapping information.Mapping Syntax: Use zero or more pairs of angular brackets
<and>to group substrings inli(letters) andsi(phonemes) for establishing one-to-one correspondence when necessary.liconsists of a sequence of charactersc1c2...cpsiconsists of a sequence of phonemesp1 p2 ... pk
Simple Mapping: If there is a one-to-one mapping between letters and phonemes, no additional markup is needed.
Inference of Mapping: If there are no pairs of
<and>inliandsi, and ifp(the number of phonemes) equalsk(the number of letters), we infer:c1is mapped top1c2is mapped top2...,
cpis mapped topk
Complex Cases: For cases where character-to-phoneme mapping is more complex than one-to-one:
Use
<and>to group adjacent characters and/or phonemes.Empty brackets
<>can be used to denote mappings wheremcharacters correspond tonphonemes (withm >= 0andn >= 0, but not both zero).
This transforms li and si into sequences of the same length to facilitate one-to-one mapping.
This can be quite complex, so let's provide illustrative examples that cover all real-world cases. This way, you can easily copy and modify an example instead of constructing the notation from scratch.
Markup examples with letters
Here’s a polished and professional version of your examples:
Example 1:
Markup: [car]{2|k aa1 r}
Explanation:
This example features one syllable without the need for < and >. Here, l1 is "car" and s1 is "k aa1 r." We can infer that:
c1maps top1c2maps top2c3maps top3
Thus, we have:
c1 = cc2 = ac3 = rp1 = kp2 = aa1p3 = r
Example 2:
Markup: [bi|li<ng>|<>ual]{2|b ay0|l ih1 ng|g w ah0 l}
Explanation:
In this case, we have zero characters mapping to one phoneme in l3 as <>ual and s3 as g w ah0 l. The <> allows us to infer that the phoneme g is mapped to no character. Thus:
c1 =c2 = uc3 = ac4 = lp1 = gp2 = wp3 = ah0p4 = l
Example 3:
Markup: [M<>|<><r.>]{2|m <ih1 s>|t er0}
Explanation:
This example shows zero characters mapping to two phonemes. Here, l1 is M<> and s1 is m <ih1 s>. The <> and <ih1 s> allow us to infer that the phonemes ih1 s are mapped to no character. Thus:
c1 = Mp1 = mp2 = ih1 s
Example 4:
Markup: [ear]{2|iy1 <> r}
Explanation:
In this instance, we have one character mapping to zero phonemes. Here, l1 is "ear" and s1 is "iy1 <> r." The <> indicates that the character a is mapped to no phoneme. Thus:
c1 = ec2 = ac3 = rp1 = iy1p2 =p3 = r
Example 5:
Markup: [box]{2|b aa1 <k s>}
Explanation:
This example features one character mapping to two phonemes. Here, l1 is "box" and s1 is "b aa1 ." The <k s> allows us to infer that the character x is mapped to the phonemes k s. Thus:
c1 = bc2 = oc3 = xp1 = bp2 = aa1p3 = k s
Example 6:
Markup: [si<gh>]{2|s ay1 <>}
Explanation:
In this case, we have two characters mapping to zero phonemes. Here, l1 is si<gh> and s1 is s ay1 <>. The <gh> and <> allow us to infer that the characters gh are mapped to no phoneme. Thus:
c1 = sc2 = ic3 = ghp1 = sp2 = ay1p3 =
Example 7:
Markup: [no<th>|i<ng>]{2|n ah1 th|ih0 ng}
Explanation:
This example features two characters mapping to one phoneme. Here, l1 is no<th> and s1 is n ah1 th. The <th> allows us to infer that the characters th are mapped to the phoneme th. Thus:
c1 = nc2 = oc3 = thp1 = np2 = ah1p3 = th
Example 8:
Markup: [qu<eue>]{2|k <y uw1> <>}
Explanation:
In this instance, we have three characters mapped to zero phonemes. Here, l1 is qu<eue> and l2 is k <y uw1> <>. The <eue> and <> allow us to infer that the characters eue are mapped to no phoneme. Thus:
c1 = qc2 = uc3 = euep1 = kp2 = y uw1p3 =
Last updated