Parsing Language With Scanners

The Scanner provides an interface for quickly performing flexible natural language matching extraction on a conversation.

The types of questions you might want answered about a conversation that can be addressed with a scan include:

  • Was a word or phrase (or a similar equivalent) spoken?
  • Did a caller follow a script or adhere to policies?
  • What were the names, phone numbers, and products mentioned in a call?
  • If one party on a call brings up a subject, does the other party respond appropriately?

You can provide one or several queries to the Scanner and it will return the following:

  • Matching excerpts from the conversation.
  • Any extracted words or phrases.

Scanner Query Language

The scanner query language is used to define a query for a scan on a conversation. Key components of the language are as follows:

  1. A specific phrase (a literal) is searched for in single quotes.
  2. If you want to search for approximate matches you can add one to three tilde operators to the left of a literal (quoted phrase). While the appropriate degree of fuzziness depends on the problem (and might require experimenting for a use case), generally one tilde allows for simple word substitutions that maintain meaning (synonyms), three tildes allows for complete rephrasing, and two tildes is somewhere in between.
  3. Multiple literals can be chained together with “and”, “or”, and “then” to look for whether a more complex rule was matched in a conversation. Parentheses can also be added.
  4. Curly braces are added to extract specific parts of a phrase like a name or a date.

When using scanner, first decide on a query you want to perform, and then you provide the query as an argument to the scanner processor as shown in the example below.

The scanner performs fuzzy matching by scanning a query across the transcript of a conversation. Words are paired together and a the query is scored by its similarity to that segment of the conversation.

For example,

'i | wrote |     a | book | on | cooking'
'. | book  | about | food |  . |       .'

Produces a poor match, whereas,

'i | wrote | a | book |    on | cooking'
'. | .     | . | book | about |    food'

Matches well as cooking and food have related meaning, and on and about are the same part of speech. By adding additional tildes, the scanner also becomes more tolerant to word reordering as well.

Extraction tokens (like ‘{number}’) often extract an indeterminate number of words.


To look for the exact occurence of a phrase, it is simply wrapped in quotes:

'cable outage'

You can also look for a boolean combination of phrases:

'good morning' or 'good evening'
'thank you for calling' and 'how can i help you'

or for time ordering:

'problem' then 'minutes'

Sometimes if you search for a longer phrase, you may not get a match if there’s a rephrasing:

'what time is it'

would not catch the speech segment ‘what is the time’. To allow for soft matching you can add one to three tildes.

One tilde for simple synonym subsitutions:

~'bad connection'

matches ‘poor connection’.

Two tildes for thematic substitutions and light rephrasing:

~~'made of gold'

Matches ‘made of silver’

Three tildes to allow for complete rephrasing:

~~~'thank you for helping'

Matches ‘that helped thanks’

You can use single word wildcards with the asterisk:

~~~'my * is broken'

This matches general phrasing where something is broken and someone complains.


Scanner also allows the user to extract relevant information and entities from natural language using curly braces.

~~~'my name is {name}'

Will extract a name from such a sentence. There are several special extractors that capture general concepts including:

  • name
  • phoneNumber
  • number
  • date
  • time
  • zipCode
  • greeting
  • title
  • polite

You may also optionally label your extractions, to make it easier to manage the results:

~~~'received first letter {date:firstLetter}' then 'second letter {date:secondLetter}'

Performing Scans

Let’s say we want to extract the agent name when they introduce themselves. We build a single simple scanner query with the form:

~~'thanks for calling i am {name:agentName}' or ~~'hello this is {name:agentName}'

The code for performing a query is similar to classification, and the processor takes the query as an argument. Keep in mind, one query may be a complex boolean expression with multiple extractions.

conv_url = 'http://[my api server]/v0/conversations'
callback_url = 'http://[my callback server]/doneprocessing'

conversations = {}

for file_name, audio_id in audio_ids.iteritems():
    payload = {
        'audio_id': audio_id,
        'name': file_name,
        'processors': ["scan:~~'thanks for calling i am {name:agentName}' or ~~'hello this is {name:agentName}'"],
        'event_callback': callback_url
    response =, auth=(account_id, auth_token), json=payload).json()
    conversations[file_name] = response