Retrieve results

Request

When a processing request is completed successfully you can proceed in fetching the results. The results are provided in a JSON object format and are described in more detail in this page. The endpoint to call is:

curl --request GET \
     --url https://api.behavioralsignals.com/clients/your-client-id/processes/pid/results \
     --header 'X-Auth-Token: your-api-token' \
     --header 'accept: application/json'

The GET results method requires the client ID (long: {cid}) and the process ID (long: {pid}) to be passed as path parameters. On invocation, it returns the result of the processing in JSON format. In case the specified cid or pid is not found or the status of the job is not set to completed, a corresponding error response is sent to the user.

Response schema

The response is a JSON with the following structure:

{
  "pid": 0,
  "cid": 0,
  "code": 0,
  "message": "string",
  "results": [
    {
      "id": "1",
      "startTime": "0.209",
      "endTime": "7.681",
      "task": "<task>",
      "prediction": [
        {
          "label": "<label>",
          "posterior": "0.754",
          "dominantInSegments": [
            0
          ]
        },
        ...
      ],
      "finalLabel": "<label>",
      "level": "utterance",
      "embedding": "[11.614513397216797, -15.228992462158203, -4.92175817489624, ...]"
    }
  ]
}

results is an array, where each element corresponds to a prediction for a specific task and utterance/segment. The available tasks are the following:

diarization: Contains the speaker label of the utterance, e.g: SPEAKER_00, SPEAKER_01, .... If the embeddings query param is defined, the speaker embeddings are also returned.
asr: Contains the verbal content of the utterance
gender: The sex of the speaker
age: The age estimation of the speaker
language: The detected language
emotion: The detected emotion. Class labels: happy, angry,neutral,sad
strength: The detected arousal of speech: Class labels: weak, neutral, strong
positivity: The sentiment of speech. Class labels: negative, neutral, positive
speaking_rate: How fast/slow the speaker talks. Class labels: slow, normal, fast
hesitation: Whether there are signs of hesitation in speech. Class labels: no, yes
politeness: The politeness based on the tone of speech. Class labels: rude, normal, polite
deepfake: Whether the utterance is potentially artificially generated (deepfake). Class labels: bonafide, spoofed
features: This task is only present when the embeddings query param is defined. It contains the behavioral embeddings of the speaker.

The id of each result is used to indicate the utterance/segment id. The startTime, endTime indicate the start/end of the utterance/segment.

Each result has a prediction array. This includes the values of each class for the specific task. For example in case of the emotion task, an example prediction object would be:

"prediction": [
  {
    "label": "sad",
    "posterior": "0.7969",
    "dominantInSegments": [
        0, 1, 2
    ]
  },
  {
    "label": "neutral",
    "posterior": "0.1931",
    "dominantInSegments": [4]
  },
  {
    "label": "happy",
    "posterior": "0.007"
  },
  {
    "label": "angry",
    "posterior": "0.0029"
        }
]

The posterior indicates the probability of this class label being present in the utterance/segment. In case of utterances, thedominantInSegmentsindicates the segments in which the label was dominant. In our example, in the first three segments of the utterance, the speaker was sad. The finalLabel in the result object, indicates the dominant class label.

The level field indicates whether this result corresponds to a segment or utterance. The hierarchy is that an utterance contains 1-N segments, and usually corresponds to a speaker turn or sentence. The segment is the smallest unit of speech corresponding to 2 seconds.

The embedding field contains the speaker or behavioral embedding. This field is empty in most tasks except two:

diarization: Here the embedding field corresponds to the speaker embedding
features: The embedding field corresponds to the behavioral features

This field is present only when the embeddings query param is present.