Identify Service: OCR Extract API

This api service allows you to extract text from an ID image and an optional name comparison between what you have in your db and what the name is in the ID. Below are samples of your request body when you post to the api endpoint.

Request Body: JSON

{
  "service": "identify-ocr-extract",
  "payload": {
    "image_link": "https://link-to-your-id-image.jpeg",
    "webhook_name": "<YOUR WEBHOOK NAME IDENTIFIER>",
    "source_id": "<UNIQUE ID FOR REQUEST>",
    "name_compare_data": {
      "first_name": {
          "value": "John",
          "include": true
      },
      "middle_name": {
          "value": "Wayne",
          "include": true
      },
      "last_name": {
          "value": "Doe",
          "include": true
      }
    }
  }
}

Request Body Definition

service - This defines the service that we want to use
payload - Information needed in order for the service to process the request
payload.image_link - The link where the image of the ID can be downloaded. It is advised that your link be available for only 30 minutes for security reasons.
payload.webhook_name - This is the name of the webhook endpoint where you would like to receive the results. The name is what you have assigned when you added the webhook.
payload.source_id - This is a unique identifier that you generate for the request so that when you receive it in the webhook, you can identify which request the webhook data belongs to.
payload.name_compare_data - This is where you specify whether the configuration for the name comparison. A value of null means you do not want to do any name comparison.
payload.name_compare_data.first_name - If you choose to have name comparison, this is where you put the configuration. The value is what you have in your data as the first name and include is whether you want it to be included in the comparison. If you indicate include to be false, this part of the name will be ignored when comparing to the name extracted from the ID. Example is if you do not have the middle name, then you can choose to ignore this so that the middle name from the ID will be excluded when comparing with the name you submit. This results to a better score. The middle_name and last_name works the same.

Sample Response

If your request is successful, you will receive the following response where you get a message of the service you requested and the credits you have left after making the request, with a status code of 200.

{
  "message": "identify-ocr-extract request received",
  "remaining_credits": 92
}

If your request is not successful, you will receive a message from the json response describing the error. Possible errors may be due to:

Invalid API token with status code 401
Invalid service name with status code of 422
Not enough credits with status code of 422
Invalid payload with status code of 422

How Name Comparison Works

When you use name compare, your configuration in the payload will determine how the name will be compared. Example is when you send the name John Wayne Doe for the first, middle and last name. If you set the middle name to be not included and the name extracted from the ID is Johnny Wayne Doe, then the names that will be compared will be just John Doe against Johnny Doe. It is best to test this with different scenarios that you have in order to determine an acceptable score for you.

Webhook Data Received: JSON

{
  "source_event": "identify-ocr-extract", 
  "payload": {
      "extracted_data": {
          "first_name": {
              "value": "SARAH", 
              "confidence": 96.61
          }, 
          "last_name": {
              "value": "MARTIN", 
              "confidence": 97.91
          }, 
          "middle_name": {
              "value": "", 
              "confidence": 99.18
          }, 
          "address": {
              "value": "", 
              "confidence": 99.17
          }, 
          "city": {
              "value": "", 
              "confidence": 99.17
          }, 
          "id_number": {
              "value": "P123456AA", 
              "confidence": 95.36
          }, 
          "expiration_date": {
              "value": "01-14-2033", 
              "confidence": 94.65
          }, 
          "date_of_birth": {
              "value": "08-01-1990", 
              "confidence": 95.86
          }, 
          "place_of_birth": {
              "value": "OTTAWA CAN", 
              "confidence": 96.74
          }, 
          "country": "Canada", 
          "id_type": "Passport"
      }, 
      "url": "<IMAGE LINK YOU PASSED IN THE INITIAL REQUEST>", 
      "source_id": "<YOUR UNIQUE REQUEST ID>", 
      "name_compare_data": {
          "first_name": {
              "value": "Sarah", 
              "include": true
          }, 
          "middle_name": {
              "value": "", 
              "include": false
          }, 
          "last_name": {
              "value": "Martin", 
              "include": true
          }
      }, 
      "name_compare_result": {
          "source_name": "SARAH MARTIN", 
          "id_name": "SARAH MARTIN", 
          "score": 100.0
      }
  }, 
  "error": null, 
  "completed_at": "2024-01-13T05:39:48.966984+00:00"
}

Webhook Data Definition

source_event - The name of the service
error - The error message if there was an error processing the request or null if processing was successful.
completed_at - The time the process was completed and sent to your webhook endpoint. It follows the ISO 8601 format.
payload.extracted_data - These are all the text details extracted from the ID. It will always be in this format. For each text extracted, you get the extracted value in the ID as well as the confidence level. A score closer to 100.00 means the more likely the text extracted for the specific definition is close to accurate. If a specific label (e.g. id_number) was not found, an empty string will be the value. This means a high confidence level with an empty value means most likely, there is no ID number in the ID. If there is a value but the score is really low, chances are the ML model used was able to spot something that probably is the ID number but is not confident enough probably due to image quality or it is not familiar with the type of ID yet.
payload.name_compare_result - The results of the name comparison.
payload.source_id - This is a unique identifier that you sent with the initial request.
payload.name_compare_data - This is where you specified the configuration for the name comparison during the initial request.
payload.url - The link of the image you sent with the initial request.

Cliqet

Cliqet

Cliqet

Identify Service: OCR Extract API

Request Body: JSON

Request Body Definition

Sample Response

How Name Comparison Works

Webhook Data Received: JSON

Webhook Data Definition