Automation Document Processing API
Automation Document Processing, by using the Document Processing API, offers the power of intelligent capture with the flexibility of an API that enables you to extend the value of your core enterprise content-management technology stack and helps you rapidly accelerate extraction and classification of data in your documents.
Prerequisites
- Make sure that Document Processing is deployed and initialized successfully by retrieving the overall deployment
status.
acacm=$(oc get cm -o name |grep aca-config) oc get $acacm -o jsonpath='{.data.ACA_INIT_STATUS}'This command should return
True. If that is not the case, review your deployment. - Create a project by using Document Processing Designer if you have not done so already.
- Make sure that the jq command-line JSON processor is installed.The jq tool is available from this page: https://stedolan.github.io/jq/.
Retrieve the required information
Retrieve the Document Processing back-end API URL
information.
echo "https://$(oc get route cpd -o jsonpath="{.spec.host}")/adp/aca"You can verify the Document Processing build by running the following
command:
curl -k https://$(oc get route cpd -o jsonpath="{.spec.host}")/adp/aca/pingThe result should be similar to the following
example:
<h1>IBM Content Analyzer Ping Page</h1><p>Build: APD-Backend/master_21.0.3.0.1200 Thu Sep 30 09:57:46 PDT 2021Document Processing API details
You can authenticate either through a Zen API Key or a Zen token.
- Authenticating with a Zen API Key
- A Zen API Key does not store any credentials or have an expiration time. It is also a good
choice if you need to call the Document Processing with a service count. To
authenticate with a Zen API Key:
- Open the Cloud Pak Platform UI (Zen) home page, for example https://<adp_url>/zen/#/homepage.
- Generate an API Key by clicking .


- Encode the Zen API Key with the username by using Base64 as
follows:
<username>:<api key> => <Base64 encoded> - Send it as the Authorization header with the prefix
ZenApiKey:
For example:Authorization: ZenApiKey <encoded value>curl --location --request POST 'https://${Zen_host}/adp/aca/v1/projects/$project_name/analyzers' \ --header 'Authorization: ZenApiKey Y2VhZG1pbjo4akJZeFVtY296NFBWd2hHaEMzeE5GYThVcDFkRlpWWWhXVFVabXI4' \ --form 'file=@"/C:/Users/SampleFiles/APT003.pdf"' \ --form 'responseType="\"json\""' \ --form 'jsonOptions="\"HR\", \"DC\", \"KVP\", \"TH\", \"OCR\", \"SN\", \"MT\", \"CB\", \"ST\", \"DS\", \"AI\", \"CHAR\""'© Copyright IBM Corporation 2022
- Authenticating with a Zen token
- By using the jq JSON command-line processor and the curl
tool, generate the Zen JSON web token (JWT). As the username value, pass the
username that belongs to the appropriate group, such as
captureadminsorprojectadmins.
For more information about roles, see Business Teams permissions.Zen_host=$(oc get route cpd -o jsonpath="{.spec.host}") username=<The username that belongs to the appropriate group such as `captureadmins` or `projectadmins`.> pwd= <password> Zen_JWT=$(curl -u "$username:$pwd" -Ssk -X GET "https://$Zen_host/v1/preauth/validateAuth" | jq -r '.accessToken') - Setting the Document Processing project ID
-
- Open your project in Document Processing Designer and open the
spbackendcontainer log. Search forproject_idto find the ID. - Send in the request with the
Zen_JWTtoken:curl -k --location --request POST "https://${Zen_host}/adp/aca/v1/projects/$project_name/analyzers" \ --header "Authorization: Bearer $Zen_JWT" \ --form "file=@"/tmp/TLG6TV.pdf"" \ --form "responseType="json"" \ --form "jsonOptions="ocr,dc,kvp,sn,hr,th,mt,ai,ds,char""Attention: The char option causes high memory usage from thepostprocessingpod, which might result in the process running out of memory and stopping. If you need to use this option, monitor and increase the RAM accordingly forpostprocessingpods.
- Open your project in Document Processing Designer and open the
- Response properties
- The Document Processing API uses standard HTTP response codes to indicate whether a call is
successful or not.
- A 200 code indicates success.
- A 202 code indicates that the file is accepted for processing.
- A 4xx code indicates errors that might be caused by clients input.
- A 5xx code indicates a server-related error.
Property Data Type Description code integer The status associated with the response. messageId string Document Processing message code. message string The status associated with the response. analyzerId string The ID associated with the API call. It is used in the other requests later. fileNameIn string Name of the uploaded file. type array The type of file that is generated. errorId string Document Processing error code. explanation string Explanation for the error. action string The action that is needed to correct the error.
Response examples
- Example of a successful response
-
{ "status": { "code": 202, "messageId": "CIWCA50000", "message": "Success" }, "result": [ { "status": { "code": 202, "messageId": "CIWCA11106", "message": "Content Analyzer request was created" }, "data": { "message": "json processing request was created successful", "fileNameIn": "Legal Invoice 15.pdf", "analyzerId": "ac3afc50-2c52-11ec-b296-c35cda005f89", "type": [ "json" ] } } ] } - Example of an error response
-
{ "error": "invalid_token", "error_description": "access token is missing or invalid." }