Tutorials
Document AI Tutorials
Handling JSON Output from AI Query
overview in this tutorial, we will see how cloudfiles can be used with salesforce flow automations to extract data from documents in a structured json format each time a file is uploaded into salesforce or external storage, the following actions will occur the file is processed using cloudfiles document ai a natural language query is executed to extract information from the document in a json format tailored to your object and field structure the extracted data is parsed and processed using salesforce apex in order to setup this automation, you first need to have some pre requisites after this you can start setting up some query automations and updates to achieve your use case check the sections below for more information pre requisites before starting with this tutorial, you need to ensure that cloudfiles document ai is properly installed & configured in your system additionally, you also need to create some initial automations to process the files before you can query them check the 2 sections below carefully installation & configuration ensure the following before starting cloudfiles document ai installation docid\ rfxciqrbejmpsmbczexam and configuration docid\ amnhmxwixyngnaxerbkwd are complete you have an active subscription or trial of cloudfiles document ai the cloudfiles event mode setting is set to custom object initial automation setup there are 2 flows you will need to set up to process the uploaded files the first flow is triggered when the document is uploaded into salesforce or external storage this flow sends the file to ai for processing this might take a few seconds once the ai is done with processing, the second flow will trigger which will contain the processed document you can then run queries on this processed document flow 1 send document for processing triggered when file is uploaded into salesforce or external storage sends the file for processing which may take a few seconds to a minute flow 2 document processing complete triggered when the file has been processed by ai performs queries on the file to extract the data all information related to setting up these flows is given in the ai flows initial setup guide docid\ sxn79rxu7gw7bb6ogpuix article please go through this article carefully to set up these flows once your flow setup is complete, your flows should look like the following query automation setup now that you have the initial automation setup, we can take the document processed flow and extend it to query the document and update the required fields in this section, we show how to query the processed document, check the results and perform the necessary updates querying the document in order to query the document, we will simply use the query document docid\ dm3eh gzaocyoqd5y0r8d flow action this action taken in a processed document id and a text query as input the processed document id is available as output of the get event details action here are the full inputs used in the above image processed document id {!get triggering event details documentprocessed processeddocumentid} query = ai can be guided to generate structured outputs and responses tailored to our specific requirements, thereby simplifying data processing provide a clear prompt that tells the ai what data to extract, the required format for each field, and the output structure (e g , json) specify how to handle missing values (e g , use null ) and include any key mappings for integration example for example, consider a driver's license or passport we need to write a query to extract various details from the document, such as the type, number, issue date, expiry date, and the holder's name the sample query for this is " identify the type of kyc document provided (e g , passport, driver's license) extract the following fields from the document document type (e g , 'passport', 'driver's license') document number full name of the document holder date of issue (in yyyy mm dd format) date of expiry (in yyyy mm dd format) return only the structured json in this exact format { "kyc document type c" "", "kyc document number c" "", "kyc document holder name c" "", "kyc document issue c" "", "kyc document expiry c" "" }\n if any field is missing or unreadable, return its value as null only return the json object and nothing else " the ai output as tested on the above sample document is as follows { "kyc document type c" "driver's license", "kyc document number c" "64040738293", "kyc document holder name c" "kowalski jan", "kyc document issue c" "2019 03 06", "kyc document expiry c" "2028 01 18" } process ai output using apex action once the line items are available in json format, they can be processed in various ways within salesforce in this case, we use an apex action to process the output —it parses the json, maps each data field to the appropriate object field, and updates the corresponding fields on the contact record this is the apex code used to process the json input public with sharing class updatecontactfromkyc { @invocablemethod(label='update contact kyc fields' description='updates contact fields based on kyc document json input') public static void updatecontact(list\<kycinput> inputs) { list\<contact> contactstoupdate = new list\<contact>(); for (kycinput input inputs) { if (string isblank(input contactid) || string isblank(input kycjson)) { continue; } try { map\<string, object> data = (map\<string, object>) json deserializeuntyped(input kycjson); contact c = new contact(id = input contactid); if (data containskey('kyc document type c') && data get('kyc document type c') != null) { c kyc document type c = string valueof(data get('kyc document type c')); } if (data containskey('kyc document number c') && data get('kyc document number c') != null) { c kyc document number c = string valueof(data get('kyc document number c')); } if (data containskey('kyc document holder name c') && data get('kyc document holder name c') != null) { c kyc document holder name c = string valueof(data get('kyc document holder name c')); } if (data containskey('kyc document issue c') && data get('kyc document issue c') != null) { c kyc document issue c = date valueof(string valueof(data get('kyc document issue c'))); } if (data containskey('kyc document expiry c') && data get('kyc document expiry c') != null) { c kyc document expiry c = date valueof(string valueof(data get('kyc document expiry c'))); } contactstoupdate add(c); } catch (exception e) { system debug('error parsing kyc json or updating contact ' + e getmessage()); continue; } } if (!contactstoupdate isempty()) { update contactstoupdate; } } public class kycinput { @invocablevariable(label='contact id' required=true) public string contactid; @invocablevariable(label='kyc json output' required=true) public string kycjson; } } see it in action now that everything is set up, you can test your flows when a purchase order is uploaded as a salesforce file on the record, the corresponding order line items are automatically extracted, created, and listed note as flow automation executions and document processing via ai may take a short time, the field updates will not reflect immediately to view the updates, wait briefly and refresh the page flow 2 debug whenever a file is processed using the process document using ai docid 31ujx1ligtkwfkjuanbzt action, a document processed docid\ xrr pnbpocwwcmyg6qgk9 object record is published you can query these event object records to verify successful file processing and debug the flow example soql query to check document processed events sorted by the most recent select id, name, createddate, cldfs data c from cldfs cloudfilesevent c where cldfs type c = 'document processed' order by createddate desc you can check the context and file details in the data field of the record when a flow runs in debug mode and executes query document docid\ dm3eh gzaocyoqd5y0r8d action, you can check the results and modify the queries if required