Tutorials
Document AI Tutorials

Using AI for Simple Data Extraction from KYC Files

Tutorial Overview

This tutorial offers detailed instructions on configuring Salesforce Flow Automations using CloudFiles Document AI Flow Actions. Specifically, it covers:

  • Processing Salesforce Files (Content Documents) uploaded or attached to Salesforce Object records.
  • Using AI to extract information from them.
  • Automating the update of Salesforce Object fields based on the extracted data.

Scenario

In this scenario, your organization aims to automatically process Driving License attached to Contact records. Each time a Driving License is uploaded or attached to a Contact record, the following actions occur:

  • Every uploaded file is processed using CloudFiles Document AI.
  • Essential information (Name, Driver’s License Number, Issue Date, Expiry Date) is extracted.
  • Corresponding fields on the Contact record are automatically updated.

Using CloudFiles Document AI Flow Actions with Salesforce Flow elements, you can build automated workflows for document processing and data extraction within Salesforce.

What to Expect

Upon completing this tutorial, you will:

  • Gain hands-on experience configuring Salesforce Flows with CloudFiles Document AI Flow Actions.
  • Understand how to set up file processing and data extraction automation.
  • Master automating Salesforce record updates based on AI-extracted data.

Pre-requisites

Ensure the following before starting:

  1. CloudFiles Document AI Installation and Configuration are complete.
  2. You have an Active Subscription or Trial of CloudFiles Document AI.
  3. Ensure you have necessary Salesforce fields to be updated with extracted information
  4. The CloudFiles Event Mode setting is set to Custom Object
Document image


Flow Automation Setup

As guided in AI Flows - Initial Setup Guide, this kind of automation requires two Salesforce flows:

Flow 1

Triggered whenever a new Salesforce File is uploaded or attached (identified by the Salesforce File Attached event) onto Contact records. It processes each attached file using Process Document using AI in the context of the triggering Contact record. Successful file processing publishes a Document Processed event, triggering Flow 2.

For more details, refer to section Flow 1 Configuration inProcess Salesforce Files.

Document image


Flow 2

Triggered by each new Document Processed event. This flow filters events published by Flow 1 execution, extracts required details from the Driver's License, and updates the relevant Contact fields.

Document image




Flow 2 Configuration

1 - Create a Record-Triggered Flow

  1. Navigate to Salesforce Setup > Process Automations > Flows.
  2. Click "New Flow", select "Start From Scratch", and then "Record-Triggered Flow".

2 - Configure Trigger

  1. Select the object as CloudFiles Event ( cldfs__CloudFilesEvent__c ).
  2. Under Trigger the Flow When:, select A record is created.
  3. Set entry condition as Type ( cldfs__Type__c ) = document-processed
  4. Under Optimize the Flow for:, select Actions and Related Records.
  5. Check Include a Run Asynchronously path to access an external system after the original transaction for the triggering record is successfully committed.



Document image


3 - Get Triggering Event Details

  1. In the Run Asynchronously path add CloudFiles: Get Event Details Action.
  2. Provide a clear label such as Get Triggering Event Details.
  3. Input parameter: Event (Custom Object) = Triggering cldfs__CloudFilesEvent__c ( {!$Record} ).
  4. Leave all other defaults and close the action.
Document image


4 - Check Document Processed Event Context

  1. After the Get Event Details action, click + and add a new Decision element.
  2. Provide a clear label such as Check Document Processed Event Context.
  3. For the path the flow should take when the Document Processed Event's Context is a Contact RecordId , create an outcome and give it a label (e.g., Is a Contact RecordId )
  4. Configure the outcome conditions as follows:
    1. Resource = Outputs from Get Triggering Event Details > Document Processed > Context ( {!Get_Triggering_Event_Details.DocumentProcessed.Context} )
    2. Operator = Starts With
    3. Value = 003
  5. Leave all other defaults and close the element.

Note: In Salesforce, the first three digits of a record ID indicate the object type. For example, Contact records start with "003". This step ensures the flow runs further only when the Document Processed Event's context is specifically a Contact RecordId, signifying that the event originated from Flow 1 execution where the Contact RecordId was provided as context.

Document image


5 - Extract DL Holder's Name

  1. After the Get Event Details action, click + and add a CloudFiles: Query Document Action.
  2. Provide a clear label such as Query: DL Holder Name.
  3. Configure the input parameters as follows:
  4. Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
  5. Query = Enter a clear Natural Language Data Extraction Prompt such as: Return the Full Name including the first and last name of the Driving License Holder
  6. Leave all the other defaults and close the action.
Document image


6 - Extract DL Issue Date

  1. Click + and add a CloudFiles: Query Document Action.
  2. Provide a clear label such as Query: Issue Date.
  3. Configure the input parameters as follows:
  4. Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
  5. Query = Enter a clear Natural Language Data Extraction Prompt such as: Return the Driving License Issue Date in DD-MM-YYYY Format
  6. Leave all the other defaults and close the action.
Document image


7- Extract DL Expiry Date

  1. Click + and add a CloudFiles: Query Document Action.
  2. Provide a clear label such as Query : Expiry Date
  3. Configure the input parameters as follows:
  4. Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
  5. Query = Enter a clear Natural Language Data Extraction Promptsuch as: Return the Driving License Expiry Date in DD-MM-YYYY Format
  6. Leave all the other defaults and close the action.
Document image


The extracted date is returned as a text string but can be converted to a Date type data type using Salesforce formulas.

8- Extract License Number

  1. Click + and add a CloudFiles: Query Document Action.
  2. Provide a clear label such as Query: License number
  3. Configure the input parameters as follows:
  4. Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
  5. Query = Enter a clear Natural Language Classification Prompt such as: Return the Driving License Number
  6. Leave all the other defaults and close the action.
Document image


9- Update Contact Record

  1. To update the Contact Record with all the assigned field values, click + and Update Records element
  2. Provide a relevant label such as Update DL Details.
  3. For How to Find Records to Update and Set Their Values, check the option Specify conditions to identify records, and set fields individually
  4. To find the right Contact record compare ‘Id’ with the ‘Context’. In order to do so use Id Equals Outputs from Get Processed Event Details > DocumentProcessed > Context field-value pair to filter contact records in ‘Update Records’.
Document image


Now set the field values manually:

License NumberText from Query: License Number

DL Expiry DateText from Query: Expiry Date

DL Issue DateText from Query: Issue Date

Name in LicenseText from Query: DL Holder Name



Document image


6 - Save and Activate

Save your Flow 2 (for example as CloudFiles: Document Processed Event Handler ) and Activate the flow. You are now ready to test the working of these automations

Flow 2 Debug

Whenever a file is processed using the Process Document using AI action, a Document Processed object record is published.

You can query these event object records to verify successful file processing and debug the flow.

Example SOQL Query:

To check Document Processed events sorted by the most recent:

SELECT Id, Name, CreatedDate, cldfs__Data__c FROM cldfs__CloudFilesEvent__c WHERE cldfs__Type__c = 'document-processed' ORDER BY CreatedDate DESC

You can check the Context and File Details in the Data Field of the record.

When a flow runs in debug mode and executes Query Document actions, you can check the results and modify the queries if required.

See it in Action

Now that everything is set up, you can test your flows. When a Driver's License is uploaded into a Contact folder from CloudFiles LWC on Contact record page, the mapped fields should be updated automatically.

Note: As Flow Automation executions and document processing via AI may take a short time, the field updates will not reflect immediately. To view the updates, wait briefly and refresh the page.