Using AI for Simple Data Extraction from KYC Files
In this tutorial, we will see how CloudFiles can be used with Salesforce flow automations to extract data from uploaded documents . Each time a file is uploaded into Salesforce or external storage, the following actions will occur:
- The file is processed using CloudFiles Document AI.
- A Natural Language Query executes to extract information from the Driving License.
- Information is automatically updated in its respective fields.
In order to setup this automation, you first need to have some pre-requisites. After this you can start setting up some query automations and updates to achieve your use-case. Check the sections below for more information.
Before starting with this tutorial, you need to ensure that CloudFiles Document AI is properly installed & configured in your system. Additionally, you also need to create some initial automations to process the files before you can query them. Check the 2 sections below carefully -
Ensure the following before starting:
- You have an Active Subscription or Trial of CloudFiles Document AI.
- The CloudFiles Event Mode setting is set to Custom Object

There are 2 flows you will need to setup to process the uploaded files. The first flow is triggered when the document is uploaded into Salesforce or external storage. This flow sends the file to AI for processing. This might take a few seconds. Once the AI is done with processing, the second flow will trigger which will contain the processed document. You can then run queries on this processed document.
- Flow 1: Send Document For Processing - Triggered when file is uploaded into Salesforce or external storage. Sends the file for processing which may take a few seconds to a minute.
- Flow 2: Document Processing Complete - Triggered when the file has been processed by AI. Performs queries on the file to extract the data.
All information related to setting up these flows is given in the AI Flows - Initial Setup Guide article. Please go through this article carefully to set up these flows.
Once your flow setup is complete, your flows should look like the following.

Read the article on AI Flows - Initial Setup Guide carefully to setup both the flows. Once your initial flows are setup as shown above, you are ready to move to the next step.
Now that you have the initial automation setup, we can take the document processed flow and extend it to query the document and update the required fields. In this section, we show how to query the processed document, check the results and perform the necessary updates.
In order to query the document, we will simply use the Query Document flow action. This action taken in a processed document ID and a text query as input. The processed document ID is available as output of the Get Event Details action. Here are the full inputs used in the images below.
Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
Query = Enter a clear Natural Language Data Extraction Prompt such as: Return the Full Name including the first and last name of the Driving License Holder

Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
Query = Enter a clear Natural Language Data Extraction Prompt such as: Return the Driving License Issue Date in DD-MM-YYYY Format

Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
Query = Enter a clear Natural Language Data Extraction Promptsuch as: Return the Driving License Expiry Date in DD-MM-YYYY Format

The extracted date is returned as a text string but can be converted to a Date type data type using Salesforce formulas.
Processed Document Id = Triggering cldfs__CloudFilesEvent__c" ( {!Get_Triggering_Event_Details.DocumentProcessed.ProcessedDocumentId} )
Query = Enter a clear Natural Language Classification Prompt such as: Return the Driving License Number

The Query document action above outputs text variables which is the full response of AI for the given query.To find the right Contact record compare ‘Id’ with the ‘Context’. In order to do so use Id Equals Outputs from Get Processed Event Details > DocumentProcessed > Context field-value pair to filter contact records in ‘Update Records’.

Now set the field values manually as shown in the below image:

Now that everything is set up, you can test your flows. When a Driver's License is uploaded into a Contact folder from CloudFiles LWC on Contact record page, the mapped fields should be updated automatically.
Note: As Flow Automation executions and document processing via AI may take a short time, the field updates will not reflect immediately. To view the updates, wait briefly and refresh the page.
Flow 2 Debug
Whenever a file is processed using the Process Document using AI action, a Document Processed object record is published.
You can query these event object records to verify successful file processing and debug the flow.
Example SOQL Query:
To check Document Processed events sorted by the most recent:
You can check the Context and File Details in the Data Field of the record.
When a flow runs in debug mode and executes Query Document actions, you can check the results and modify the queries if required.