Flow Actions
Doc AI Flow Actions

Split Document using AI

Introduction

The CloudFiles: Split Document Using AI flow action enables automated splitting of merged PDF files stored in Salesforce or external cloud storage into individual documents. By leveraging AI, the action intelligently identifies and separates multiple documents within a merged PDF. This is particularly useful for processing files like scanned bundles of invoices or multi-document contracts.

Document image


What this action does

The CloudFiles: Split Document Using AI action asynchronously processes a merged PDF file—whether stored in Salesforce or external storage—and intelligently splits it into multiple individual documents based on its content. Once complete, it performs the following:

  • Publishes a Document Split event: This platform event includes metadata for each split file and can be used to trigger follow-up flows.
  • Uploads a searchable version of the original file: If the submitted document was a scanned, non-searchable file (e.g., image-based PDF), a searchable, OCR-enhanced version of the original merged file is also created and stored in the specified destination.
  • Auto-names each split file: Every generated split file is automatically named using AI, based on the content within that file (e.g., document type, person name, or reference ID).

These enhancements allow for better indexing, file retrieval, and immediate usability of both the original and split documents.

Example Use Case

Automating the processing of uploaded invoice bundles. A merged PDF containing multiple invoices—often scanned and non-searchable—is submitted to the flow. The Split Document Using AI action processes the file, performs OCR to generate a searchable version of the original document, and splits it into individual invoice files.

Each split file is:

  • Auto-named using AI based on invoice-specific details (e.g., invoice number, vendor name).
  • Saved alongside the searchable original file in a specified Salesforce record or external storage folder.

This setup ensures improved document searchability, easier identification, and seamless automation of invoice handling or record updates in Salesforce.

Input Parameters

In your Flow Builder, search for the element named "CloudFiles: Split Document using AI". You can find this action in the CloudFiles category when you click on the "Action" element in the "Add Element" box. Select the action to insert it into the flow, and then configure the input parameters.

To Split a Salesforce File

In order to specify a Salesforce File to be processed, Input paramters as:

  1. Library - salesforce
  2. FileID - The ContentDocumentID of the Salesforce File to be split.

You can get the ContentDocumentID of the Salesforce File from other standard salesforce elements like "Get Records" or standard Screen Flow "Upload Files" component or from details of CloudFiles Events like Salesforce File Attached.

  1. You can only split a Salesforce File in ContentDocument format.
  2. You cannot split an Attachment i.e. Classic Salesforce File format.
  3. It is mandatory to input both Library and FileID to specify a Salesforce File.

To Split an External Storage File

If you are using CloudFiles: Document Managemnt pacakage as well, then you can Process and Split the file in connected external storage as well.

In order to specify an External Storage File to be processed, Input paramters as:

  1. Library - The Library parameter is the external storage type you are using. Possible values are sharepoint, google (for Google Drive), azure , onedrive, dropbox, box, cloudfiles (for AWS S3).
  2. Drive Id - The Id of the drive where the document resides. This is important for Google Drive & Sharepoint libraries only. The Drive Id is a unique identifier for a storage location in both SharePoint and Google Drive. In SharePoint, it represents a document library within a site, while in Google Drive, it identifies a user's drive or shared drive.
  3. File Id - The unique identifier (Resource Id) of the file that is to be processed.

Based on the use case, you can get these parameters from details of other CloudFiles Events like File Uploaded or File Received etc.

Context

An optional identifier to track the source of the event or any other intended/necessary details. This shall be available in corresponding output i.e. in the corresponding Document Split event details.

The Context parameter is particularly helpful if this action is used in multiple flows. For example, if you’re processing documents attached to Contact records, you can set the Contact’s Record Id as the Context. When events are published, this Context value will help you track the origin of each event by showing the associated Contact record.

Instructions (Optional)

You can guide the AI by adding specific instructions for how to interpret and split the document. Providing contextual cues can improve the accuracy of the split, especially for varied or non-standard formats.

For example:

“This is a bundle of scanned invoices. Each invoice starts with a bold header containing 'Invoice Number' and ends with a subtotal. Split at each occurrence of this pattern.”



Destination

The Destination parameter determines where the generated files will be stored after CloudFiles completes the splitting process.

Steps to Configure Destination:

  1. Create an Apex-Defined Variable of Apex Class cldfs__Resource
    1. 

      Document image
      
  2. Assign the necessary metadata (e.g., Library, Drive Id, Id) based on the chosen storage option.
  3. To Save in Salesforce
    • Default Behavior (Salesforce Files):
      • If no destination parameter is specified, the files shall be stored in the Salesforce File Library.
    • Link to a Specific Record:
      • Assign the following fields:
        • Library: Set to salesforce.
        • Id: Set to the Record Id of the Salesforce record to which the files will be linked.
        • 

          Document image
          
  4. To Store in External Storage
      • Assign the following fields to define the external storage location:
        • Library: Specify the library name.
        • Drive Id: Specify the Id of the connected drive.
        • Id: Specify the Id of the folder or location within the drive.
        • 

          Document image
          
  5. Pass the configured variable as the Destination Parameter in the CloudFiles: Split Document Using AI action.

This setup provides flexibility to store files in Salesforce or external storage, aligning with your business needs.

Output Parameters

The apex action does not return anything as an output in the flow it is used but for every file processed for split, a Document Split event is published. This event signals the completion of document split and can be used to trigger platform event flows to perform post-split actions.