Tutorials
Document AI Tutorials

Line Item Extraction

overview in this tutorial, we will see how cloudfiles can be used with salesforce flow automations to extract data from documents in a specified format each time a file is uploaded into salesforce or external storage, the following actions will occur the file is processed using cloudfiles document ai a natural language query executes to extract order line items from the uploaded purchase orders information is extracted the extracted information is then processed in order to setup this automation, you first need to have some pre requisites after this you can start setting up some query automations and updates to achieve your use case check the sections below for more information refer to the below image to understand the working pre requisites before starting with this tutorial, you need to ensure that cloudfiles document ai is properly installed & configured in your system additionally, you also need to create some initial automations to process the files before you can query them check the 2 sections below carefully installation & configuration ensure the following before starting cloudfiles document ai installation docid\ rfxciqrbejmpsmbczexam and configuration docid\ amnhmxwixyngnaxerbkwd are complete you have an active subscription or trial of cloudfiles document ai the cloudfiles event mode setting is set to custom object initial automation setup there are 2 flows you will need to setup to process the uploaded files the first flow is triggered when the document is uploaded into salesforce or external storage this flow sends the file to ai for processing this might take a few seconds once the ai is done with processing, the second flow will trigger which will contain the processed document you can then run queries on this processed document flow 1 send document for processing triggered when file is uploaded into salesforce or external storage sends the file for processing which may take a few seconds to a minute flow 2 document processing complete triggered when the file has been processed by ai performs queries on the file to extract the data all information related to setting up these flows is given in the ai flows initial setup guide docid\ sxn79rxu7gw7bb6ogpuix article please go through this article carefully to set up these flows once your flow setup is complete, your flows should look like the following read the article on ai flows initial setup guide docid\ sxn79rxu7gw7bb6ogpuix carefully to setup both the flows once your initial flows are setup as shown above, you are ready to move to the next step query automation setup now that you have the initial automation setup, we can take the document processed flow and extend it to query the document and update the required fields in this section, we show how to query the processed document, check the results and perform the necessary updates querying the document in order to query the document, we will simply use the query document docid\ dm3eh gzaocyoqd5y0r8d flow action this action taken in a processed document id and a text query as input the processed document id is available as output of the get event details action here are the full inputs used in the above image processed document id = triggering cldfs cloudfilesevent c" ( {!get triggering event details documentprocessed processeddocumentid} ) query = ai can be guided to generate structured outputs and responses tailored to our specific requirements, thereby simplifying data processing provide a clear prompt that tells the ai what data to extract, the required format for each field, and the output structure (e g , json) specify how to handle missing values (e g , use null ) and include any key mappings for integration example suppose there is a parent object called purchase orders , which has a related child object called order line items each order line item includes the following fields line item name , quantity , unit price , and line amount we need a query that can extract the order line items from a given sample purchase order and return the results in a structured json array format the sample query is extract all line items from the purchase order document for each line item, extract the following fields line item name quantity (whole number) unit price (currency value with two decimals) line amount (currency value with two decimals) return the result as a json array of objects each object must have these keys item name c quantity c unit price c line amount c format example \[ { "item name c" "", "quantity c" , "unit price c" , "line amount c" }, { "item name c" "", "quantity c" , "unit price c" , "line amount c" } ] rules if any field is missing for a line item, set its value to null (e g , "unit price c" null) quantity must be an integer (no decimals) unit price and line amount must be numbers with two decimal places output only the json array no extra text, comments, or explanation the ai output as tested in the ai playground on the above sample document is as follows \[{"line item name" "onsite stem workshop training", "quantity" "1", "unit price" "5000 00", "line amount" "5000 00"}, {"line item name" "stem workbook 1", "quantity" "850", "unit price" "12 00", "line amount" "10200 00"}, {"line item name" "stem workbook 2", "quantity" "850", "unit price" "42 00", "line amount" "35700 00"}, {"line item name" "stem workbook 3", "quantity" "850", "unit price" "15 50", "line amount" "13175 00"}, {"line item name" "test series 1", "quantity" "850", "unit price" "9 50", "line amount" "8075 00"}, {"line item name" "test series 2", "quantity" "850", "unit price" "11 50", "line amount" "9775 00"}] process ai output using apex action once the line items are available in json format, they can be processed in various ways within salesforce in this case, we use an apex action to handle the data — it reads the json, maps each item to the correct salesforce fields, associates them with the specified purchase order, and inserts all the records in a single step this is the apex code used to process the json input public with sharing class createpurchaseorderlineitemsfromjson { public class inputs { @invocablevariable(required=true) public id purchaseorderid; @invocablevariable(required=true) public string lineitemsjson; } @invocablemethod(label='create line items from json' description='creates purchase order line item c records from extracted json') public static void createlineitems(list\<inputs> inputslist) { if (inputslist isempty()) return; inputs inputs = inputslist\[0]; list\<purchase order line item c> lineitemstoinsert = new list\<purchase order line item c>(); try { list\<object> parsedjson = (list\<object>) json deserializeuntyped(inputs lineitemsjson); for (object obj parsedjson) { map\<string, object> itemmap = (map\<string, object>) obj; purchase order line item c lineitem = new purchase order line item c(); lineitem purchase order c = inputs purchaseorderid; if (itemmap containskey('item name c')) { lineitem item name c = (string) itemmap get('item name c'); } if (itemmap containskey('quantity c')) { lineitem quantity c = itemmap get('quantity c') != null ? integer valueof(itemmap get('quantity c')) null; } if (itemmap containskey('unit price c')) { lineitem unit price c = itemmap get('unit price c') != null ? decimal valueof(string valueof(itemmap get('unit price c'))) null; } if (itemmap containskey('line amount c')) { lineitem line amount c = itemmap get('line amount c') != null ? decimal valueof(string valueof(itemmap get('line amount c'))) null; } lineitemstoinsert add(lineitem); } if (!lineitemstoinsert isempty()) { insert lineitemstoinsert; } } catch (exception e) { system debug('error parsing or inserting line items ' + e getmessage()); // optionally add custom error handling / store to a debug object } } } see it in action now that everything is set up, you can test your flows when a purchase order is uploaded as a salesforce file on the record, the corresponding order line items are automatically extracted, created, and listed note as flow automation executions and document processing via ai may take a short time, the field updates will not reflect immediately to view the updates, wait briefly and refresh the page flow 2 debug whenever a file is processed using the process document using ai docid 31ujx1ligtkwfkjuanbzt action, a document processed docid\ xrr pnbpocwwcmyg6qgk9 object record is published you can query these event object records to verify successful file processing and debug the flow example soql query to check document processed events sorted by the most recent select id, name, createddate, cldfs data c from cldfs cloudfilesevent c where cldfs type c = 'document processed' order by createddate desc you can check the context and file details in the data field of the record when a flow runs in debug mode and executes query document docid\ dm3eh gzaocyoqd5y0r8d action, you can check the results and modify the queries if required