Document AI
Features
Intelligent Document Queries
introduction in today’s business landscape, organizations handle a vast array of documents—contracts, invoices, kyc forms, medical records, purchase orders, and more—that are often scattered across systems, in different formats, or even handwritten and multilingual extracting key information from these files can be slow, error prone, and inefficient when relying on traditional search or manual processing cloudfiles document ai transforms document workflows by combining advanced ocr with natural language processing (nlp), enabling businesses to quickly find, analyze, and act on information its capabilities include nlp powered q\&a ask questions in natural language and get precise answers from any document instantly supports complex queries, multilingual text, and reduces manual search time multi format output extracted data can be delivered as plain text, structured tables, charts, or summaries, adapting seamlessly to your workflow and eliminating the need for manual reformatting deep data extraction pull structured text, tables, and metadata from across pages and file types, and merge results directly with salesforce for a unified, enriched view of your records with cloudfiles document ai, teams can automate data entry, accelerate workflows, and reduce errors—unlocking the full value of their documents without extra effort nlp powered q\&a when working with large documents—contracts, compliance files, kyc documents, or diagnostic reports—finding answers is often tedious traditional keyword based search tools only highlight text but don’t provide direct answers, forcing users to manually interpret results this creates inefficiencies, slows down decision making, and increases the chances of missing critical details the nlp powered q\&a feature solves this by allowing you to interact with documents in plain english (or any natural language) you ask a question, and cloudfiles ai gives you the precise answer from the document, instantly upload a document to ai playground docid\ tyblqcdnynqzxfrzqp86w to try it now for example query “what is the expiry date of this driver's license? output only the date in dd/mm/yyyy format ” output “01/08/2029" query extrahiere nur die vollständige firmenadresse aus dem verarbeiteten dokument gib die adresse als einfachen text zurück, ohne zusätzliche details, beschriftungen oder formatierungen output beispiel gmbh, musterstraße 12, 10115 berlin this capability opens up wide ranging applications for instance, a lawyer can extract key clauses from long contracts in seconds, or compliance teams can quickly identify required identity documents and agendas from regulatory filings powered by advanced nlp, intelligent document queries supports general queries, file based queries, making document exploration faster and smarter with this feature, businesses no longer need to waste time digging through unstructured text try it now in the ai playground docid\ tyblqcdnynqzxfrzqp86w and see how quickly you can find exactly what you need multi format output extracting information is only the first step, what often slows down workflows is the inconsistency in how information is structured across documents users often spend time cleaning, aligning, and reformatting data just to make it usable multi format output in cloudfiles document ai eliminates this hassle by enforcing consistent, structured outputs regardless of the document’s original format whether you need plain text for quick checks, a concise summary for an overview, or organized data in tables and charts for deeper analysis, the output adapts seamlessly to your workflow upload a document to ai playground docid\ tyblqcdnynqzxfrzqp86w to try it now ask a query on any uploaded document choose your preferred output formatted results in json, csv etc; the below query follows the best practice for prompting that will ensure accurate results query {instruction} you are an ai trained to extract structured data from documents {task} extract all line items from the provided purchase order document for each line item, extract the following fields line item name, quantity (whole number), unit price (currency value with two decimals), line amount (currency value with two decimals) {output format} return results only as a json array of objects with these keys item name c, quantity c, unit price c, line amount c {rules} if any field is missing, set its value to null quantity must be an integer unit price and line amount must be numbers with exactly two decimal places output only the json array no extra text, comments, or explanation {example} \[{"item name c" "item a", "quantity c" 10, "unit price c" 15 50, "line amount c" 155 00}, {"item name c" "item b", "quantity c" 5, "unit price c" null, "line amount c" 75 00 }] output \[ { "item name c" "platinum web hosting package down 35mb, up 100mb", "quantity c" 1, "unit price c" 65 00, "line amount c" 65 00 }, { "item name c" "2 page website design includes basic wireframes, and responsive templates", "quantity c" 3, "unit price c" 2100 00, "line amount c" 2100 00 }, { "item name c" "mobile designs includes responsive navigation", "quantity c" 1, "unit price c" 250 00, "line amount c" 250 00 } ] applications can span across industries for example operations teams can automatically extract order or shipment details, and finance teams can quickly consolidate line item data across invoices cloudfiles powers this adaptability by converting ai responses into multiple layouts and export ready formats with multi format output, the same answer works for multiple needs—no reformatting, no wasted time test it in the ai playground and see how easily it molds to your way of working deep data extraction documents often contain far more than just words tables, metadata, and structured data are usually embedded deep inside, spread across pages and formats that make manual extraction painful and error prone deep data extraction in cloudfiles document ai is designed to solve this challenge by going beyond surface level ocr it pulls structured text, tabular data, and metadata with precision and merges the results directly with salesforce, giving you a unified and enriched view of your records upload a document to ai playground docid\ tyblqcdnynqzxfrzqp86w , ask a query to try it out for example we upload a 3 page document that contains information of various insurance claims ask complex queries beyond the surface level ocr, that requires computing for example query instruction calculate the total amount of insurance claims raised in the year 2024 task review the document for all insurance claims raised during 2024 sum up the claim amounts return the total amount as a single numeric value, preferably with currency formatting if applicable example output $1,250,000 output $40,889 86 ask queries that use data directly from salesforce fields, allowing you to dynamically extract, filter, or compare information based on your salesforce records you can use this feature in the flow actions query instruction find the claim id for a specific claimant task search the document for claims where the claimant name matches {!$record patient name c} extract and return the corresponding claim id(s) as plain text ouput 2024001882 the applications span industries insurance automatically extract claim details, policy numbers, and settlement amounts from lengthy claim forms, enabling faster approvals and reporting legal pull clauses, dates, and party information from multi page contracts to assist in due diligence, compliance, or litigation support by providing a unified view of structured and unstructured data, deep data extraction not only accelerates workflows but also improves data accuracy, transparency, and decision making businesses no longer need to rely on manual extraction or risk missing critical insights hidden deep within documents