Document Parser
O
Operations Team
Currently, I use a combination of DocParser and Zapier to load up information into SmartSuite from uploaded documents. It would be beautiful to have a system that allows us to configure a parser to find key points of information, and upload a document directly to SmartSuite to sift through and enter the information into fields.
Through DocParser I've been able to shave off between 10-15 minutes each document, and I feel like having this system within SmartSuite would be incredibly useful for automation!
Nate Montgomery @ SmartSuite
Merged in a post:
PDF Invoice OCR in Forms; Load in a PDF > Form reads > Creates an Invoice!
Tidd.Co
SmartSuite could eat the breakfast of several other full softwares with this ability, Dext and Hubdoc are OCR softwares - you email the PDF or load the PDF of an invoice to the apps and you teach it how to recognize the invoice terms with rules integrated to a customer database. Then the resultant data would be saved in a record in a purchase orders app.
Nate Montgomery @ SmartSuite
Merged in a post:
Extract Key Data from Scanned Document
Kelly Lamb
It would be great if we could scan a document into SmartSuite to automatically create a new record (eg, scan a PO to generate an order), without having to manually enter data.
Jon Darbyshire
Hey Kelly Lamb, thanks for your feedback! I have a few more questions for you:
- Could you provide examples of the types of documents you'd like to scan and the key data you'd like extracted from them?
- What is the typical format of the data in these documents (e.g., tables, paragraphs, lists)?
- How would you like the system to handle any potential errors or inconsistencies in the scanned data?
Kelly Lamb
Jon Darbyshire, thanks for your follow-up questions! The main type of document would be a Purchase order in the form of a PDF. Key data would include P.O. #, order date, manufacturer, mfg. product number, quantity, customer, customer part number, price, ship to, account manager. Every customer's P.O. looks a little different, so is it possible to train AI to look for and match the data needed, based on an existing image on the document, such as a logo or text stamped on document? So for example, if the P.O. is from customer XYZ, something in the scan would let the system know that "Seller" on this P.O. is equivalent to the "manufacturer" field in SmartSuite, vs. customer ABC's P.O. where they list the manufacturer as "Supplier".
After scanning, a human would probably need to review the entry to confirm accuracy. Any info not found could be left blank for manual entry. Could we trigger an event (such as an e-mail) to notify of any errors or inconsistencies?
Thanks again!
Kelly Lamb
Jon Darbyshire, as a follow up, I wanted to give as an example a company that seems to provide this option: https://flowrms.com/auto-data-load/
I have not personally seen it in action, but it is an option our organization is looking at closely.
Jon Darbyshire
Great to hear your perspective, Tidd.Co! I have a few more questions for you:
- Could you provide examples of the specific invoice terms you would like the software to recognize?
- What specific rules would you like to be integrated into the customer database for recognizing invoice terms?
- Could you elaborate on how you envision the resultant data being saved in a purchase orders app?
Tidd.Co
Jon Darbyshire definitely AI.
- invoice date, supplier name, address, line item detail, total
- default customer chart of account code, default terms etc.
- the invoice datapoints would save to a purchase orders app and the line items data would save to a linked line items record OR would be saved as subitems
Jon Darbyshire
Tidd.Co Thank you for the additional information. I do agree that this would be very useful and this is something that I am thinking about now. Quickbooks and other accounting products provide a feature like this that allows you to upload an invoice file and it then parses the content into a record for you. We have been looking at ways to use AI to perform this capability and with the recent updates to ChatPGT we can not do this using our Make and Zapier connectors. However, I am thinking about how we can easily provide an AI capability in Automations that will allow you to set this up and then performed the action on a click of a button or based on a Trigger condition you add.
Jon Darbyshire
Hello Operations Team! I have a few more questions for you:
- Can you provide examples of the key points of information you typically need to extract from the documents?
- What types of documents do you usually upload and need to parse?
- How do you envision the configuration process for the parser within SmartSuite?
O
Operations Team
Jon Darbyshire Absolutely. Currently we use DocParser to scan real estate offer letters from agents (California Residential Purchase Agreement PDF). We scan for the offer amount, buyer details, agent details, document dates, etc...
We are planning to parse more documents other than RPA's, but those are what we're starting with thus far.
As far as configurations go, DocParser's overall layout and whatnot works fine. If the documentation is pretty uniform, you can plot boxes where the information is and the parser automatically uses the information inside the crop box. What we do, since the document sizes and page numbers can vary, is we isolate it by text keywords. The parser will transform the document into all text (removing formatting) and you give it start and end keywords.
For example: If you know that a real estate agents name will be on a signature line, and the text right before says "Buyer Agent Name:" that would be your start location. Your end location would be the keyword right after, or the next line of text. The keyword might be, "Sign Date:" ("Sign" would be the end point). So it will capture anything between those keywords.