Skip to content

parserdata/parserdata-make-invoice-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParserData + Make.com: Google Drive document extraction

Automatically extract structured data from documents uploaded to Google Drive using Make.com and the ParserData API, then save clean JSON results back to Drive.

Make.com ParserData

What this scenario does

This automation creates a fully hands-off document processing pipeline:

  • Watches a Google Drive folder
  • Detects newly uploaded files
  • Downloads the document
  • Sends it to the ParserData extraction API
  • Cleans the API response
  • Saves the extracted data as a JSON file back to Google Drive

Once activated, everything runs automatically.

Official ParserData pages

Requirements (Step 0: Before you start)

You will need:

  • A Make.com account
  • A Google Drive account with access to folders you want to monitor
  • A ParserData API key (sign up at parserdata.com)

Step 1: Import the scenario into Make.com

  1. Download the scenario blueprint: scenarios/google-drive-parserdata.json
  2. In Make.com, click "Create a new scenario"
  3. Click the "Import" button (top-right corner)
  4. Select the downloaded JSON file
  5. The scenario will be imported with all modules pre-configured

Step 2: Configure Google Drive connections

The scenario uses two Google Drive connections:

  1. Google Drive (Trigger): Watches for new files
  2. Google Drive (Action): Uploads extracted JSON back

For each Google Drive module:

  • Click on the module
  • Click "Add" next to the connection
  • Authenticate with your Google account
  • Grant necessary permissions (read/write access to Google Drive)

Step 3: Configure ParserData API connection

  1. Click on the "HTTP" module (the one making the API call)
  2. In the "Headers" section, add:
    • X-API-Key: Your ParserData API key
  3. The URL is pre-configured to: https://api.parserdata.com/v1/extract

⚠️ Important Security Note: Never hardcode your API key directly in the scenario.

Store your API key in Make.com environment variables:

  1. Go to your Make.com account settings > Environment variables
  2. Create a variable named PARSERDATA_API_KEY
  3. In the HTTP module, use {{vars.PARSERDATA_API_KEY}} instead of hardcoding the key

See .env.example for a template of required environment variables.

Step 4: Configure folders

Replace the placeholder folder IDs in the scenario:

  1. Input Folder: Where new documents will be uploaded

    • Find the Google Drive module labeled "Watch files"
    • Update the folder ID to your target folder
  2. Output Folder: Where extracted JSON will be saved

    • Find the Google Drive module labeled "Upload file"
    • Update the folder ID to your destination folder

Step 5: Test the Scenario

  1. Click "Run once" to test the scenario
  2. Upload a test document (PDF, JPG, PNG) to your input folder
  3. Check that the scenario executes successfully
  4. Verify that a JSON file appears in your output folder

Step 6: Schedule the scenario

Once testing is successful:

  1. Click "Schedule" on the scenario page
  2. Set polling frequency (recommended: every 15 minutes)
  3. Click "Save"

Scenario Structure

The scenario consists of 5 main modules:

  1. Google Drive (Trigger) - Watches for new files in a specific folder
  2. Google Drive (Download) - Downloads the file content as binary data
  3. HTTP Request - Sends file to ParserData API with extraction prompt
  4. Data Transformation - Cleans and formats the API response
  5. Google Drive (Upload) - Saves extracted JSON back to Drive

API Configuration Details

HTTP Request Module Settings

  • Method: POST
  • URL: https://api.parserdata.com/v1/extract
  • Headers:
    • X-API-Key: Your API key (use environment variable)
    • Content-Type: multipart/form-data
  • Body (form-data):
    • prompt: "Extract invoice number, invoice date, supplier name, total amount, and line items (description, quantity, unit price, net amount)."
    • options: {"return_schema":false,"return_selected_fields":false}
    • file: Binary file from Google Drive module

Error handling & Retry logic

The scenario includes:

  • HTTP status code checking - Retries on 429 (rate limit) and 5xx errors
  • Exponential backoff - Wait times increase with each retry
  • Maximum 3 retries - After which the scenario logs an error
  • Error notifications - Can be extended to send email/Slack alerts

Customizing the extraction

To extract different fields:

  1. Edit the HTTP module's prompt parameter
  2. Adjust the data transformation module to handle different response structures
  3. Update the output JSON filename pattern if needed

Example prompts:

  • Receipts: "Extract merchant name, transaction date, total amount, tax amount, payment method, and items purchased."
  • Bank statements: "Extract account number, statement period, opening balance, closing balance, transactions (date, description, amount, type)."
  • Purchase orders: "Extract PO number, supplier, order date, delivery address, line items (product code, description, quantity, unit price, total)."

Troubleshooting

Common Issues

  1. Authentication Errors: Re-authenticate Google Drive connections
  2. API Key Issues: Verify API key is active and has sufficient credits
  3. Folder Permissions: Ensure Google Drive folders are accessible
  4. File Size Limits: ParserData API has file size limits (check documentation)
  5. Timeout Errors: Increase timeout in HTTP module (default: 300 seconds)

Debugging

  1. Use "Run once" mode to test step-by-step
  2. Check each module's output by clicking the "i" icon
  3. Review Make.com execution logs
  4. Test API directly with cURL: curl -X POST -H "X-API-Key: YOUR_KEY" -F "file=@document.pdf" -F "prompt=Extract fields" https://api.parserdata.com/v1/extract

Security best practices

  1. Use Environment Variables: Never hardcode API keys
  2. Least Privilege: Google Drive connections should have minimal necessary permissions
  3. Audit Logs: Regularly review Make.com execution logs
  4. Rotate API Keys: Periodically rotate your ParserData API keys
  5. Data Retention: Configure Google Drive to automatically clean up old files if needed

Performance Considerations

  • Polling Frequency: Balance between real-time processing and API usage costs
  • Batch Processing: For high volumes, consider batching files
  • Concurrency: Make.com allows multiple concurrent executions (check your plan limits)
  • API Rate Limits: ParserData API has rate limits (check documentation)

Perfect for

This scenario is designed for real-world automation and data extraction:

  • Invoice Processing: Automatically extract invoice numbers, due dates, line items, prices, and totals
  • Accounting Automation: Reduce manual data entry by converting financial documents into structured JSON
  • ERP Ingestion Pipelines: Feed clean, structured data directly into ERP systems
  • CRM Data Enrichment: Extract customer, order, or transaction data from documents
  • Back-Office Operations: Streamline repetitive document handling tasks
  • Financial Reporting: Prepare structured data for dashboards and analytics
  • Startups and Small Teams: Replace manual workflows with AI-driven automation

Production deployment recommendations

For production environments, consider these additional best practices:

  1. API Monitoring: Set up monitoring for the scenario execution frequency and error rates
  2. Alerting: Configure Make.com error notifications or integrate with monitoring services
  3. Logging: Enable detailed logging to track document processing volume and success rates
  4. Rate Limiting: Be aware of ParserData API rate limits and adjust polling frequency accordingly
  5. Backup: Regularly backup your scenario configuration and environment variables
  6. Testing: Create a testing environment with separate Google Drive folders and API keys
  7. Documentation: Maintain internal documentation for your team on workflow changes

API versioning

This example uses the latest ParserData API version. Always check the official API documentation for updates:

Note: API endpoints and response formats may change in future versions. Subscribe to ParserData announcements to stay informed.

Security considerations

  1. API Key Rotation: Rotate your ParserData API keys periodically (every 90-180 days)
  2. Access Auditing: Review who has access to your Make.com scenarios and Google Drive folders
  3. Network Security: Ensure your network allows outbound HTTPS connections to api.parserdata.com
  4. Data Classification: Classify the sensitivity of documents being processed and apply appropriate controls
  5. Compliance: Ensure the workflow complies with relevant regulations (GDPR, HIPAA, etc.) based on your use case

Scaling considerations

For high-volume document processing:

  1. Batch Processing: Modify the scenario to process files in batches to reduce API calls
  2. Concurrent Executions: Adjust Make.com plan limits for concurrent scenario executions
  3. File Size Limits: Be aware of ParserData API file size limits for different document types
  4. Queue Management: Implement a queuing mechanism if processing many files simultaneously
  5. Performance Monitoring: Track processing time per document to identify bottlenecks

License

MIT

Need help or custom setup?

This repository is a reference example. If you need help tailoring it to your workflow, or want advice on more advanced ParserData API integration (custom schemas, scale, or production use), reach out to: support@parserdata.com

Contributing

Found a bug or have an improvement? Submit a pull request or open an issue.

About

ParserData API example: Make.com + Google Drive schema-based extraction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published