RSS

Complex Document Management System Build

This entry was posted on Jun 21 2018

A new client whose job it is to audit water management systems approached me to build a complex document management system to automate an existing system that required a lot of manual labour. The audits of varying intervals result in written reports that needed to be kept for future reference, and notifications need to be sent when the system determines that a report is late. In point form, the major requirements of the new system were:

  • Allow a new site (such as a building) to be added/updated with the following attributes:
    • List or auditing companies for given site
    • Report types and their required intervals
    • Email credentials
  • Include an interface to create a report scanner – which will be used to determine the report type by scan a incoming report (pdf, xls, doc file) and scrape metadata such as report date and report type. This has to be done for each report type for each auditing company.
  • Automatically download from the mail server any audits emailed from any number of reporters for the given site (eg. building)
  • Scan the report using the report scanner, give it a filename that includes relevant metarandata and move to a CDN.
  • Have a reporting page for each site to show the latest reports and show whether they we received on-time, late or overdue.
  • Have another service hosting a Nextcloud service that would make available the reports for a user using their preferred Nextcloud client (web browser or mobile app). Therefore any facility manager could easily see and search for any reports for a location the were responsible for.
  • Include a cronjob to look for any outstanding reports and send notifications to the facility manager for the given location.

Taking in all requirements, I determined AWS was the best platform to host the application as I could easily host the application and Nextcloud, use S3 for the CDN, use SNS for sending notifications and load balancers to place the servers behind.

Ansible was used to provision the local development environments as well as the EC2 instances on AWS for the application servers and Nextcloud servers which were all run on Debian.

I chose Yii2 framework to build the application for several reasons but mainly because I could easily scaffold each form and add validation and the business logic required. A custom component was written to retrieve new documents from the email server and scan them, and running the metadata through the scanner app. The documents were named according to their metadata and then moved across to an S3 bucket. If a document type was not found, then it was copied to a folder waiting to be fixed manually. A cronjob runs daily to scan the S3 folder for report types and check against the location’s report interval setting to determine if any reports were overdue and sent notifications if any were found.

The client is really happy with the application as it has taken away manual work as well as notify when reports are late which wasn’t available before.

Sorry, comments for this entry are closed at this time.