|Reduction in Manual Work||Accuracy||Faster generation of reports|
In today’s fast-paced business environment, it is essential for financial reviewers to streamline their processes and optimize their time in reviewing financials while maintaining quality of service.
The Existing Process involved implicit knowledge and skills of reviewers that posed a challenge to the scalability of the company.
Hence, there was a pressing need for a more efficient and automated process that could address these challenges and help financial reviewers make informed decisions quickly.
Our client, a global finance as a service company, decided to address the challenges through a combination of software engineering and state of the art Machine Learning models that would learn to recognize patterns and flag anomalies automatically. The solution eliminated the need for manual reviewers.
This helped our client increase the capability to scale and add new business without being limited by the training bottlenecks of financial reviewers.
The existing review process posed significant challenges, particularly in deciding on a threshold for expenses that are less or excess compared to previous months. This process involves human decision-making, which results in variations from entity to entity, departments, categories, and even employees/vendors/service providers/GL accounts within an entity. It also included variations within a month and from month to month. Despite attempts to create labeling rules, there was no agreement on how labeling should be done, resulting in significant variation in the labeling process. An initial “If else” condition was rolled out, with 16 rules for 5 labels, and exception cases caused exponential growth in the rules. Thus, an automated solution using machine learning (ML) was needed to address the challenges and variations in the review process and quickly make more informed decisions.
thinkbridge partnered with the client to design and implement a custom machine learning model that ensured key financial data was not exposed to third party systems and was cloud native. The key part was engineering a data collection mechanism that captures historical data along with reviewer actions that could be used to train the ML model.
The Solution involved three stages:
- Business Understanding: First, we understood the current decision-making process used by reviewers and defined the end output that needed to be predicted. We identified the important attributes or features that were used to predict the required output, such as entity, department, category, employee/vendor/service provider/GL account, and variation from previous months.
- Model Building: We provided easy-to-use controls on the user interface that allowed users to set the required output and attributes, which were stored in the system for further analysis and training of the ML algorithms. Based on the type of data and prediction requirements, we selected the most suitable algorithm for the dataset, which was a “Boosted Tree” algorithm. We then trained the ML algorithms on the collected data to generate the required Predictive Model.
- Maintenance: To ensure that the model was accurate and could predict future datasets correctly, we repeated the training process for a minimum of three months to collect enough data and variations. We measured the accuracy of the ML algorithm on new datasets using different combinations of feature attributes to improve the accuracy of predictions. Finally, we retrained the model based on new variations in the data to ensure that it continued to provide accurate predictions. To ensure that Model can continue to self-learn we created a “Retraining Loop” that would let the users flag any incorrect predictions and the model would learn from the mistakes.
The Technical Stack
- Azure SQL Database
- Azure COSMO Database
- Azure Functions
- Python, Numpy, Pandas, SkLearn, XG Boost
The client was able to completely eliminate manual reviewers from the workflow, freeing up team members to add value in other areas. The machine learning model was deployed to production with an initial accuracy of over 90%, and over the course of 3 months, the model accuracy reached 98%. As new clients are added the ML model continually updates itself to new data patterns.