From SAS to Databricks: A Successful Migration of Credit Risk Models

A major Dutch mortgage bank engaged Pipple to migrate credit risk models from SAS to Databricks and PySpark. The migration to Databricks enabled a significant improvement in performance, scalability, and ease of maintenance, with seamless integrations into Azure DevOps and MLflow. This case study outlines the challenges, approach, and results of the migration and how Pipple contributed to future-proof credit risk modeling for the bank.

Contents

Situation and Challenge

The Dutch mortgage bank faced the challenge of migrating its existing credit risk models, built in SAS, to a more modern and flexible solution. While SAS is a powerful platform, it is also costly and offers limited technical support. Within the financial sector, there is a clear trend towards Databricks and PySpark due to their scalability, flexibility, and seamless cloud integration.

Why the Transition to Databricks?

Databricks provides a flexible and scalable Platform-as-a-Service (PaaS) that grows with usage, enabling more efficient cost management. Its integration with cloud-based tools such as Azure DevOps, MLflow, and Artifactory simplifies collaboration and model management.

Unlike SAS, Databricks supports multiple programming languages, including Python, SQL, R, and Scala, allowing teams to work more flexibly and efficiently. Apache Spark enables fast processing of large datasets, while its advanced AI and machine learning capabilities contribute to faster model development and management.

Objectives and Requirements

The key objectives of the project were improved performance, easier maintenance, and enhanced scalability. The ambition was to significantly increase the speed of the credit risk models. Additionally, maintenance needed to be more efficient by integrating Git and MLflow for version control, making the platform more user-friendly for new employees.

Specific requirements included enabling users to run the entire model pipeline with a single click. Furthermore, strict privacy and data security regulations had to be followed, ensuring that all data remained within the platform.

Migration Process

The migration was carried out in seven phases:

  1. Refactoring existing models to a scalable Databricks environment.
  2. Testing the new implementation to ensure correct functionality.
  3. Model validation by an external team to verify accuracy and reliability.
  4. Creating a release for approved models.
  5. Handover to the platform team for deployment.
  6. Shadow run, where the new model ran in parallel with the old one to analyze differences.
  7. Production deployment, making the models fully operational.

Solution

Pipple successfully migrated the bank’s credit risk models from SAS to a modern, scalable Databricks environment in PySpark. This resulted in faster processing, more efficient model management, and a future-proof data infrastructure. Thanks to our approach, the following improvements were achieved:

  • One-click deployment enhanced user-friendliness and simplified rapid model rollout.
  • Improved scalability and flexibility by transitioning to a robust Databricks environment with cloud integration in Azure.
  • More efficient maintenance and user-friendliness: The integration of MLflow and Git improved version control, making maintenance easier and the platform more accessible for new employees.
  • Support in model optimization: Pipple helped identify and resolve bottlenecks, enabling faster and more stable model execution.
  • Technical knowledge transfer and collaboration: We supported the team with expertise in Python and PySpark, code optimization, and technical coaching. By actively participating in refactoring and working closely with the team daily, the migration process was accelerated, and implementation was improved.

Pipple played a key role in both the technical transition and team development. Through close collaboration with the bank, we established a solid foundation for further optimization and future innovations in credit risk modeling.

With this modern Databricks environment, the bank is prepared for the future: a flexible, efficient, and scalable credit risk modeling system.

About Pipple

What started as an ambitious startup in 2016 has grown into a thriving company with 40 dedicated employees. At Pipple, we believe in the power of data and AI to solve complex challenges and create meaningful impact. Our approach is innovative, creative, and human-centered, always striving for the best solutions for our clients.

We follow a structured process: from clearly defining the problem to building smart, scalable solutions that truly work. Together with our clients, we make data valuable and accessible, with a strong focus on both technology and user experience.

Our team consists of passionate data scientists, engineers, and strategists who push boundaries and help businesses grow. At Pipple, it’s not just about numbers—it’s about people. And that’s what makes us unique.

Contact

Unleash the power of your data with an advanced data platform that serves as the beating heart of your organization. By bringing all data sources together, you create valuable insights that are instantly accessible and drive strategic growth.

Ready to transform your data into a competitive advantage? Schedule a free inspiration session with our experts today and discover how data and AI can take your organization to the next level.

Questions about the latest news, events and PR?

Rob can tell you everything about our organization, mission and vision. He would love to get in touch with you!

Rob Tillemans
Commercial Director
commercie@pipple.nl