Blue Shield Of California - MRF Files Ingestion and Visualization
- Tech Stack: Google Cloud Dataflow, Google BigQuery, Apache Beam, Python, Data Flow, Extract, Transform, Load (ETL), Continuous Integration and Continuous Delivery (CI/CD)
I contributed to the development and implementation of a comprehensive solution for ingesting and analyzing TIC (Third-Party Insurance Carrier) compliant publicly available MRF (Medical Record File) files from competitors(Anthem, Aetna, Centene, Cigna and Centene). The primary objectives of the project were to enable data-driven decision-making and provide cost transparency for both internal stakeholders and members.
Key Roles and Responsibilities:
- Implement (ETL) pipelines to automatically ingest data from 5 data sources and 1 internal source.
- Developed a provider group decision tree for seamless ingestion of MRF (Medical Record File) files.
- Setup BigQuery Data Schema to support data upload, data integration from internal and external data sources using Data Flow.
- Setup Connectivity to do one time historical migration for ongoing integration
- ETL development of pipelines to support ongoing incremental data loads
- Error handling and reprocessing failed records
- Development of data quality tests and alerting to check for data issues