Blue Shield Of California - MRF Files Ingestion and Visualization

  • Tech Stack: Google Cloud Dataflow, Google BigQuery, Apache Beam, Python, Data Flow, Extract, Transform, Load (ETL), Continuous Integration and Continuous Delivery (CI/CD)

I contributed to the development and implementation of a comprehensive solution for ingesting and analyzing TIC (Third-Party Insurance Carrier) compliant publicly available MRF (Medical Record File) files from competitors(Anthem, Aetna, Centene, Cigna and Centene). The primary objectives of the project were to enable data-driven decision-making and provide cost transparency for both internal stakeholders and members.

Key Roles and Responsibilities:

  1. Implement (ETL) pipelines to automatically ingest data from 5 data sources and 1 internal source.
  2. Developed a provider group decision tree for seamless ingestion of MRF (Medical Record File) files.
  3. Setup BigQuery Data Schema to support data upload, data integration from internal and external data sources using Data Flow.
  4. Setup Connectivity to do one time historical migration for ongoing integration
  5. ETL development of pipelines to support ongoing incremental data loads
  6. Error handling and reprocessing failed records
  7. Development of data quality tests and alerting to check for data issues