DataConnect
How For People streamlined Medicaid eligibility verification for 8.1 million eligible subscribers of FCC's Lifeline program.
The Challenge
For People partnered with Bixal and the Centers for Medicare and Medicaid Services (CMS) to modernize its Medicaid Eligibility API, which provides automated eligibility verification for the Lifeline program overseen by the Federal Communications Commission (FCC) and administered by the Universal Service Administrative Company (USAC). Due to configuration drift, technical debt, and misalignment with project engineering standards, the API suffered from outages and longer-than-expected recovery times. Our team worked with stakeholders at CMS to resolve the business and technical challenges to enhance Lifeline's system availability and reliability. These improvements ensure continued access to the benefit Lifeline provides to more than 8.1 million eligible subscribers as of June 2025.
In America, millions of low-income households rely on the Lifeline program to help pay for phone and internet services, which are critical for connecting to friends, family, jobs, emergency services and more. Consumers may qualify for the program based on a gross household income at or below 135% of the Federal Poverty Guidelines or participation in certain federal assistance programs. Originally, the Lifeline program verified eligibility using manual processes that could take 6+ months to approve applications. A few years ago, several For People employees working for Nuna's Government Services Team developed the Medicaid Eligibility API ("Lifeline API").
This API was the first cross-agency API for Medicaid, enabling the Lifeline eligibility verifier to call the CMS data source, supporting near instantaneous Medicaid eligibility verification. It also allowed analysts to quickly iterate using Databricks Notebooks, and enabled rapid development via cloud-based, serverless components. In 2020, the API was so successful that 60% of Lifeline beneficiaries could be instantly verified, replacing the manual verification process. However, many years later, when those same For People employees joined the new DataConnect project – which now oversaw the API - led by Bixal, the API system was experiencing long outages and was due for an upgrade.
When the Medicaid Eligibility API experienced an outage, it delayed eligibility verification and caused confusion for Lifeline applicants. This created blockers for applicants, raised barriers to access benefits, and negatively impacted the experience of those seeking government services. Poor experiences with public services reduce trust in the government and discourage people from applying for benefits and accessing critical services - it is imperative that government services are high-functioning and operational 24/7.
When the team inherited the Medicaid Eligibility API system, we experienced challenges aligning the API with existing staff skillsets, tech stack, and engineering practices of the project. For example, team members regularly encountered issues that hindered outage recovery efforts. Often, team members identified unexplained configuration drift, where the documented system differed from the implemented system in key areas. Sometimes, we deployed changes to fix one problem, only to discover those changes triggered new problems, usually involving unmanaged resources and unknown dependencies. While our team managed API code using the AWS CloudFormation infrastructure-as-code tool, the rest of the project used Terraform. Rather than operating as an integrated part of the overall DataConnect project, we inherited a Medicaid Eligibility API that ran in its own silo, isolated until incidents demanded attention from the rest of the project.
Our Approach
Business Solution
As just one of many priorities within the larger DataConnect project, it was challenging to justify investing resources on a major API overhaul versus maintaining the status quo. To start, our team documented the challenges of meeting the API service level agreement, which specified high levels of uptime and functionality. Not only that, we pointed out how the costs of postponing technical debt would far outweigh the costs of a one-time migration. On top of that, our team noted how the constant diversion of DataConnect team members to incident response disrupted larger project operations. Over the course of several months and quarterly planning meetings, we worked with Bixal and DataConnect's leadership to build a business case, gain stakeholder buy-in from CMS stakeholders, and achieve alignment across the five contractor teams on the project. Finally, the team's hard work paid off and we received clearance from the CMS Product Owners to proceed with the system overhaul.
Technical Solution
Upon securing business approval, the team worked to refactor the codebase to account for configuration drift, adapt to evolving requirements, and ensure long-term stability. For People's team members were key in this process due to our deep expertise in the API. Specifically, our team refactored the API data pipeline to run within Databricks instead of Amazon EMR. First, we granted the necessary permissions to enable Databricks to use AWS resources such as DynamoDB, Parameter Store, and S3. Then we modified the ingestion code to run as functions from a library instead of processes and steps within EMR. After creating the library, our team changed the existing notebook code to call these library functions rather than sending an AWS SNS alert. Also, we re-pointed all end-to-end tests to cover the new pipeline.
After finishing the pipeline code, our team refactored the AWS Lambdas to automatically launch it as a Databricks job. Then, we migrated the API out of the AWS vendor-specific CloudFormation tool and onto the more vendor-neutral Terraform used more broadly across the project.
Finally, we archived the API GitHub repo and migrated its code to other project repos, where appropriate. By doing so, we improved the ease of maintenance, enabled more rapid deployment of changes, and offered enhanced observability into the health and operation of the API.
- Amazon EMR data pipeline
- AWS CloudFormation
- Isolated GitHub repository
- SNS-triggered processes
- 30-minute deployments
- ~3 outages per year
- Databricks data pipeline
- Terraform
- Integrated project repositories
- Library-based functions
- 10-minute deployments
- Zero outages
Databricks • Terraform • AWS Lambda • Python • Amazon Web Services (AWS)
The Outcomes
Previously, the Medicaid Eligibility API experienced an average of 3 outages a year, which sometimes lasted several hours or more. After the refactor, the API experienced no outages attributable to the system itself. Also, where the API used to take 30 minutes to deploy, the refactored system now deploys the pipeline in just 10 minutes. All told, we managed to implement a sustainable solution to improve the long-term stability and functionality of the Medicaid Eligibility API.
Ready to achieve similar results?
Contact us to discuss how we can help your agency.
Partner With Us