The Role of Data Equity in Achieving Health Equity

Milbank State Leadership Network
Focus Area:
State Health Policy Leadership
Health Equity
Dr. Ninez Ponce

Ninez Ponce

Dr. Paris "AJ" Adkins-Jackson

Paris “AJ” Adkins-Jackson

To celebrate its 100th year, The Milbank Quarterly has published a centennial anniversary issue with 36 articles that consider the future of population health. In this Q&A with Ninez Ponce, PhD, MPP, of the University of California, Los Angeles (UCLA) Center for Health Policy Research, and Paris “AJ” Adkins-Jackson, PhD, MPH, of the Columbia University Mailman School of Public Health, we discuss their contribution to the issue, “Making Communities More Visible: Equity-Centered Data to Achieve Health Equity,” coauthored with Riti Shimkhada, PhD, MPH, of the UCLA Center for Health Policy Research.

Their article argues that structural and systemic biases in health data systems render certain communities “invisible” and excluded from policy decisions, contributing to health inequities. The authors define data equity, discuss policy recommendations from the Biden administration’s Equitable Data Working Group Report, and highlight the need for community-centered data.

What is data equity?

Ponce: Data equity is about putting marginalized voices front and center. Data systems are limited to broad racialized categories and don’t measure the more granular, more precise identities of people from subgroups or intersections among racialized groups, sexual orientation groups, gender identity groups, and disability groups.

Adkins-Jackson: To acknowledge and then address systemic issues, biases, and inequities, you have to acknowledge and address that people are invisible in the data that we use. We group people together, or we might decide not to capture the data because the data points are just too small. Not only is this systemic bias, but it also sets the tone for racist measurement. What we’re asking is to turn that on its head. Acknowledge that these groups exist, and after you acknowledge that they exist, start to capture their experiences with inequity. Then you can address it because you see it.

Please describe the Equitable Data Working Group and what it set out to do.

Ponce: I’ve been in this data equity business for over 30 years, and I’m just thrilled that there’s momentum now led by the federal government. In 2022, the Biden administration’s Equitable Data Working Group released recommendations in the following areas: disaggregated data, underused data, capacity for equity assessment, partnerships across government and research communities, and accountability to the public. Our paper looks at these key priority areas, the selected recommendations, and some of the identified tasks.

In disaggregated data, the monumental change right now is revising the Office of Management and Budget (OMB) Statistical Policy Directive 15 that puts forth suggestions for racialized group categories. The last time that it was revised was in 1997. The five current racialized categories are American Indian or Alaska Native; Asian, Black or African American, Native Hawaiian or other Pacific Islander, and White. The Latino category is an ethnic overlay over these five racialized groups. One of the proposed revisions is to include Middle Eastern and North African in the minimum set of categories, and to add the Latino Hispanic identity as one of these racialized groups. There are also suggestions for a more detailed set of categories where subgroups are included in forms that elicit self-report of somebody’s identity. We fully endorse more disaggregation of racialized groups.

I think the most important piece is accountability to the public, where you increase transparency in serving marginalized communities, as well as build access, like through data-friendly tools, to make it not so expensive or so hard to get restricted data.

What does disaggregating data mean, and why is data visibility so important?

Ponce: Disaggregated data means unlocking the truths that are behind a lumped identity. The average aggregation trumps detection of the pain or the assets of the groups underneath this lumped identity. Even though the current OMB Directive 15 mandates that Asians be disaggregated from Native Hawaiians and Pacific Islanders, when the Native Hawaiian Pacific Islander Data Policy Lab at UCLA was looking state by state and jurisdiction by jurisdiction at COVID data, over 20 states were still not disaggregating. Lumping Native Hawaiians and Pacific Islanders with Asians hid the high death rates and case rates of COVID-19 for the Native Hawaiian and Pacific Islander group. Disaggregation increases data quality because it gets at this more precise inference of what’s happening with different groups in the US population.

Adkins-Jackson: I always think of social contracts. In my social contract with my country, I pay taxes. And for those taxes, I expect public amenities, I expect access to a good quality life. A lot of those societal functions are based on the data that the government has available for groups. If we’ve made you invisible in the data, how then can you advocate for your communities, for your needs? [Data] has such downstream implications for groups. If you can take my dollar, then you need to tell my story.

Another recommendation from the working group is about developing metrics on racism. Why is it important to develop metrics on racism?

Ponce: You need to have structural racism measures and not just look at the disparities outcomes. That’s why you need to disaggregate. You can’t have health equity without data equity.

Adkins-Jackson: Capturing structural racism and other structural and social determinants of health are capturing the exposure. What health disparities [research] does is capture the result of that exposure. Depending on how you see the relationship between the two, you might think that when you capture the outcome, which is the disease outcome or the health experience, that disparity encapsulates that exposure. And I think that is true, but I also think that it’s important to tell the complete story. If the distinction is not clear, when combined with the non-scientific categories, you end up with the illusion that there’s a biological basis for disparities, which is even more dangerous.

The COVID-19 pandemic highlighted the weaknesses of data collection. Why is it important to collect and combine both place-based and population-level data?

Ponce: You need both to triage and direct public policy resources [during an emergency]. Place-based is a quick way of deploying these resources. In some states, social vulnerability indices (SVI), which are multi-dimensional place-based measures, were used to prioritize where vaccine pop-ups would be, and to determine the allocation of resources to certain counties, community groups, and community organizations. For smaller groups that are dispersed, this geographic-based targeting misses out on the risk of groups that may not be in the so-called worst quintile or quartile where we’re targeting resources. Place-based data have value because you can immediately target places, but it must be augmented for populations that may not necessarily live in those areas that are the legacy of residential segregation.

Adkins-Jackson: We use easy targets like location as a way to understand an experience for a group of people. I often ask my colleagues what is more important: an exposure at home, an exposure at work, or an exposure going in between? Because for some racialized groups, they’ve never been mostly in their neighborhoods, they’ve always commuted out for work. So, while you’re looking at their neighborhood exposure, they were exposed on the drive to work, and you missed it because you didn’t capture the intersection of all these parts.

Your article closes with the concept of community-centered data and how that can help build community trust. What will it look like when communities have access to their data?

Ponce: If we’re trying to generate knowledge and evidence for populations and communities of interest, particularly marginalized communities, then they must be alongside us every step of the way. It’s not just creating a community advisory board and only asking them to give feedback at the beginning and the end of the project, or reaching out to community groups so they can help spread the word and disseminate our study. Data equity is the process and the product, but the big P is the process.

Adkins-Jackson: My colleagues and I have been joking about whether Chat GPT and other AI resources are going to replace us as data analysts. But AI cannot replace the community because you won’t understand context, you won’t understand meaning, you won’t understand exposure, you won’t understand impact, you won’t understand anything.

Ponce: We’ve also got to build the data capacity of these communities and community organizations. Let us make sure that the grants that we have invest in the data consumption capacity and data production capacity of organizations. You have to build the infrastructure in these community organizations, and that includes encouraging members of these organizations to be part of our pipeline of academics and data producers.

Adkins-Jackson: Accountability is being so connected to the community that they are parts of your institution and organization. Without them, you can’t even get your basic processes or products completed. And you can’t have data equity without the people at the table.