Policy Forum

Jan 2025
Peer-Reviewed

How Should Epidemiologists Respond to Data Genocide?

Abigail Echo-Hawk, MA, Sofia Locklear, PhD, Sarah McNally, MPH, Lannesse Baker, MPH, and Sacena Gurule, MPA
AMA J Ethics. 2025;27(1):E44-50. doi: 10.1001/amajethics.2025.44.

Abstract

Data quality for and about American Indian/Alaska Native (AI/AN) people is undermined by deeply entrenched, colonial practices that have become standard in US federal data systems. This article draws on cases of maternal mortality and COVID-19 to demonstrate the ethical and clinical need for inclusive, diverse, and accurate data when researching AI/AN health trends. This article further argues that epidemiologists specifically must challenge implicit bias, question methods and practices, and recognize colonial, racist reporting practices about AI/AN people that have long undermined data collection, analytical, and dissemination practices that are fundamental to epidemiological research.

First Data Gatherers

American Indian and Alaska Native (AI/AN) people were the first data gatherers in what is now called the United States.1 Indigenous communities have consistently been empirically rigorous, collecting both quantitative and qualitative data for health and well-being purposes.2 Prior to colonization, AI/AN individuals and communities had robust health, and AI/AN health care practices have long been utilized in Western medicine.3 Indigenous knowledge is still passed down through generations, despite settler colonialism’s initiation in the late 1400s as one of the most influential social determinants of AI/AN health.4,5,6 Settler colonial logic is a “logic of elimination,”4 whereby settler colonizers purposefully try to deplete and eliminate original people and their cultures through genocide. Less widely known is how settler colonial genocidal practices have influenced data.7

Data Sovereignty

Prior scholars have deemed the exclusion of AI/AN people from federal data, such as the US census, to be “statistical genocide.”7 The Urban Indian Health Institute (UIHI)—a division of the Seattle Indian Health Board and the only national Tribal Epidemiology Center that serves urban dwelling AI/AN populations by providing public health support through data, research, and evaluation—considers statistical genocide to be a form of data erasure contributing directly to a larger colonial project of data genocide. Data genocide, as defined by UIHI, is “the elimination of Indigenous people in data resulting in the non-fulfillment of treaty and trust responsibilities due to ‘lack’ of data on urban and rural tribal communities.”8 Data genocide also includes the erasure of Indigenous people through aggregating data and misclassifying Indigenous people within datasets. Even when collected, any data about nation-based Indigenous people in the United States must respect federal treaty rights, a tenet of which is Indigenous data sovereignty. Indigenous data sovereignty is the right of each Tribe to exercise sovereignty over the collection, ownership, and application of data that aligns Indigenous customs, values, and ways of knowing.9 Data sovereignty extends to any health information collected about Indigenous people and must be respected to ensure that collection and use of the data align with Indigenous principles and is guaranteed by the United Nations Declaration of the Rights of Indigenous People, which the United States announced support for in 2010.10

Data Invisibility

One striking example of data genocide is the invisibility of AI/AN people in maternal mortality rates. AI/AN women, along with Black women, have some of the highest rates of pregnancy-related mortality deaths, with a significant increase seen in 2021 associated with the COVID-19 pandemic.11 Yet this fact often goes ignored in most analyses of maternal mortality rates, with AI/AN people being lumped into an “other” category, thereby erasing their racial and political identity as Indigenous and eliminating the ability to disaggregate the data and identify disparate outcomes for this group. Collapsing racial and ethnic data into an other category is often rationalized by small sample sizes. Yet data genocide—through individuals being racially misclassified within federal data sets—contributes to shrinking the sample size.7,8 Through racial misclassification, Indigenous people are made invisible while simultaneously being labeled as “other.” Consequently, calculation of maternal mortality deaths, which are linked to the social determinants of health,12 now lies in the hands of a system that determined that AI/AN birthing people were too small of a population to separate out—or to do so precisely—within statistical analyses,13,14 making invisible the reality of maternal mortality for AI/AN women. These practices are racist because they reify settler colonial power’s embeddedness in data systems, data analysis, and data dissemination by not collecting and reporting data on Indigenous people’s race and ethnicity.

Through racial misclassification, Indigenous people are made invisible while simultaneously being labeled as “other.” 

This problem is avoidable. Yet it is further exacerbated by common data practices spanning collection to dissemination. The use of a single-race AI/AN category illustrates how these data practices are rooted in data genocide.8 Despite AI/AN being one of “the largest growing multi-racial groups in the United States,”15 it is common practice for government, academic, and other agencies to use only a single-race AI/AN category in their analyses, effectively shrinking the sample size of specific groups through dilution, potentially overlooking statistically significant differences, and upholding a former colonial practice by the US government to determine who was AI/AN based on blood quantum.16 There is no scientifically valid reason to use only a single-race AI/AN category in data analysis and dissemination, and, as a result of the advocacy of tribal nations, only Tribes, not the US government, can determine who is a tribal member.17 Yet statistical and other agencies continue to use this outdated, nonscientific, and colonial data practice. The authors recognize this practice as structural racism in data. To uproot this structural racism, the field of epidemiology must challenge implicit bias, question what has become standard methodological practice, and recognize the unintended and very real consequences of this practice and other colonial data practices on AI/AN and other populations, such as Pacific Islanders, impacted by ongoing colonialism.

The UIHI’s report, “Data Genocide of American Indians and Alaska Natives in COVID-19 Data,”8 which discussed AI/AN COVID-19 data reporting for all US states, illustrated the detrimental effect of the elimination of AI/AN in data, as it resulted in misallocation of federal funds meant to address the pandemic,18 despite AI/AN being one of the groups most detrimentally affected by the virus.19 In one of the first studies published on COVID-19 infection rates in AI/AN, the authors were only able to include data reported to the Centers for Disease Control and Prevention for 23 states, as the rest of the nation was not reporting a minimum of 70% complete race and ethnicity data, effectively limiting the understanding of the virus in a paper that was intended to inform public health and clinical practice.19 Colonial data practices effectively prohibit researchers and clinicians from accessing the information they need to make data-driven decisions in research, policy, programming, and practice. In order to attain health equity, these practices must be challenged. For example, an individual can be an enrolled member of a federally recognized Tribe and categorized as American Indian, yet at the same time be racialized on a phenotypical level as Black or White, resulting in racial misclassification in medical records. These differences in racial reporting are crucial to capture within the data, as AI/AN patients can have significantly worse COVID health outcomes, for example, than White or Black patients.20 Disaggregated data on Indigenous peoples’ Tribal and community belonging, race, and ethnicity is vital in order for researchers to fully understand the diverse and complex picture of Indigenous health.21

Consequences of Data Genocide

Ongoing data genocide contributes to social theories of health inequalities like “deaths of despair” to explain why non-Hispanic White mortalities due to suicide, drug overdose, and alcoholic liver disease exceed the death rates of other racial groups, while ignoring the extreme health inequity Indigenous people experience.22 In fact, the validity of such theories is challenged when data about AI/AN people are appropriately included in analyses.22 Excluding AI/AN people from the data or subsuming them (thereby rendering them invisible) under an other category harms not only Indigenous people themselves—as inaccurate pictures of their colonially imposed health inequity due to data genocide are presented—but also those in other racial groups, as data genocide of Indigenous people misrepresents the data and promotes misunderstanding of health inequity among persons and communities designated as other. While Friedman et al demonstrate that, indeed, Indigenous people are experiencing much higher rates of deaths of despair than their non-Hispanic White counterparts,23 we strongly stand against the language of “despair” when analyzing deaths of any type for any racial group. This phrasing places blame on an individual’s emotional states and emotional points of intolerance instead of framing these deaths within the uninhabitable structures that settler colonialism and capitalism created. Data genocide has implications for the lived experiences of today’s AI/AN people and communities. Genocide happening within data collection, analysis, and dissemination hides lived realities of poor health outcomes, such as the alarmingly high rates of maternal child mortality for Indigenous women, and masks the contemporary ways in which settler colonialism affects AI/AN persons’ and communities’ health.

Rethinking Data Practices

To address data genocide at a basic level within clinical data collection and analysis, several small changes can be made. UIHI’s “Best Practices for American Indian and Alaska Native Data Collection”15 recommends a myriad of best practices that are grounded in and stem from Indigenous values. This framework specifies ensuring that any data collected about AI/AN people include a multiracial category and that those people are counted in the AI/AN category during analysis. Or, in other words, “if the AI/AN individual identifies as another race, include the individuals who are AI/AN in any combination with any other race and include those who identify as Latinx/Hispanic. In the event the definition cannot be as inclusive as stated above, the next less inclusive definition should be used, i.e. AI/AN alone.”15 International efforts led by the Māori Indigenous Sovereignty Network in New Zealand include creating a platform for Māori Tribal information managers to access existing government datasets, to which they then can add their own Tribal data and analysis; the platform is an efficient tool at merging governmental data with supplementary Tribally collected and owned data.24

The UIHI’s “Best Practices” also identifies opportunities to train staff, doctors, and data analysts on proper race data collection.15 Such training includes an understanding of race as a social construction and not as biological essentialism,25 learning about the political status of AI/AN individuals and Tribes, and understanding the impacts of racialization on health and the various ways in which these impacts must be captured in our ever-growing multiracial society. Additionally, epidemiologists must be trained on small population methodologies and Indigenous statistics26 for quantitative data analyses. Yet it isn’t just quantitative data about AI/AN people that must be meaningfully included; qualitative data must also be collected that can add rich nuance to our understandings of Indigenous health. Last, and most important, those who collect data should engage in conversations with local Tribes and urban Native communities on Indigenous data sovereignty and what data collection practices work best for their communities and geographies.

References

  1. Rodriguez-Lonebear D. Building a data revolution in Indian country. In: Kukutai T, Taylor J, eds. Indigenous Data Sovereignty: Toward an Agenda. Australian National University Press; 2016:253-274. Centre for Aboriginal Economic Policy Research Monographs; vol 38.

  2. Decolonize data: accurate data tells accurate stories. Urban Indian Health Institute. Accessed February 28, 2024. https://www.uihi.org/projects/decolonizing-data-toolkit/

  3. Redvers N, Blondin B. Traditional Indigenous medicine in North America: a scoping review. PLoS One. 2020;15(8):e0237531.

  4. Wolfe P. Settler colonialism and the elimination of the native. J Genocide Res. 2006;8(4):387-409.
  5. Wispelwey B, Tanous O, Asi Y, Hammoudeh W, Mills D. Because its power remains naturalized: introducing the settler colonial determinants of health. Front Public Health. 2023;11:1137428.

  6. McKay DL, Vinyeta K, Norgaard KM. Theorizing race and settler colonialism within US sociology. Sociol Compass. 2020;14(9):e12821.

  7. Anner J. To the US Census Bureau, Native Americans are practically invisible. Minor Trendsetter. 1990;4(1):15-21.
  8. Data genocide of American Indians and Alaska Natives in COVID-19 data. Urban Indian Health Institute. February 15, 2021. Accessed August 21, 2024. https://www.uihi.org/projects/data-genocide-of-american-indians-and-alaska-natives-in-covid-19-data/

  9. Kukutai T, Taylor J, eds. Indigenous Data Sovereignty: Toward an Agenda. Australian National University Press; 2016. Centre for Aboriginal Economic Policy Research Monographs; vol 38.

  10. United Nations. United Nations Declaration on the Rights of Indigenous Peoples. Hum Rights Q. 2007;33(3):909-921.
  11. Thoma ME, Declercq ER. Changes in pregnancy-related mortality associated with the coronavirus disease 2019 (COVID-19) pandemic in the United States. Obstet Gynecol. 2023;141(5):911-917.
  12. Wang E, Glazer KB, Howell EA, Janevic TM. Social determinants of pregnancy-related mortality and morbidity in the United States: a systematic review. Obstet Gynecol. 2020;135(4):896-915.
  13. Trost SL, Beauregard J, Njie F, et al. Pregnancy-related deaths among American Indian or Alaska Native persons: data from maternal mortality review committees in 36 US states, 2017-2019. Centers for Disease Control and Prevention. May 28, 2024. Accessed July 3, 2024. https://www.cdc.gov/maternal-mortality/php/data-research/2017-2019-aian.html

  14. Hoyert DL. Maternal mortality rates in the United States, 2021. Centers for Disease Control and Prevention. Reviewed March 16, 2023. Accessed July 3, 2024. https://www.cdc.gov/nchs/data/hestat/maternal-mortality/2021/maternal-mortality-rates-2021.htm

  15. Best practices for American Indian and Alaska Native data collection. Urban Indian Health Institute. Updated August 26, 2020. Accessed February 28, 2024. https://www.uihi.org/download/best-practices-for-american-indian-and-alaska-native-data-collection/

  16. Haozous EA, Strickland CJ, Palacios JF, Solomon TGA. Blood politics, ethnic identity, and racial misclassification among American Indians and Alaska Natives. J Environ Public Health. 2014;2014:321604.

  17. Tribal enrollment process. US Department of the Interior. Accessed July 3, 2024. https://www.doi.gov/Tribes/enrollment#:~:text=Rarely%20is%20the%20BIA%20involved

  18. Skinner A, Raifman J, Ferrara E, Raderman W, Quandelacy TM. Disparities made invisible: gaps in COVID-19 data for American Indian and Alaska Native populations. Health Equity. 2022;6(1):226-229.
  19. Hatcher SM, Agnew-Brune C, Anderson M, et al. COVID-19 among American Indian and Alaska Native persons—23 states, January 31-July 3, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(34):1166-1169.
  20. Musshafen LA, El-Sadek L, Lirette ST, Summers RL, Compretta C, Dobbs TE 3rd. In-hospital mortality disparities among American Indian and Alaska Native, Black, and White patients with COVID-19. JAMA Netw Open. 2022;5(3):e224822.

  21. Huyser KR, Locklear S. Reversing statistical erasure of Indigenous peoples: the social construction of American Indians and Alaska Natives in the United States using national data sets. In: Walter M, Kukutai T, Gonzales AA, Henry R, eds. Handbook of Indigenous Sociology. Oxford University Press; 2021:247-262.

  22. Case A, Deaton A. Rising morbidity and mortality in midlife among white non-Hispanic Americans in the 21st century. Proc Natl Acad Sci U S A. 2015;112(49):15078-15083.
  23. Friedman J, Hansen H, Gone JP. Deaths of despair and Indigenous data genocide. Lancet. 2023;401(10379):874-876.
  24. Kukutai T. How Indigenous communities in New Zealand are protecting their data. Science. 2024;384(6691):eado9298.

  25. Smedley A, Smedley BD. Race in North America: Origin and Evolution of a Worldview. Westview Press; 2012.

  26. Walter M, Andersen C. Indigenous Statistics: A Quantitative Research Methodology. Routledge; 2016.

Editor's Note

Background image by Julia O’Brien.

Citation

AMA J Ethics. 2025;27(1):E44-50.

DOI

10.1001/amajethics.2025.44.

Conflict of Interest Disclosure

Authors disclosed no conflicts of interest.

The viewpoints expressed in this article are those of the author(s) and do not necessarily reflect the views and policies of the AMA.