Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

Veronica Barcelona ORCID: orcid.org/0000-0003-3070-1716¹,
Danielle Scharp¹,
Hans Moen²,
Anahita Davoudi³,
Betina R. Idnay⁴,
Kenrick Cato^1,5 &
…
Maxim Topaz¹

529 Accesses
Explore all metrics

Abstract

Introduction

Stigma and bias related to race and other minoritized statuses may underlie disparities in pregnancy and birth outcomes. One emerging method to identify bias is the study of stigmatizing language in the electronic health record. The objective of our study was to develop automated natural language processing (NLP) methods to identify two types of stigmatizing language: marginalizing language and its complement, power/privilege language, accurately and automatically in labor and birth notes.

Methods

We analyzed notes for all birthing people > 20 weeks’ gestation admitted for labor and birth at two hospitals during 2017. We then employed text preprocessing techniques, specifically using TF-IDF values as inputs, and tested machine learning classification algorithms to identify stigmatizing and power/privilege language in clinical notes. The algorithms assessed included Decision Trees, Random Forest, and Support Vector Machines. Additionally, we applied a feature importance evaluation method (InfoGain) to discern words that are highly correlated with these language categories.

Results

For marginalizing language, Decision Trees yielded the best classification with an F-score of 0.73. For power/privilege language, Support Vector Machines performed optimally, achieving an F-score of 0.91. These results demonstrate the effectiveness of the selected machine learning methods in classifying language categories in clinical notes.

Conclusion

We identified well-performing machine learning methods to automatically detect stigmatizing language in clinical notes. To our knowledge, this is the first study to use NLP performance metrics to evaluate the performance of machine learning methods in discerning stigmatizing language. Future studies should delve deeper into refining and evaluating NLP methods, incorporating the latest algorithms rooted in deep learning.

Significance

AbstractSection What is Already Known on this Subject?

Traditional informatics methods include natural language processing, and these methods have been increasingly applied to the study of public health problems using electronic health records.

AbstractSection What this Study Adds?

We identified well-performing machine learning methods to automatically identify stigmatizing language in labor and birth clinical notes. These methods have not been applied to labor and birth clinical notes and have the potential to be a powerful tool in examining perinatal health inequities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Racial Differences in Stigmatizing and Positive Language in Emergency Medicine Notes

Article 09 July 2024

Testimonial Injustice: Linguistic Bias in the Medical Records of Black Patients and Women

Article 22 March 2021

Initial development of tools to identify child abuse and neglect in pediatric primary care

Article Open access 17 November 2023

References

Alpaydin, E. (2020). Introduction to machine learning, fourth edition. MIT Press. https://books.google.com/books?id=tZnSDwAAQBAJ.
Alpert, J. M., Morris, B. B., Thomson, M. D., Matin, K., Geyer, C. E., & Brown, R. F. (2019). OpenNotes in oncology: Oncologists’ perceptions and a baseline of the content and style of their clinician notes. Transl Behav Med, 9(2), 347–356. https://doi.org/10.1093/tbm/iby029.
Article PubMed Google Scholar
Barcelona, V., Horton, R. L., Rivlin, K., Harkins, S., Green, C., Robinson, K., & Topaz, M. (2023a). The Power of Language in Hospital Care for pregnant and Birthing people: A vision for change. Obstetrics & Gynecology. https://doi.org/10.1097/AOG.0000000000005333.
Article Google Scholar
Barcelona, V., Scharp, D., Idnay, B. R., Moen, H., Goffman, D., Cato, K., & Topaz, M. (2023b). A qualitative analysis of stigmatizing language in birth admission clinical notes. Nursing Inquiry, e12557. https://doi.org/10.1111/nin.12557.
Beach, M. C., Saha, S., Park, J., Taylor, J., Drew, P., Plank, E., & Chee, B. (2021). Testimonial injustice: Linguistic Bias in the Medical Records of Black Patients and women. Journal of General Internal Medicine, 36(6), 1708–1714. https://doi.org/10.1007/s11606-021-06682-z[doi].
Article PubMed PubMed Central Google Scholar
Berthold, M. R. C., Dill, N., Gabriel, F., Kotter, T. R., Meinl, T., Ohl, T., Thiel, P., & Wiswedel, K., B (2009). KNIME – the Konstanz Information Miner. AcM SIGKDD Explorations Newsletter, 11(1), 26–31.
Article Google Scholar
Braveman, P., Dominguez, T. P., Burke, W., Dolan, S. M., Stevenson, D. K., Jackson, F. M., & Waddell, L. (2021). Explaining the black-white disparity in Preterm Birth: A Consensus Statement from a Multi-disciplinary Scientific Work Group convened by the March of dimes [Review]. 3. https://doi.org/10.3389/frph.2021.684207.
Bridle, J. S. (1990). Probabilistic interpretation of Feedforward Classification Network Outputs, with relationships to Statistical Pattern Recognition. In F. F. Soulié, & J. Hérault (Eds.), Neurocomputing (Vol. 68). Springer.
Coyne, I. T. (1997). Sampling in qualitative research. Purposeful and theoretical sampling; merging or clear boundaries? Journal of Advanced Nursing, 26(3), 623–630. https://doi.org/10.1046/j.1365-2648.1997.t01-25-00999.x.
Article CAS PubMed Google Scholar
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Drewniak, D., Krones, T., & Wild, V. (2017). Do attitudes and behavior of health care professionals exacerbate health care disparities among immigrant and ethnic minority groups? An integrative literature review. International Journal of Nursing Studies, 70, 89–98. https://doi.org/10.1016/j.ijnurstu.2017.02.015.
Article PubMed Google Scholar
Everett, B. G., Limburg, A., McKetta, S., & Hatzenbuehler, M. L. (2022). State-Level regulations regarding the protection of sexual minorities and birth outcomes: Results from a Population-based Cohort Study. Psychosomatic Medicine, 84(6), 658–668. https://doi.org/10.1097/psy.0000000000001092.
Article PubMed PubMed Central Google Scholar
Fernández, L., Fossa, A., Dong, Z., Delbanco, T., Elmore, J., Fitzgerald, P., & DesRoches, C. (2021). Words Matter: What do patients find judgmental or Offensive in Outpatient notes? Journal of General Internal Medicine, 36(9), 2571–2578. https://doi.org/10.1007/s11606-020-06432-7.
Article PubMed PubMed Central Google Scholar
Goddu, A. P., O’Conor, K. J., Lanzkron, S., Saheed, M. O., Saha, S., Peek, M. E., & Beach, M. C. (2018). Do words Matter? Stigmatizing Language and the transmission of Bias in the medical record. Journal of General Internal Medicine, 33(5), 685–691. https://doi.org/10.1007/s11606-017-4289-2[doi].
Article Google Scholar
Goh, Y. C., Cai, X. Q., Theseira, W., Ko, G., & Khor, K. A. (2020). Evaluating human versus machine learning performance in classifying research abstracts. Scientometrics, 125(2), 1197–1212. https://doi.org/10.1007/s11192-020-03614-2.
Article CAS PubMed PubMed Central Google Scholar
Hall, W. J., Chapman, M. V., Lee, K. M., Merino, Y. M., Thomas, T. W., Payne, B. K., & Coyne-Beasley, T. (2015). Implicit Racial/Ethnic Bias among Health Care Professionals and its influence on Health Care outcomes: A systematic review. American Journal of Public Health, 105(12), e60–76. https://doi.org/10.2105/AJPH.2015.302903[doi].
Article PubMed PubMed Central Google Scholar
Himmelstein, G., Bates, D., & Zhou, L. (2022). Examination of stigmatizing Language in the Electronic Health Record. JAMA Netw Open, 5(1), e2144967. https://doi.org/10.1001/jamanetworkopen.2021.44967.
Article PubMed PubMed Central Google Scholar
Ho, T. K. (1995). Random decision forests. The Institute of Electronical and Electronics Engineers (IEEE), In Proceedings of 3rd international conference on document analysis and recognition.
Hoover, K., Lockhart, S., Callister, C., Holtrop, J. S., & Calcaterra, S. L. (2022). Experiences of stigma in hospitals with addiction consultation services: A qualitative analysis of patients’ and hospital-based providers’ perspectives. Journal of Substance Abuse Treatment, 138, 108708. https://doi.org/10.1016/j.jsat.2021.108708.
Article CAS PubMed Google Scholar
Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. https://doi.org/10.1177/1049732305276687.
Article PubMed Google Scholar
Jindal, M., Thornton, R. L. J., McRae, A., Unaka, N., Johnson, T. J., & Mistry, K. B. (2022). Effects of a curriculum addressing racism on Pediatric residents’ racial biases and Empathy. J Grad Med Educ, 14(4), 407–413. https://doi.org/10.4300/jgme-d-21-01048.1.
Article PubMed PubMed Central Google Scholar
Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning Berlin, Heidelberg.
Kim, H., Sefcik, J. S., & Bradway, C. (2017). Characteristics of qualitative descriptive studies: A systematic review. Research in Nursing & Health, 40(1), 23–42. https://doi.org/10.1002/nur.21768.
Article CAS Google Scholar
Kravitz, E., Suh, M., Russell, M., Ojeda, A., Levison, J., & McKinney, J. (2021). Screening for Substance Use disorders during pregnancy: A decision at the intersection of racial and Reproductive Justice. American Journal of Perinatology. https://doi.org/10.1055/s-0041-1739433.
Article PubMed Google Scholar
Landau, A. Y., Blanchard, A., Cato, K., Atkins, N., Salazar, S., Patton, D. U., & Topaz, M. (2022). Considerations for development of Child Abuse and neglect phenotype with implications for reduction of racial bias: A qualitative study. Journal of the American Medical Informatics Association, 29(3), 512–519. https://doi.org/10.1093/jamia/ocab275.
Article PubMed PubMed Central Google Scholar
Locke, S. B., Al-Adely, A., Moore, S., Wilson, J., & Kitchen, A., G.B (2021). Natural language processing in medicine: A review. Trends in Anaesthesia and Critical care, 38, 4–9. https://doi.org/10.1016/j.tacc.2021.02.007.
Article Google Scholar
Malouf, R., Redshaw, M., Kurinczuk, J. J., & Gray, R. (2014). Systematic review of heath care interventions to improve outcomes for women with disability and their family during pregnancy, birth and postnatal period. Bmc Pregnancy and Childbirth, 14, 58. https://doi.org/10.1186/1471-2393-14-58.
Article PubMed PubMed Central Google Scholar
Manning, C. D. R., & Schütze, P., H (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press.
Martin, J. A., & Osterman, M. J. K. (2018). Describing the increase in Preterm Births in the United States, 2014–2016. NCHS data Brief, (312)(312), 1–8.
Google Scholar
Martin, K., & Stanford, C. (2020). An analysis of documentation language and word choice among forensic mental health nurses. International Journal of Mental Health Nursing, 29(6), 1241–1252. https://doi.org/10.1111/inm.12763.
Article PubMed Google Scholar
Minehart, R. D., Bryant, A. S., Jackson, J., & Daly, J. L. (2021). Racial/Ethnic inequities in pregnancy-related morbidity and mortality. Obstet Gynecol Clin North Am, 48(1), 31–51. https://doi.org/10.1016/j.ogc.2020.11.005.
Article PubMed Google Scholar
Omenka, O. I., Watson, D. P., & Hendrie, H. C. (2020). Understanding the healthcare experiences and needs of African immigrants in the United States: A scoping review. BMC Public Health, 20(1), 27. https://doi.org/10.1186/s12889-019-8127-9.
Article PubMed PubMed Central Google Scholar
Park, J., Saha, S., Chee, B., Taylor, J., & Beach, M. C. (2021). Physician use of stigmatizing Language in Patient Medical records. JAMA Network open, 4(7). https://doi.org/10.1001/jamanetworkopen.2021.17052.
Philipsborn, R. P., Sorscher, E. A., Sexson, W., & Evans, H. H. (2021). Born on U.S. Soil: Access to Healthcare for neonates of non-citizens. Maternal and Child Health Journal, 25(1), 9–14. https://doi.org/10.1007/s10995-020-03020-3.
Article PubMed Google Scholar
Quinlan, J. R. (2014). C4. 5: Programs for Machine Learning. 58–60. https://books.google.com/books/about/C4_5.html?id=b3ujBQAAQBAJ.
Sandelowski, M. (2010). What’s in a name? Qualitative description revisited. Research in Nursing & Health, 33(1), 77–84. https://doi.org/10.1002/nur.20362.
Article Google Scholar
Shattell, M. M. (2009). Stigmatizing language with unintended meanings: Persons with mental Illness or mentally ill persons? Issues in Mental Health Nursing, 30(3), 199. https://doi.org/10.1080/01612840802694668.
Article PubMed Google Scholar
Sun, M., Oliwa, T., Peek, M. E., & Tung, E. L. (2022). Negative patient descriptors: Documenting racial Bias. Health Aff (Millwood), 41(2), 203–211. https://doi.org/10.1377/hlthaff.2021.01423. The Electronic Health Record.
Article PubMed Google Scholar
Tiwary, U. S. S., T (2008). Natural Language Processing and Information Retrieval. Oxford University Press, Inc. https://dl.acm.org/doi/abs/10.5555/1481140.
Togioka, B. M., Seligman, K. M., & Delgado, C. M. (2022). Limited English proficiency in the labor and delivery unit. Current Opinion in Anaesthesiology, 35(3), 285–291. https://doi.org/10.1097/aco.0000000000001131.
Article PubMed Google Scholar
United States Department of Health and Human Services (2020). 08/04/2020). 21st Century Cures Act: Interoperability, information blocking, and the ONC health IT certification program National Archives. Retrieved November 5 from https://www.federalregister.gov/documents/2020/05/01/2020-07419/21st-century-cures-act-interoperability-information-blocking-and-the-onc-health-it-certification.
Vaswani, A. S., Parmar, N., Uszkoreit, N., Jones, J., Gomez, L., Kaiser, A. N., & Polosukhin, L. (2017). I. Attention is all you need. Advances in neural information processing systems https://arxiv.org/abs/1706.03762.

Download references

Acknowledgements

This project was supported by funding from the Columbia University Data Science Institute Seeds Funds Program and a grant (GBMF9048) from the Gordon and Betty Moore Foundation.

Author information

Authors and Affiliations

School of Nursing, Columbia University, 560 West 168th St, Mail Code 6, New York, NY, 10032, USA
Veronica Barcelona, Danielle Scharp, Kenrick Cato & Maxim Topaz
Department of Computer Science, Aalto University, Espoo, Finland
Hans Moen
VNS Health, New York, NY, USA
Anahita Davoudi
Department of Biomedical Informatics, Columbia University, New York, NY, USA
Betina R. Idnay
University of Pennsylvania, Philadelphia, PA, USA
Kenrick Cato

Authors

Veronica Barcelona
View author publications
You can also search for this author in PubMed Google Scholar
Danielle Scharp
View author publications
You can also search for this author in PubMed Google Scholar
Hans Moen
View author publications
You can also search for this author in PubMed Google Scholar
Anahita Davoudi
View author publications
You can also search for this author in PubMed Google Scholar
Betina R. Idnay
View author publications
You can also search for this author in PubMed Google Scholar
Kenrick Cato
View author publications
You can also search for this author in PubMed Google Scholar
Maxim Topaz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Author contributions are as follows: Conceptualization (VB, MT), Analysis (DS, AD, BRI, MT), Original draft (VB), Revised draft (VB, DS, HM, AD, BRI, KC, MT), Funding (VB, MT, KC).

Corresponding author

Correspondence to Veronica Barcelona.

Ethics declarations

Conflict of Interest

The authors have no conflicts of interest to disclose.

Human Subjects

Human subjects approval for this study was received from the Institutional Review Board at Columbia Irving Medical Center, AAAT9870.

Data Sharing

No new data were generated for this analysis, therefore, there are no data to share.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Barcelona, V., Scharp, D., Moen, H. et al. Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes. Matern Child Health J 28, 578–586 (2024). https://doi.org/10.1007/s10995-023-03857-4

Download citation

Accepted: 10 December 2023
Published: 26 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10995-023-03857-4

Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

Abstract

Introduction

Methods

Results

Conclusion

Significance

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Racial Differences in Stigmatizing and Positive Language in Emergency Medicine Notes

Testimonial Injustice: Linguistic Bias in the Medical Records of Black Patients and Women

Initial development of tools to identify child abuse and neglect in pediatric primary care

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Human Subjects

Data Sharing

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

Abstract

Introduction

Methods

Results

Conclusion

Significance

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Racial Differences in Stigmatizing and Positive Language in Emergency Medicine Notes

Testimonial Injustice: Linguistic Bias in the Medical Records of Black Patients and Women

Initial development of tools to identify child abuse and neglect in pediatric primary care

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Human Subjects

Data Sharing

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation