Skip to main content

Advertisement

Log in

Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

  • Published:
Maternal and Child Health Journal Aims and scope Submit manuscript

Abstract

Introduction

Stigma and bias related to race and other minoritized statuses may underlie disparities in pregnancy and birth outcomes. One emerging method to identify bias is the study of stigmatizing language in the electronic health record. The objective of our study was to develop automated natural language processing (NLP) methods to identify two types of stigmatizing language: marginalizing language and its complement, power/privilege language, accurately and automatically in labor and birth notes.

Methods

We analyzed notes for all birthing people > 20 weeks’ gestation admitted for labor and birth at two hospitals during 2017. We then employed text preprocessing techniques, specifically using TF-IDF values as inputs, and tested machine learning classification algorithms to identify stigmatizing and power/privilege language in clinical notes. The algorithms assessed included Decision Trees, Random Forest, and Support Vector Machines. Additionally, we applied a feature importance evaluation method (InfoGain) to discern words that are highly correlated with these language categories.

Results

For marginalizing language, Decision Trees yielded the best classification with an F-score of 0.73. For power/privilege language, Support Vector Machines performed optimally, achieving an F-score of 0.91. These results demonstrate the effectiveness of the selected machine learning methods in classifying language categories in clinical notes.

Conclusion

We identified well-performing machine learning methods to automatically detect stigmatizing language in clinical notes. To our knowledge, this is the first study to use NLP performance metrics to evaluate the performance of machine learning methods in discerning stigmatizing language. Future studies should delve deeper into refining and evaluating NLP methods, incorporating the latest algorithms rooted in deep learning.

Significance

AbstractSection What is Already Known on this Subject?

Traditional informatics methods include natural language processing, and these methods have been increasingly applied to the study of public health problems using electronic health records.

AbstractSection What this Study Adds?

We identified well-performing machine learning methods to automatically identify stigmatizing language in labor and birth clinical notes. These methods have not been applied to labor and birth clinical notes and have the potential to be a powerful tool in examining perinatal health inequities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

Download references

Acknowledgements

This project was supported by funding from the Columbia University Data Science Institute Seeds Funds Program and a grant (GBMF9048) from the Gordon and Betty Moore Foundation.

Author information

Authors and Affiliations

Authors

Contributions

Author contributions are as follows: Conceptualization (VB, MT), Analysis (DS, AD, BRI, MT), Original draft (VB), Revised draft (VB, DS, HM, AD, BRI, KC, MT), Funding (VB, MT, KC).

Corresponding author

Correspondence to Veronica Barcelona.

Ethics declarations

Conflict of Interest

The authors have no conflicts of interest to disclose.

Human Subjects

Human subjects approval for this study was received from the Institutional Review Board at Columbia Irving Medical Center, AAAT9870.

Data Sharing

No new data were generated for this analysis, therefore, there are no data to share.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barcelona, V., Scharp, D., Moen, H. et al. Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes. Matern Child Health J 28, 578–586 (2024). https://doi.org/10.1007/s10995-023-03857-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10995-023-03857-4

Keywords

Navigation