skip to main content
10.1145/3219819.3219896acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

ActiveRemediation: The Search for Lead Pipes in Flint, Michigan

Published: 19 July 2018 Publication History
  • Get Citation Alerts
  • Abstract

    We detail our ongoing work in Flint, Michigan to detect pipes made of lead and other hazardous metals. After elevated levels of lead were detected in residents' drinking water, followed by an increase in blood lead levels in area children, the state and federal governments directed over $125 million to replace water service lines, the pipes connecting each home to the water system. In the absence of accurate records, and with the high cost of determining buried pipe materials, we put forth a number of predictive and procedural tools to aid in the search and removal of lead infrastructure. Alongside these statistical and machine learning approaches, we describe our interactions with government officials in recommending homes for both inspection and replacement, with a focus on the statistical model that adapts to incoming information. Finally, in light of discussions about increased spending on infrastructure development by the federal government, we explore how our approach generalizes beyond Flint to other municipalities nationwide.

    Supplementary Material

    MP4 File (chojnacki_activeremediation.mp4)

    References

    [1]
    Jacob Abernethy, Cyrus Anderson, Chengyu Dai, Arya Farahi, Linh Nguyen, Adam Rauh, Eric Schwartz, Wenbo Shen, Guangsha Shi, Jonathan Stroud, et al. 2016. Flint Water Crisis: Data-Driven Risk Assessment Via Residential Water Testing. arXiv preprint arXiv:1610.00580 (2016).
    [2]
    Maria-Florina Balcan, Steve Hanneke, and Jennifer Wortman Vaughan. 2010. The true sample complexity of active learning. Machine learning Vol. 80, 2 (2010), 111--139.
    [3]
    Maria-Florina F Balcan and Vitaly Feldman. 2013. Statistical active learning algorithms. In Advances in Neural Information Processing Systems. 1295--1303.
    [4]
    Alina Beygelzimer, Sanjoy Dasgupta, and John Langford. 2009. Importance weighted active learning. In Proceedings of the 26th annual international conference on machine learning. ACM, 49--56.
    [5]
    Steve Carmody and Mike Brush. 2016. Flint might have a bigger problem with lead pipes than previously thought. http://michiganradio.org/post/flint-might-have-bigger-problem-lead-pipes-previously-thought. (2016). (Accessed Feb, 16, 2017).
    [6]
    Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 785--794.
    [7]
    Alex Chojnacki, Chengyu Dai, Arya Farahi, Guangsha Shi, Jared Webb, Daniel T. Zhang, Jacob Abernethy, and Eric Schwartz. 2017. A Data Science Approach to Understanding Residential Water Contamination in Flint. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). ACM, New York, NY, USA, 1407--1416.
    [8]
    Matthew Dolan. 2016. Far more Flint homes have lead lines than expected, report shows. (2016). http://www.freep.com/story/news/local/michigan/flint-water-crisis/2016/09/28/more-than-half-flint-homes-could-have-lead-lines-report-shows/91225284/ (Accessed Feb, 16, 2017).
    [9]
    Arya Farahi and Jonathan Stroud. 2018. The Michigan Data Science Team: A Data Science Education Program with Significant Social Impact.
    [10]
    Ron Fonger. 2015 a. Flint data on lead water lines stored on 45,000 index cards. (2015). http://www.mlive.com/news/flint/index.ssf/2015/10/flint_official_says_data_on_lo.html (Accessed Feb, 16, 2017).
    [11]
    Ron Fonger. 2015 b. Here's how that toxic lead gets into Flint water. http://www.mlive.com/news/flint/index.ssf/2015/10/see_step_by_step_how_lead_is_g.html. (2015). (Accessed Feb, 16, 2017).
    [12]
    Alan E Gelfand and Penelope Vounatsou. 2003. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics Vol. 4, 1 (2003), 11--15.
    [13]
    Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. 2014. Bayesian data analysis. Vol. Vol. 2. Chapman & Hall/CRC Boca Raton, FL, USA.
    [14]
    Mona Hanna-Attisha, Jenny LaChance, Richard Casey Sadler, and Allison Champney Schnepp. 2016. Elevated blood lead levels in children associated with the Flint drinking water crisis: a spatial analysis of risk and public health response. American journal of public health Vol. 106, 2 (2016), 283--290.
    [15]
    Duncan Lee. 2011. A comparison of conditional autoregressive models used in Bayesian disease mapping. Spatial and Spatio-temporal Epidemiology Vol. 2, 2 (2011), 79--89.
    [16]
    Duncan Lee. 2013. CARBayes: an R package for Bayesian spatial modeling with conditional autoregressive priors. Journal of Statistical Software Vol. 55, 13 (2013), 1--24.
    [17]
    Alexander Liu, Goo Jun, and Joydeep Ghosh. 2008. Active learning with spatially sensitive labeling costs NIPS Workshop on Cost-sensitive Learning.
    [18]
    Kristin Moore. 2016. Number of Service Lines that Need Replacing in Flint Rises to 29,100, According to Study. https://www.cityofflint.com/2016/12/01/number-of-service-lines-that-need-replacing-in-flint-rises-to-29100-according-to-study/. (2016). (Accessed Feb, 16, 2017).
    [19]
    Eric Potash, Joe Brew, Alexander Loewi, Subhabrata Majumdar, Andrew Reece, Joe Walsh, Eric Rozier, Emile Jorgenson, Raed Mansour, and Rayid Ghani. 2015. Predictive modeling for public health: Preventing childhood lead poisoning Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2039--2047.
    [20]
    Anne Sandvig, P Kwan, G Kirmeyer, B Maynard, D Mast, R Rhodes Trussell, S Trussell, A Cantor, and A Prescott. 2008. Contribution of service line and plumbing fixtures to lead and copper rule compliance issues. Environmental Protection Agency, Water Environment Research Foundation.
    [21]
    Michael Torrice. 2016. How Lead Ended Up in Flint's Tap Water. Chem. Eng. News Vol. 94, 7 (2016), 26--29.

    Cited By

    View all
    • (2023)Machine Learning Applications in Sustainable Water Resource Management: A Systematic ReviewEmerging Technologies for Water Supply, Conservation and Management10.1007/978-3-031-35279-9_2(29-47)Online publication date: 26-Jul-2023
    • (2023) Geospatial model of composition of water service lines in Flint, Michigan : Validation using excavation data and a new compositional geostatistical approach AWWA Water Science10.1002/aws2.13315:2Online publication date: 24-Mar-2023
    • (2022)Prioritizing municipal lead mitigation projects as a relaxed knapsack optimization: a method and case studyInternational Transactions in Operational Research10.1111/itor.1321230:6(3719-3737)Online publication date: 30-Sep-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2018
    2925 pages
    ISBN:9781450355520
    DOI:10.1145/3219819
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. active learning
    2. flint water crisis
    3. machine learning
    4. public policy
    5. risk assessment
    6. water infrastructure

    Qualifiers

    • Research-article

    Funding Sources

    • The Michigan Institute for Data Science (MIDAS)

    Conference

    KDD '18
    Sponsor:

    Acceptance Rates

    KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)65
    • Downloads (Last 6 weeks)6

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Machine Learning Applications in Sustainable Water Resource Management: A Systematic ReviewEmerging Technologies for Water Supply, Conservation and Management10.1007/978-3-031-35279-9_2(29-47)Online publication date: 26-Jul-2023
    • (2023) Geospatial model of composition of water service lines in Flint, Michigan : Validation using excavation data and a new compositional geostatistical approach AWWA Water Science10.1002/aws2.13315:2Online publication date: 24-Mar-2023
    • (2022)Prioritizing municipal lead mitigation projects as a relaxed knapsack optimization: a method and case studyInternational Transactions in Operational Research10.1111/itor.1321230:6(3719-3737)Online publication date: 30-Sep-2022
    • (2022)Urban Anomaly Analytics: Description, Detection, and PredictionIEEE Transactions on Big Data10.1109/TBDATA.2020.29910088:3(809-826)Online publication date: 1-Jun-2022
    • (2022)Assessment of Lead in Drinking Water from Multiple Drinking Water Sampling Programs for a Midsize CityEnvironmental Science & Technology10.1021/acs.est.2c0661457:1(842-851)Online publication date: 23-Dec-2022
    • (2022)Are You Ready for Big Data?Journal AWWA10.1002/awwa.2021114:10(78-82)Online publication date: 7-Dec-2022
    • (2021)Understanding and Expanding College Students' Perceptions of Computing's Social Impact2021 Conference on Research in Equitable and Sustained Participation in Engineering, Computing, and Technology (RESPECT)10.1109/RESPECT51740.2021.9620589(1-10)Online publication date: 23-May-2021
    • (2021)Beyond ‘AI for Social Good’ (AI4SG): social transformations—not tech-fixes—for health equityInterdisciplinary Science Reviews10.1080/03080188.2020.184022146:1-2(94-125)Online publication date: 7-Mar-2021
    • (2021)Identifying Lead Service Lines with Field Tap Water SamplingACS ES&T Water10.1021/acsestwater.1c002271:8(1983-1991)Online publication date: 4-Aug-2021
    • (2021)Predicting childhood lead exposure at an aggregated level using machine learningInternational Journal of Hygiene and Environmental Health10.1016/j.ijheh.2021.113862238(113862)Online publication date: Sep-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media