BioMedware Chief Scientist Dr. Pierre Goovaerts will present his talk, titled “Finding water service lines with hazardous material using a compositional approach and geostatistics-informed machine learning,” at Geostats 2024 (12th International Geostatistics Congress) on Thursday, September 5. Learn more about the conference here.
Key takeaways from the presentation:
Water service lines (SL), crucial for connecting buildings to the public water supply, are often outdated and built from lead, presenting significant health risks to Americans. Increasingly machine learning (ML) approaches are used to identify which houses are most likely to have lines with hazardous material given features that reflect a combination of property characteristics (e.g., year built, property tax assessment), historical data (e.g., plumbing permits, meter installation records, SL installation, inspection and maintenance records), administrative spatial data, and tap water quality samples.
One of the strengths of ML algorithms is that they are very flexible and not restricted to linear relationships between these predictors and the target values. However, such classification algorithms are unable to handle heterogeneity related to geographic location and similarity of homes. This talk addresses this limitation and illustrates alternate prediction techniques using a Flint dataset that includes observations (type of material) for 26,731 tax parcels and 72 covariates.