Integrating geophysical and geochemical data using machine learning tools to map and monitor transport and accumulation of contaminants in groundwater: two case studies

In our early collaboration with Humber College, we combined a geoelectrical study with measurements of surface methane emissions for a closed landfill in southern Ontario. The resulting tomographic images of the subsoil show resistivity contrasts that monitor the transport and buildup of groundwater contaminants. The possible coupling between methane emissions and leachate’s geoelectrical proxies was also determined via machine learning tools, to describe the correlation between the shared effects of the multiple parameters in the waste stabilization zone and methane concentrations in the biogas.

Another multidisciplinary collaboration has been recently initiated with Humber College (Prof. Maria Jacome), Fort William First Nation (FWFN), and U of T Departments of Human Biology and Indigenous Studies (Prof. Melanie Jeffrey), Applied Chemistry and Chemical Engineering (Prof. Daniela Galatro), and Industrial and Mechanical Engineering (Prof. Jason Bazylak).

This research aims to communicate relevant environmental information and better visualize and rank the levels of hazardous compounds in soil and water for a complex setting affected by numerous sources of contamination. In this way, local and Indigenous capacities are built to make informed decisions about land use and healing.

Based on a series of regional geophysical surveys (geoelectrical and ground penetrating radar/GPR) and the acquisition and handling of new and existing geochemical information for different environmental media, the project’s ultimate goal is to quantify the interpretation of multiple data layers by developing and applying machine learning tools and generating maps of indicators that define distinct contaminated zones in FWFN.

Determining Causality between Environmental Risk Factors and Health Outcomes

A preliminary study by the U of T / Humber College Research Team has set forth a numerical framework using average treatment effect (ATE) and Uplift modeling to evidence causation when relating exposure to benzene indoors and outdoors to childhood Acute Myeloid Leukemia, predicting causation when exposed indoors to this contaminant. This work employed simulated data based on previously published results for a case study in California.

In Fort William First Nation (FWFN), a description of the distribution of possible contaminants relative to a residential area of concern can be obtained from the environmental reports and the geophysical information (NSERC CCSIF). Besides, co-applicant Prof. Melanie Jeffrey (U of T) is compiling a cancer registry database for FWFN (Connaught Indigenous Stream) that could provide valuable epidemiological information.

Different Machine Learning (ML) tools would be applied to the available data to investigate causality and interaction between environmental and epidemiological risk factors in FWFN. In addition, to acquire an understanding and knowledge of the outcomes given by these ML algorithms, the application of a set of novel meta-learning approaches might also be explored.