The National Library of Medicine awarded BioMedware’s Principal Investigator (PI), Dr. Pierre Goovaerts a 2-year SBIR Phase II grant titled “Geostatistical software for Non Parametric Geostatistical Modeling”.  With a ~$2M budget, the grant kicked-off on September 1, 2024.   

In almost all research and applied endeavors assessing the magnitude of uncertainty and how it impacts the subsequent decision-making process is of paramount importance.  For example, suppose we are forecasting a quantity (e.g. concentrations of toluene in Michigan, soil nutrients in farm fields, rainfall amount, disease incidence, cancer outcomes and so on) at locations on a map.  Then both the estimate (e.g. 1.23 ppm is the predicted toluene level at a specific location) and its uncertainty (e.g. the true value has a 0.95 probability to fall between 1.1 and 2.4 ppm) are essential pieces of information for us to make informed decisions and to determine relationships between variables.  But while value estimation has a rich methodological tradition and suite of techniques (e.g. averages, frequentist and Bayesian approaches, geostatistics, and machine learning), methods for uncertainty assessment are not as well developed, in particular when it comes to the availability of commercial software for non-parametric modeling of uncertainty and its application to compositional variables.   

What Is A Compositional Variable?   

Compositional data are a special type of categorical data where there are a fixed number of categories (e.g., 3 categories) and the values in each category (e.g., percentages, proportions) are non-negative and sum to a constant, often 1 or 100%.  Compositional variables are widely used in the environmental and health sciences, (e.g., to code exposures into categories such as “low”, “medium”, “high”; to handle categorical responses from surveys, such as ethnicity and smoking status; to represent landcover categories such as urban, forested etc., and many other instances).  

Unprecedented Value and Innovation 

Surprisingly, algorithms in existing software do not handle categorical variables appropriately, resulting in inaccurate results that can impact conclusions and policy decisions, a salient weakness addressed by this project. In addition, this project is putting in place techniques that leverage both geostatistics and machine learning, creating novel GeoAI methods that result in greater accuracy. These transformational techniques are being incorporated into the Vesta software, offering better solutions to a problem that is a substantial market opportunity. 

Primary Goals For Phase II  

Following a Phase I grant that demonstrated project feasibility, this Phase II grant will accomplish 4 objectives. 

  • Conduct future research developments to: 1) extend the new approach (quantile random forest with kriged data layer) developed in Phase I to include additional ML algorithms (i.e., support vector machines, gradient boosting) and spatial data layers (e.g., eigenvectors of distance matrix) in the comparison study, and 2) generalize cross-validation and performance measures to the multivariate case of compositional variables. 
  • Complete a fully functional and tested soft indicator coding and ML interpolation software product ready for commercial distribution. This desktop product will be fully compatible with products developed by ESRI, the leader in the GIS market, which currently do not include any CoDA tools and geostatistical non-parametric interpolation functionalities. It will include a soft indicator coding advisor to guide the user through the selection, coding, and interpolation of prior probability distributions, making these tools more accessible to the research community. This step will benefit from recommendations formulated by an independent working group that will be convened by the North American Association of Central Cancer Registries in year 1. 
  • Conduct a formal usability study to evaluate the design of the prototype based on usability protocols developed by the NIH involving (i) expert evaluation by the firm Tec-Ed and (ii) usability testing by representative users. The products will be modified accordingly based on usability testing results  
  • Apply the software and methods to demonstrate the approach and its unique benefits for the multivariate interpolation of exposure data, modeling and propagation of uncertainty through the subsequent analysis. 

The Team Behind The Grant 

An impressive team and institutional collaborations will address these aggressive Aims.  The grants officer at the National Library of Medicine is DrAllison Dennis.  In addition to the PI Dr. Goovaerts, the BioMedware team includes Ioana Nadra, Project Manager and documentarian; Robert Rommel, Lead Software Engineer; and Luna Jia, Associate Software Engineer.  Expert external usability assessment and advisement is provided by Dr. Recinda Sherman of the North American Association of Central Cancer Registries (NAACCR), Dr. Sue Grady of Michigan State University, Dr. Peter Siska from Louisiana State University Sciences, and Dr. Georges Adunlin from Samford University.   

Collaborating institutions are NAACCR and TecEd.  Under the direction of Dr. Recinda Sherman, NAACCR will conduct a 2-day expert panel meeting to assess grant progress and to evaluate and make recommendations for the Vesta software and uncertainty modeling approaches.  Currently, this meeting is planned in conjunction with NAACCR’s 2025 annual meeting to be held in June in Hartford, CT.  TecEd is a usability and user experience testing firm located in Ann Arbor, Michigan.  In grant year 2 TecEd will conduct a formal study that will include 10 study subjects selected to be representative of the Vesta software user base.  This study will step through a set of tasks in the Vesta software and will apply objective criteria to assess usability and user experience.  Recommendations formulated by TecEd will be used by the BioMedware team to improve and enhance the Vesta software. 

The entire BioMedware team thanks its colleagues at SPARK, a business accelerator that is proving invaluable to BioMedware’s commercialization objectives, and at Merithot, a first in class digital marketing firm that is charged with communicating BioMedware’s contributions and offerings. 

What can you do?  BioMedware is looking for marketing and commercialization partners.  Reach out to Jacquez@biomedware.com if you are interested.  

About Our Contributors and Collaborators:  

North American Association of Central Cancer Registries  

The North American Association of Central Cancer Registries (NAACCR, Inc.), is a professional organization that develops and promotes uniform data standards for cancer registration; provides education and training; certifies population-based registries; aggregates and publishes data from central cancer registries; and promotes the use of cancer surveillance data and systems for cancer control and epidemiologic research, public health programs, and patient care to reduce the burden of cancer in North America. 

National Library of Medicine  

The National Library of Medicine (NLM) is a leader in biomedical informatics and computational health data science research and the world’s largest biomedical library. NLM leads innovation in the development of advanced tools for clinical data interpretation and decision-making through our cutting-edge research, training programs, and information services. This organization plays a vital role in biomedical discovery and the translation of biomedical research into practice.   

TecED  

TecEd is a team of usability experts and a user experience (UX) consultancy headquartered in  Ann Arbor, Michigan. For over 40 years, TecEd has worked with Fortune 500 companies, small and medium-sized businesses, startups, and not-for-profits to design successful products, websites, systems, and services. 

Spark  

Ann Arbor SPARK is a non-profit organization dedicated to the development of economic and employment opportunities in Washtenaw and Livingston counties. SPARK assists businesses with their relocation and expansion plans, offers business acceleration services that drive the development of technology startups, and provides talent-related resources.  

Merithot Creative Marketing 

Merithot is a fast-growing creative marketing agency in downtown Ann Arbor on a mission to help great companies create remarkable brands, websites, and content. As a full-service agency, Merithot specializes in brand strategy, web development, digital marketing, content creation, and videography. Merithot helps businesses push boundaries, build empires, and change the world.