By Geoffrey M. Jacquez, PhD (BioMedware) and Daniel Goldberg, PhD (University of Southern California)

This blog provides a quick take on the First International Geospatial Geocoding Conference. I am pleased to write this blog with Dan Goldberg, conference organizer, with each of us providing impressions and lessons learned.

Conference Summary: The 1st International Geospatial Geocoding Conference was held Dec 6-7, 2011, on the campus of Esri in Redlands, California. The organizing committee included representatives from industry and academe. The meeting was funded by a conference grant from the Centers for Disease Control and Prevention (CDC) National Center for Environmental Health (NCEH) and The Agency for Toxic Substances and Disease Registry (ATSDR) under their Public Health Conference Support Program and was sponsored by Esri, Navteq, and the University of Southern California. With Dr. Dan Goldberg of the University of Southern California as the lead organizer, the conference brought together nearly 200 attendees from around the world. Two main themes were pursued, advances in geocoding technology and practice, and geocoding in health. These resulted in two special issues in the journals Transactions in GIS (technology and practice, John Wilson Editor-in-Chief), and Spatial and Spatio-temporal Epidemiology (geocoding in health, Andrew Lawson Editor-in-Chief). The presentations will be posted on the conference website, keep checking should they not be up when you read this blog.

Geoffrey Jacquez: I attended to learn more about geocoding techniques and emerging technologies; and to assess how these might impact geohealth, my area of specialization. Why worry? E-health, Health 2.0 and related initiatives are advancing rapidly (check out the recent mHealth Summit keynote address by Secretary of Health and Human Services Kathleen Sebelius for a summary of recent innovations), and electronic health records are rapidly being adopted. In 2011, 34% of physicians are using electronic health records, with 52% reporting they intend to adopt them soon. The e-health era is clearly upon us. An estimated 85% or more of e-health records contain georeferencing of some kind, typically addresses of patients, clinics, physicians, laboratories and pharmacies. Geocoding converts addresses into geographic coordinates, and these coordinates then are used to calculate disease rates in areas (e.g. counties); to site health screening facilities (e.g. to be sure mammography screening clinics are near populations that will benefit from screening); to identify disease clusters, and for a host of other purposes. Does the process of geocoding impact the decisions made from e-health records? This question I hoped to answer by attending the conference.

The conference opened with a keynote presentation by Don Cooke, considered by many to be one of the fathers of geocoding. Don gave a tour and history of geocoding, from its first uses at the US Census, through DIME and TIGER files, and on to commercialization of the technologies through GDT and other companies. He spoke of the continuing need to focus the technologies in order to accurately encode geographic coordinates, and in passing used the term “baloney filter”. I hadn’t heard this before, and for me it captured the idea of separating the gold from the dross, of determining early on whether something matters, or should be filtered out as “baloney”. And that of course is what we need to do when assessing whether geocoding accuracy makes a difference in e-health: Assess whether geocoding accuracy impacts health policy decisions, or can be filtered out as something we don’t need to worry about.

Over the course of the next 2 days, I attended talks concerned with assessing accuracy and positional error in geocoded coordinates, addressing topics such as error magnitude, sources of geocoding error, propagation of error into geohealth analysis, and the assessment of how geocoding error alters analysis results. By the end of the meeting I was convinced that geocoding errors indeed may alter analysis results, and may be large enough to qualitatively change resulting health policy decisions. Examples cited at the conference include decreased power to detect true disease clusters when geocoding error is present, errors introduced into accessibility metrics, the underestimation of odds ratios used to assess health-environment relationships in epidemiological studies and others. Further examples may be found in the special issue on geocoding in epidemiology. But our present understanding of how geocoding error affects health policy decisions is meager, and a research agenda is needed to better understand the size of the problem, whether poor decisions have been made in the past, and to assure appropriate decisions are made in the future. From my perspective, such a research agenda might address five needs: 1) A lack of standardized, open-access geocoding resources for use in health research; 2)  A lack of geocoding validation datasets that will allow the evaluation of alternative geocoding engines and procedures; 3) A lack of spatially explicit geocoding positional error models; 4) A lack of resources for assessing the sensitivity of spatial analysis results to geocoding positional error;  5) A lack of demonstration studies that illustrate the sensitivity of health policy decisions to geocoding positional error.  See my paper that is appearing in the special issue of Spatial and Spatio-Temporal Epidemiology “A research agenda:  Does geocoding positional error matter in health GIS studies?” for details.

Dan Goldberg:

With the assistance of Geoff and my fellow conference organizing committee members, we had hoped to use the opportunity of the first international conference on geocoding to bring together folks from every aspect of the geocoding spectrum to discuss consistent challenges and emerging opportunities for geocoding research and practice. What occurred exceeded our expectations on every level. The keynotes from Don Cooke and Mark Greninger were both entertaining and extremely informative for a variety of reasons. First off, it was extremely interesting to find out that Don edited the proceedings of a conference on Geocoding hosted by URISA way back in the early days of geocoding; so, this current conference was by no means the first conference on geocoding. Even though I like to consider myself somewhat of an expert on the history of geocoding, Don’s recounting of the many key points and players in the history of the geocoding research, development, and application assured me yet again that there are many people who must be thanked for the processes and tools we take for granted every day – can you even imagine a world without TIGER?

By all accounts, Don’s talk was one of, if not THE, highlight of the conference. However, not to be outdone, Mark’s keynote was extremely motivating and full of exemplar cases of when and how geocoding can go wrong and what impacts these can have on the delivery of critical services. In the context of the eighth largest state in the country (LA County is by many metrics larger than 42 of the 50 states) Mark highlighted the consequences to public safety, government service delivery, and effective representation that can occur should problems arise at any of the many levels of the complex processes of geocoding and address management.

As the program committee had hoped, the parallel paper presentations and lightning talks were quite successful in representing the broad spectrum of geocoding concerns present at the meeting ranging from the impacts on health analysis from incomplete or incorrect geocoding to new application for geocoding emerging data types such as online twitter feeds. A wide variety of presenters from academia, industry, and government representing diverse geographic locations including Brazil, the United Kingdom, Australia, Canada, and the United States all presented their latest work in the production, analysis, representation, visualization, and utilization of geocoded data. Many of the talks focused on the repercussions of poor geocoding in data analysis, while others introduced new methods for improving the quality of geocoded data as well as techniques for enabling useful analysis in the ever-present case that geocoding does not work perfectly for every record in a dataset.

Q and A was lively after every presentation I attended, evidence that the topics discussed were relevant, timely, and familiar to many of those in attendance who shared personal experience, opinion, and in many cases alternative approaches all of which proved fruitful for making strides toward achieving shared goals; scientific, applied, and technical.

The breakout sessions followed a similar two tracked approach; I was in attendance for the Address Standardization and Volunteered Geographic Information sessions. Although quite different in theme, the result of each of these was similar – the identification of challenges and opportunities for research, development, and application of the respective technologies. Discussions were lively and included the views of participants from every career stage and background (professional and geographic), all of which were boiled down to a series of action items for the research and development communities to take head on to tackle the most pressing challenges in each domain.

The final panel session included leaders from many diverse fields including health (David Stinchcomb), Census (Ama Danso), HUD (Jon Sperling), local government (Mark Greninger), data and service providers (Dan Gibbons), and industry (Don Cooke). Led by Christophe Charpentier, the panel discussed the past, present, and future of geocoding technology with respect to what challenges remain to be solved to allow geocoding to continue to advance to serve the needs of researchers, scientist, policy makers, the business community, and local, state, and federal governments and agencies.  Geocoding more than just address data – for example, relative location descriptions such as the location of an injured person in a national park – was a consistent theme as was the need to understand and represent uncertainty in geocoded data.

All in all, this conference was a success by many accounts. Who would have thought 200+ people from around the globe would have come together for a conference on geocoding? Many participants reported that they developed new collaborative relationships with people they had wanted to meet for years, while others took home new ideas for research and development, and still others came away with new techniques for identifying and dealing with problem address data. The most prominent question asked by the majority of attendees however was – when will the Second International Geospatial Geocoding Conference take place?