For a sound use of health care data in epidemiology: evaluation of a calibration model for count data with application to prediction of cancer incidence in areas without cancer registry

Publié le 1 Juillet 2019
Mis à jour le 15 Janvier 2020

There is a growing interest in using health care (HC) data to produce epidemiological surveillance indicators such as incidence. Typically, in the field of cancer, incidence is provided by local cancer registries which, in many countries, do not cover the whole territory; using proxy measures from available nationwide HC databases would appear to be a suitable approach to fill this gap. However, in most cases, direct counts from these databases do not provide reliable measures of incidence. To obtain accurate incidence estimations and prediction intervals, these databases need to be calibrated using a registry-based gold standard measure of incidence. This article presents a calibration model for count data developed to predict cancer incidence from HC data in geographical areas without cancer registries. First, the ratio between the proxy measure and incidence is modeled in areas with registries using a Poisson mixed model that allows for heterogeneity between areas (calibration stage). This ratio is then inverted to predict incidence from the proxy measure in areas without registries. Prediction error admits closed-form expression which accounts for heterogeneity in the ratio between areas. A simulation study shows the accuracy of our method in terms of prediction and coverage probability. The method is further applied to predict the incidence of two cancers in France using hospital data as the proxy measure. We hope this approach will encourage sound use of the usually imperfect information extracted from HC data.

Auteur : Chatignoux Édouard, Remontet Laurent, Iwaz Jean, Colonna Marc, Uhry Zoé
Biostatistics, 2019, vol. 20, n°. 3, p. 452-467