Modeling dengue incidence using time series models with exogenous covariates: climate, effective reproduction number, and twitter data


  • Julio Cesar de Azevedo Vieira


17/04/2018 - 14:00


Sala da Congregação FGV EMAp (5º andar)


Dengue fever is an infectious disease affecting subtropical countries. Local health departments use the number of notified cases to monitor and predict epidemics. This work focus on modeling weekly incidence of dengue fever in four cities of the state of Rio de Janeiro: Rio de Janeiro, São Gonçalo, Campos dos Goytacazes, and Petrópolis. Time series models are often used to predict the number of cases in the next cycles (weeks, months), in particular, SARIMA (Seazonal Auto-Regressive Integrated Moving Average) models are shown to perform well in distinct settings. Alternative models also include climate covariates to improve the quality of the forecasts. However, models that only use historical and climate data may no have sufficient information to capture changes from non-epidemic to an epidemic regime. Two reasons are that there is a delay in the notification of cases and there might not have had epidemics in the previous years. Based on the INFODENGUE monitoring system we argue data including the "effective reproduction number of mosquitoes" (RT) and "number tweets referring to dengue" (tweets) may improve the quality of forecasts in the short (1 week) to long (8 weeks) range. We show that time series models including RT and climate information often outperform SARIMA models in terms of mean squared predictive error (RMSE). Inclusion of twitter did not improve the RMSE.

*Texto enviado pelo aluno. 

Membros da banca: 

  • Eduardo Fonseca Mendes (orientador) - FGV/EMAp
  • Flavio Codeço Coelho - FGV/EMAp
  • Aline Araujo Nobre - FIOCRUZ