8 Data Processing (Stage 5)
The data processing is where you spend your time in infectious disease surveillance
Data processing is seldom mentioned in classical surveillance literature, but it is a stage that significantly improves surveillance systems. Many people at all levels are involved in data processing within surveillance systems. In the past, data was transmitted monthly by mail in surveillance systems, whereas today, data flow mostly occurs through interfaces between software programs and databases. The way data is transmitted affects data quality and, consequently, the evaluation of the data. Data processing also includes the application of scripts that prepare the data for subsequent evaluation. An example is automated outbreak detection. Here, data is analyzed using an algorithm or machine learning to determine whether there is a high probability of an outbreak. The resulting dataset can be made available as open data, a publication format that has increased significantly in recent times.
8.1 Typical ways of data processing
- Questionnaires on paper
- Software interface between laboratory information system and public health notification software
- Accessing a database via an API
- Querying persons with a survey app
- Keeping information in a large excel sheet
8.2 Comparing characteristics of manual vs electronic data processing
Characteristic | Manual | Electronic |
---|---|---|
Ease | Easy | Complicated |
Internal validity | Low | High |
Speed | slow | fast |
Scalability | low | high |
Setup costs | low | high |
Operational costs | high | low |
Accepatbility | low | high |
Understandability | high | low |
8.3 Problems of data processing
8.3.1 Dependence on the user interface of software
The interface of the software that is used influences the data. If the variables that need to be entered are placed on different spots and described differently than one can expect different results
8.3.2 Dependence on classification systems in manual systems
8.3.3 Dependence on classification systems in automatic systems
- SnowMed
- FIHR
8.3.4 Single point of truth
8.3.5 Errors in data handling by the software
- Excel changes specific values to dates automatically
- Excel sheets have only a certain number of rows
8.4 Open Data and Public Engagement
Recent trends in data publication have expanded assessment beyond designated experts:
Advantages:
Enables independent verification Increases transparency Potentially leads to novel insights
Challenges:
Risk of misinterpretation by non-experts Necessitates clear communication of data limitations and context
Control group must be - representative of exposures in the source population - be identifiable as cases - habe same exclusion and restctition criteria as cases