Data entry quality of double data entry vs automated form processing technologies: A cohort study validation of optical mark recognition and intelligent character recognition in a clinical setting
Journal article, Peer reviewed
Published version
View/ Open
Date
2020Metadata
Show full item recordCollections
- Department of Clinical Medicine [2104]
- Registrations from Cristin [10482]
Abstract
Background and Aims
Patient-reported outcome measures (PROMs) are increasingly used in health services. Paper forms are still often used to register such data. Manual double data entry (DDE) has been defined as the gold standard for transferring data to an electronic format but is laborious and costly. Automated form processing (AFP) is an alternative, but validation in a clinical context is warranted. The study objective was to examine and validate a local hospital AFP setup.
Methods
Patients over 18 years of age who were scheduled for knee or hip replacement at Stavanger University Hospital from 2014 to 2017 who answered PROMs were included in the study and contributed PROM data. All paper PROMs were scanned using the AFP techniques of optical mark recognition (OMR) and intelligent character recognition (ICR) and were processed by DDE by health secretaries using a data entry program. OMR and ICR were used to capture different types of data. The main outcome was the proportion of correctly entered numbers, defined as the same response recorded in AFP and DDE or by consulting the original paper questionnaire at the data field, item, and PROM level.
Results
A total of 448 questionnaires from 255 patients were analyzed. There was no statistically significant difference in error proportions per 10 000 data fields between OMR and DDE for data from check boxes (3.52 95% confidence interval (CI) 2.17 to 5.72 and 4.18 (95% CI 2.68-6.53), respectively P = .61). The error proportion for ICR (nine errors) was statistically significantly higher than that for DDE (two errors), that is, 3.53 (95% CI 1.87-6.57) vs 0.78 (95% CI 0.22-2.81) per 100 data fields/items/questionnaires; P = .033. OMR (0.04% errors) outperformed ICR (3.51% errors; P < .001), Fisher's exact test.
Conclusions
OMR can produce an error rate that is comparable to that of DDE. In our setup, ICR is still problematic and is highly dependent on manual validation. When AFP is used, data quality should be tested and documented.