A database query is a data correction request generated by the sponsor's data management team after reviewing data submitted by the research site. Queries occur when submitted data is incomplete, inconsistent with other entries, outside expected ranges, or violates protocol rules. Every query requires the coordinator to investigate the source record, provide a correction or explanation, and close the query — a process that takes 20 to 30 minutes per query and creates documentation overhead that compounds at high query volume.
How Query Rates Affect Site Selection
CROs track query rates per site and use them as a data quality signal in site selection decisions. A site generating 3 queries per 100 data points submitted is performing well. A site generating 12 queries per 100 data points is telling the sponsor that its data collection process is unreliable. Beyond the immediate time cost of resolution, high query rates suggest that the site's source documentation practices, data entry protocols, and coordinator training are insufficient — concerns that extend to the integrity of all data the site submits and its suitability for regulatory submissions.
The Two Primary Causes of High Query Rates
The first cause is transcription errors. Coordinators transcribing data from paper source documents to EDC entry make mistakes — transposed numbers, missed decimal points, incorrect dates. The transcription step itself is the error risk, and eliminating it eliminates the errors it generates. The second cause is process gaps — inconsistent source documentation, variable assessment procedures, and protocol compliance failures that produce data the EDC validates as incorrect.
What Standardized Data Collection Reduces
Standardized data collection workflows that capture data at the point of care — using digital forms that validate entries in real time before submission — eliminate the transcription error category entirely. Required fields with format validation catch missed data and incorrect formats before the form is saved. Range checks flag values outside protocol-expected ranges at entry rather than at EDC submission. Sites that have moved from paper source document transcription to point-of-care digital entry consistently report significant reductions in query rates.