Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Aug 1;24(3):272-291.
doi: 10.1177/1525822X12443097. Epub 2012 Apr 26.

Data Quality in web-based HIV/AIDS research: Handling Invalid and Suspicious Data

Affiliations

Data Quality in web-based HIV/AIDS research: Handling Invalid and Suspicious Data

Jose Bauermeister et al. Field methods. .

Abstract

Invalid data may compromise data quality. We examined how decisions taken to handle these data may affect the relationship between Internet use and HIV risk behaviors in a sample of young men who have sex with men (YMSM). We recorded 548 entries during the three-month period, and created 6 analytic groups (i.e., full sample, entries initially tagged as valid, suspicious entries, valid cases mislabeled as suspicious, fraudulent data, and total valid cases) using data quality decisions. We compared these groups on the sample's composition and their bivariate relationships. Forty-one cases were marked as invalid, affecting the statistical precision of our estimates but not the relationships between variables. Sixty-two additional cases were flagged as suspicious entries and found to contribute to the sample's diversity and observed relationships. Using our final analytic sample (N = 447; M = 21.48 years old, SD = 1.98), we found that very conservative criteria regarding data exclusion may prevent researchers from observing true associations. We discuss the implications of data quality decisions and its implications for the design of future HIV/AIDS web-surveys.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Decomposition of the sample to ensure adequate final analytic sample Notes. Adjusted Completion rates are calculated taking full missing data out of the estimation. Completion rate = [488/548] = 89.05% Adjusted Completion Rate using Group A = [385/488] = 78.89% Adjusted Completion Rate using Group E = [447/488] = 91.60%

Similar articles

Cited by

References

    1. Bauermeister J, Leslie-Santana M, Johns M, Pingel E, Eisenberg A. Mr. Right and Mr. Right Now: Romantic and casual partner-seeking online among young men who have sex with men. AIDS and Behavior. 2011;15(2):261–272. - PMC - PubMed
    1. Bowen A, Daniel C, Williams M, Baird G. Identifying multiple submissions in Internet research: Preserving data integrity. AIDS & Behavior. 2008;12:964–973. - PMC - PubMed
    1. Couper M. Designing Effective Web Surveys. Cambridge, MA: Cambridge University Press; 2008.
    1. Garofalo R, Herrick A, Mustanski B, Donenberg G. Tip of the iceberg: Young men who have sex with men, the Internet, and HIV risk. American Journal of Public Health. 2007;97(6):1113–1117. - PMC - PubMed
    1. Konstan J, Simon-Rosser B, Ross M, Stanton J, Edwards W. The story of subject naught: A cautionary but optimistic tale of Internet survey research. Journal of Computer-Mediated Communication. 2005;10(2):article 11.

LinkOut - more resources