Rigorous methods for developing and validating surveys, scales, and assessments
← All Research AreasDeveloping valid and reliable measurement instruments — surveys, questionnaires, and scales — is foundational to research in education and psychology. Poor measurement can lead to misleading conclusions regardless of how sophisticated the analysis.
Our lab applies rigorous psychometric methods throughout the scale development process: from item generation and pilot testing to factor analysis, reliability estimation, and validity evaluation. We emphasize not just classical reliability coefficients, but modern evidence-based validation frameworks.
Applications span academic and non-academic constructs including motivation, self-efficacy, attitudes toward statistics, and various psychological and educational outcomes.
Representative recent publications from this research area
Ma, W., Preast, J. L., Sanders, S., Hester, O. R., Jolivette, K., Shelton, S. A., Odom, K. P., Prewitt, N. B., & Pitzel, A. (Accepted) · DOI: 10.1177/15345084261434199
Bullying research has largely focused on traditional K–12 schools, leaving restrictive educational settings — juvenile justice facilities and residential treatment centers — without validated measurement tools. These environments present unique challenges, as highly structured conditions restrict youth autonomy, intensifying peer conflict dynamics and complicating standard survey administration.
This study validates a modified version of the Illinois Bully/Victim/Fight Scale in restrictive settings using Exploratory Structural Equation Modeling (ESEM), which allows for cross-loadings between factors rather than forcing simple structure. ESEM is compared to traditional CFA to demonstrate the advantages of the more flexible approach.
Findings: A four-factor structure (bullying, victimization, fighting, and anger) fit adequately under ESEM but not traditional CFA. Several items loaded meaningfully on multiple dimensions, and internal consistency was acceptable (ω = .74–.93), supporting cautious use of the scale in this population.
Gungordu, N., Nabizadehchianeh, G., O'Connor, E., Ma, W., & Walker, D. I. (2024) · DOI: 10.1080/10508422.2023.2206573
The Defining Issues Test-2 (DIT2) is a widely used instrument for assessing moral reasoning development based on neo-Kohlbergian theory. Despite its broad use in research and applied settings, comprehensive normative data had been absent — making it difficult to interpret individual or group scores in context.
Using a large archival database maintained by the Center for the Study of Ethical Development (N = 73,740; age range 12–95), this study establishes DIT2 norms by education level, by gender × education, and by gender × age. The norms document how moral reasoning scores vary systematically across demographic groups.
Findings: Moral reasoning scores generally increase with education, though not consistently. Personal Interest and Maintain Norms scores are higher for males, while Postconventional, N2, and Type indicator scores are higher for females across all education levels and age groups.
Chou, S-Y., Ma, W., & Britt, R. (2023) · DOI: 10.1080/19376529.2022.2044818
Podcast listenership has grown rapidly, but measurement tools to understand why people tune in are scarce. This study develops and validates a motivations scale grounded in uses and gratifications theory, using data from Taiwanese podcast listeners to capture dimensions such as entertainment, information seeking, social interaction, and personal identity.
Exploratory and confirmatory factor analyses were used to refine items, and measurement invariance was tested across gender and age groups. The resulting scale demonstrates good reliability and validity indicators, offering researchers a versatile tool for examining podcast consumption motives.
Findings: A four-factor structure provided the best fit, with invariance supported across demographics. The scale can inform content creators and marketers about listener needs and guide future media research.