- The purposes of this study were to (a) provide insight into the use of item response theory (IRT) with psychomotor skills, (b) assess the psychometric properties of the Test of Gross Motor Development (TGMD) using IRT, and (c) provide a basis for future studies of the TGMD using IRT. The dichotomously scored TGMD is a test instrument which measures psychomotor skills in a framework similar to cognitive tests, thus providing a convenient "transitional" type test which can be used to examine the use of IRT with psychomotor skill tests. The present study employed data used by Ulrich (1985) in the original psychometric analysis of the TGMD. The data consisted of 913 subjects aged 3 to 10 years, nonhandicapped and 20 mildly mentally handicapped. Since IRT cannot provide accurate ability estimates at mastery levels of 0% and 100% mastery, 32 subjects were deleted from the record. Since the TGMD was found to be multidimensional, the test was analyzed by subtests so not to violate the unidimensionality assumption of IRT. Interpretation of traditional item statistics using classical test theory (CTT) and IRT item parameters revealed that item difficulty and item discrimination were closely related. The locomotor IRT difficulty parameters revealed a high negative correlation (r = -.87) with the CTT difficulty statistics, while the object control IRT difficulty parameters displayed a very high negative correlation (r = -.98) with their CTT counterparts. Item response theory discrimination parameters correlated highly with CTT discrimination statistics within the locomotor (r = .91) and the object control (r = .94) subtests. IRT analysis revealed that the locomotor subtest was less difficult (median difficulty = -.944) than the object control subtest (median difficulty = .053) and the object control subtest displayed a better discrimination index (median = 2.17) than the locomotor subtest (median = 1.54). In addition to difficulty and discrimination indices, IRT also provided the amount of information given by each item and subtest, which indicated the precision in measuring various ability levels. The locomotor subtest information was reported at I = 15.50, indicating adequate precision to measure low ability (Θ = -1.857). The object control information function showed that the subtest displayed more information (I = 18.24) at a slightly higher ability level (Θ = -1.643). The results of the item analysis revealed that all items (behavioral criteria) of the hop, leap, and the overhand throw displayed effective psychometric properties, while 9 out of 12 skills contained items that displayed poor psychometric characteristics and/or did not fit the two-parameter model. The run (items 1, 3, and 4), gallop (items 6 and 8), horizontal jump (item 18), skip (item 20), slide (items 23, 24, 25, 26), strike (items 27, 28, and 29), stationary bounce (item 32), catch (item 34 and 35), and the kick (item 38) should be revised. Since the TGMD is also used as a criterion-referenced test the decision validity of the mastery classification cut-off scores was analyzed. For the purposes of these analyses true mastery state was determined by IRT because it provides an estimation of underlying ability. It was found that IRT and CTT showed a high agreement of classifying masters and nonmasters at the 70% and 85% levels of mastery. The locomotor subtest revealed a decision validity coefficient of .93 and .99 for the 70% and 85% mastery levels, respectively. The object control subtest revealed higher decision validity coefficients of .99 and .997 for the 70% and 85% mastery levels, respectively. The TGMD subtests were found to best measure very low mastery levels, where the most precision for measuring ability represented 30% and 45% mastery for the locomotor and object control subtests, respectively. Item response theory has been successfully employed in the cognitive and affective domains and shows great promise for the psychomotor domain. The present study set forth evidence that the IRT two-parameter logistic model provides an effective psychometric analysis of dichotomously scored psychomotor skills. The theory addresses many of the shortcomings of CTT, such as the inability to generalize item statistics to various populations and determine the contribution of test items independently. The invariance property of IRT is very appealing to those who must assess atypical populations because a single test can accommodate various populations and wide ranges of ability. The results of this study provide evidence that the characteristics of IRT are well suited to improve measurement and evaluation in the psychomotor domain.