Introduction Questionnaires for health-related quality-of-life (HRQoL) measurement are important for several reasons. First, they may be used to compare the mean level of different patient groups with respect to physical, mental and social health. A researcher may want to find out whether these patient groups have different needs with respect to, for example, therapy or medication or whether different adaptations of their environment are in order so as to improve their conditions of living. Second, HRQoL questionnaires are also important for the measurement of mean change––either progress or deterioration––of such groups due to, for example, therapy. The researcher’s interest then lies in the effectiveness of therapy with respect to HRQoL. Third, the total score a patient obtains on an HRQoL questionnaire may be used to diagnose this patient’s general level of physical health and psychological well-being, for example, so as to be able to estimate the budget needed for his/her treatment during a particular period. To effectively measure HRQoL, we argue that an instrument must meet two requirements. The first requirement is that it is clear what the instrument measures: one overall dimension of HRQoL or several dimensions reflecting different aspects of HRQoL. If the instrument measures one dimension, one can use the total score on all items to obtain an impression of the overall level of HRQoL. If the instrument measures multiple dimensions, it may be recommendable to determine total scores on subsets of items (e.g., domain scores), each reflecting a particular aspect of HRQoL (e.g., HRQoL with respect to physical, psychological, and social limitations) and then assess individuals or compare groups on a profile of scores. These two cases may be characterized as unidimensional and multidimensional measurement. The second requirement is that the psychometric properties of the items are known and found sufficient. One important psychometric item property is the item’s location on the scale that quantifies the HRQoL aspect of interest. For example, patients are likely to experience fewer problems when engaging in activities like bathing and dressing than in more demanding activities such as shopping and travelling. The items concerning bathing and dressing require a lower level of physical functioning than the other two items. Thus, bathing and dressing are located further to the left (at a lower level of the scale) than shopping and travelling. A good diagnostic HRQoL instrument contains items of which the locations are widely spread along the scale. Such a scale allows for measurement at varying levels of physical functioning and may be used, for example, for assessing mean differences between groups, mean change due to therapy, and individual patients’ levels of physical functioning. 1 1 nonparametric 2 5 6 9 Our point of view is that, given that the researcher has formulated desirable measurement properties, (s)he should construct his/her scale by means of an IRT model that is as general as possible while satisfying the desired measurement properties. Examples of such properties are that the items measure the same dimension, that the measurement level is at least ordinal, and that measurement values are reliable. An HRQoL researcher who has constructed and pre-tested a questionnaire consisting of, say, 40 items is not served well when his/her data are analyzed by means of an IRT model that is unnecessarily restrictive, the result of which is that, say, half of the items are discarded. We will argue that the most general IRT model that serves one’s purposes well, often (but not always) is a nonparametric IRT model. parametric 1 10 11 12 13 15 16 18 19 Definitions and assumptions N J j j  J M X j j X j M very much dissatisfied M very much satisfied X j J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ X_ + = \sum\nolimits_{j = 1}^J {X_j } $$\end{document} X + T X + 20 X j X + θ θ 21 22 23 22 24 25 X j M  . P X j m θ P jm θ m  M m  j θ m θ 1 M  24 2 1 1 Fig. 1 a b Fig. 2 a b θ J θ J θ θ 1 θ θ Parametric and nonparametric graded response models Parametric Graded Response Model. 26 δ jm m j P jm θ θ α j j The meaning of these parameters is explained after the ISRF of the parametric GRM is introduced. This ISRF is defined as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ P_{jm} (\theta ) = \frac{{\exp [\alpha _j (\theta - \delta _{jm} )]}} {{1 + \exp [\alpha _j (\theta - \delta _{jm} )]}},\;\;\;\;j = 1{\text{ }},...,{\text{ }}J;\;m = 1,...,M. $$\end{document} 1 M  j k j δ j δ j δ j k δ k δ k δ k θ m P jm θ P km θ m  M 1 j k m j δ jm α j   1 α j  α k Nonparametric Graded Response Model. 27 4 θ θ a θ b \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ P_{jm} (\theta _a ) \le P_{jm} (\theta _b ),\;\;\;\;{\text{whenever }}\theta _a < \theta _b . $$\end{document} θ 1 m θ X + θ 28 X + v w \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ E(\theta |X_ + = v) \le E(\theta |X_ + = w),\;\;\;\;{\text{for }}0 \le v < w \le J, $$\end{document} 4 θ θ θ δ α 16 29 2 M  θ θ θ 2 θ θ θ M m j π jm m  M π j π jM δ j δ jM H j 2 18 M H j J θ 30 j j θ H j j 18 Nonparametric versus parametric graded response models Relationships Among Models 1 Table 1 Comparison of monotone homogeneity model (MHM) and graded response model (GRM) Nonparametric IRT (MHM) Parametric IRT (GRM) Restrictiveness of models Low; many items admitted to the scale High; fewer items admitted to the scale Interpretation of parameters Intuitively appealing More-complicated Parameters typical range     Person level T X + θ θ      ISRF location π jm jm δ jm     ISRF discrimination H j α j α j Data analysis Exploratory, data as point of departure Confirmatory, model as null hypothesis Applications Patient and Item Parameters 1 X + θ X + X + θ 31 X + X + J θ θ X + δ jm θ π jm m α j j θ α j θ J H j θ H j θ j θ α j H j N  J  M  2 θ α j  θ H j H j H 1    H 4    H 5    θ ( 3 H j θ 3 H j 2 α j J  H 3 θ θ θ H j θ H j j α j j Table 2 H j Item parameters GRM H j α j δ j δ j δ j θ θ 1 1.4 –2.4 –2.2 –1.0 0.36 0.26 2 1.4 –4.0 –2.0 1.0 0.37 0.39 3 1.4 –1.0 2.0 2.5 0.39 0.42 4 1.4 –3.0 –2.0 –1.0 0.36 0.25 5 1.4 –2.5 –2.0 –1.5 0.36 0.25 Note H j N Fig. 3 Three ISRFs of the same item (Item 4) relative to two different distributions of the latent variable Confirmatory and exploratory data analysis 16 18 1 Application of parametric and nonparametric IRT X + θ 1 2 θ 32 θ θ θ θ θ 1 Nonparametric IRT analysis in practice Software for nonparametric IRT analysis 33 4 34 MSP 2 4 H j θ H j 29 2 H j 4 M 4 ) Fig. 4 a b TestGraf98 34 35 4 Research strategies for nonparametric IRT analysis. c H j c  c  2 4 33 4 36 35 37 A real-data example: The World Health Organization Quality-of-Life Scale Do you have enough energy for daily life? (physical domain) How much do you enjoy life? (psychological domain) How satisfied are you with your personal relationships? (social domain) How safe do you feel in your daily life? (environmental domain) X j ) N N 38 Results Sample statistics of item and scale scores 3 X + Table 3 H j H H j H MSP item selection procedure H j c c j Mean 1 2 1 2 3 4 5 Physical health and well-being 3 a 3.04 .59 .59 .40 10 a 2.98 .43 .53 .46 15 Satisfied with sleep 2.66 .22 – – – – – – .28 25 Moving around well 3.46 .36 .43 .41 16 Satisfied doing daily activities 2.84 .41 .56 .52 4 a 3.17 .59 .59 .43 17 Satisfied work capacity 2.89 .40 .57 .52 Scale value 21.04 .43 Psychological health and well-being 5 Enjoying life 2.66 .34 .42 .37 7 Being able to concentrate 2.80 .32 – – – – – .29 18 Satisfied with yourself 2.95 .41 .48 .45 11 Acceptance physical appearance 3.23 .33 – – – – – .35 26 a 2.72 .30 – – – – – .34 6 Life meaningful 2.66 .30 – – – – – .37 Scale value 17.01 .36 Social relations 19 Satisfied relationship with other people 3.06 .34 .50 .50 20 Satisfied with sex life 2.58 .30 .42 .42 21 Satisfied support from others 2.88 .28 – .40 .40 Scale value 8.52 .44 Environment 8 Feeling safe in daily life 3.08 .30 – – – – – .33 22 Satisfied living conditions 3.13 .43 – – – – – .43 12 Enough financial resources 3.08 .33 .53 .42 23 Satisfied getting adequate health care 2.87 .29 – .52 .36 13 Availability information needed in daily life 3.06 .34 .53 .40 14 Opportunities leisure 2.90 .34 .49 .39 9 Healthy environment 2.88 .29 – – – – – – .31 24 Satisfied with transport in daily life 3.28 .34 .52 .40 Scale value 24.27 .38 a Dimensionality analysis Principal components analysis. 39 ] P Monotone homogeneity model analysis. P  H j H . c c c  c  c  H  H  3 H j c  3 c  c c  4 7 40 H 2 H j 2 3 H 2 H c  H 15 H 18 Monotonicity assessment in each item domain For each item domain, MSP was used to assess the ISRFs’ shapes. First, ISRFs were estimated accurately (i.e., many cases were used to estimate separate points of the ISRFs) but at the expense of possible bias (i.e., only few points were estimated). Second, ISRFs were estimated with little bias (i.e., many points were estimated) but at the expense of accuracy (i.e., few cases were used to estimate each point). 5 Fig. 5 a b c 6 P 72 θ P X 7 θ P P 73 θ P X 7 θ Fig. 6 a b c 5 6 θ  θ θ Comparison of the MHM with the GRM 41 4 α j δ jm j  H j 3 π jm j Table 4 Results of monotone homogeneity model (MHM) scale analysis and estimated item parameters from the graded response model (GRM) MHM b j H j π j1 π j2 π j3 π j4 α j δ j1 δ j2 δ j3 Physical health and well-being 3 b .40 .99 .93 .72 .40 1.14 −2.71 −0.99 0.49 10 a .46 .98 .95 .79 .45 1.97 −2.24 −0.63 0.56 15 Satisfied with sleep .28 .99 .95 .70 .34 0.85 −2.09 −0.79 1.78 25 Moving around well .41 .99 .96 .91 .61 1.23 −3.08 −2.24 −0.41 16 Satisfied doing daily activities .52 .98 .93 .72 .21 3.96 −1.59 −0.58 0.87 4 a .43 .98 .92 .75 .23 1.23 −2.82 −1.29 0.24 17 Satisfied work capacity .52 .99 .98 .77 .20 3.58 −1.56 −0.69 0.81 Scale value .43 Psychological health and well-being 5 Enjoying life .37 1.00 .98 .61 .07 1.93 −2.79 −0.38 1.97 7 Being able to concentrate .29 .99 .98 .77 .20 0.82 −4.27 0.68 1.69 18 Satisfied with yourself .45 1.00 .93 .64 .15 1.99 −2.85 −0.97 1.12 11 Acceptance physical appearance .35 .99 .98 .80 .45 1.08 −3.93 −1.56 0.22 26 a .34 1.00 .97 .61 .08 1.11 −2.84 −0.60 1.90 6 Life meaningful .37 1.00 .96 .62 .23 1.90 −2.71 −0.35 1.94 Scale value .36 Social relations 19 Satisfied relationship with other people .50 .99 .96 .82 .29 2.52 −2.21 −1.10 0.97 20 Satisfied with sex life .40 .97 .88 .58 .15 1.32 −3.19 −0.90 1.38 21 Satisfied support from others .42 .99 .97 .72 .20 1.38 −1.88 −0.34 1.62 Scale value .44 Environment 8 Feeling safe in daily life .33 1.00 .98 .78 .32 1.12 −4.12 −1.37 0.84 22 Satisfied living conditions .43 .98 .90 .68 .34 1.96 −2.71 −1.27 0.63 12 Enough financial resources .42 .99 .95 .70 .44 1.84 −2.22 −0.73 0.21 23 Satisfied getting adequate health care .36 .99 .97 .68 .23 1.32 −2.84 −0.93 1.33 13 Availability information needed in daily life .40 .99 .95 .72 .21 1.64 −3.21 −0.98 0.60 14 Opportunities leisure .39 .99 .98 .75 .34 1.61 −1.94 −0.69 0.60 9 Healthy environment .31 .99 .98 .83 .32 1.02 −3.95 −0.89 1.43 24 Satisfied with transport in daily life .40 .99 .97 .86 .45 1.66 −2.77 −1.55 0.17 Scale value .38 a b 42 43 ( P  5 4 H 3 4 π j π j 6 6 H 7    θ Summary of the scale properties 19 H X + θ X + θ X + θ H j , θ X + X + + X + \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ 2.8 \times 1.96 \times \sqrt 2 \approx 8 $$\end{document} X + \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ 1.8 \times 1.96 \times \sqrt 2 \approx 5 $$\end{document} + 44 H j X + 45 Discussion 24 26 13 15 11 46 X + θ θ X + In an HRQoL context, often little is known about the psychometric properties of new questionnaires. A typical nonparametric MHM analysis explores the dimensionality of the data by capitalizing on model assumptions such as monotonicity (MSP), and studies the shapes of the ISRFs and the ISFs in order to learn more about the (mal-)functioning of individual items (MSP and TestGraf98). This results in scales on which groups can be compared and changes monitored without making unduly restrictive assumptions about the data. 47 48 49 54