Design & Instruments

Design & Instruments

Adapting a psychological test is complex. It includes far more than merely translating the test items. Therefore the International Test Commission published the “Test Adaptation Guidelines” (TAG) which ensure a high-quality adaption of psychological tests.

The following presentation of the process and methodology follows the TAG. Apart from adapting the test, developing, testing and modifying the competence model forms the second main field of work. The test is developed iteratively; the test is adapted close to the competence model and vice versa.


0. Development of a Competency Model:

The theoretical framework model for the assessment of students’ economic content knowledge was designed in the context of the WiwiKom project and can be described using a three-dimensional structure:

  1. The first dimension is comprised of structural assumptions in regard to the cognitive demands of specialized knowledge (propositional, case-related and strategic knowledge).
  2. The second dimension deals with assumptions regarding the level of knowledge and is based on the economic taxonomy levels developed by Anderson & Kraftwohl (2001) as well as Walstad et al. (2007).
  3. The third dimension differentiates between various content sub-domains (such as micro- and macro-economics, accounting, marketing etc.).

On the one hand, this ensures the modeling of content-related sub-domains that enable assertions regarding structure and dimensioning of content-specific knowledge (e.g. diversification between business administration and economics). On the other hand, the modeling of cognitive demands which allow structuring cognition-based knowledge as well as first assumptions regarding classification is thus ensured.

1. Translation and adaption:

In co-operation with the chair of Prof Dr. Silvia Hansen-Schirra from the school of translation science, linguistics and cultural science of the Johannes Gutenberg University Mainz, a scientific, professional translation and cultural adaptation of international test instruments including 403 items of economic nature is ensured.

2. Curricular Analysis, Online expert evaluation and expert interviews:

In co-operation with the national education panel (stage 7 of the NEPS), a curricular analysis of the main contents of economics degree courses is being conducted. In this analysis, the curricula and module guides of 96 study courses at 40 universities and 24 schools of higher education (including the largest facilities in the field of economic sciences) are being incorporated. After the test, the test items will be presented to 78 experts (economic science professors and lecturers) from the questioned facilities via an online survey in order to verify the content-related and curricular validity of the analysis. Any critical feedback on specific items as well as items already assessed as problematic will then be reviewed again with 32 experts (also professors of economic sciences) in the context of expert interviews in individual or group sessions.

3. Cognitive Interviews:

This validation aspect encompasses the analysis of the test subjects’ response behavior. For this purpose, 30 test subjects underwent 120 cognitive interviews. To enable the validation of a match between the construct and individual response behavior, individual strategies for solving items are analyzed and evaluated. In this way items can be adapted or removed, for example if a misinterpretation by a majority of subjects has been determined. The validation process consists of assessing whether the correct answering of items was achieved using the expected cognitive processes or if perhaps other, unwanted test- or even guessing-strategies were applied (AERA et al., 2004, p. 12f.). The cognitive interviews are conducted using the think-aloud method during and standardized, specific and retrospective questions after item response. The cognitive interviews themselves aim to achieve the following: (1) Item quality validation, (2) discriminant validity, (3) convergent validity. Additionally, the interviews serve to determine whether individual items are comprehensible enough (for example when graphical representations are used) and how much time subjects need to finish the test.

4. Pretest:

In the summer semester of 2012, a pretest with a total of 962 students had been conducted at two German universities. The purpose of this test was to enable a quantitative analysis of items determined as problematic in the earlier qualitative analyses. To accomplish this, a selection of 45 tasks (all of which were deemed critical according to the adaptation process) was given in two different test versions. Version 1 was comprised of items from the areas of marketing and human resource management whereas version 2 dealt with financial and business management. In the course of this test it was also observed whether the items were too easy or too hard for the target audience.

5. First survey (WS 2012/13):

Based on the validation analyses, 144 translated and adapted items from the original EGEL test were integrated in the first survey for the calibration of the item pool in the winter semester of 2012/13. Additionally, all 60 translated and adapted items of the TUCE were added to the item pool, thus covering micro- and macro-economy with 30 tasks each. Due to the limited time students had to finish the test and to control position-effects, a booklet design (see Frey, Hartig & Rupp 2009) was used. The subjects were given 30 items each from one of 43 different booklets. To enable a mostly unbiased estimation of the item parameters, a variety of Youden-Square designs was used. In this process, three item clusters of ten items each were added to the booklet. Furthermore, a Youden-Square design incorporating one item cluster from each of the seven sub-domains was conceived, in this way enabling a first estimation of the connections between the individual sub-domains. 4.050 students from over 23 separate universities and schools of higher education across Germany participated in the first survey. Based on this data and using a variety of item analyses, it was possible to identify problematic items and distracters and revise these with the help of experts.


With the help of descriptive statistics and based on the classic test theory (CTT), an item analysis and a measuring of reliability will be conducted first and the gained results of the CTT analysis will be supplemented by the results gained with the help of the item-response-theory (IRT). First, the IRT models should be used to estimate the item difficulties and selectivity of the item pool, so that suitable items can be selected for the second evaluation in 2013. The IRT analysis will help to assess whether the theoretically defined comp
etence levels can be retraced in the individual domains, whether the existing cognitive level classification of items in assessment frames can be largely replicated and where differences exist. Additionally, the complex booklet design allows for a first estimation of the correlation between the individual sub-domains.

6. Cognitive interviews:

See point 3

7. Second survey (SS 2013):

Based on the results of the first survey, new test versions were composed and, in the summer of 2013, a second survey was conducted with 3.713 students from 25 German universities. In the context of this survey, the theoretical WiwiKom Model was empirically validated and national and international comparative analyses have been carried out.

8. Final evaluation and publication of results:
The data from the second survey is currently being evaluated.



Test of Understanding College Economics“ (TUCE) of the Council for Economic Education (CEE) The “Test of Understanding College Economics” (TUCE) of the Council for Economic Education (CEE) is an internationally tested and validated test instrument that is currently being adapted for use in German speaking countries. The TUCE is in its fourth edition (Walstad, Watts & Rebeck 2007) and strives to supply an assessment instrument for school and university students of economic sciences. The test consists of two parts, both parts containing 30 tasks each (one for macro-economics and the other for micro-economics). The TUCE is a multiple choice test where one of four possible answers is correct. The test developers differentiate between three levels (recognition and understanding, explicit application and implicit application).

„Examen General para el Egreso de la Licenciatura en administración“ (EGEL-A) and „Examen General para el Egreso de la Linceciatura en Contaduría“ (EGEL-C) of the Centro Nacional de Evaluación para la Educación Superior (CENEVAL) The tasks of the EGEL-A can be divided into two test versions and are comprised of 250 tasks. These can be allocated to four major economic subject fields, namely financial, business and human resource management as well as marketing. The EGEL-C, which is comprised of 93 tasks in the field of accounting, was additionally used in order to incorporate as many curricula as possible. The tasks of these Latin-American EGEL tests were developed with professors and employers and are, like the American TUCE, multiple choice tests where only one of four possible answers is correct.