Introduction
Self regulation (SR) is a fundamental developmental skill impacting a child’s performance and health across the lifespan [
1,
2].
It describes the ability to adapt one's thoughts, feelings, and behavior to the demands of a particular situation in order to optimally pursue personal goals [
3]. Moreover,
SR refers to processes that enable us to maintain optimal levels of emotional, motivational, and cognitive arousal. It […] overlaps substantially with inhibitory control, a core dimension of executive functions [
4].
From a medical, psychological and pedagogical perspective, good SR skills are considered a protective factor regarding mental [
5‐
7] and physical health [
8] and have been found to longitudinally predict health, success in professional and private life, satisfaction with life and social equity in adulthood [
1].
Accumulating evidence in the last two decades suggests that more and more children from school age to adolescence have difficulties in regulating their behaviors [
9]. For example, the prevalence of behavioral and psychological problems related to SR in kindergarten and primary school has been steadily increasing [
2,
10‐
12]. This not only presents challenges for the daily work of teachers [
13‐
15], but studies also suggest that these problems persist into adolescence with a 50% chance [
16], resulting in a high societal burden and possible medical costs [
17,
18].
With the window for promoting children’s SR skills opening years before entering school, early identification of children with SR difficulties combined with early intervention e.g. in kindergarten seems key from a public health perspective. As SR development depends on environmental factors and experiences [
19‐
21] (besides biological maturity), interventions that change the environment and experiences have the potential to effectively support child SR development [
22‐
24]. Current systematic reviews have shown effectiveness of different SR promoting interventions in early childhood education and care environments (ECECs) [
23,
24]. Other studies showed that supportive environmental factors such as high-quality teacher–child interaction [
25] are positively associated with SR development in children. This suggests that a public health approach combining the efficient identification of children with SR difficulties early on with the implementation of effective interventions in the kindergarten setting has a high potential.
To identify vulnerable children, valid measurement of SR in kindergartens is necessary. As SR skills are part of psychological and social-emotional child development, questionnaires that are used to assess the latter might be promising. These include the Behavioral and Emotional Rating Scale (BERS, 26 items, domains: behavioral self-control, emotional self-control) [
26], the Child Behavior Checklist (CBCL, 33 items, domains: emotionally reactive, attention problems, aggressive behavior) [
27], the Child Behavior Questionnaire (CBQ, 12 items, domains: attentional focusing, inhibitory control) [
28], the Child Behavior Rating Scale (CBRS, 17 items, domains: self-regulation, social/interpersonal skills) [
29], Conners' rating scale – teacher form (CTRS, 28 items, domains: conduct problems, day-dreaming inattention, anxious fearful, hyperactivity) [
30], the Devereux Early Childhood Assessment (DECA, 8 items, domain: self-control) [
31], Social competence and behavior evaluation—preschool edition (SCBE, 20 items, domains: anger-aggression, social competence) [
32], the Social Competence Scale (SCS, 13 items, domains: prosocial behavior, emotion regulation) [
33], the Strengths and difficulties questionnaire (SDQ, 25 items, domains: emotional symptoms, conduct problems, hyperactivity/inattention, peer relationship problems, prosocial behavior) [
34,
35], and the Behavior Rating Inventory of Executive Function—Preschool Version (BRIEF-P, 63 items, domains: inhibition, attention shift, emotional control, working memory, planning/organizing) [
36]. Although many instruments might be available to measure SR skills, the most important ones were suggested to be the CBQ, BRIEF, CBCL and SDQ [
37]. However, from a public health perspective, all of these are too comprehensive and long (e.g. number of items for SR measurement = 12, 26 23, 25, respectively) for screening purposes, and do not feature SR as a separate construct.
Several of these questionnaires also exist in German, e.g. the SDQ or the BRIEF-P [
38]. Furthermore, additional questionnaires exist that were developed in the German context and are primarily used in Germany, such as the Kindergarten Behavior Scales (VSK, 49 items, domains: anxiety, hyperactivity and inattention, aggressive behavior, emotional dysregulation, social competence, emotional knowledge/empathy, self-regulation) [
39], the Organizing Education in Kindergarten screening (BIKO, 33 items six domains: willingness to cooperate with educational staff, integration into the group, problem behavior towards peers, prosocial behavior towards peers, play and task behavior, regulation of emotions) [
40,
41], the Dortmund Developmental Screening for Kindergarten (DESK 3–6 R, 45 to 50 items depending on age, domains: fine motor skills, gross motor skills, social competence, social behavior, social interaction, attention and concentration, cognition and language, cognition, basic competence literacy, basic competence numeracy, language and communication) [
41] or the questionnaire Competencies and Interests of Children (KOMPIK, 158 items across 11 domains: motor skills, social and emotional behavior, motivation, language and early literacy, maths, science, music, design, health, well-being, and social relationships) [
42].
While these instruments meet scientific standards, they are all longer and quite time-consuming (minimum 40 items, while the DESK even contains performance tasks over and above questionnaire items, which requires even more time and a suitable physical environment in kindergartens). In addition, most of them do not feature SR as a separate construct and are far too comprehensive (e.g. measure development or behavioral issues in general), which reduces their suitability as efficient SR screening tools in the kindergarten environment and also might explain why they failed to gain wide use in Germany.
To move the field of developmental monitoring and public health intervention planning in kindergartens in Germany forward, we previously adapted the internationally widely used Canadian Early Development Instrument (EDI) [
43] to the German context and published the German version of the EDI (GEDI) [
44]. The EDI is a valid and reliable teacher 103-item questionnaire assessing a child’s ability to meet age-appropriate development expectations in five domains (see below), developed by Magdalena Janus and colleagues at the Offord Center for Child Studies at McMaster University, Ontario. The instrument was designed as a screening and developmental monitoring tool [
45‐
49]. It serves to collect data on the development of 3- to 6-year-old children in all relevant developmental domains [
50]. In Canada and other countries, the EDI is integrated into a public health monitoring and intervention planning approach, which results in a tailored implementation of interventions in kindergartens to support child health and development.
Based on the features described above, the EDI could provide an optimal basis to develop a brief, but psychometrically sound and fully questionnaire-based screening instrument to detect SR difficulties in kindergarten children. In addition, the worldwide use of the EDI would allow to assess SR as part of the regular EDI monitoring in kindergartens in many countries.
Therefore, this study assesses whether it is possible to develop a valid scale measuring SR by recombining items of the theoretically relevant EDI domains "social competence" and "emotional maturity". The following research questions guide our study:
a)
Can existing items from the (G)EDI be selected based on solid theoretical and conceptual considerations and recombined to form a valid (stand-alone) SR scale?
b)
Does the resulting (G)EDI-SR scale have adequate psychometric properties and validity?
Discussion
The aim of the study was to identify items eligible for SR-measurement within the (G)EDI domains "social competence" and "emotional maturity" by a theory-based selection process, and therefrom develop a GEDI-SR scale and assess its dimensions, psychometric properties and validity.
We identified 20 original (G)EDI items eligible for measuring SR. Starting with these 20 items, we used exploratory factor analysis to assess constructs and dimensions using the development dataset. Cross-validation with both datasets using confirmatory factor analysis was successful and resulted in a 13-item, three-factor GEDI-SR scale model with excellent goodness of fit indices for measuring SR in kindergarten children. The GEDI-SR scale’s internal consistency, test–retest and interrater reliability, stability across populations as well as concurrent validity with the VSK-SR scale were in the good to excellent range, which qualifies the scale for screening or monitoring purposes. Since all items of this SR scale are inherent to the (G)EDI, SR can now be efficiently measured when administering the (G)EDI, without the need for applying an additional SR assessment instrument. Alternatively, given high reliability and validity, the newly developed, short GEDI-SR scale could also be administered as stand-alone scale.
Development of the GEDI-SR scale and its constructs and dimensions
The sequence of theory-based selection process and a subsequent quantitative analysis of constructs and dimensions of the resulting eligible SR-items across two independent data sets was successful to reduce the initial 20 items to a very short scale of 13 items to measure SR in a valid way. The internal consistency of this scale was high (⍺ 〜 0.90).
The 13 items of the resulting SR scale revealed large correlations at the factor and item level, which indicates a multicomponent latent construct. The three factors of the GEDI-SR scale found empirically correspond perfectly to the theoretical basis of Diamond's conceptual model on SR [
33], which underlines the scale’s validity. It consists of the “core” components of SR 1) behavioral response inhibition; 2) cognitive inhibition; 3) selective or focused attention (Diamond 2013). A child scoring high on these domains will find it easier to a) meet teachers' expectations, as teachers expect children to behave appropriately with regard to their school readiness and show SR by treating people and things well, by being able to sit still and to listen when needed [
67]. Such children will show b) responsible behavior by following rules, taking responsibility for their actions, and being mindful of the materials and furniture at the kindergarten; c) concentration being able to conduct activities independently and calmly, e.g. completing painting and handicrafts carefully and on time, and to have an appropriate attention span. Children with high levels of SR may be expected to show d) conscientiousness, for example being careful with play materials.
The exploratory factor analysis led to omission of four items from the eligible SR-item selection. These encompass items such as “demonstrates self-control”, “has temper tantrums”, “has the ability to get along with peers” and “has difficulty awaiting turn in games or groups”, which -based on face-validity- might actually relate to the concept of SR. It is therefore not fully clear why the exploratory factor analysis suggested omission. The most probably hypothesis is that these items capture other behavioral domains distinct from the 13-items representing SR. Likewise, the structural equation modeling failed to support the inclusion of the items “gets into physical fights”, “is impulsive, acts without thinking” and “is able to follow class routines without reminders” – although all three investigators initially considered them to be appropriate and relevant items to measure SR. This however does not seem unusual: Also other studies on the development of theory- or literature-based questionnaires have shown that theoretically relevant items are dropped after factor analytic steps [
68,
69]. Authors have argued that this might be due to the wording of some items not being appropriate to reflect the latent construct for which they were actually included.
Reliability assessment
The 13-item GEDI-SR scale showed favorable reliability, both with respect to internal consistency as well as the results from structural equation modeling and re-test analyses. Yet, we must acknowledge some limitations regarding test–retest and interrater reliability. First, due to the COVID-19 pandemic and difficult organizational conditions in kindergartens, we received significantly fewer pairs of data than intended. With three pairs only for 6-year-olds, calculation of ICCs was not possible as was the calculation of interrater ICCs for 4- to 6-year-olds. We therefore only present overall values and recommend age-specific reliability analysis in a future study.
Concurrent validity
We assessed concurrent validity by comparison to the VSK-SR scale. The VSK-SR scale tends to focus behavioral inhibition, namely patience, adaptability, and perseverance skills, whereas the GEDI-SR scale reflects cognitive inhibition and selective/focused attention with slightly different dimensions (concentration, diligence, and adherence to rules). Given this difference, the degree of agreement in terms of Pearson’s correlation coefficient was good. However, despite good overall concurrent validity results, the additional Bland–Altman analysis revealed that the two scales ((G)EDI-SR versus VSK-SR) differed for extreme values of SR. It thus remains uncertain whether the VSK-SR overestimates the extremes or the GEDI-SR underestimates deviations from the mean. Therefore, a future study might want to re-investigate the agreement of the GEDI-SR scale and another instrument available in German language, such as the SDQ.
Comparison of reliability and validity results with those of other SR instruments
Regarding its psychometric properties and validity, the GEDI-SR scale shows values comparable (or even superior) to those of other instruments used to measure SR in the international and national context, as exemplified and quantified in Table
9. For example, the GEDI-SR scale compared to the other instruments shows very good internal consistency. Test–retest reliability seems even better than that of the CBQ or SDQ.
Table 9
Comparison of psychometric properties of the GEDI-SR scale with other SR-measurements
Reliability |
Internal consistency | ⍺ 〜 0.90 | ⍺ ≥ 0.86 | ⍺ = 0.67 to 0.71 | ⍺ = 0.73 | 0.90 < ⍺ > 0.97 | 0.82 < ⍺ > 0.94 |
Test–retest reliability | ICC = 0.85 (95%-CI: 0.71 to 0.93) | Pearson’s r = 0.72 to 0.89 | r = 0.61 – 0.70 | r = 0.73 (after 4 to 6 months) | r = ≥ 0.90 | X |
Interrater reliability | ICC: 0.71 (95%-CI: 0.43 to 0.89) | r = 0.52 to 0.78 | r = 0.47 | r = 0.80; (sample of 5–15-Year-Olds) | X | r = 0.56 |
Validity |
Concurrent Validity | r = 0.75 with VSK-SR | r = 0.56 to 0.77 with the Richman Behavior Checklist | X | OR = 13.5 (95%-CI: 11.1 to 16.3) with DSM-IV-diagnosis | X | r = 0.70 with BASCb |
Moreover, our results confirm the good psychometric properties of the original (G)EDI and show that the "Social Competence" and "Emotional Maturity" scales of the EDI have been developed very well with regard to the selection and formulation of items. Building on this excellent work of the Canadian developers, we were now able to develop a reliable and valid SR scale that is inherent to the (G)EDI and thus does not require additional time for SR-assessment.
Public health implication
Given good psychometric characteristics, high validity and reliability of the (G)EDI-SR scale, our work is the precondition for a public health monitoring process, which could take GEDI-SR as part of the (G)EDI or as a stand-alone scale as a starting point for intervention implementation, both at the individual child as well as the population level. The newly developed GEDI-SR might be specifically relevant to those countries already monitoring child development in kindergartens using the EDI at scale (e.g., Australia [
45]). However, to lever its use as a potential public health screening instrument, in a next step, age-specific standardized cut-offs should be established in a representative sample (standardization sample) [
70]. After the establishment of valid cut-off values, each country using the EDI for developmental monitoring could efficiently screen for SR difficulties in this early age and use the screening for tailored implementation of SR-promoting interventions in kindergartens at a public health scale.
Strengths and limitations
To our best knowledge, this is the first study to define and validate a short SR scale within the widely used EDI. Although other short SR subscales exist (e.g. in the VSK-SR or the CBRS) and might be theoretically usable, our scale might be very efficient from a public health perspective as its items are part of and included in the administration of the EDI or GEDI. In addition, the costly purchase of e.g. the VSK (which is not open access) and the necessary, separate scoring methodology make the use of a separate SR scale potentially challenging for teachers and public health researchers, especially if compared to the (G)EDI assessment, which would allow developmental and SR assessment at once and is available free of charge.
In terms of item selection for the GEDI-SR scale, we only achieved a moderate agreement between raters, which underscores the difficulty to distinguish SR from other constructs such as social competence or emotional maturity. Despite the agreement and consensus regarding the theoretical basis, the only moderate agreement might also be explained by the raters’ different professional perspective and background (psychology, occupational therapy, pedagogy), e.g. bringing about different preferences for wordings and deviating operationalizations. However, reassuringly, the results of our exploratory and confirmatory factor analyses and structured equation modeling suggest that the selected items represent the latent construct SR.
Although we were able to include two independent data sets, we are aware that both might be affected by selection bias, according to their geographic location (e.g. potentially containing lower numbers of children from families with low socioeconomic status). As we did not collect the SES of the children's families we cannot assess representativeness of the samples. Hence, our data cannot readily be generalized to specific subgroups of interest, for example children from parents with recent migrant background and lower socio-economic or educational status. Moreover, 6-year-old children are underrepresented in both datasets. We found differing percentile values for lower age groups, but we attribute these to a higher inter- and intra-individual variability of developmental maturity [
71].
In addition, we did not establish reference values in a representative data set. However, given the successful replication of the structured equation modeling with the validation dataset, we were at last able to demonstrate the stability of the model across populations. Last, at this stage and without a standardized sample, we are currently unable to determine the predictive validity of the GEDI-SR scale.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.