Introduction and state of the question
Teacher training and pedagogical content knowledge
Over the last few decades the initial training and the continuous professional development of teachers has become a mainstream issue (González & Skultety, 2018). International studies insist on the need to renew teacher training programs in order to upgrade the teaching-learning processes in compulsory education (Barnes, Fives, & Dacey, 2017). A large number of authors argue that it is necessary to conduct more comparative research and to transfer its findings into classroom practice (König, Ligtvoet, Klemenz, & Rothlandb, 2017). In the context of current research lines in teacher training, the analysis of the knowledge and conceptions of teachers has come to play a fundamental part in the approach that should be adopted by investigations on initial training programs (Darling-Hammond, 2006; Fives & Buehl, 2012). Outstanding among this body of research are studies aimed at calibrating the several types of professional knowledge possessed by teachers and emphasizing the command of regular classroom tasks (Oliveira, Lopes & Spear-Swerling, 2019).
The proposals by Shulman (1987) have had a broad influence on the definition of such categories as are designed for the purpose of researching teachers’ knowledge. Particularly when teaching skills are evaluated, researchers tend to draw distinctions between content knowledge (CK), pedagogical content knowledge (PCK) and general pedagogical knowledge (GPK) (Kleickmann, Richter, Kunter, Elsner, Besser, Krauss, & Baumert, 2012). While CK consists in the knowledge of a specific subject and is related to the contents that teachers are expected to explain, GPK is general pedagogical knowledge and involves broad principles and strategies for classroom management and organization (Blömeke, Busse, Kaiser, König, & Sühl, 2016). PCK includes knowledge that relates the specific subject contents to the purposes of teaching (Monte-Sano, 2011): it is a kind of knowledge that delves deep into the social representations that students have with regard to a specific subject matter as well as into the way students understand that knowledge, the methods and resources that are needed in order to teach that discipline and the selection and organization of specific contents so as to adapt them to the reality of the classroom (Meschede, Fiebranz, Möller, & Steffensky, 2017).
T-PACK, teacher digital competence and didactic methodology
Digital resources are having a great impact on the new ways of classifying teacher competencies. While it is true that both teachers and students are immersed in media experiences in their everyday lives, the transfer of such an experience into the teaching-learning process has not yet been fully developed (Ramírez & González, 2016). There are still some reservations about their use that have a strong bearing on teacher training. In fact, teacher training constitutes the variable that exerts the greatest influence on the level of digital competence of teachers, according to studies like the one by González, Gozálvez and Ramírez (2015).
One of the most robust research proposals for the integration of digital resources in teacher training programs is the methodological model known as Technological Pedagogical Content Knowledge (T-PACK) developed by Koehler and Mishra (2008). This model supports a teacher training approach that incorporates digital resources from a threefold perspective: the teacher’s acceptance of technology and technological competence, the use of pedagogical models and the didactic application of such technologies (Koh & Divaharan, 2011). In other words, the T-PACK model is based on the interrelatedness of three types of knowledge: pedagogical content knowledge, technological content knowledge regarding how technology can be useful in generating new types of content, and technological pedagogical knowledge, which is the whole body of knowledge related to the use of technology in teaching methodologies. This model has proven relatively successful, so that in the last five years we have seen a proliferation of studies about its impact in teacher training (Gisbert, González, & Esteve, 2016), which trace the teachers’ perceptions regarding the significance of digital literacy skills (García-Martín & García-Sánchez, 2017) or measure the teachers’ ability to develop digital information among their students and promote their communicative skills in this regard (Claro & al., 2018).
Other relevant studies focus on the integration of professional digital skills into teacher training (Instefjord & Munthe, 2017) and the characterization of such factors as account for digital inclusion (Hatlevik & Christophersen, 2013) or the proposal of basic criteria for the teaching of digital skills both in schools and in teacher training programs (Engen, Giæver, & Mifsud, 2015).However, research work on the training of teachers in the domains of History and other Social Sciences that involves comprehensive, systematic and comparative studies is still scarce. There are some proposals that have produced a model in order to align evaluation with learning skills and activities (Guerrero-Roldán & Noguera, 2018). Other studies, like the one by Cózar and Sáez (2016), focus on play-based learning or gamification in the initial training of Social Science teachers; or on the digital skills of prospective Social Science teachers as defined by the TPACK model (Colomer, Sáiz & Bel, 2018). All together, they have opened up an avenue of research that needs to be further pursued.
The T-PACK model is based on the interrelatedness of three types of knowledge: pedagogical content knowledge, technological content knowledge regarding how technology can be useful in generating new types of content, and technological pedagogical knowledge, which is the whole body of knowledge related to the use of technology in teaching methodologies.
Our main goal is to analyze the existing relationships between the views and perceptions of teachers-in-training regarding the use of digital resources and their own appraisal of History as a formative subject, as well as the didactic strategies that they are expected to implement in the classroom. This general goal, in turn, gives rise to four distinct research problems:
Q1. What is the response profile of teachers-in-training concerning the use of digital resources in the teaching of History? Are there differences between the answers provided by Spanish and British respondents?
Q2. What is the relationship between the opinions of teachers-in-training about the use of digital resources and their perception of evaluation processes?
Q3. What is the relationship between the opinions of teachers-in-training about the use of digital resources and the value they attach to History as a formative subject?
Q4. What is the relationship between the opinions of teachers-in-training about the use of digital resources and the value they attach to the development of historical skills in the classroom?
Material and methods
The context in which this research took place is the professional postgraduate degree that provides graduates with the required qualification to become Secondary Education teachers of History both in Spain and in Britain. 506 teachers-in-training were recruited all of whom were enrolled in either Spain’s Master’s degree in secondary education, History and Geography specialty (344), or Britain’s Postgraduate Certificate in Education courses or Teach First programs (162) by the end of academic year 2015-2016. 22 universities joined the study, 13 from Spain and 9 from Britain. Even though the number of British participants was lower, the sample representativeness was similar for both countries.
According to official data, and following consultation with a British expert in teacher training, its is estimated that a population of 1,200 students in Spain and 800 in Britain are enrolled in these professionally-geared degrees. The choice of these two countries is due to their different traditions in History education —in the case of Britain focused on the development of historical skills by contrast with Spain’s emphasis on conceptual contents and transversal competencies.
The design chosen for the purpose of the present study was quantitative and non-experimental, involving the use of a Likert scale questionnaire (1-5). Survey-based designs are quite common in the field of education, since they are applicable to multiple problems and make it possible to collect information on a high number of variables (Sapsford & Jupp, 2006).
Data collection instrument
The data used are part of a questionnaire named “Views and perceptions of teachers receiving initial training on History learning and the evaluation of historical competencies”. The questionnaire was validated by four experts from different areas and universities in Spain who had extensive experience in Secondary Education. It was constructed around the pertinence and clarity of each of the items: only items scoring three on average were eventually included. The first part of the questionnaire deals with identification details and includes information about the university, gender, age and training background of respondents. The second one consists of three thematic blocks. The first block, titled “Views and perceptions about evaluation and its role in the teaching-learning process” focuses on teaching practices in relation to traditional and innovative profiles, following studies like those authored by Alonso-Tapia and Garrido (2017) or Stufflebeam and Shinkfield (2007).
The second block, “Views and perceptions about History as a formative subject, methods, sources and teaching resources” deals with the opinions of respondents with regard to the epistemology of History and its function as a subject in education. This section draws upon the Beliefs History Questionnaire used by VanSledright and Reddy (2014). The third block, “Views and perceptions about the evaluation of historical competencies in Secondary Education: use of sources, causal reasoning and historical empathy”, is mainly based on the three basic principles of historical thinking: causal explanation, sources and evidences, and empathy or historical perspective (Martínez-Hita & Gómez, 2018).
Once the questionnaire was validated by the experts, it was translated into English and submitted for further validation to the ethics committee of the University College of London’s Institute of Education, which provided its approval. For the purpose of collecting the information we previously contacted teachers in both countries. Completed questionnaires were collected in paper format from the universities of Murcia, Alicante, Valencia, Barcelona, La Rioja, Zaragoza, Oviedo, Cantabria, Valladolid, Burgos, Madrid (Universidad Autónoma), Málaga and Jaén. In Britain, questionnaires were collected, both on line and in format paper from the following universities: IoE-UCL, Exeter, Edge Hill, Metropolitan Manchester, York, Leeds, East-Anglia, Birmingham and Christ Church of Canterbury.
Procedure and data analysis
Data analysis was performed along three stages: a) Exploration of the structure of assessments on the usefulness of digital resources through latent class analysis; b) Estimation of confirmatory factor models for all three questionnaire blocks; c) Estimation of the differences across classes as regards the variables modeled under point b. All the analyses were performed by using Mplus 7.0 (Muthén & Muthén, 2015).
Latent class analysis
In the first place, modeling was performed on the assessments provided by students regarding the importance of using the several modalities of digital resources (Internet, digital and printed press, films and documentaries on historical topics, video games and comics). To this end we used latent class analysis (LCA). LCA constitutes a useful method in order to statistically identify internally homogenous groups on the basis of continuous or categorical multivariate data. LCA uses probabilistic models for non-observable group membership unlike other clustering methods based on the detection of conglomerates by means of arbitrary or theoretical distance measurements (Hagenaars & McCutcheon, 2002). The number of classes was determined by using fit indices: entropy, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), the sample-size ajusted BIC (ssaBIC) and the Lo-Mendell-Rubin test (LMR). Lower values for AIC, BIC and saaBIC suggest a better fit of the current model with respect to the more parsimonious previous one. Entropy is an index of the accuracy with which the model classifies individuals (values above .70 suggest a substantial accuracy). The LMR tests the null hypothesis that the solution with k+1 classes is no better than the solution with k classes. Significant LMR values (p<.05) suggest that the solution involving the higher number of classes represents more closely the structure of the data (Lo, Mendell, & Rubin, 2001).
Estimation of factor models
Prior to estimating interclass differences concerning the variables under examination, confirmatory factor analyses were conducted in order to ensure measurement quality. We first assessed the dimensionality of each scale by means of an optimized parallel analysis (Timmerman & Lorenzo-Seva, 2011). Next, we estimated the confirmatory models according to the number of factors suggested by the parallel analysis.
In order to evaluate the goodness-of-fit of factor models, we estimated the root mean square error of approximation (RMSEA), the comparative fit index (CFI) and the Tucker-Lewis index (TLI). RMSEA values lower than .05 or .08, and CFI and TLI higher than .95 and .90 respectively suggest a good or acceptable fit of the data in the model (Hu & Bentler, 1999). Additionally, we traced the presence of local misfits by using the modification indices (MI) and the standardized expected parameter change (SPEC) for each model. MI values higher than 10 and SEPC values higher than .20 suggest the presence of local sources of misfit that should be investigated before selecting the definitive model (Saris, Satorra & Van der Veld, 2009). In order to estimate all factor models, we used means and variance adjusted weighted least squares (WLSMV), given the ordinal nature of the input data.
Classes were compared by using the standardized factor scores obtained for each of the scales. A t-test was performed on every pair of classes. A significance level of .01 was used in order to decrease the probability of classifying as significant differences that are substantially irrelevant. For every significant contrast, we estimated the effect size (Cohen, 1988).
Latent class analysis
Table 1 contains the results of our latent class analysis. We estimated models of a maximum of five classes (the six-class solution could not be correctly estimated due to a non-positive definite derivative matrix). The one-class solution, equivalent to a unidimensional factor model, obtained the worst fit of all consulted indices.
AIC, BIC and saaBIC showed improved values as far as the five-class solution was concerned. However, the BIC and saaBIC improvement in the five-class model with regard to the four-class model could not be taken as strong evidence in favor of the model with more parameters (ΔBIC=3 and ΔsaaBIC=9 were in both cases lower than a Bayes factor of 150; Raftery, 1995). The LMR test suggested the inclusion of more classes until reaching the five-class model, where LMR turned out to be non-significant (p=.679), thus suggesting that the four-class model should be retained. Entropy was adequate in all cases. Given these results, we chose to retain the more parsimonious four-class solution.
The response profile of the several classes is shown in figure 1. The lines represent the average score for each item per class (the higher the score, the more importance is attached to the items within digital resources).
The rectangles represent a standard deviation around the mean (horizontal line) estimated on the basis of the data from the full sample. Class 1 (21.2% of the sample) assigned high values to all items in digital resources. Class 2 (25.7%) assigned moderately high values to all items, with the exception of those with a larger written content (popularizing magazines and historical novels), which obtained somewhat lower values.
Class 3 (30.1%) showed a very similar profile to class 2, except for the items “documentaries”, which received slightly higher values, and “video games” and “comics”, where values were substantially low (unlike in class 2). Lastly, class 4 (22.9%) was assigned intermediate (Internet, digital and printed press, documentaries), low (film, novels and popularizing magazines) or very low values (comics and video games). The distribution of individuals across classes was significantly different in Spain and Britain (χ²(3)=28.96, p.=.001), but hardly relevant in any case (Cramer's V=.21).
Figure 2 (panels a, b and c) shows the results of the parallel analysis for each scale. The analysis suggested a two-factor structure for scale A and a one-factor structure for scales B and C, since only one of the empirical eigenvalues was higher than the simulated eigenvalues (1,000 matrices).
Items on scale A underwent an exploratory factor analysis (weighted least squares for categorical variables implemented on FACTOR 10.9; Lorenzo-Seva & Ferrando, 2006). The two-factor correlated solution produced a clear structure where one factor clustered items referring to the preference for traditional evaluation procedures, and another one clustered those other items related to innovative evaluation procedures.
Traditional evaluation procedures clustered by factor A1 were: a) evaluation is a positive element; b) it must rely on curricular precepts; c) qualitative techniques must have a lower impact; and d) the examination is an objective procedure. The more innovative procedures clustered by factor A2 were: a) conceptual concepts must have a lower impact; b) traditional evaluation procedures hamper innovation; and c) traditional innovation procedures are related to school failure. Inter-factor correlation was negative and low (-.31), suggesting that the preference for innovative methods does not necessarily imply the rejection of traditional methods (and vice versa). Factor B clustered items related to conceptions of history as a formative subject from a traditional perspective: a) History is simply knowledge of the past; b) the disagreement among historians is only due to problems about sources; c) historical contents must be based on the origin of the nation; d) sound reading and memory skills are enough to interpret sources; e) it is complicated to use methods of inquiry. Factor C included items that were least inclined to the development and evaluation of historical skills in the classroom, save for the use of sources: a) it is essential to memorize dates; b) items supporting the use of sources; c) items against causal explanation; d) items against the use of historical empathy.
Table 2 contains the fit indices of confirmatory factor models. In the case of variable A, the two-factor correlated model achieved a sufficient goodness-of-fit according to RMSEA and CFI, but a suboptimal one according to TLI.
We can observe that the correlation between residuals for two items specifically referred to content evaluation obtained MI and SEPC values respectively higher than 10 and .20. Since it is to be expected that pairs of items referring to very specific aspects of content should exhibit moderate correlations beyond those explained by the factor itself (Brown, 2006), we chose to dispense with that correlation. The resulting model displayed a sufficient goodness-of-fit (RMSEA=.06, CFI=.951, TLI=.912). The unidimensional model for variable B obtained a close fit (RMSEA=.04, CFI=.97, TLI=.95) without further model specifications being needed. The unidimensional model for variable C obtained a sufficient goodness-of-fit on RMSEA and CFI, but not on TLI (.86). The main sources of misfit in this case were two correlations between residuals. Such specific shared variance was modelled by dispensing with both correlations, which resulted in a substantially better fit (RMSEA=.03, CFI=.97, TLI=.95).
For the purpose of interclass comparison, we use the standardized factor scores (M=0, DT=1) estimated by means of the factor models described above. Table 3 contains the results of the t tests.
The main differences across classes were observed in factors A1 and A2, where ten out of twelve contrasts turned out to be significant with effect size ranging from very low (.34) to very high (1,77). Factor B showed significant differences in four out of six contrasts, with effect sizes ranging from moderate (.47) to high (.93). Finally, factor C only presented one significant difference with a moderate size effect (.59).
In order to facilitate the interpretation of results, figure 3 shows the mean standardized factor scores by class and factor. Regarding variable A1, class 4 showed substantially more favorable appraisals of traditional methods than the remaining classes: classes 2 and 3 showed intermediate ratings and class 1 expressed very unfavorable ones. Regarding variable A2, the most favorable views on innovative procedures were presented by classes 1 and 2, with very large differences (as many as 1.7 standard deviations) compared to classes 3 and 4, which showed moderately unfavourable assessments of innovative procedures. Regarding variable B, the main differences were observed in class 1, which produced a substantially negative appraisal of traditional perceptions of History; and in class 4, which showed moderately positive appraisals. In the case of variable C, the single relevant difference was observed between classes 3 (slightly negative appraisals) and 4 (slightly positive appraisals).
Discussion and conclusions
The results of the latent class analysis and the interclass comparison performed by using the factor model for each of the questionnaire blocks enable us to answer all four research problems.
P1. Taken as a whole, the classes reflect two issues: a) In the first six analyzed items, the classes are virtually arranged like a continuum ranging from high (class 1) to moderate ratings (classes 2 and 3), and from moderate to low (class 4); b) The previous arrangement changes in the case of comics and video games, where classes are organized around two opposite extremes resulting in a highly polarized bimodal distribution involving two substantially favourable classes (1 and 2) and two highly unfavourable ones (3 and 4). Differences in class size, on the other hand, are not very relevant, since all range between 21 and 30% of the sample. As for the differences between the results for Spain and Britain, these are scarce from a statistical point of view and basically concern the sizes of individual classes.
P2. There is no single bipolar continuum of traditional-innovative processes. The preference for innovative or traditional procedures operates as a binomial of two different and scarcely dependent factors, so that one can find individuals who prefer innovative processes without necessarily rejecting traditional ones. The traditional process factor is clearly class-related in the sense that the higher the rating assigned to the usefulness of digital resources, the lower the preference for the use of traditional procedures (this relation is clearly seen in classes 1 and 4). In the innovative process factor, groups are polarized in a similar fashion to what happened with the ratings of comics and videogames. Thus, classes 1 and 2 (positive assessments of comics and video games) are quite in favor of using innovative strategies. In comparison, classes 3 and 4 (low value attached to comics and video games) express a lower preference for innovative processes. The results show a clear correlation between the value assigned to innovative methodologies, on the one hand, and to the usefulness of comics and video games in the History classroom on the other. A clear example of this correlation can be found in class 3, which assigned very similar values to those attached by class 2 to the first six items within digital resources, while expressing a more negative assessment of comics and video games. Class 3 presents assessments of innovative procedures that are radically different from those of class 2. International studies on gamification have shown the close connection between the use of video games in the classroom and the increase in motivation and support of innovation in teacher training (Landers & Amstrong, 2017; Özdener, 2018). Although to a smaller extent, research and innovation experiences have also been published with regard to the use of comics in specific topics in the social sciences and its impact on motivation (Delgado-Algarra, 2017). Teachers-in-training see both resources as two important elements for innovation in the History classroom that are closely tied to motivation.
P3. The result for factor A1 (traditional evaluation procedures) is now repeated, but differences are much slighter in this case. In other words, the higher the ratings for the items under the digital resources category, the lower the values assigned to items presenting history as a formative subject from a traditional perspective. These results are in line with the findings in the study by García-Martín and García-Sánchez (2016), which relates the implementation of active methodologies (together with the use of innovative strategies, styles and approaches) to the acquisition and development of digital skills. In this case, again classes 1 and 4 (which express opposite views regarding the value of digital resources) represent the largest rating differences (.93). Classes 2 and 3 (representing opposite views as regards the rating of comics and video games) assign similar scores to this factor. We can observe that this polarization of classes 2 and 3 is rather related to the views on the value of innovative methodological procedures than to the open rejection of traditional methods. Moreover, since for factor B there is a mixture of methodological and epistemological elements, no differences in the views expressed by both classes (2 and 3) can be attested.
P4. There is only one difference and it qualifies as moderate. A tendency is perceived for factor B. The larger presence of items related to the discipline’s epistemology explains the fewer discrepancies across classes. In the case of factor C, the items are mainly related to the development and evaluation of historical competencies. The differences in the teachers’ ratings of the use of digital resources were mainly linked to their conception (rather traditional or innovative) of teaching methodologies. Yet such differences did not exhibit the same intensity as regards their epistemological conceptions of history: a mismatch that was already pointed at by Kirschner a decade ago (2009).
In view of the results obtained, we believe it necessary to strengthen digital competencies in teacher training programs that go beyond the mere acquaintance with ICT tools. The T-PACK model provides an alternative where the use of technology is seen from a didactic perspective targeted at teaching contents (Claro & al., 2018). If we implement this model in the training of History teachers, the use of digital resources should encourage the prospective teachers’ ability to propose activities where the historian’s procedures play a major part. Moreover, such activities should be developed on the basis of questions that enable students to solve problems by applying methods of inquiry. Research on History education over the last few decades has espoused these proposals in the face of traditional approaches and on the basis of a more competency-based epistemological view (Van-Drie & Van-Boxtel, 2008). Until these methodological perspectives are not brought together, digital resources will play a merely playful and motivational role, and will not develop a truly critical approach that instils in students the ability to evaluate digital information (Hatlevik & Hatlevik, 2018) and solve historical questions. It is necessary to adopt measures within teacher training so as to achieve a competency-based form of History education that resorts to more active learning methods (Gómez & Miralles, 2016) and foregrounds a direct relationship between the implementation of active methodologies (involving the use of innovative strategies and approaches), a shift in the epistemological model of historical knowledge and the development of digital competencies (García-Martín & García-Sánchez, 2016).