Systematic Versus Informal Application of Culturally Relevant Pedagogy: Are Performance Outcomes Different? A Study of College Students

In a field study, the effects on academic performance of two different applications of culturally relevant pedagogy (CRP) in the classroom were measured. Both entailed modes and contents of instruction that attend to the specific cultural characteristics of the learners. However, in one condition (systematic CRP application), emphasis on culturally relevant contents extended to both instruction and assessment, whereas in another condition, they were largely confined to instruction (informal CRP application). Students of Middle Eastern descent who were enrolled in either a history or a critical thinking course were exposed to one of the two conditions. During the first half of the semester, assignment and midterm performance were not significantly different. However, performance during the second half of the semester and attendance rates were higher for the systematic CRP condition. These findings suggest that emphasis on culturally relevant content encompassing both learning and assessment can be beneficial to academic performance but its benefits become tangible only with sustained


Introduction
Some time ago, at an institution of higher learning in Saudi Arabia (SA), an undergraduate student visited a faculty for advice and guidance. She stated that in one of her classes, an assignment required students to make an oral presentation regarding the knowledge, experience, and skills demanded by a job or profession they would like to perform after graduation. Her talk and the PowerPoint document that accompanied it were judged by the instructor as excellent. The instructor went so far as to openly praise her work in front of the entire class. A few days later, she was at first puzzled, and then rather distraught when she realized that the instructor, who was of Middle Eastern descent, had deducted points for her wearing an abaya (a traditional attire for women in SA) instead of a business suit on the day of Is there evidence that CRP specifically benefits academic performance?
Most of the studies on CRP to date have relied on a variety of methodologies, including case studies, surveys, and observations. One of the most glaring issues of the extant literature is the inconsistency in the way the theoretical model of CRP, as defined by Ladson-Billings (1994, 1995a, 1995b, has been understood and applied to research and education (Young, 2010). Another glaring issue is the scantiness of research demonstrating its effectiveness on learners' academic performance, especially college students. When evidence is available, it is mostly qualitative (see Benegas, 2019;Keratithamkul et al., 2020). Literature reviews illustrate these weaknesses. For instance, Wah and Nasri (2019), who reviewed the research published between 2010 and 2019 on the effects of CRP on students' learning and achievement, found only six articles that targeted performance. Although each illustrated a case study or retrospective case study in which CRP was found to benefit students' academic performance, the impact of CRP on college students was a neglected matter. Another literature review, including 37 studies published between 1995 and 2013 and 8 dissertations, again highlighted extant research's emphasis on primary education and qualitative evidence (Aronson & Laughter, 2016). Another limitation is that research has tended to focus on particular populations, such as African American, Native American, or Latino students (Cholewa et al., 2014;Howard & Terry, 2011;Irizarry, 2007;Schmeichel, 2012;Yazzie-Mintz, 2007), but has neglected others, such as learners of Middle Eastern descent (see Hamdan Alghamdi, 2014), thereby questioning the generality of its purported effectiveness.

The Present Study
As a way to begin to tackle some of the weaknesses in the extant literature, our research focuses on the academic performance of college students of Middle Eastern descent. According to Byrd (2016), sensible estimates of the effects of CRP may result from a comparison of classroom applications that use more CRP with those that use it less. Thus, ours is a field study that examines the effects of two types of emphasis on culturally relevant content: one targeting instruction (informal application) and the other targeting both instruction and assessment (systematic application). The present research is born from the consideration that even if the instruction of a course is pre-set by syllabi developed in the Western world, which instructors are required to follow to assess students' performance, opportunities for the application of CRP exist. In the present field study, informal applications are defined as consisting of instruction that encourages students to work collaboratively, relies on their own reservoir of knowledge and experiences, and calls attention to personally and culturally relevant examples in lectures and class discussions. However, it does not require that such knowledge and experiences be explicitly included in-class assignments or even tested during midterm and final examinations. Namely, all the ingredients of CRP are present, but they are voluntary for assessment purposes. Instead, systematic applications are operationally defined as consisting of all the properties of informal applications with the exception that the inclusion of culturally relevant knowledge is a required aspect of students' performance, encompassing different forms of assessment, such as tests and assignments. The question that we ask through this field study is whether the difference between systematic and informal CRP can affect students' academic success.

Data and Method Design
In our field study, the main outcome variables were grades for assignments and tests, serving as measures of performance, and attendance (percentage of class meetings attended during a semester), serving as a rough measure of engagement. For the measurement of all outcomes, the main independent variable was condition (informal versus systematic application). For the measurement of assignment performance, time of assessment (before versus after the midterm) also served as the independent variable. Participants One hundred and seventy-six undergraduate female students participated. They were all full-time students of a University located in the Eastern Region of SA whose curriculum follows a US model of higher education. As such, instruction was largely delivered in English. They were enrolled in one of two courses of the Core Curriculum: Critical Thinking, a required course, and Modern History (i.e., from the 1450s to today), an elective course. These courses were taught by the same instructor entirely in English.
Students, whose ages ranged from 18 to 25, reported Arabic as their first language and English as their second language. Their English competency had been verified through a standardized English proficiency test (i.e., TOEFL, IELTS, or Aptis) prior to admission. According to students' self-reports, collected in class, exposure to English and Western culture included a mixture of experiences: formal instruction in the form of mandatory English courses completed before college admission, interactions with expatriates, trips abroad, exposure to foreign television channels, and internet browsing and surfing.
Eight classes taught by one instructor of Middle East descent during a period of two semesters were selected through convenience sampling to ensure no overlap of students. Participation complied with the guiding principles of the Office for Human Research Protections of the US Department of Health and Human Services and with the ethical standards in the treatment of human subjects of the American Psychological Association.

Materials and Procedure
Students qualified for participation by virtue of their being enrolled in either of two courses: Critical Thinking or Modern History. They participated in the study for an entire semester, during which they completed the assignments and the tests on which our results are based. In a field study such as ours, which relied on actual students, who were enrolled in real classes, and whose performance was assessed on actual tests, random assignment of participants was unfeasible. No student was enrolled in more than one of the selected classes. The curriculum of both Critical Thinking and Modern History relied on syllabi approved by the Texas International Education Consortium (TIEC) and textbooks written for a US audience.
The study entailed two conditions which included either an informal or a systematic application of CRP. The instructor's applications of CRP were judged by independent observers as meeting the criteria set by Richards et al. (2007) for CRP. To wit, the instructor was reported to (1) acknowledge students' differences and similarities; (b) validate their cultural identities in instruction and materials used; (c) educate students about diversity in the world; (d) foster equity and respect; (e) nurture interactions among all parties involved in the learning process, including students, their families, faculty, etc.; (f) encourage active learning; (g) nurture critical thinking skills; (h) emphasize students' academic success as defined by their potentials; and (i) assist them in comprehending social and political issues and their implications.
There was one fundamental difference between the two CRP applications (i.e., conditions) involving (j) assessment suited to the population being tested (i.e., valid; Richards et al., 2007). To wit, the informal application of CRP was operationally defined as instructional modes and contents that comply with all the criteria set by Richards et al. (2007) except for assessment. Tests and assignments covered the content of textbooks written for US college students and did not explicitly require the participants to include knowledge of Middle Eastern beliefs, values, and practices. If such knowledge was included (e.g., a personal example to illustrate a concept), it would receive equitable evaluation (i.e., given the same weight as an example taken from the textbook or another foreign source). Instead, the systematic application of CRP was operationally defined as instructional modes and contents that comply with all criteria set by Richards et al. (2007), including assessment being suited to the population being tested. Thus, although tests and assignments covered the content of textbooks written for US students, they explicitly required participants to include knowledge of Middle Eastern beliefs, values, and practices.
For instance, in Modern History, an assignment required students to critically examine the life and deeds of a historical figure. If a student chose the astronomer Copernicus, the connections between him and Muslim scholars, such as Al-Tussi and Ibn Al-Shatir would either be required (systematic application) or be merely suggested by the assignment (informal application). Similarly, if a student examined Vasco Da Gama or Columbus, she would be asked to compare him to Ibnu Battuta or be merely encouraged to do so. In Critical Thinking, an assignment might require students to compare and contrast arguments regarding an issue that the students could select from the textbook or other sources (informal application), or compare and contrast arguments regarding the responses to interview questions students themselves collected regarding an issue (e.g., the role of women in politics following decrees formalizing such roles) that had been deemed culturally relevant by both the instructor and the students at the time of the class.
The informal application of CRP involved 3 sections of Critical Thinking (n = 62) and one section of Modern History (n = 28). The systematic application of CRP involved 3 sections of Critical Thinking (n = 64) and one section of History (n = 22). These two courses were selected to ensure adequate representation of the Core Curriculum whose courses emphasize either the practice of fundamental academic skills (e.g., writing, speaking, reasoning, etc.) across a wide range of topics, such as Critical Thinking, or the acquisition of knowledge about a specific academic field, such as Modern History.
The instructor of the selected courses was chosen for her instructional mode, which would fit the CRP framework (Rychly & Graves, 2012), and per her willingness to participate in a study in which such a pedagogy would be explicitly applied to assignments and tests. In advance of the study, peer classroom observations identified her instructional style as fitting the criteria that define CRP put forth by Rychly and Graves (2012). To wit, her style was characterized as exhibiting an empathetic and caring attitude, was informed by knowledge of a variety of cultures, including Middle Eastern contents, and conveyed an awareness of her own cultural frames and their implications. The instructor was known to her colleagues and past students as a reflective educator who made it a point to incorporate knowledge of the Middle East in her lectures and class discussion. It is important to note that the main research question that motivated the present investigation was not discussed with the instructor during implementation.
To ensure that the comparison between the two conditions did not involve students with distinctly different characteristics, information about students' general self-efficacy and self-image was collected at the start of the semester. General self-efficacy is a "can-do attitude" that refers to people's confidence to perform well across a wide range of tasks and situations (Bandura, 1993). General self-efficacy can be conceptualized as a motivational trait that people develop over time from the accumulation of successes and failures (Chen et al., 2000). It is a trait that is thought to contribute to academic performance (Pilotti et al., 2019). In the present study, the New General Self-Efficacy (NGSE) questionnaire (Chen et al., 2001) was selected to provide information about participants' general self-efficacy. The questionnaire asked participants to report on a five-point Likert scale from "strongly disagree" (1) to "strongly agree" (5) their agreement with each of eight statements of general confidence in one's competence to deal effectively with life challenges. The NGSE's reliability, as measured by Cronbach's alpha, was .82.
Information about students' self-image was collected from the Twenty Statements Test (TST; Hartley, 1970;McPartland et al., 1961). The test required participants to respond to a single open-ended probe, "Who am I?", twenty times, each time with a unique answer (Kuhn & McPartland,1954). The TST was used to gather information about the nature of the selfimage of the participants, under the assumption that their independent or interdependent selfimages (Cousins, 1989;Gardner et al., 1999) could make them differentially sensitive to CRP.

Results
Participants' characteristics were first analyzed to determine whether students assigned to the two CRP conditions differed at the start of the study. Students' performance was then assessed to determine whether there were differences between conditions. Analysis of variance (ANOVA) was utilized in both assessments. Data from the courses in which students were enrolled were collapsed. Since there was no evidence that course type interacted with condition, this variable was not included in the analyses described below. Results are considered significant at the .05 level (Field, 2009). Assessment of Participants' Characteristics at the Start of the Study Table 1 displays descriptive statistics. NGSE responses were submitted to a one-way ANOVA with condition as the factor. There were no significant differences between conditions in self-efficacy, F(1, 174) <1, ns. Responses to the TST were classified into one of two classes (Ashton-James et al., 2009;Gardner et al., 1999): (a) responses reflecting an interdependent self-concept included references to group membership, relationships, and social roles (e.g., "I am Saudi", "I like to help others", "I am a sister", and "I am a daughter"); (b) responses reflecting an independent self-concept included references to psychological traits (e.g., "I am determined", "I am smart", "I am realistic", and "I am strong"). Responses that fell outside these two classes by signaling neither an independent nor an interdependent self (e.g., "I am hungry", "I like dark colors", and "I am 18 years old") were excluded (M = 3.11%). Percentages were submitted to a two-way ANOVA with condition (informal vs. systematic application) and type of response (interdependent versus independent self-concept) as the factors. A main effect of type of response was uncovered, F(1, 174) = 296.75, MSE = 399.82, p <.001, ηp 2 = .630, illustrating that most responses reflected an independent self rather than an interdependent self. However, there was neither a main effect of condition nor a significant interaction, Fs < 1. To wit, responses reflecting an independent self, as well as those reflecting an interdependent self did not differ between CRP conditions. Table 1 Descriptive Statistics (Mean, M,

and Standard Error of the Mean, SEM) of Key Participants' Characteristics
Note. Responses that did not fit an independent or interdependent self were excluded (M = 3.11%)

Assessement of Performance
All selected courses required several assignments, a midterm test, and a final test. Modern History involved 3 assignments to be completed before the midterm test: (a) an analysis of a historical figure, (b) a research proposal, and (c) a literature review related to the research proposal. After the midterm, it required the completion of (d) a research project (building on the assignments completed before the midterm) along with the presentation of its content to the class. Critical Thinking instead involved two assignments before the midterm: (a) a critical analysis of an issue, and (b) the review and presentation of a selected text (e.g., article or book chapter). After the midterm, there were two assignments: (c) a research project which required students to gather evidence about a controversial issue through interviewing family members (systematic application condition) or through reviewing the scholarly literature (informal application condition) followed by an in-class debate or discussion, and (d) an assignment entailing the comparison and contrast of viewpoints on a selected topic. In both the informal and the systematic application conditions, all activities were methodologically equivalent except for activity (c) in Critical Thinking which was followed by different methods for gathering evidence and for presenting it to the other members of the class. Test questions and assignments encompassed the six different types of information processing highlighted by the Boom's taxonomy of human thinking (Anderson & Krathwohl, 2001;Bloom, 1956Bloom, , 1976Krathwohl, 2002): remembering (i.e., the act through which acquired information is first retained and then reinstated), understanding, application, analysis, evaluation, and synthesis/creation of original work. To ensure stable measurements of assessment activities before and after the midterm, scores of different assignments administered before the midterm were averaged together. Scores of assignments administered after the midterm were also averaged together. Thus, students' performance regarding assignments was clustered in two sets, depending on whether scores were gathered during either the first half of the semester or the second half. Midterm scores were kept separate. The final test performance was not available due to institutional restrictions. Table 2 displays descriptive statistics. All performance and attendance scores were distributed on a scale from 0 to 100. Midterm test performance was not significantly different between informal and systematic applications of CRP, F(1, 174) = 3.10, ns. A 2 time (before vs. after) X 2 condition (informal vs. systematic application) mixed factorial ANOVA, conducted on assignment scores, yielded a main effect of time, F(1, 174) = 16.18, MSE = 57.80, p <.001, ηp 2 = .085 (condition, F = 2.27, ns). However, a significant interaction, F(1, 174) = 19.19, MSE = 57.80, p <.001, ηp 2 = .099, illustrated that performance improvement was not uniform. Tests of simple effects specifically indicated that in the informal application condition, performance on assignments did not differ between before and after the midterm, t(89) < 1, ns. Instead, in the systematic application condition, performance on assignments improved after the midterm, t(85) = 5.14, p = .001. Attendance over an entire semester was overall high, but superior in the systematic application condition, F(1, 174) = 6.16, MSE = 59.53, p =.014, ηp 2 = .034.

Discussion
The results of the present field study can be summarized in two key points. Namely, CRP that targets both instruction and assessment can benefit students' academic performance more than CRP that targets only instruction. However, the benefits of a pedagogy that explicitly emphasizes culturally relevant content in both instruction and assessment require some time before they can be detected.
Evidence exists from a diverse array of studies that assessment exercises can promote long-term retention, and, more broadly, learning (Butler, 2010;Karpicke, 2012;McDaniel et al., 2007;Roediger & Karpicke, 2006). Our results are consistent with research demonstrating that assessment can be conceptualized as a learning opportunity through which materials are reiterated and further analyzed. In our study, emphasis on culturally relevant content in assignments and tests might have enhanced the value that students attribute to such content. As a result, greater attention and processing were devoted to culturally relevant information, which then became easier to remember and use, thereby improving students' performance. Of course, the comparison that we carried out did not allow us to measure the benefits of CRP on instruction only. The extant literature though offers evidence of such benefits, albeit not always in a quantitative format (Aronson & Laughter, 2016;Vu, 2019;Wah & Nasri, 2019).
The data collected from the students of our study indicate that even in educational settings structured by syllabi that instructors are required to follow, room for creative and helpful infusions of culturally relevant materials is not only possible but also beneficial. We can all learn from the instructor in this study who was deliberate about what and how she chose to teach. She learned from this study too as she indicated that the quantitative evidence that illustrated the impact of her teaching on students' academic learning made her see CRP as an unavoidable obligation for all instructors teaching in a foreign land.
Our field study has limitations, which include the absence of the customary random assignment of participants and its reliance on a female-only sample. The study, which was conducted in real classrooms rather than in a lab, involved the measurement of the actual performance of students in courses in which they were enrolled. Thus, it was not possible to randomly assign participants to conditions. Yet, the self-efficacy and self-image of students assigned to the two CRP conditions were not different, suggesting that, at the start of the study, key characteristics that might have explained performance differences did not set apart the two groups. Furthermore, gender segregation customs, creating two separate campuses for males and females, made access to male students by female researchers unattainable. Yet, it is important to note that the extant literature does not predict a gender difference in the impact of CRP. Our CRP intervention worked on young bilingual/bicultural students who were enrolled in classes at a university that follows a US curriculum. One may ask whether a similar intervention may be more or less powerful on older students who have suffered the consequences of the Khawājat Complex for a much longer period of time. The answer to this question awaits the evidence of future research endeavors.

Conclusion
Educators often struggle to find ways to use research findings that suit their needs inside and outside the classroom. Thus, it is important to note that applications of CRP are not limited to particular courses, but may encompass entire programs. Program-wide applications create informal learning communities among educators in different fields, thereby promoting valuable exchanges of information and social support. This type of pedagogy may be particularly valuable in fields that exhibit recruitment and persistence issues, especially among underrepresented groups, such as female and ethnic minority students (Sparks et al., 2020). Attempts to include CRP in STEM education, which encompass the fields of science, technology, engineering, and mathematics, are particularly noteworthy (Johnson & Elliott, 2020;O'Leary et al., 2020). It is in these fields that CRP may ultimately display the biggest and most visible benefits (Brown-Jeffy & Cooper, 2011;Gay, 2018).
Of course, benefits may come in different forms and may not be immediate, but rather become visible over time. Thus, it is important to recognize the advantages of mixed methodology assessments of CRP, which combine qualitative and quantitative measurements of the impact of this pedagogy (Treagust et al., 2020). Qualitative data can inform the interpretation of quantitative data, allowing researchers and educators to develop a deeper understanding of the range and quality of the impact of CRP. It is also useful to recognize that the assessment of CRP may be improved by the use of longitudinal designs through which a comprehensive picture of the impact of this pedagogy on the lives of its beneficiaries can be extracted.