Checkerboards and Straightjackets: The Organisation of the Timetable and Grading Systems

Conrad Hughes

Jump to Content

Chapter 3 Checkerboards and Straightjackets

The Organisation of the Timetable and Grading Systems

In: Changing Assessment

Author:

Conrad Hughes

Conrad Hughes
Search for other papers by Conrad Hughes in
Current site
Google Scholar
PubMed

Type:: Chapter

Pages:: 19–26

DOI:: https://doi.org/10.1163/9789004714205_003

View license

Access via:

Dar Hadith al Hassania

Download PDF

1 The Carnegie Unit

If you walk into any high school, the first thing you will notice is the way that the working day is organised. The overwhelming majority of schools have a quasi-identical structure driven by a timetable whereby learning is structured into periods. These periods are often in the region of 40 to 60 minutes each; many will be grouped as “doubles” of one-and-a-half hours or two hours. Often a bell will ring, and students will pack their bags and shuffle to the next class, and then the next, until the day is over. The next day will resemble much of the first, and so on. This system often propagates itself from high school right down to primary school. It is only among much younger learners that the day is structured in a more fluid, transdisciplinary, and flexible manner, allowing for projects, extended learning, even naps and outdoor learning.

Where does this ritual come from? Why is it that this model predominates?

Throughout the Middle Ages and the Enlightenment, courses of study were not standardised and, therefore, varied greatly in length and assessment method. The length of a course and the way it was assessed was decided by its teacher. In the nineteenth century, as universities started to develop across the world, particularly in the United Kingdom and in the United States, admissions teams expressed frustration at the disparity in contact time that different students had experienced in their schooling. Some might have spent over 200 hours learning a subject; others, under 100. How would admissions officers compare such situations and vouch that the student had had the right amount of learning to be eligible for entry into the university? From an assessment point of view, we can see this as a problem of reliability — inconsistent testing methods make testing unfair for students.

Harvard president Charles William Eliot responded by proposing units considered necessary for the correct amount of study to have taken place (for more on this, see Silva, 2015). For Eliot, students would have to study a subject for 120 hours to gain credit. This meant that throughout a school year of 40 weeks — i.e., 52 weeks minus roughly 12 weeks for holidays (depending on the system) — each week would contain 3 hours of courses. These 3 hours would be divided evenly across a week to make it easier for the student to establish a rhythm of working and not to concentrate each course into tight packages. The typical model would be four separate periods of 45 minutes each. Assuming roughly 36 working hours per week (after taking out time for lunch and breaks), this would allow for up to 12 subjects to be studied over a year, divided into a set number of periods during a week.

Of course, this model has variations, especially when students go into more depth in certain subjects, taking them as majors or at a higher level. In such circumstances, students might study fewer subjects over the course of a year (6 to 8 subjects, versus 12).

At the end of the 1900s, this credit system was endorsed by the American National Education Association. From then on, high schools would have to ensure that students followed courses for 120 hours to be awarded credit (Shedd, 2003). However, adoption was slow; it was only between 1906 and 1910, when the Carnegie Foundation made this unit of study (120 hours) a mandatory institutional condition for college professors to receive their retirement pensions, that adoption became widespread. This is why the 120 hours of study, split into periods, is called the Carnegie Unit (Carnegie Foundation, 2023).

2 Standardised Curriculum Design

The need to standardise curriculum sequencing was done in order to make the course of study at high schools more reliable in the eyes of universities. By ensuring that a unit of study complied with minimal time requirements, the relative chaos and extreme variability of learning experiences that had hitherto made admissions criteria tenuous and subjective became much more objective.

However, in standardising study units across the curriculum, the Carnegie Unit was not just making the exposure to learning across different systems and schools comparable — it was also defining, substantively, what the experience of learning meant. The pace of learning would no longer be dictated by the needs of the student or the decisions of the teacher but by the need to get through the curriculum in a certain amount of time. This would lead to the artificial prolongation of some units and, perhaps more damaging, the contraction of others. Teachers and students rushed through some units in order to fit everything into a week, for example, as designed according to administrative, rather than pedagogic, exigencies.

Learning is a complex process in which educators should use time judiciously to meet the needs of students. According to needs, pacing, and individual challenges, the time dedicated to learning should be as flexible as possible. Some concepts are more complex than others (for example, threshold concepts such as gravity in physics, moles in chemistry, or literary analysis), and teachers need more time to help students solidify their understanding of these.

Gifted students typically process information more quickly than other students; whereas students who struggle to access the curriculum will need more time as information is scaffolded and repeated, chunked and reinforced. It is unhelpful to consider the process of learning, which by nature is differentiated and individualised, in terms of standardised chunks of time.

So the Carnegie Unit of study — and the manner in which it has been subdivided into equally distributed units of time in school schedules — creates an artificially consistent rhythm of curriculum coverage, which does not meet the saccadic and irregular nature of learning. Many of the problems that students experience in their learning are not addressed because the approach is based on unit coverage rather than remedial work. In an interview on the cognition of understanding, developmental psychologist Howard Gardner pointed this out:

The greatest enemy of understanding is coverage. As long as you are determined to cover everything, you actually ensure that most kids are not going to understand. You’ve got to take enough time to get kids deeply involved in something so they can think about it in lots of different ways and apply it — not just at school but at home and on the street and so on. (Brandt, 1993)

For a more valid learning experience, one in which students can thrive according to their needs, educators would need to design units of study differently, allowing for more time to go deep into understanding and application.

A restructured timetable would imply a restructured assessment system, too, since there would be fewer items to score — and what would be assessed would be depth of understanding rather than coverage of knowledge.

3 Grading: Technical, Emotional, and Psychological Effects

Central to the post–nineteenth-century system of curriculum coverage and assessment is grading. The origins of this practice are debated: for some it began at Yale (Pierson, 1983); for others, at Cambridge (Postman, 1992). What is clear is that, by the end of the 1700s in some universities, professors were dividing attainment into symbols, percentages, numbers, or letters, so as to organise results into categories.

The tendency has been to subdivide results numerically into bands of roughly 15%, from A to E. The common practice in high schools is for all work, or at least the vast majority of work, to be described through such a grading system.

Grading is less an act of formative assessment (meaning, assessment that helps students learn by giving them qualitative feedback on what they can do to improve) and more one of summative assessment (meaning, an act of judgement at the end of a piece of work to communicate to the student what that piece of work is worth).

It is understandable to want to quantify assessment into a neat and clear system that allows evaluators and learners to situate their attainment in a straightforward fashion. However, reams of educational research point out just how damaging grades can be for learning (e.g., Black & Wiliam, 1998; Butler, 2011; Putwain, 2009).

Pulfrey, Buchs, and Butera (2011) “revealed that expectation of a grade for a task, compared with no grade, consistently induced greater adoption of performance-avoidance, but not performance-approach, goals” (p. 683). The work of Dylan Wiliam (2001; 2017) has shown how grades wash out feedback on learning, focussing students’ minds on ego and status rather than on steps for improvement.

Grades are not only considered to be particularly inefficient for learning but have several negative backwash effects on wellbeing. Högberg et al. (2021), looking at the effects of the introduction of grading in Swedish schools, found “negative health consequences of accountability policies such as testing and grading” and that there are “stronger effects on girls compared to boys […] in line with studies suggesting that girls are more sensitive to performance-based self-esteem” (p. 1). Crocker et al. (2003), in studying the effects of grading on university students in the United States, found that “bad grades led to greater drops in self-esteem [which] predicted increases in depressive symptoms for students initially more depressed” (p. 507). Wang (2016) found similar outcomes in researching the effects of grading on teenagers.

And yet the ritual of grading is extremely strong in schools, anchored as a cultural norm that seems almost impossible to displace. This is not to say that experiments to move away from grading are not abundant, for they are. In fact, as Kohn (2013) points out, research going back as far as the 1930s and 1940s (Crooks, 1933; Linder, 1940; De Zouche, 1945) pointed out the dangers and inefficacy of grading, but to little avail.

As of the writing of this book, experiments to assess students beyond and outside of grading are outnumbered massively by the global juggernaut of grading throughout the world’s high schools. I very much hope that anyone picking up this book 100 years from now will read it in an age where grading is seen as an antiquated system no longer in force.

4 Placement Tests and Cut-offs

Since grading systems are used primarily with a summative purpose (as opposed to a formative purpose), their most common application is to rank students for selection eligibility.

We can see several examples of this practice in different national systems. For example, the 11+ Test is administered to Year 6 students in some parts of the United Kingdom to determine entry into grammar schools (which are reputed to be academically rigorous). Students may only take the test once; it is essentially structured as a psychometric evaluation. In Switzerland, for students to enter the academic pathway leading to high-school certification, they must either sit examinations or obtain certain grades at the end of their middle schooling. In the United States, most universities require students to obtain certain scores on standardised admissions tests in order to be considered for admission; and in the United Kingdom, universities will set “tariffs” for entry, meaning that, to be admitted, students must achieve a certain grade at the end of high school.

Selective schools will demand that students submit either a certain grade average, a certain performance on a placement test, or a certain IQ test score in order to be considered for admission.

Schools running special education programmes or streams for gifted and talented students will often require certain IQ test scores to determine who gets into the course and who does not.

The purpose of these selective entry mechanisms is to make sure that students with a certain intellectual and/or cognitive profile are admitted. In most cases, this means that there is less pressure on the schools to raise admitted students’ achievement since students entering the system are already academically groomed, good test-takers, and high achievers. One might ask what the fundamental educative purpose of selective educational systems is: since the premise of education should be to improve learning, it would make more sense for schools to accept the lower-scoring students in order to provide what is known as “value added” to their learning.

One problem that these selective mechanisms cause is the notion of the cut-off grade. Seemingly arbitrary numbers are used to determine whether students progress to a selective institution (or section of the institution) or not. Highly selective North American universities never quote an exact SAT or ACT score or grade average required for them to consider a student. Instead, they speak in “ranges” (Glassman & Swanston, 2024) — stressing the importance of a holistic appreciation of an application file and how the process involves admissions deans discussing student files and considering several factors (recommendations, personal statement, grade averages, standardised admissions test scores). However, given the vast number of applications and the difficulty of having to choose one student among a large pool with very similar scores, often a few points or one tenth of a grade-point average on a subject will be used to decide who is accepted and who is not.

Other systems are less subtle and determine very sharp cut-off points for entry. For example, in the 1920s, Terman (1926) claimed that students with an IQ of 140 or higher were “gifted”. (For a more detailed analysis of IQ cut-off points to determine giftedness, see Mcbee & Makel, 2019.)

Card and Giuliano (2015) show how, in 2005, an unnamed US school district — described as “one of the largest and most diverse school districts in the country” (p. 1) — introduced a scheme whereby “non‐disadvantaged students scoring above 130 points on [a type of IQ] test, and [second-language learners and students receiving free or reduced lunches] scoring above 115 points were eligible for referral for IQ testing” (p. 5). Such scores would, depending on subsequent IQ scores, lead to access to a remedial programme. The paper reveals how “relatively high ability students from disadvantaged backgrounds were being overlooked under the traditional referral system” (p. 3) and the “traditional referral system also misses some high ability non‐disadvantaged students” (p. 15).

So the consequences of performing above or below a threshold — which can mean, for example, how students answer one question worth just a few points — can be significant and have all sorts of implications for students’ future pathways, subsidies, or opportunities. Cut-offs are too narrow as criteria for major decisions on student opportunities; they result in many gifts being missed in the process. More enlightened assessment systems, such as those we explore later in this book, broaden assessment to prevent these narrow cut-off exercises. Unfortunately, they remain the exception: almost all British universities, for example, will select students based on a UCAS tariff points system with very sharp cut-offs.

5 Why Breaking the Checkerboard Is So Difficult

This assessment grid, from the Carnegie Unit to grading to cut-offs, is a tightly regulated and numerical checkerboard. Human potential, which is subtle, variable, culturally specific, and infinitely creative, sits uneasily on this checkerboard, never quite corresponding to its hard contours and angular delineations.

From Binet’s work on IQ testing through two centuries of statistical modelling being the dominant paradigm in the behavioural sciences, this checkerboard has become hardened in the central role it plays in education. Entire districts, national education systems, and even global testing schemes rely on it as an axiomatic playing field that determines practices and decisions.

To break up this checkerboard and create something else will require major upheaval, a coordinated effort across several simultaneous matrices. The work will be difficult, but it is not impossible and must remain a hope, so that the way human beings are viewed and evaluated changes.

References

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102
- Search Google Scholar
- Export Citation
Brandt, R. (1993). On teaching for understanding: A conversation with Howard Gardner. ASCD. https://ascd.org/el/articles/on-teaching-for-understanding-a-conversation-with-howard-gardner
- Search Google Scholar
- Export Citation
Butler, R. (2011). Enhancing and undermining intrinsic motivation: The effects of task‐involving and ego‐involving evaluation on interest and performance. British Journal of Educational Psychology, 58(1), 1–14. https://doi.org/10.1111/j.2044-8279.1988.tb00874.x
- Search Google Scholar
- Export Citation
Card, D., & Giuliano, L. (2015). Can universal screening increase the representation of low income and minority students in gifted education? Working paper 21519. National Bureau of Economic Research.
- Search Google Scholar
- Export Citation
Carnegie Foundation. (2023). What is the Carnegie Unit? https://www.carnegiefoundation.org/faqs/-carnegie-unit/
- Search Google Scholar
- Export Citation
Crocker, J., Karpinski, A., Quinn, D. M., & Chase, S. K. (2003). When grades determine self-worth: Consequences of contingent self-worth for male and female engineering and psychology majors. Journal of Personal and Social Psychology, 85(3), 507–516. https://doi.org/10.1037/0022-3514.85.3.507. PMID: 14498786
- Search Google Scholar
- Export Citation
Crooks, A. D. (1933). Marks and marking systems: A digest. Journal of Educational Research, 27(4), 259–272.
- Search Google Scholar
- Export Citation
De Zouche, D. (1945). “The wound is mortal”: Marks, honors, unsound activities. The Clearing House, 19(6), 339–344.
- Search Google Scholar
- Export Citation
Glassman, S., & Swanston, B. (2024). Get accepted: What is the average SAT score needed for college admission? Forbes Advisor (updated February 7). https://www.forbes.com/advisor/education/student-resources/average-sat-score/
- Search Google Scholar
- Export Citation
Högberg, B., Lindgren, J., Johansson, K., Strandh, M., & Petersen, S. (2021). Consequences of school grading systems on adolescent health: Evidence from a Swedish school reform. Journal of Education Policy, 36(1), 84–106. https://doi.org/10.1080/02680939.2019.1686540
- Search Google Scholar
- Export Citation
Kohn, A. (2013). The case against grades. Counterpoints, 451, 143–153. http://www.jstor.org/stable/42982088
- Search Google Scholar
- Export Citation
Mcbee, M., & Makel, M. (2019). The quantitative implications of definitions of giftedness. AERA Open, 5(1). https://doi.org/10.1177/2332858419831007
- Search Google Scholar
- Export Citation
Meyer, J. H. F., & Land, R. (2006). Threshold concepts and troublesome knowledge: Issues of liminality. In J. H. F. Meyer & R. Land (Eds.), Overcoming barriers to student understanding: Threshold concepts and troublesome knowledge (pp. 19–32). Routledge.
- Search Google Scholar
- Export Citation
Pierson, G. (1983). C. Undergraduate studies: Yale College. A Yale book of numbers: Historical statistics of the college and university 1701–1976. Yale Office of Institutional Research.
- Search Google Scholar
- Export Citation
Postman, N. (1992). Technopoly: The surrender of culture to technology. Alfred A. Knopf.
- Search Google Scholar
- Export Citation
Pulfrey, C., Buchs, C., & Butera, F. (2011). Why grades engender performance-avoidance goals: The mediating role of autonomous motivation. Journal of Educational Psychology, 103(3), 683–700. https://doi.org/10.1037/a0023911
- Search Google Scholar
- Export Citation
Putwain, D. W. (2009). Assessment and examination stress in Key Stage 4. British Educational Research Journal, 35(3), 391–411. http://doi.org/10.1080/01411920802044404
- Search Google Scholar
- Export Citation
Shedd, J. (2003). The history of the student credit hour. New Directions for Higher Education, 122(Summer), 5–12. http://doi.org/10.1002/he.106
- Search Google Scholar
- Export Citation
Silva, E. (2015). The Carnegie unit: A century-old standard in a changing education landscape. Carnegie Foundation for the Advancement of Teaching.
- Search Google Scholar
- Export Citation
Terman, L. M. (Ed.). (1926). Genetic studies of genius: Mental and physical traits of a thousand gifted children (Vol. 1, 2nd ed.). Stanford University Press.
- Search Google Scholar
- Export Citation
Wang, L. C. (2016). The effect of high-stakes testing on suicidal ideation of teenagers with reference-dependent preferences. Journal of Population Economics, 29(2), 345–364. http://doi.org/10.1007/s00148-015-0575-7
- Search Google Scholar
- Export Citation
Wiliam, D. (2001). What is wrong with our educational assessments and what can be done about it? Education Review, 15, 57–62.
- Search Google Scholar
- Export Citation
Wiliam, D. (2017). Embedded formative assessment. 2nd ed. Solution Tree Press.
- Search Google Scholar
- Export Citation

Citation Info

Save
Cite
Email this content

Share link with colleague or librarian

You can email a link to this page to a colleague or librarian:
Email this content
or copy the link directly:

https://brill.edhh.ma/display/book/9789004714205/BP000011.xml
The link was not copied. Your current browser may not support copying via this button.

Link copied successfully

Collapse
Expand

Changing Assessment

How to Design Curriculum for Human Flourishing

Series: IBE on Curriculum, Learning, and Assessment, Volume: 6

E-Book ISBN:: 9789004714205

Publisher:: Brill

Print Publication Date:: 22 Feb 2025

Subjects
- Education

Front Matter

Back Matter

ProCite

RefWorks

Reference Manager

BibTeX

Zotero

EndNote

Metrics

	All Time	Past 365 days	Past 30 Days
Abstract Views	0	0	0
Full Text Views	522	71	18
PDF Views & Downloads	68	19	1

African Studies	Education	Media Studies
American Studies	History	Middle East and Islamic Studies
Ancient Near East and Egypt	Human Rights and Humanitarian Law	Musicology
Art History	International Law	Philosophy
Asian Studies	International Relations	Religious Studies
Biblical Studies	Jewish Studies	Slavic and Eurasian Studies
Biology	Languages and Linguistics	Social Sciences
Book History and Cartography	Life Sciences	Theology and World Christianity
Classical Studies	Literature and Cultural Studies

Subjects

Authors

Open Access

Product Information

Company

Contact

Chapter 3 Checkerboards and Straightjackets

The Organisation of the Timetable and Grading Systems

1 The Carnegie Unit

2 Standardised Curriculum Design

3 Grading: Technical, Emotional, and Psychological Effects

4 Placement Tests and Cut-offs

5 Why Breaking the Checkerboard Is So Difficult

References

Citation Info

Share link with colleague or librarian

Changing Assessment

How to Design Curriculum for Human Flourishing

Table of Contents

References

Metrics

Metrics