American Mathematical Monthly, Vol. 103, No. 2, February 1996, pp. 134-142.

A Fable of Reform

John J. Schommer

For years the math department at Rolling Hills State University (RHSU) had been taking heat from nearly every liberal arts department on campus for being too tough on liberal arts students. Having refused on many occasions to adopt the "Math for Poets" class that many universities offer to fulfill university math requirements, the math department at RHSU had been adamant in its support of College Algebra as the least course a student could take to fulfill this requirement. But reform was in the air...

1. The Decision

Maybe it was the fact that a lot of new folks had just been hired by the math department. Maybe a lot of folks in the math department just really had a good breakfast that morning. For whatever reason, someone suggested at the faculty meeting one afternoon that perhaps the department should offer a course in chess. At first the surprisingly enthusiastic reception was aimed at adopting the course as an elective. The department soon found itself considering seriously, however, whether in fact chess should be adopted as a way of fulfilling the university math requirement. Enthusiasm for the idea, though far from unanimous, was in the majority that afternoon, and a committee was soon formed with many eager volunteers to draw up a syllabus for the course, as well as to lay out the best arguments for a chess course before the university curriculum committee.

The newly formed chess committee had barely met for an hour when they realized that, from a professional standpoint, most of what they wanted to accomplish in a chess class would resonate strongly with the NCTM Professional Standards for Teaching Mathematics [1, pp.19-67]. Within a week of their first committee meeting, a very eager chess committee had completed the first draft of its report to the curriculum committee. It emphasized the following:

  • Critical Thinking / Problem Solving. If ever there was a game that involved critical thinking it was chess. Every time the board changed new problems were created, problems with many valid game-winning solutions, problems completely unlike the pre-fabricated exercises of the typical algebra book with their stock unique solutions. With the right teacher at the helm of this class, each lesson might very well involve almost 50 minutes of careful, critical analysis.

  • Collaborative Learning. A well-run chess class could very well serve as the model for a collaborative learning environment. Again, with the right teacher at the helm, a class could easily be conducted in small groups, all members tossing out ideas, discussing the critical pros and cons of every move on the board, moving only after achieving some democratic consensus for a course of action. No sleepy lecture sessions, classrooms would be noisy and bubbling with intellectual ferment. Quiet shy students with little self esteem would find their views solicited and even adopted on occasion. Students, even shy students, would learn to make their best arguments before a group of their peers. Highly disciplined and yet fun, chess promised to be an almost perfect venue for collaborative learning.

  • No "mindless" symbol manipulation. College algebra's harshest critics were almost incapable of saying "symbol manipulation" without saying "mindless" at the same time. If chess were adopted as a course, the math department could begin to demolish that ugly characterization of their discipline. Aside from the fairly easy-to-learn notation system for recording games (a skill that would be required only rarely in this class), 99% of the time nothing vaguely resembling abstract symbols would be used! Abstract symbol manipulation would be dead, the demand for critical analysis would be maintained. The best of both worlds!

  • Physical Manipulatives. Here was something that you rarely saw in a college math class -- physical manipulatives! And beautiful, artistic manipulatives at that. Though some students would be content with their generic plastic chess sets, some would no doubt fall in love with the artistry of more beautiful sets, a love which would surely make them want to play often. The more artistic students might even be motivated to sculpt their own sets. With chess sets becoming valued artistic possessions, could the desire for the beauty of the game itself (that reservoir of critical thinking) be far away?

  • Negligible costs, even for a technology-friendly class. Though a few students would indeed want fancy chess sets, and some would want to buy chess software for their computers, none of these would be required. The simplest of boards is all that would be needed; a college student could effectively walk into any local toy store and come away with the only materials needed for the course. Furthermore any chess program, a site-license for which would be cheaper than the heavily promoted math software found in most professional math journals, would instantly convert any computer lab into a chess lab.

    2. The Plan

    To say that the university curriculum committee was stunned to hear the math department's proposal would be an understatement. It all sounded so progressive. Liberal arts members of the curriculum committee began to imagine life in harmony with the math department, their students no longer complaining of the drudgery associated with completing their math requirements. So caught up in the idea of this major reform in the math department (and perhaps just a little fearful that this opportunity might not ever come again), the committee quickly approved the course. Strange as this might seem now in retrospect, no one ever bothered to ask, where was the mathematics?

    Every member of the chess committee was eager to be one of the first to teach the new class. Since only few would be able to, the committee decided that each member should draw up a detailed proposal for the course, and that the members with the best proposals would then become the first teachers of the course. Wonderful proposals were made, and when the teachers for the first run of "Math 130: Chess" were finally picked, they all had a certain similarity in their approach.

  • Most of the traditional "lectures" would occur in the first week of the course, consisting primarily of the basic moves of each piece and ways to avoid being checkmated early. Chess aphorisms like "play to the center of the board" and "try to control a middle square" would be discussed.

  • Any lectures in subsequent weeks would be enthusiastic and quite short, focusing on problems associated with various opening, middle, and end games. Most of the time though, students would be actually trying to solve various chess problems in small groups. The emphasis in class would be "problem of the day and play, play, play".

  • Students would not be required to memorize games or even openings.

  • Chess grandmasters would be invited to talk about life in the world of chess and their favorite matches.

    With an enthusiastic group of teachers chosen, there was one last piece of business that had to be decided: how would grades be assigned? Here, the teachers thought, was where they could really exploit the beauty of living in the computer age. With chess programs available that provide competition on several levels of difficulty, putting together an objective grading system would be easy: grades would be determined by students' ability to defeat a given chess program a certain fixed percentage of times. A student could earn an "A", for example, by defeating the program at "level five" 7 out of 10 times. The beauty of this system is that there would be clear objective standards that would mean the same each and every year. Gradewise, the course would be essentially teacher independent!

    The teachers worked as a group to find good chess problems with interesting solutions. Great software was found (with dazzling visuals that projected nicely on overhead screens) that would play out the day's problem so that everyone could compare their solutions to the computer approach. Four speakers were lined up for the fall, all of whom traveled extensively because of chess and had marvelous stories to tell of games bravely fought and often won. Two speakers even offered to stay after class and play all takers on the quad in simultaneous games.

    By summer's end, it appeared that a dynamic, exciting class had been prepared that would really get first-year students deeply involved in problem solving, developing critical thinking skills that would serve them throughout their lives. Well, at least that was the plan...

    3. The Implementation

    There was a lot of excitement that first week. Students and teachers alike felt that they were somehow part of a revolution. When teachers described how the class would be run -- very short lectures, lots of game playing -- students were pleased and eager to get started. Though teachers tried to stress that in fact this "game playing" was going to be serious business, some students (especially those who had failed college algebra before) couldn't help but think that this was going to be an easy way to complete that old math requirement.

    By the second week, it was already becoming quite clear who was probably going to be earning "A"'s at semester's end. Knowing that these students would more than likely find each other after class for some challenging play, teachers made sure that this wealth of natural ability was spread around when assigning folks to their small groups. And these natural players were actually quite happy to be the big fish in those small ponds. As the semester progressed teachers noted some minor problems with certain of these "prodigies" lording over the others in their groups, but this indeed only proved to be a minor problem. All in all, group work turned out pretty well in the first two months.

    The semester's guest lecturers were, by and large, quite interesting. Though one presenter's "inside" stories about the chess world proved way too obscure, the presentation by one grandmaster was particularly entertaining and enlightening -- the image of chess nerd was thoroughly dispelled. The first "chess on the quad" was surprisingly popular, and an alert university relations officer made sure that local media would be around for the second event. The second event was equally popular, and did indeed find its way into local news stories. In fact, one reporter's story evolved into an examination of mathematics reform on the national level and was picked up later by various news services, much to the delight of a university president wanting very much to be perceived as being on the cutting edge of reform.

    It's actually not too hard to pin down when things began to deteriorate -- it was right around midsemester, two weeks before grades were due. For fear that the more prodigious players might go for their "A" early in the semester and then start sleeping in, it was decided that you could not leave a level of computer play until after you won at that level the required percentage of times. To get a midsemester "A" you had to graduate to level four by midsemester, to get a "B", level three etc. The network in the computer lab was set up so that one could essentially play at any time of day for official scoring purposes, and was furthermore designed so that a teacher could be certain that the person who claimed to be playing was actually the person on the machine. The computer also kept a good log of how long a person was on the machine.

    It was this simple log that shouted warning signs long before midsemester. After the initial spurt of logging on at the beginning of the year, only the naturally talented students seemed to be logging on regularly, and only they seemed to be staying on for periods of time vaguely resembling the fabled ratio of two hours homework to one hour of classtime. Despite these early warning signs, the teachers assumed that students preferred playing each other, and would eventually begin logging on regularly once midsemester approached.

    And so they did. With two weeks to go before midsemester grades were due, a trickle and then a torrent of folks began to log on. The only problem was that in the course of all their group work, students had managed to develop a completely unrealistic idea of their own particular chess skills. Beating the computer even at the lowest level proved far harder than they imagined. Office hours were well attended the week before midterm with students absolutely panicked about their grades. The teachers discussed the crisis and decided that they were partially responsible for the mess -- they were going to have to be quite generous with midsemester grades. An anonymous survey also revealed that students were in fact not playing chess all that often outside of class -- the computer logs had been a pretty accurate reflection of who was putting in time with "homework". The teachers made clear, though, that the ultimate grading criteria remained the same -- by the end of the semester, a 7 out of 10 success rate at level two would be necessary to receive a minimal passing grade. An understanding dean was notified officially by the teachers about what had been going on (he had heard quite a bit already thank you very much), and it was decided that students would be allowed to drop if they wished (even this late in the semester) with a "WP". A discouraging number chose this option.

    The computer log that developed in the two weeks prior to midterm made something else abundantly clear -- a large number of students were having difficulty just getting past level one. The computer could replay students' games for the instructors, and many students seemed to be roaming the board aimlessly. Opening play appeared to be especially bad, leaving students with barely defensible positions. Students also seemed to have no sense of when resignation would be preferable to playing out their weak positions. Of course one quick way to get students into more competitive openings with the computer was to have them memorize famous openings like the Queen's Gambit and the Sicilian Defense. It was precisely this kind of rote memorization which everyone wanted to avoid when the course was being planned. Yet it was clear that something like rote memorization was going to have to happen if many students were going to get past level one. The teachers put together a booklet of about ten famous openings and suggested that students might begin enjoying more success against the computer if they memorized a few of them -- not that this was required, mind you. A day was put aside for a more traditional lecture, and the pros and cons of the ten openings were discussed.

    A few of the talented kids had discovered books on chess weeks earlier and had in fact learned some of the classic openings already. Their impressive names had been dropped by the "prodigies" in small groups, but it wasn't until the big grade scare that the average student felt particularly motivated to memorize them. In any case, soon after students started memorizing openings, the level one barrier fell for many. The computer did not always respond according to the book, but students found that if they persisted with a prepared opening, more often than not they began to enjoy some success. At this point they had, of course, little appreciation for why such openings were yielding success, but they were at least content to be finally winning and relieved to be "passing" the course. Appreciation for those openings could always develop with time. Some students did want to drop the course a few weeks after midsemester when it became clear that they would have to spend quite a bit of time memorizing a chess "vocabulary". They were not altogether happy when a very tired dean did not allow them to withdraw.

    With little more than a third of the semester still to go, a "C" was looking more and more possible to many and the assault on level three had begun in earnest. But the grade scare had had a profound effect on classroom temperament. Memorization had not only brought success to those who were having trouble, it had also brought drudgery. Chess class was no longer the "consequence free" class that it was just a month earlier. Success, it seems, had been a double-edged sword. Though in-class time was itself still considered fun, there was nothing particularly fun about the personal discipline required to study chess outside of class. One of the messages students had received implicitly before the grade crisis was that there were no "right" and "wrong" solutions to chess problems, only "different" ones. The success of memorization seemed to destroy that vision of chess -- there apparently were "right" and "wrong" approaches to chess problems (depending on level of play) and the discovery/small group method of learning how to make those distinctions came with the risk that the desired grade might not be earned by semester's end. Students no longer entertained false notions of their own abilities. Teachers began to regularly hear the question, "...but what is the right move?" The chess books held in reserve at the library were being used more often as students wanted to know more about the "correct" way to play chess. Ironically, the more tiresome Math 130 had become for many, the more success students seemed to be having according to the computer logs.

    Now as uncomfortable as the class had become for those who had a reasonable shot at earning a "C" by finals, it had become very uncomfortable for those for whom a "D" was becoming their best hope. The odd thing though, was that the people who so desperately needed to spend more time with chess were not really logging onto the computer as often as might be thought. After initially becoming very involved in the memorization craze, the participation of the struggling students fell off markedly when it became clear that memorization was not going to be a quick fix to their grade predicaments. It turned out there was still plenty of chess to play after those prepared opening moves. Occasionally a student would find a teacher during an office hour and express the frustration that no doubt many were feeling -- they "understood" everything in groups, and they "followed" everything said in lectures, but as soon as they tried to play on their own, they met with failure. Challenged by the records of the computer log that claimed that they were not "studying" anywhere near as often as they needed to, their response was to confess that they indeed had not practiced as much as they should, but that their jobs and extracurriculars took up an important chunk of their time.

    Perhaps surprisingly, the biggest threat to the future of the chess class did not come from a student, an "arts" faculty member, or an administrator. The teacher who pushed hardest and worked most enthusiastically for the chess class began to wonder openly whether any of this was an improvement over the old college algebra class. The chess grade distribution looked strikingly similar to the one for college algebra, about the same number of people had dropped, and the same old excuses for poor performance were being heard. Most profoundly discouraging though was the fact that the kind of students who struggled in College Algebra primarily because they didn't do their algebra homework, weren't doing their chess homework either. It was almost precisely for this particular group of students that the chess class was put together. Group work, manipulatives, all the things that were really supposed to get this group involved did not seem to have anywhere near the effect on personal study habits as was hoped. Embraced precisely to bring students formerly alienated from math into a deeper involvement with problem-solving, the chess version of "math reform" had yet to evidence the kind of involvement that instructors were looking for.

    The semester ended the way many do. Some students hardly lifted a finger and got an "A". Some students, overconfident from their midsemester "A"'s and "B"'s, lost a letter grade. Some students who struggled with a "C" all semester long and who logged many an hour on the computer finally crossed the "B" threshold with a tremendous sense of accomplishment. Yet other "C" students who worked just as long and hard could not quite cross that "B" threshold, finishing the course quite frustrated and vowing never to play again.

    4. Back to the drawing board

    With grades assigned and the campus virtually deserted, the chess committee got together to debrief the teachers and assess the semester. The course evaluations that students had completed in the last week of classes were opened and read aloud. A consensus quickly emerged that students were quite happy with the conduct of daily classes, and in particular the group work. They were disappointed however that group work didn't seem to factor explicitly in their final grade. One student's comment brought considerable laughter -- "There was too much chess". But when the laughter died down everybody had a sense that they knew what the student meant. Almost 90% of class time had been spent solving chess problems, an almost maddeningly efficient use of class time. A small but enthusiastic minority claimed that this was the best college course they had ever taken. An equally small but adamant minority claimed that it had been the worst. Most of the criticism seemed directed at the grading. Though none could take the grading to task for being subjective, students as a whole did not feel that the final grade they were given adequately captured what they "knew". Any restructuring of the course then, would have to focus on grading.

    It seemed that the committee would first have to agree about what precisely they were trying to measure. "Chess ability" rolled easily off the tongue, but was perhaps too vague. Unfortunately, further discussion didn't clear this up much. The committee felt itself going in circles, replacing the vague "chess ability" with equally vague-sounding things like "successful contingency planning" and "successful problem resolution". The word "success" seemed to be the only commonality to all this vagueness, and this suggested to the committee that they were on the right track keeping to a grading system based on numbers of wins.

    Perhaps a balance could be struck by setting the hurdles lower and then adding a lot more of them: a "D" could be earned by beating level two 20 out of 40 times playing white; a "C" could be earned by winning at the same level 20 out of 40 times playing black; and so forth through level three (a full two levels below what was needed to get an "A" last semester). This would also solve the problem of those few "A" students who had the course pretty much rapped up after midterm -- they would have to log at least 52 games more than last semester to get their "A"'s. Students at lower levels would be rewarded to a greater extent for their quantity of play, if less so for their quality.

    But maybe the problem with the grading system went deeper. Despite their quaint but honest attempt to establish what they were trying to measure, perhaps emphasizing each student's individual ability to win was somehow fundamentally wrong-headed. Perhaps the clear equation between students' ability to win at a certain level and their actual problem-solving abilities was just an illusion. To be sure, playing to win seemed to be an integral part of chess. But didn't the current grading system engender a hurtful and oppressive hierarchy among the students? A distasteful elitism? Committee discussion naturally turned to "alternative assessment".

    Perhaps portfolios could be required of students. Students could be asked to submit a certain number of their "best" games. The computer program that the department had been using in fact was capable of keeping a record of all games using standard chess notation, so it would not be too hard to submit games to an instructor for evaluation. If wins at any level were the only games that qualified for submission, then there would be little for the instructor to do but make sure that the correct number of predetermined wins was achieved. If the "quality" of each win were to be judged, then perhaps the board position of each of the student's prized games could be entered into the chess program, say after the first 10 moves, and the number of moves it took the student to win the game from that point could be compared to the number of moves the computer would take to win the same game. Of course, the very notion of "quality" threatened to bring back the aura of elitism that the committee was trying to eliminate. Furthermore, given that this grading system was more difficult for the instructor to manage than the original "elitist" grading scheme, it was not clear that this alternative system was really worth the extra trouble. If the department was serious about ending elitism, the notion of quality wins in a portfolio would have to be abandoned.

    The committee did come to a quick agreement that some kind of writing could be an important component of the portfolio. Students capable of explaining "Kasparov's 23rd move against Karpov in the third game at Sarajevo" were certainly demonstrating some advanced knowledge of the game -- provided of course their analysis was correct. The committee would have to seriously consider portfolios which included term papers purporting to analyze some of the most famous (and perhaps not so famous) chess matches of all time. Unfortunately, if part of the purpose of changing the grading scheme was to make students happier with the course, it was not at all clear that adding an analytical term paper to their responsibilities was going to accomplish this.

    Still other candidates for portfolio submissions were discussed. Back when the chess committee was trying to sell the curriculum committee on the value of having physical manipulatives, their enthusiastic rhetoric suggested that some of the more artistic students would be inspired to sculpt their own chess sets. Perhaps sculpture of this sort would constitute a valid entry in a chess portfolio. In the same vein, perhaps history majors would want to contribute papers detailing chess history, perhaps sociology majors would want to write on the contributions of different ethnic groups to chess, etc.

    But just when everybody was feeling wonderfully inclusive about what would comprise a portfolio, someone on the committee, to everyone's horror, realized that the point of this course originally was to satisfy a university math requirement. The committee had been on the verge of accepting sculptures of chess pieces as reasonable demonstrations of university-level competence in mathematics! How far they had come! They had not even begun to discuss one of their students' most common complaints -- that their group work never factored explicitly in their grades. Reawakened to the original purpose of the course, the committee now realized that allowing group work to count for a significant portion of the final grade would confront them with yet another problem. The math department would effectively be saying "because you are good at group work in chess, you have sufficiently demonstrated individual competence in mathematics." That was quite a stretch. A good Outward Bound experience or membership on one of the school's sports teams might as well contribute to satisfying the math requirement -- any of those experiences usually involved group problem solving.

    As long as the grading in the chess course focused on each individual's ability to win games, there seemed to be a mathematical character to the whole enterprise. Once the individual's ability to win games was removed from its preeminence in the grading scheme, the validity of chess as a substitute for math seemed questionable. Questions indeed appeared to be in no short supply. Why did chess ever seem like a valid way to fulfill the math requirement in the first place? Why precisely did the possibility of alternative assessment seem to negate the value of what the committee was trying to accomplish? Were members of the committee simply closed-minded? Did everyone on the committee have some subconscious elitist agenda? And finally what, if anything, did all this say about the validity of alternative assessment strategies in the typical math course?

    Everyone was now feeling quite frustrated. When this experiment had been but a twinkle in the committee's eye, the campus support for this particular reform was wide. Unless the department diluted its current grading criteria though, setting lower "hurdles", allowing group work to count significantly in the grading, and perhaps adopting certain alternative assessment strategies, support for this kind of reform would wane and the math department would once again become the whipping boy of the campus. In sum, the math department appeared to have accomplished what few could have thought possible: they had taught far less traditional math than ever before, and yet managed to make the same numbers of people every bit as resentful. Was all this "reform" worth the effort? The committee adjourned, undecided as to whether Math 130 would be offered the following year.


    [1] NCTM, Professional Standards for Teaching Mathematics, Reston: NCTM, 1991