In this post I want to talk about *timed testing* under specs grading. This is an idea that’s not prevalent in Linda Nilson’s book on specs grading that got me started down this road. Mathematics is a subject that typically has a significant amount of procedural knowledge, unlike a lot of the subjects represented in Linda’s book. So there is a need to assess students’ ability to show that they can perform certain tasks on demand, without the benefit of virtually unlimited time and resources — things like calculating derivatives, interpreting graphs, and instantiating definitions.

Don’t misunderstand: Those tasks don’t *make up* mathematics any more than the ABC’s make up the discipline of literature. But it seems reasonable to assess beginning writers on their ABC’s, and mathematical tasks that admit timed assessment are the ABC’s of the subject.

Timed assessments were in the design of the Discrete Structures course from the beginning through the use of Concept Checks, which are 15 minute weekly quizzes over the “CC” objectives. But another layer of timed testing needed to be built to assess what I called the “CORE-M” objectives, the learning outcomes that are not basic (and can’t be assessed through objective, easy-to-grade items) and of central importance. The solution I came up with turns the whole process of timed testing sideways, and I really like it.

Four times during the semester, we’ve set aside an entire period that I call a *timed assessment period*. During that timed assessment period, students will work individually to do a combination of the following: (1) work problems over CORE-M objectives that are new to them, (2) re-work new versions of problems over old CORE-M objectives that they haven’t passed yet, and (3) re-work new versions of problems over old CC objectives that they haven’t passed yet. About a week prior to the timed assessment period, I put up a survey using a Google Form to have students go in and select the problems they wish to work during the period. Then, I go make out one problem for each CORE-M and for each CC objective that were requested.

I print out the appropriate number of copies of each problem and bring them to class on the assessment period day and lay them out on two tables. Students then come up at the beginning of the hour and get the problems that they want to work. They work those, and submit them when done. I then grade all of those on a Pass/No Pass basis using the specifications we laid out at the beginning of the semester. If a student Passes an objective, they are done with it, and they are one step closer to attaining the goal grade they wanted for the semester. If the student doesn’t Pass, then they can request that objective again during the next timed assessment period, at which point I’ll make out a new problem that assesses it.

At the end of the semester, we have the entire 110-minute final exam period set aside as a massive, last-chance timed assessment period. I do not give a comprehensive final exam in the class. Instead, the time is spent by students to make one last attempt at any CC or CORE-M objective they had not passed yet.

Here are some pictures I took of the process of putting together the first timed assessment period for the discrete structures class. Going into the assessment period, I had copies of the six CORE-M objectives available and a single pack containing all the CC objectives. I also had envelopes labelled with the objective numbers for students to return their work when done.

Here’s how it looked when I got set up in class:

This was the first assessment period, so we only had six CORE-M objectives. The next one, happening this Friday, has seven more new ones in addition to a smattering of older ones, so I’ll need more space. To avoid a human logjam when coming to get the papers, I released students by rows to come down one at a time to get their papers. It dawned on me as this was happening that this process was *exactly* like the process of receiving Communion at my church on Sunday mornings. I suppose that makes me some sort of Eucharistic minister.

Some students decided to try all six CORE-M objectives, while others opted only to work on three or four of them, just to make sure they had time to work. One of the things I like the most about this system is that I will never again have the horrible situation of having to pry tests out of students’ hands when time runs out. That is, this happens (and did happen in this particular session) but each student can estimate how much time they need on each objective and only ask for problems that fit within the time frame; and if they misestimate, then no worries — just try again at the next session.

Students handed in their work by putting the problems into the appropriate envelope, Actually it turned out the envelopes were a bad idea because they were only slightly larger than the papers themselves, and it was hard (and noisy!) to shove a paper into an already full envelope. At the second section’s period, I switched from envelopes to hanging file folders, and that solved the problem.

When it was time to grade, I pulled out the papers from each envelope/folder and alphabetized them (*note to self: hire a student to do this in the future*), and put them in piles:

Then I graded everyone’s work on objective M.1, then everyone’s work on M.2 and so on. When the grading was done and Pass/No Pass marks entered into Blackboard, I laid out the papers on a big table and put together a packet for each student. This was simple because the stacks were alphabetized, so “Adams” was on top, then “Brown”, etc. and I just had to skim the current student’s work off the tops of the stacks and staple them together.

This was probably the same amount of work that I would expend on an ordinary timed test. Making out the problems didn’t take any more time. Grading them was about the same as for an ordinary test, with a slight speedup because I am not assigning partial credit to the work. (It’s only a *slight* speedup because I *am* giving feedback that’s more detailed than what I’d give on a regular test.) I have to alphabetize seven stacks of papers instead of one, and that’s time consuming but trivial. So really, once the logistics are ironed out, this isn’t that much more work. For future periods, I’ll have to make out new instances of problems that assess particular objectives, but that’s not usually very hard.

And there are many things I like, and the students like, about this way of testing. First there is minimal pressure; if you don’t Pass an objective, just do it again later. Second, because students are choosing what they want to be assessed on, it makes them think intentionally about their preparation, rather than engaging in “studying” (which usually isn’t very purposeful) and showing up *hoping* they do well. Third — and multiple students mentioned this as a positive — this system does not let you disengage from course material once the test has been given. Because the test is not over yet! If you don’t pass, try again later — but you do need to *try again*.

That last item is the main thing that makes me feel OK about not having a comprehensive final exam. If students are continuously revisiting material that they had not previously mastered and re-attempting problems to demonstrate mastery, the need for a comprehensive final exam diminishes.

As a last word, I think that even if you aren’t into specs grading and the whole no-partial-credit idea, this way of doing timed testing could still work. I should give credit where it’s due and say that the main idea for doing tests like this came from this Calculus 2 course that does not use specs grading. Just make up one problem per objective and assign a uniform number of points to each objective — say, 10 points per objective. Then students do their work and submit it as described above, and you grade it using your system and rubric for partial credit. Then, the sum total of all the points earned flows into a large pool of points for “Tests”. Over the course of a semester using regular grading, you might give four 100-point tests; in this system you’d be giving 40 problems worth 10 points each. So it’s roughly equivalent in terms of contributions to the course grade. There are details to work out of course, but in principle this way of testing fits into ordinary courses in addition to specs-grading courses.

**Want to continue the conversation?** Follow me on Twitter (@RobertTalbert) or on Google+, and share this article on your networks using the social media buttons above.

**What I used to think:** Pre-class activity in a flipped learning model is about mastering content-oriented instructional objectives.

**What I think now:** Pre-class activity is for *generating questions*.

I attended a talk by Jeremy Strayer last year, and he said something that stuck with me: that *the purpose of pre-class work in the flipped classroom is to “launch” the in-class activity*. In flipped learning we certainly want students to pick up fluency with basic content and learning objectives prior to class. But I think Jeremy’s point is that content delivery shouldn’t be the *primary purpose* of pre-class work. Rather, it should be to prime the student intellectually to engage in whatever high-level tasks we have devised for the in-class meeting.

This point was echoed in this study from Stanford which suggests that while the flipped learning model in itself is an improvement over a standard lecture-oriented model, there are even stronger learning gains among student when their pre-class work consists of open-ended explorations of concepts that precede a more text-based study of those concepts. The Stanford study suggests that “flipping the flipped classroom” in this way is the best approach.

So based on all of this, I’ve started learning away from “content delivery” as the main purpose of pre-class work and toward the notion of *question generation*. For example, when we get around to studying Eulerian paths in the discrete structures course, the pre-class activity will go like this:

- Read the section in the book about Eulerian paths and circuits and pay particular attention to the definitions. I often provide some optional additional videos for students to use if the book is too opaque.
- Look at a graph and identify whether a given path/circuit is Eulerian or not. (Thereby ensuring that students know the basic terminology.)
- Now, go play with the following script in Sage:
g = graphs.RandomGNM(10,20) print g.degree_sequence() g.is_eulerian(path = TRUE)

Run this script as many times as you need, changing the 10 and the 20 to adjust the number of nodes and edges if necessary, until you get `TRUE`

for the `is_eulerian`

output. What do you notice about the numbers in the degree sequence list? What’s different about this degree sequence from the degree sequence for a graph that isn’t Eulerian?

In this way, students may not have mastered a smaller list of basic learning objectives than they used to, but they are coming to class more invested in the main idea of the section (the fact that a graph has an Eulerian path if and only if it has exactly two vertices of odd degree) and with their antennae up, so to speak.

Here’s another evolving thought:

**What I used to think:** Students in a flipped classroom need to have some graded measure of accountability when they arrive at class (an entrance quiz, etc.) to ensure that they do the pre-class work.

**What I think now:** Accountability doesn’t have to look like a quiz.

I used to take it for granted that in order to get students to do any sort of work, I needed to attach a grade to it. My mind changed on that last semester in the first-semester discrete structures class I was teaching.

I was giving entrance quizzes that covered the basic learning objectives from Guided Practice, pretty much one of these every class day, just as I had done with almost all previous flipped classes. But I began to notice that the quizzing was causing more problems than it solves. Students were telling me – both directly through their comments and indirectly through body language – that they were tired of being quizzed all the time.

So I decided to try something radical: *Don’t give them a quiz at all.* Just let them do the Guided Practice and let that be that. What surprised me was that not only did the completion rates for Guided Practice not get worse, the students’ comprehension of basic ideas improved, and their work in class improved as well. And we had more time in class for the high-level work inherent in a flipped class meeting.

This may not work for all student demographics, but I also have to say these students were mostly freshmen and sophomores, and not all intrinsically motivated by the material. And *reducing* the quizzing *improved* their work, at least to my eye.

So in my classes now, there is accountability but without graded quizzes. It really looks more like *responsibility* than *accountability*. I’ve been sticking to a simple message that *students are adults* – and preparing for a class is an adult responsibility that they are expected to perform; giving them detailed guidelines and lots of help in getting prepared; and spelling out what failure to prepare does to their learning. If students show up unprepared, we move on as if they had prepared – no exceptions. The “accountability” consists in holding hard boundaries on what we will and will not do in class.

And so far, I’ve had no problems with students showing up unprepared. Maybe some *do* show up unprepared, but they’ve been able to participate and learn anyway. (I have clicker question data to prove it.)

I think this, along with my shift to specs grading, signals a more general shift in my teaching toward the concept of andragogy as opposed to pedagogy – treating students as responsible adults rather than as children who need constant supervision and rule-setting. I have more to say about that later.

Finally, a third evolving thought:

**What I used to think:** The in-class instruction in a flipped class should focus primarily on active student work with little to no lecture.

**What I think now:** The in-class instruction should focus on two things: Answering questions, and engaging students in high-level tasks – and lecture can play an important role in both.

This post from Kris Shaffer left me in thought for days, especially his thoughts about the “geographical flip” versus the “timing flip” and his preference for the latter, which is also rooted in that Stanford study I linked earlier. Kris does something here which I wish more of us would do, which is to transcend the shopworn “lecture sucks” narrative and instead try to craft the best pedagogy that combines the *most effective* uses of several modalities.

I’ve found myself being much more amenable to lecture in class these days. I *plan* those lectures: They are short, surgically targeted at the most common misconceptions or aimed at specific student questions that were voiced during the pre-class assignment.

One of my colleagues once told me that he loved teaching because as a mathematician, he loves difficult problems, and teaching is a problem whose solutions generate even more and harder problems. I enjoy this aspect of teaching too, along with the continuous evolution of thought that it requires.

*Want to continue the conversation? Follow me on Twitter (@RobertTalbert) or on Google+, and share this article on your networks using the social media buttons above.*

A lot of what I am going to write here is a repeat of what I wrote earlier on December 22, but between that previous article and the actual start of classes two weeks ago, I made some significant changes. So bear with me if it sounds like I’ve said a lot of this before – I have, but this is the final version.

I’d taught this course before a couple of times using a traditional grading system. The course is one of my favorites to teach, but the traditional grading always seemed like a poor fit. Since the course is required for the CS major but is not itself a prerequisite for any further course, there’s a strong temptation to treat it like a hoop, even among the best students in the course. It was a course where the grades very often gave a lot of false positives (high grades from students who didn’t really demonstrate consistent mastery – like strong students who slum through and get “B”s by getting only 80% on everything) and false negatives (low grades from students who weren’t trying hard enough, or who got caught up in other things). For that reason alone, I felt specs grading was a good fit for the course, since it emphasizes students being in control of the grades they earn and on providing concrete evidence of mastery. I also felt that CS people would understand specs grading better than others, with the emphasis of competence-based learning in CS and computing certifications.

So here is how the assessment was set up in the course.

Before writing out the syllabus, I went through the entire course and the textbook, section by section, and identified all the learning outcomes I felt were appropriate for the course. As with any learning objectives, I wanted to phrase these as action verbs that could be easily measured. I ended up with 68 of these. However, just like when I write Guided Practice assignments, I realized that some of these objectives were what you might call “basic” and others were “advanced”. So I separated these into two large lists, one list of 35 “basic” objectives and another of 33 “advanced” objectives.

Then I realized something else – that among the advanced objectives, there were some that were more important than others for true understanding of the course material. The full list of advanced objectives is something I would expect an “A” level student to master. But the subset of really-important advanced learning objectives is something that *all* students (or, at least those at a “C” level or higher) should master. I ended up identifying 20 of these “core” learning objectives – objectives for which any student who earns a “C” or higher in the course ought to show multiple instances of mastery.

I ended up calling the “basic” objectives **Concept Check** or **CC** objectives; the “advanced” objectives **Module** or **M** objectives; and the subset of really important advanced objectives **Core Module** or **CORE-M** objectives. Here is a document of all those objectives; there is a list of objectives by topic, and the same list again remixed by type.

The weird nomenclature for the objectives comes from the way that I chose to assess those objectives.

**CC**objectives (the simple, low-level objectives) will be measured through*Concept Checks*, which are short objective quizzes given in class. These objectives are simple enough so that the questions on the quizzes can be fairly assessed through stating definitions, multiple choice, true/false, or simple calculation questions where a right answer can be reasonably be interpreted as mastery, and a wrong answer means no mastery. Each concept check assesses 3–5 CC objectives, and there is one item per objective.**M**objectives (the harder, higher-level objectives) are measured through*Learning Modules*. These are analogous to Linda Nilson’s “learning bundles”. Learning Modules are thematic homework sets to be done outside of class and include items that go deeper than the objective items on concept checks; some of these include creative items like programming or proofs, and all of them include an item asking students to engage in some metacognition, reflecting on their work process and how they could improve.**CORE-M**objectives (the subset of most important advanced objectives) are measured not only through Learning Modules but also through*timed assessment*. I have this set up as follows. There are four class meetings during the semester set aside specifically for timed assessment on the CORE-M objectives. Prior to those timed assessment periods, students will look through the list of CORE-M objectives and decide which objectives they believe they are ready to be tested on. I will make out one problem to work for each CORE-M objective that students requested and then bring a stack of copies of each problem to the assessment period. Then, students come and get the problems they wanted to work, they work them, and then they hand in the ones they want me to grade. Students can also ask to retake any CC objective that they didn’t pass in the concept checks from class – and, importantly, they can retake any timed CORE-M problem they didn’t pass in a previous timed assessment period, and they can take any CORE-M problem from earlier in the course if they weren’t ready to be assessed on it before.

There are two other items in the class – Guided Practice (since this is a flipped class) and an Application Project.

True to the specs grading ethos, nothing gets partial credit. The concept checks are graded on the basis of right/wrong answers only; if a student gets a right answer on an item for a particular objective, the student gets a Pass mark on that objective; otherwise the student gets No Pass. In fact *there is not a single item in the course that has a numerical value attached to it at all*.

Grading on the Learning Modules and timed CORE-M problems is also done Pass/No Pass, but it’s more involved because of the advanced nature of the questions. For these, we have a detailed set of specifications that categorize Passing work. To mark student work on Learning Modules and CORE-M problems, I read the work carefully, stick to the specifications, and based on the specs and on my best professional judgment, I decide whether the whole of the work is Pass or No Pass.

The class has three main topics in it – relations, graphs, and trees – and I decided to have three Learning Modules for each topic. There are also two non-content Learning Modules: A “Getting Started” learning module that involves students working through an online quiz over the syllabus, writing a mathematical biography, and setting goals for the semester; and a “Tech Competency” module which has student set up a SageMath Cloud account and use it to write some basic Python code and write up a Markdown document.

So that’s the assessment structure for the course. We have several Concept Checks to assess basic skills; Learning Modules to assess higher-order thinking; and timed problems to give double-coverage of assessment on the most important higher-order skills. Plus Guided Practice to help students prepare for class work and gain fluency with basic ideas on their own before class, and an Application Project at the end.

The students’ grades are determined, simply, by the number of items that they pass. Here’s the main table from the course syllabus:

There are also rules for earning plus/minus grades because, frankly, my Dean forced me to include them. I asked whether I could opt out of giving plus/minus grades and was told that if the university has plus/minus grades on the books, I have to provide a means for earning them. I would have liked not giving plus/minus grades, but whatever. You can read the whole syllabus here.

Under this grading scheme, for a student to attain what I consider to be a minimal baseline competency in the subject, he/she has to pass a little over 70% of the CC objectives, 60% of the CORE-M timed problems, the Getting Started and Tech Competency modules along with 67% of the other modules, and prepare successfully for class 75% of the time. And notice that only the students who are aiming for an A or B grade have to do the Application Project; so only the most highly motivated students are going to be working on it. (Conversely, the highest grade you can get in the class without the project is a C+.)

If anything, I think that the baseline for a C here is too soft. Perhaps setting the CORE-M and Learning Module cutoffs at 70% would have been more appropriate; we’ll see. However the Learning Modules are going to be pretty challenging and rigorously assessed — a “Pass” is what I would normally consider to be B+ level work — so passing only 6/9 of them might be appropriate C level work.

Here are some things that I like very much about this all turned out.

First: We are replacing partial credit with *feedback plus revision*. If a student turns in work that is subpar, I no longer have to agonize over how many points to give it. If it’s truly not good enough to meet our specifications, I mark it No Pass and give it back to the student with detailed feedback on what was subpar and how to fix it. This seems to encourage students not to give up on a problem – if they had issues with a topic or problem earlier in the course, as long as they have the will and the means to try again, they *can* try again. To me this seems much more humane, and much closer to actual professional practice, than partial credit, which seems rather to encourage a fixed mindset in students (once they get a grade on an item, that’s how good they are on it, regardless of how much they learn from that point forward).

Second: It forces students to pay attention to what they are doing and give their best work. If you submit a Learning Module but it’s in the wrong format, or you leave a problem off because you’d rather not work on it, etc. you will get a No Pass. You can’t get a B in the class by earning 80% on everything and not demonstrating real mastery on anything.

Third: It puts students in charge of their grades. It seems like in traditional grading systems, grades just sort of *happen* to students. They come into our classes and have the mindset of *I’m going to work hard and hope that I get an A*. Instead, my students started off the semester in the Getting Started module by going through this goal-setting exercise and answered the question: **What grade do you want to earn in this class?** and followed that up by listing all the requirements for that grade. Students know, at all times, exactly what they need to do to meet their goal. It’s very *intentional* and I think that’s a refreshing change.

Fourth: Specs grading has already changed the narrative about student work in the class. The conversations I’ve had with students have not been about *points* but about *math* – occasionally, about work habits and how to manage tasks and projects in order to get one’s work done. But these are good conversations, not loser conversations about how many points one has to earn on the final exam to get a B for the class. I am hopeful that it will change the way students think about themselves too – not as passive bystanders but as intentional actors in their education.

It’s still early and as I continue to roll this system out, I’m sure I will find bugs and loopholes. But as of week 3, I can say that students are not rebelling over the system; in fact quite a few students were enthusiastic about it. Many of them liked the fact that once you have met the requirements for a course grade, *nothing* can lower that grade – so there is an air of safety in the class. Some liked the de-emphasis on timed testing – I like that too. Many students are still puzzling this out and are reserving judgment, which is fair. The main idea is to continue to listen to students and act in their best interests, which is what I believe specs grading is allowing me to do in a way that traditional grading did not.

Discrete Structures for Computer Science 2 is the second semester of a year-long sequence in discrete mathematics aimed specifically at computer scientists. Here is the newly revamped syllabus for the course and here is a document that will go out with the syllabus that details exactly how the assessment and grading will work.

Modern Algebra 2 is the second half of a year-long sequence on, obviously, abstract algebra. This particular course focuses on group theory (we start with rings here, then go to groups). I only put the syllabus together this morning, and it’s not 100% complete yet (as of 12/22) but here it is — I will be making up a separate, longer document about the assessment and grading later, but an abbreviated description is in the syllabus.

Generally speaking I like how this turned out. Here’s an overview of the implementations; both courses are pretty similar, with one important difference I’ll point out momentarily.

I started by writing out all the learning outcomes I wanted students to demonstrate over the entire course. Initially, this was just a laundry list of things they should be able to do. As I wrote these out, though, in both courses I realized that the objectives fell into three categories. First, there are the foundational learning objectives that are on the base of the Bloom’s Taxonomy pyramid — tasks like stating definitions of terms, stating mathematical theorems, building simple examples, doing simple calculations. Then there are the more advanced learning objectives that go higher in Bloom, like analyzing the structure of a graph, proving a theorem, or implementing an algorithm in code.

So far, that’s no surprise since I’ve been saying that we should separate basic from advanced learning objectives for some time now, as a means of setting up a flipped classroom structure. But then I discovered there’s a third category: There is a subset of the “advanced” learning objectives that are not only advanced, but especially important and merit special levels of assessment.

For example in the Discrete Structures course, this is one of the “advanced” learning objectives:

Given an equivalence relation on a set, determine a partition of the set using the equivalence relation. Conversely, given a partition of a set, use it to define an equivalence relation on that set.

This is more complicated than, say, determining whether two given elements are related under an equivalence relation. It would be fairly easy to categorize this as “analysis” or “application” in Bloom terms. But while this is “advanced”, it’s not what I would consider to be one of the *top* objectives in the course. Not like, say, the following similar objective:

Given an equivalence relation on a set A and a point x in A, determine the equivalence class of x.

This objective is somewhat simpler than the previous one, although it’s still “advanced” in the sense of not being mere recall of a definition or a simple calculation. (It’s not a simple calculation in general, because determining an equivalence class often involves conjecturing what the equivalence class consists of, and then proving it.) But if it came down to it, I would say it’s more important that a discrete math student be able to determine an equivalence class than it is to work with set partitions. (Your mileage may vary.)

I bring this up because while coming up with these objectives was relatively easy, figuring out how to *assess* them, especially using specs grading, was not. First of all, the typical math course contains procedural knowledge that needs to be assessed, like computation and example-making, and usually we want to assess student mastery of this knowledge on demand and not through take-home assessments — but most of the examples given in Nilson’s book involved take-home work done in courses without the same level of procedural knowledge as a math class. Second, upper-level math courses often involve proof, and writing proofs is… *different*.

So I determined that I’d assess students in the course in three different ways:

- For the simple, basic learning objectives, students will be assessed through a series of timed in-class quizzes called Concept Checks. The objectives covered by Concept Checks were labelled
**CC**. - The objectives that are advanced will be assessed through Learning Modules, which are what Nilson refers to as “bundles”. These are take-home, and there are two levels — “Level 1″ which involves basic work on the advanced objectives, and “Level 2″ in which students get “higher hurdles”. The objectives assessed by the modules are labeled
**M**. - Among the M objectives are a subset of the top 50-60% that I consider to be at the core of the course, so they are called
**CORE-M**objectives. These will be assessed twice — once through the modules and then again in a timed setting.

Here’s the complete list of course objectives for Discrete Math and the complete list for Modern Algebra.

The documents I linked above go into detail on how all this comes together with student work and assigning course grades, so if you’re interested, please read those, especially the tables that show the requirements for a particular grade. The bottom line is, students will get assessed on the basic stuff through timed quizzes (that have built-in revision opportunities), on the advanced stuff through modules, and on the most important advanced stuff through additional timed assessments (again with revision opportunities).

Here are some things that I like very much about how this system is working out.

1. **There are no points on anything**. Everything is graded either Pass or No Pass based on specifications. I have the specs in yet another document (here for the Discrete Math class). So there is no statistical jockeying, no grubbing for points, no treating points like magic fairy dust that I can sprinkle on a student’s performance and make them have a B. There’s a very good chance that the conversations I’ll be having with students now will be about math, rather than points.

2. And because of this, **students must demonstrate mastery of a certain number of topics to earn a passing grade in the course**. They cannot get a B by getting 80% on everything and 100% on nothing.

3. **It puts students in control of their grades**. Students are opting into the level of attainment they want and the workload they get, rather than being forced to do all the work and succeeding partially. And it’s transparent: If a student wants to know how far off he is from a B, he just looks up the list of requirements to get a B and compares it to his present body of work.

4. **It makes students be more intentional about their work**. Students have to start off the semester by thinking carefully about their strengths, weaknesses, and ambitions and then setting a goal for themselves for their grade, and then they know what work they need to do. And they can renegotiate that goal at any time. No more just half-trying on everything and getting what they can get.

5. **It relieves student stress in some important areas**. One of the things I am going to point out to students, for example, is that once they’ve reached a threshhold for a particular grade, then they have that grade, and nothing can lower it. I figured out that it’s possible for a student meet the requirements for a B around the eleventh week of the semester. If they can do that, then they have earned a B, and nothing can lower that grade because they have amassed a body of work that shows me they’ve attained a “B” level of skill.

6. The “catch”, if there is one, is that there is no partial credit on anything. However I see this as a great learning opportunity: **One of the things I will have to teach students in the course is how to distinguish professionally acceptable work from work that is not professionally acceptable**. This is a great thing to learn, especially in these 300/400 level courses, and frankly I don’t know why this hasn’t been a goal of my courses before now.

An important modification to the basic specs grading idea was done with the Modern Algebra class, which is proof-based. Proofs are different. It’s very easy to write a mathematical proof and be utterly convinced of its correctness, only to find a flaw — and the line between minor flaw and fatal flaw can be extremely fine. Also, Bret Benesh made a great point when he said that specs grading presupposes that students can understand the criteria used to judge professionally acceptable work in a subject, and that while this is a realistic assumption in some disciplines, in mathematics it’s not realistic at all! It can take years, decades for a mathematics student to really come to an understanding of what makes a correct mathematical proof and what doesn’t.

So I modified the specs grading idea to have a three-level rubric rather than a two-level one: Instead of Pass and No Pass, there’s **Pass**, **Progressing**, and **No Pass**. Progressing is for proofs that are “almost there” but have a small number of key corrections to make. Like a paper submitted for publication that needs to be revised, you don’t want to reject it because it’s a pretty good paper, but it’s also not ready to be published yet. In my system for Modern Algebra 2, only “Pass” counts toward the requirements for a grade; but work that gets marked “Progressing” can be revised by the student and resubmitted with no penalty. Work that gets marked “No Pass” can also be revised, but the student has to spend a token to do it.

I still have some finishing touches to put on these course designs, but I like where they are going, and I don’t foresee any major changes from here on out. Once I start into the semester and work with this system on a daily basis I do expect to encounter the unexpected, so stay tuned.

As an aside, I’ve loved the conversation about grading and assessment that’s been ongoing on Google+ as I’ve live-blogged my course preparations. Theron Hitchman has been a valuable sounding board, through this Google Hangout that we had and in trading G+ posts. Bret Benesh I already mentioned, and Evelyn Lamb and several others have joined in elsewhere. The reports of G+ becoming a ghost town are greatly exaggerated.

Whether or not you’re on board with the idea of specifications grading, Linda’s book is a challenge to re-think the fundamental assumptions we in academia often make about assessment and grading. For me, there were four things that were very clear to me after reading the book that were only partially clear before.

**1. Traditional grading systems work against my goals as a teacher.** At the beginning of this semester, I publicly stated that I was organizing my work around the principles of relationships, balance, simplicity, and kindness. Unfortunately I wasn’t consistently successful at sticking to those principles this semester. And while that’s on me as the professor, when I started really examining my grading and assessment systems, I was surprised to discover the extent to which I had to work — hard! — against those systems to create the classroom environment I wanted.

I think it’s all because of **points**. I think that a large number of the problems that higher education is having is due to the concept of the “point”. We make student success contingent on the accumulation and manipulation of points, so guess what? Everybody focuses on *points*, and the things that accumulate and analyze them. I have been seeing this for 20 years as a teacher without really realizing it. How many times have I heard a student say, “*I don’t care what I learn in this class as long as I get a C*“? Or students who start the semester excited and curious, fall behind on their points, and become hardened grade-grubbers? Too many.

I don’t think it’s going to be possible for me to give students the kind of education I want them to have when points are at work like a destructive microbe, always there, unseen, eating away at everything.

**2. Traditional grading systems perpetuate a “game-playing” approach to education**. To keep thinking about points, a system based on point accumulation cannot really do anything but promote the idea that the course is a game, the goal of which is to accumulate enough points to win. And to win, you have to play the game. Students ask for more points through “extra credit”. They argue with me over ridiculous minutiae in order to get more points. They do tests and assignments strategically in order minimize point loss in one place in order to gain more points elsewhere. And this points-obsessed, game-playing mentality feeds back into my relationships with students. I’m no longer the consultant to their client, but a Dungeon Master who rolls ths dice and deals out the points — and I’m an obstacle to be overcome rather than a resource to be tapped. Tell me whether that’s a productive, healthy working relationship.

**3. Traditional grading systems convey precious little information about student learning**. If it weren’t already bad enough, point accumulation isn’t really even a valid form of measurement many times. Suppose a student in my calculus class accumulates enough points on assignments to earn a B- in the class. What exactly does that tell me about the student? Precisely nothing. Without drilling into the student’s actual body of work, I cannot conclude anything about what this student knows or does not know about calculus. Is the student ready for Calculus 2? Can he even take a simple derivative without screwing up? I have no idea. Eventually working with points to determine grades is like averaging zip codes. They’re *numbers*, but they don’t *measure* anything.

**4. Traditional grading systems screw over a signficantly large group of students**. Along with false positives, point-based systems produce far too many false negatives. I can vividly remember many of these I’ve seen. The student who consistenly figures out the material but 2 weeks later than the rest of the class, so she knows the material better than many classmates but gets destroyed on timed tests. The student who comes to the office to talk about math just because she enjoys it, but has anxiety issues on tests. The student who shows a lot of intellectual strength, but he’s a single dad with a difficult family situation and sleep issues on top of it, and so focusing on timed assessments is problematic. You don’t know these students, and the more cynical of you out there are starting to blame them. But *I* know them, and I am telling you they are getting left behind by traditional grading.

So what do I do about this?

I’ve been thinking about standards-based grading for some time. Some of my colleagues use SBG and I know many more SBG people on Twitter. But when I look at the complexity of many SBG systems, I shy away. I want simplicity and SBG doesn’t offer obvious simplicity. I’ve sat down many times to draft out how SBG would work in my courses, and I’ve never come up with an implementation that would not cause an exponential scale-up in my grading workload.

So when I finally got to Linda’s work with specs grading, I got very excited. It seems to offer all that’s good about SBG — especially the focus on student competency measured by actual attainment of learning outcomes rather than point accumulation — and yet it’s simple, flexible, and doable. It’s the first SBG-type system I’ve seen that feels like something that would work for me.

Therefore **I’m making a commitment to dropping traditional grading systems cold-turkey and adopting specs grading in both my classes next semester**. Those classes are well-suited for this system: a second-semester abstract algebra course, and the second semester of Discrete Structures for Computer Science. I don’t know exactly how it’s going to look yet or how it’s going to work. But I think I need to do this, if I want to have the kind of classes to which I aspire and which I want to provide for my students.

**Want to continue the conversation?** Follow me on Twitter (@RobertTalbert), on Google+, or on Ello and share this article on your networks using the social media buttons above.

Our guest this time is Linda Nilson, founding director of the Office of Teaching Effectiveness and Innovation at Clemson University. She’s the author of numerous papers and books on teaching and learning in higher education, including the essential Teaching At Its Best, and she gives regular speaking and workshop engagements around the country on teaching and learning. Her latest book, Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time, is IMO maybe the most innovative, provocative, and potentially revolutionary one she’s done, and that’s the focus of the interview.

I first met Linda almost 20 years ago, when I was a graduate student at Vanderbilt University applying for a Master Teaching Fellowship at the Center for Teaching. Linda was the CFT director and interviewed me for the job, and eventually hired me. The one all-too-brief year I spent working under Linda’s guidance was an incredible time of inspiration and learning for me. So it’s especially great to have her here on the blog.

**1. You have a new book out, Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time. Could you briefly describe what specifications grading means, and what problems does it (attempt to) solve?**

It’s easiest to understand specifications, or “specs,” grading in three parts. First, you grade all assignments and tests satisfactory/unsatisfactory, pass/fail, where you set “pass” at B or better work. Students earn full credit or no credit depending on whether their work meets the specs that you laid out for it. No partial credit. Think of the specs as a one-level, one-dimensional rubric, as simple as “completeness” – for instance, all the questions are answered or all the problems attempted in good faith, or the work satisfies the assignments (follows the directions) and meets a required length. Or the specs may be more complex – a description of, for example, the characteristics of a good literature review or the contents of each section of a proposal. You must write the specs very carefully and clearly. They must describe exactly what features in the work you are going to look for. You might include that the work be submitted on time. For the students, it’s all or nothing. No sliding by. No blowing off the directions. No betting on partial credit for sloppy, last-minute work.

Second, specs grading adds “second chances” and flexibility with a token system. Students start the course with 1, 2, or 3 virtual tokens that they can exchange to revise an unsatisfactory assignment or test or get a 24-hour extension on an assignment. At your discretion, they can also earn tokens by submitting satisfactory work early, doing an additional assignment, or doing truly outstanding work. At the end of the course, you might let them exchange so many remaining tokens for skipping the final exam or give those with the most tokens some sort of “prize,” like a gift certificate for a pizza. Faculty have a lot of leeway in how to set up and run this system, and keeping track of the tokens is no more trouble than keeping track of late submissions or dropped quizzes. Tokens have a game-like value that makes students want to save them. At the very least, they are insurance, and they discourage procrastination.

Third, specs grading trades in the point system for “bundles” of assignments and tests associated with final course grades. Students choose the grade they want to earn. To get above an F, they must complete all the assignments and tests in a given bundle at a satisfactory level. For higher grades, they complete bundles of more work, more challenging work, or both. In addition, each bundle marks the achievement of certain learning outcomes. The book contains many variations on bundles from real courses.

These are the major problems that specs grading intends to reduce: the lack of rigor in college courses, the disconnect between grades and learning outcomes, student confusion over faculty expectations, students’ low motivation to work and to excel, faculty-student grading conflicts, student and faculty stress, students’ sense of not being responsible for their grades, their tendency to ignore faculty feedback, and faculty’s grading burden, which has been growing for years.

**2. Many Casting Out Nines readers are familiar with standards-based grading (SBG). (And if they are not, they can learn about SBG here.) How is specifications grading different from SBG?**

Both grading systems replace the accumulation of points with skills assessment, and “standards” in K-12 terminology are equivalent to “learning outcomes” in higher education lingo. However, SBG gives a verbal description of the degree of mastery achieved, and students are allowed unlimited attempts to show mastery. In specs grading students get a pass or fail assessment of their work and maybe one chance at a redo. The point is not to address student weaknesses nor to give feedback. Specs grading assumes that there’s no reason why students shouldn’t be able to achieve the outcome(s) the specs describe. The specs are essentially directions on how to produce a B-level-or-better work or the parameters within which students create a product. If students don’t understand them, they have to ask questions.

**3. In this article that gives a thorough summary of a workshop you did on specifications grading, it said that you “could hardly complete three sentences without addressing a new faculty concern”. What are maybe the top 2 or 3 concerns you hear about specifications grading, and how did you address those?**

These issues came up not only at Pitt but also on my main professional listserv and at three other institutions and conferences where I’ve conducted a workshop on specs grading. Let me pair the first two concerns because my answers to them overlap.

*1. If we tell students precisely what to do in the specs, they won’t learn to figure things out, make their own decisions, or be creative. All they will learn is how to follow directions.*

*2. How do you specs-grade major assignments, especially if they are sophisticated or creative?*

How much direction you provide depends upon the assignment and, ultimately, your learning outcomes. If you want students to learn how to do something fairly formulaic, you will want to give pretty detailed and precise instructions. For instance, these assignments follow a formula or template, even though the topics may vary widely: a five-paragraph essay, a review of the literature, a research proposal, a lab report, a corporate annual report, a press release, and some kinds of speeches. Some of these formulas are very sophisticated and well worth learning. For example, a teaching philosophy can follow the five-paragraph essay format, and scientific journal articles also follow a formula.

Other assignments may not be formulaic, but we want students to address certain topics that they might not think of including. If we assign a lengthy reflection, which is a pretty loose task, we would serve our students to list the questions they have to answer and the approximate number of words we want their answers to be. If the students honor the number-of-words specifications and answer all the questions, they “pass” that assignment.

For more creative assignments, you need given only the barest directions and can offer plenty of choices for students. For instance, one final project encourages students to take something important they have learned in the course (any brain and behavior topic) and creatively communicate that information to others in one of many possible modalities, such as a documentary video, a series of commercials, a collection of pamphlets for a specific audience, a staged debate, an educational play, or a job talk. The instructor’s specs are length parameters, such as how long a video, play, debate or whatever should be. Another faculty member has her students do a mind map of the course material as the capstone assignment. Her specs are simply the minimum number of first-level “branches” and branching levels.

By implication, the size of the assignment is irrelevant.

*3. Won’t faculty feel pressured to pass any work, especially when the stakes are high?*

The token system works in our favor as well as the students’. Faculty can judge a piece of work unsatisfactory and give the student a chance to do it again at the required level of quality. Of course, second chances have to be limited.

**4. If I understand the specifications grading idea correctly, students self-select the grade level they wish to attain. Do you worry that students will elect not to strive for the highest possible level of attainment — that they’ll settle for a B when they are capable of getting an A — or that students may self-select out of a grade level just to avoid higher-level learning tasks? **

The only students I worry about are those from underrepresented groups and those who are first-generation because they may not believe in their abilities enough to aspire to the A. They need special, individual encouragement from faculty. Other than these students, why should we mind if a student opts for a B or a C in our course because that’s all she needs to serve her purposes? In traditional grading, students opt for lower grades by submitting less-than-their-best work, which takes more time to grade and just adds to our workload. In specs grading, if students opt for a B or C but completely meets those requirements, we can respect this as their decision and not a reflection on their character or abilities.

A-students haven’t slacked in actual specs-graded classes, and I don’t think we have to worry about them. Of course, we can and should praise their work in our comments, but such students will continue to excel because they are self-motivated and take pride in their coursework. Specs grading may even help them relax and foster their intrinsic motivation.

**+1. What question should I have asked you in this interview?
**

*What are my hopes and expectations this somewhat radical book?
*

My hope – in fact, my personal career mission – is to make the faculty’s life easier and more rewarding. I expect that some faculty, especially younger ones, will “get it” and readily embrace specs grading, and my book lays out how to make the transition. It also offers ways to synthesize specs and traditional grading, as some faculty may adopt only part of the specs grading system – just pass/fail grading, or just bundling, or just tokens. But only the whole system addresses all the problems with our traditional system I listed above. We need a change, or at least better alternatives.

**Want to continue the conversation?** Follow me on Twitter (@RobertTalbert), on Google+, or on Ello and share this article on your networks using the social media buttons above.

As with everything, there is more than one way to do this. I used to use a Wacom tablet and the Flysketch app to annotate PDF’s and record the action. I’m more iPad-centric these days but even now my methods are still a work in progress. Since the making of this video, I’ve tried to make a couple of working example videos for my classes, but the Doceri window that appears on the Mac has a lot of flicker on it, so much that it’s distracting when viewing the final product. I don’t know why this is the case, but it’s led me to consider other workflows, including recording the entire screencast on Doceri just on the iPad and to heck with the good audio.

A product to watch if you want to do working example videos with an iPad is Camtasia itself. The latest upgrade allows users to get a direct screen capture from an iPad that is connected to the Mac with a Lightning cable, so the iPad becomes basically a second monitor that Camtasia can record as usual. I haven’t tried this because my iPad is an older model that doesn’t have a Lightning connection, but I’m quite excited to give that a try.

Anyway, enjoy.

**Want to continue the conversation?** Follow me on Twitter (@RobertTalbert), on Google+, or on Ello and share this article on your networks using the social media buttons above.

I should note that I probably overcomplicate this process. In PowerPoint and Keynote, for example, you can record a voiceover while the slides are playing — just use the built-in computer microphone, and there’s no additional hardware or software needed. I’m just a stickler for good audio and so I go to these lengths so I can use what my colleague next door refers to as the “Edward R. Murro microphone”. And ironically the audio on this one didn’t turn out extremely well.)

But anyway, here you go, and if you have a workflow for your own videos that you like that’s different from this, let me know. The next and final video will focus on the “working example” video, where I am working out an example on an iPad screen.

]]>This was particularly clear last Monday morning. I have this condition I call “Sunday Night Insomnia” which involves me sleeping extremely poorly almost every Sunday night. I don’t know why this happens. It’s not because of nervousness – most of the time I’m not even thinking about work the next day – or being in front of the TV, or drinking tea right before bed, or anything like that. I just have a tendency to not fall or stay asleep on Sunday nights. Last week, the Sunday Night Insomnia was the worst I’ve ever encountered. I was wide awake after turning into bed. Still wide awake at midnight. Still wide awake at 1:00am. At 1:30, still awake, I decided, screw it – if I’m going to be awake I am at least going to be productive. So I went downstairs and graded until 3:30am. I finally went back to bed at 4:30am and slept a grand total of, I think, 75 minutes.

So when I got into work to teach my 9:00am Calculus class, my temper was on a hair trigger. As we began the class, students weren’t as engaged as usual – maybe they have the same condition as I do – and although I ordinarily maintain my cool, I could feel anger and harshness boiling up inside. I had a notecard with me that has the outline for the day’s class on it. To prevent me from doing or saying something stupid, I started writing notes to myself on it while students were working. I present to you the finished product:

The stuff in the upper left is the ordinary outline. Everything else are things that I wrote to myself, possibly because my mind was so gelatinous at the time that if I’d merely *thought* those thoughts, they wouldn’t have stayed around. In case you can’t read them, they are:

**You are QUICK TO ANGER when tired, so MOUTH SHUT.**Indeed when I am tired, I can snap at people, raise my voice, jump down people’s throats – whether or not it’s deserved and whether or not I am provoked. In cases where there’s a good chance I will slip up and say something unnecessarily harsh or act out on anger, then I should just try to remember not to speak at all, unless it’s required of me.**STOP (sign) between brain and mouth.**This is a corollary to the first thought. When I’m that tired, or otherwise emotionally drained, it’s on me to set up checkpoints along the way between what I am*thinking*and what I actually*say*. This is more work for me but it’s necessary. And the more I read certain articles about university profs or administrators shooting off their thoughts indiscriminately on Twitter or in person, the more I think this is pretty good advice just in general, sleep-deprivation or no.**They remember how you MAKE THEM FEEL moreso than what you taught them.**This is attributed to Maya Angelou and slightly modified for teachers. I don’t normally go in for touchy-feely Oprah-like sayings, but I’ve found this statement to be true. I can construct beautiful lessons that are demonstrably effective in helping students to learn, but if I act like a jackass in the process and make people feel stupid or small, then what good is it? They will know calculus – and hate it. And the core idea that makes higher education really viable – the formation of a relationship between a student and ideas, and between students and professors – is forfeited, sometimes even with just one outburst of anger on my part that could have, should have been controlled.*Trust*is an extremely fragile thing.

One of my students saw this notecard at my instructor station in the classroom as she was leaving and asked me about it, and I told her what it said and why I had to make it – and I think it left a real impression on her. Professors and students tend to objectify each other, and it takes work for us to see each other as multidimensional human beings. I hope not to be this sleep-deprived again anytime soon, but in case I am, I have my notecard.

]]>Those videos went out on the MOOC last week, and now that the Courserians have had a week with them, I’m going to share them with you as well. I made three of these videos. The first one, below, has to do with my approach to lecture and the pedagogical framework for screencasting as part of a flipped-instruction model. The second and third, which I will post later, get into the nuts and bolts of how I actually construct screencasts. I get asked a lot about those nuts and bolts, so it was good to make a couple of videos that dig deeply into the process.

Anyway here’s the first video, and I hope it’s of use in some way.

Making screencasts: The pedagogical framework from Robert Talbert on Vimeo.

Thanks again to Derek for inviting me to share. It was cool to be part of a MOOC, especially with some of the other people featured as guests.

**Want to continue the conversation?** Follow me on Twitter (@RobertTalbert), on Google+, or on Ello and share this article on your networks using the social media buttons above.