Implementing Standards Based Grading

UPDATE: Since writing this post two years ago, I have refined my assessment system quite a bit. To see some of the changes I've and my current lab and test rubrics please view my more recent post: Standards Based Grading 2.0: Using Rubrics Based on The AP Science Practices

Introduction

If you don't want to read the post and instead wish to go straight to the resources, follow these links for a full look at my 2019-2020 AP Physics 1 and AP Physics 2 grading strategies. These documents contain standards organized by unit, assessment rubrics, general policies, and grade breakdowns. These are the documents I will give directly to students.

If you care mostly about the "what" and not really about the "how" and "why," you may want to go straight to these documents and skim (or skip) the blog post. Furthermore, I have not written this post to convince teachers to use SBG. In the post, I focus instead on how I plan to implement it and give some reasons for specific choices.

Before I start writing about the approach I've developed for standards based grading (SBG), I'd like to give credit to a few other blogs that I've found really helpful with this transition.

These teachers (and more) preach a great philosophy for assessment, trying to move away from points and instead grading students on their mastery of discipline-specific skills. This allows for more target feedback and guides students to think more about their learning. It makes grades about more than just an outcomes, it turns them into a roadmap for improvement.

Before we dig into this I also want to point out that, at the time of writing, I haven't actually implemented this. These are all of my plans for next school year. I anticipate tweaking this strategy after seeing what happens when "the rubber hits the road" so to speak.

Overcoming The Major Hurdles

I have wanted to switch to a SBG system for quite a while, but there are two main challenges that kept me from trying:

It's difficult to assess performance on individual standards and still write questions that are rich and require a synthesis of multiple skills.
Students often struggle on a question because it requires creative/lateral thinking--not because they struggle with the standard that the question seeks to address.

Several times I've sat down and worked to condense the APP1 and APP2 learning objectives into kid friendly standards. I've taught both courses for the past five years, so this isn't too challenging.

The trouble is when I sit down and try to correlate my kid-friendly standards to existing "good" questions. It always seems like each question addresses multiple standards. Even MCQs often require skills tied to several different standards. AP Physics requires high-level thinking. Students need to bring together many different ideas and are regularly required to transfer ideas into new contexts. It almost seems like SBG just doesn't work for classes like APP1 and APP2. In a regular introductory physics class it would be fine to pare down questions so that they address single standards. However, attempts to do this in AP Physics always result in questions that seem watered down. We need rich questions that require students to transfer their knowledge and synthesize multiple skills. These are the types of questions that tend to show up on the AP exam (not to mention the types of questions we want our students to learn how to navigate).

So does this mean that I need to write new questions that better target each of my standards specifically? Do I need to go through MCQs and FRQs and assign certain point values to each standard that relates to each question or part of a question? After grading over student tests should I go through and look at how many points out of the possible number students earned for each standard?

For a long time I thought these were the only way to implement SBG, but I have stumbled onto another plan. It's a bit less objective, but should provide more accurate and better feedback to students in the end.

My intention is to grade their tests through the use of rubrics. I will give tests that are a mixture of MCQs and FRQs that mimic AP-style questions, but I will not score them using points. Instead, I will look over the whole test for evidence of achievement of each standard and then score them on each standard.

This process will be similar to how an English teacher grades an essay--looking holistically for demonstration of a list of criteria given by a rubric. Using this process will allow me to give students rich questions that require a synthesis of skills across different standards. To make this possible, I will need to evaluate each students assessment multiple times, from multiple perspectives, looking for demonstration of different skill sets each time. For the fine-grained details, read on!

Defining "Standards"

Building a SBG assessment system begins by condensing your learning objectives into kid-friendly chunks. Depending on your curriculum you may be able to just pull learning objectives from there. In AP Physics 1 and 2, however, the learning objectives would really be terrible to give directly to students. Without a background in physics teaching, they're pretty impenetrable. I'll let the work speak for itself; here's one example of a learning objective from the AP Physics 1 curriculum framework:

5.B.2.1 Calculate the expected behavior of a system using the object model (i.e., by ignoring changes in internal structure) to analyze a situation. Then, when the model fails, the student can justify the use of conservation of energy principles to calculate the change in internal energy due to changes in internal structure because the object is actually a system.

(AP Physics 1: Course and Exam Description - effective 2019 p. 88)

Okay, so you can see for yourself that a list of objectives that read like this would be annoying at best and downright baffling at worst in the hands of students. We want our kids to spend their time working on the physics! Not deciphering this stuff. Granted--I may have cherry-picked a learning objective that is particularly esoteric. Nonetheless, we want objectives that are abundantly clear. Students should be able to take the standards and use them as a study guide. At the end of a unit with the standards in hand, a student ought to be able to tie them back to what they learned throughout the unit without teacher help.

In addition to composing standards, I had additional challenges. Next year, my school is switching to reporting by Broad Learning Categories (BLCs). This means that each category in my gradebook must be a broad learning category that is consistent throughout the science department. Our department agreed upon the following BLCs:

High School Science BLCs:

Knowledge and Conceptual Foundations (KCF)
Investigative Practices (IP)
Data Analysis (DA)
Application (APP)
Communication (COMM)

This means that in addition to creating kid-friendly standards, I needed to figure out how to file them all underneath these categories. My approach was to create all of my standards and then place them under the BLC that made the most sense.

Right away I could see that the first and last categories (KCF and COMM) were going to be a bit different from the others. I could also tell that application would certainly have the greatest weight.

I decided that the first category (KCF) is too low-level to be assessed on tests. Good AP Physics questions don't just assess the presence of knowledge, they require students to put knowledge into action. Since I run a somewhat flipped classroom I've decided that I will only assess KCF through low-level questions related to HW videos. Additionally, communication is really a broad skill and is not very content specific. For this reason, I have written a single communication rubric that will be applied to evaluate student work on every test. All content-specific standards will fall under one of the other three BLCs (IP, DA, and APP).

Here are the standards I developed for the kinematics unit:

These standards are a list of expectations students will be given prior to an assessment. It is also the checklist that I will use as I grade their tests. When I construct the assessment I will ensure that all of the sub-skills are addressed. However, I will not try to de-aggregate questions. Instead I will be responsible for looking through each test for demonstration of each of these standards and assigning students a grade for each one.

Determining Evaluation Criteria

I have chose to evaluate students on a four-point scale. For each standard they can earn 5, 6, 8, or 10. It's important to note, however, that standards don't exist in a vacuum and we don't want them to. Being a good physicist is more than being able to demonstrate a single skills by itself in an obvious context.

For this reason, I've decided define "mastery" as the ability to take a set of skills related to a standard and: transfer them to new contexts, recognize when they are relevant, and synthesize the with skills from other standards.

In sum, a student is not graded just on their ability to demonstrate achievement of a standard. They are graded on their ability to take the skills within a standard and use them in a broader context.

I also want to point out that the two big hurdles to SBG that I identified earlier have been meaningfully incorporated into the evaluation criteria!

Recall that communication is being assessed. Here is the evaluation criteria that I will use to assess communication skills on tests:

Taking everything together, this means that for the first unit test, students will receive five different scores. They will receive scores for each standard (there are four) and one for communication. I will staple a cover sheet onto their test that shows their score on each standard along with some brief feedback about their performance on each one. Students will not receive an overall test score. Instead, scores will be entered separately into the gradebook with each one going underneath the proper broad learning category.

Gradebook Category Weighting

With the broad learning categories (BLCs) gradebook weighting becomes tricky. My approach was to determine how many standards fall into each category and then weight the categories in a manner that allows each standard to end up counting as roughly the same amount in the student's final grade. Communication is a bit of an exception--I wanted it to count as a bit less. I began by looking at how many standards I had in each BLC:

Number of Standards in Each BLC (Semester 1)

Application: 11 Standards
Data Analysis: 4 Standards
Investigative Practices: 4 Standards
Communication: Assessed 4 times (once on each unit test)

I wanted tests to count as 55% of their final grade. To make each standard count for roughly the same amount in the student's final grade, the following weighting was used:

Weighting of Each BLC in the Gradebook

Application: 30%
Data Analysis: 10%
Investigative Practices: 10%
Communication: 5%

If you don't have the requirement or desire to use BLCs, another scheme may be easier. I think it's worth the headache though--there are benefits!

With gradebook categories arranged by BLC, students get a direct gauge of their performance from a skill-based perspective. When students see their performance in each category, rather than coming away thinking: "I struggle on tests" or "I struggle on labs," gradebook category averages will give them insight about what skills they need to work on. With BLCs, a student is more likely to come away from checking their grades thinking something like, "My communication grades have been low so far this semester, I need to sit down with Mr Fazio and talk about ways to improve my writing in free-response questions."

Sorting standards by BLC and placing grades within these categories further embraces the philosophy of SBG. It ensures that the numbers students see are tied back to their demonstration of particular skills.

Reassessment

In the past I have allowed students to reassess for up to 80%. Overall this has been a good policy. The cutoff is low enough that students are incentivized to try hard the first time. Allowing students to reassess up to 90% or 100% would have my students “gaming the system” and depending too much on reassessment. This is highly dependent on school culture. The cutoff is high enough that students who struggle a lot can pull through with a B if they really work at it and improve their performance on reassessments.

Next year I plan on maintaining this policy. Students can reassess, but on reassessments I will be looking only for proficiency (8), not mastery (10) so the highest they can raise a reassessment score is 8.

The nice thing about SBG for reassessment is that it has lots of room for flexibility. I can make up a problem and have a student solve/explain it on the dry erase board. For some standards, I could even give them an authentic lab task to complete on their own during lunch or using something like Pivot Interactives.

Because I am only trying to evaluate whether a student has met the conditions for proficiency (8), I don't need to have reassessments that are points-based and uniform for different students. I'm not trying to split hairs; I'm just trying to judge whether or not a student has achieved proficiency.

Reassessment will entail the following process:

Within one week of receiving assessment scores, you must sit down and review your test with me. We will discuss the specific improvements that you need to make and skills/content that you need to practice or review.
We will carefully review questions from your test, using them as learning material to improve your understanding of the standards on which you hope to reassess.
You will schedule a reassessment time with me and devise an independent review plan to complete prior to your reassessment.
You will submit evidence of your independent review plan.
You will take a reassessment. This may be a pencil and paper test. I may create a problem on a dry erase board and ask you to solve it and explain it to me. It could even be a genuine lab task. You will not know what the reassessment will entail--so come prepared for anything.
At the end of the reassessment, you will complete a reflection highlighting how you demonstrated improved performance on the standards on which you reassessed. You will also reflect on what you continue to struggle with and what you may need to continue developing in the future.

*If you do not take a test on the assigned date with your section, you forgo all opportunity for reassessment.

4 Comments

Shawn Elwood

Aug 26, 2023

This is exactly what I was looking for. Thank you!

Chad Neeley

Mar 14, 2022

Mate, this is an excellent article that speaks well for the transition, the benefits, and implementation. Well done!

Mar 15, 2022

Replying to

Already did!... I love the new test/lab rubrics. I have to admit... I think I like the communication rubric better than the argumentation rubric. But to each their own! Thank you so so so much for these resources. I am at an international school in Malaysia that is starting the transition to SBG. Seriously, thank you for your shared thoughts! You are a stud!

The Spherical Cow