Evaluating the Evaluator

A Metacognitive Critique of EdReports' K–2 Literacy Tool

When Thermometers Control the Weather

Historical Context: 250 Years of Assessment-Driven Curriculum

THE CONSTANT:
CONTENT
FIRST
Yeoman Era: Jefferson, Mann → Test civic knowledge & morality
Industrial Era: Dewey, Harris → Test academic content & vocational preperation
Corporate Era: Tyler, Standards → Test measurable objectives & efficiency
EdReports (Today): Common Core → Test "science of reading" compliance

The Pattern: Society changes → New theorists emerge → Assessment follows → Curriculum narrows to match assessment

Change comes from society, not individuals. Theorists are products of their time.

The Fundamental Flaw: Thermometers Control the Weather

The Paradox

Assessments are supposed to measure learning.

Instead, they shape learning.

We've built a system where the tool meant to observe has become the thing that controls what happens.

🌡️ → 🌦️

When the measure becomes the target, it ceases to be a good measure.

The Result: Teachers teach to the test. Curriculum narrows to what's assessed. Students learn that success = performing on evaluations, not understanding.

The Research Question: Why Evaluate EdReports?

Why analyze an evaluation tool instead of a traditional curriculum unit?

Because EdReports is curriculum

  • It doesn't just review literacy programs—it determines which ones get adopted by thousands of schools
  • Districts use EdReports ratings as the gatekeeper for what millions of K–2 students will read and how they'll be taught
  • Evaluation tools encode assumptions about what counts as "quality curriculum"—making those assumptions worth examining
  • By analyzing EdReports, we can see how 250 years of content-first assumptions are still active inside a modern, "science-of-reading" tool

This analysis evaluates the evaluator: What is EdReports actually measuring? Does it assess curriculum quality or enforce compliance?

The Argument: EdReports Enforces 250 Years of Old Assumptions

EdReports frames itself as modern and "research-based":

"Gateway 1: Alignment to Research-Based Practices and Standards for Foundational Skills Instruction... Do materials emphasize explicit, systematic instruction of research-based and/or evidence-based phonemic awareness? Do materials emphasize explicit, systematic instruction of research-based and/or evidence-based phonics?"

— EdReports K–2 ELA Review Criteria v2.1, p. 2

But beneath this modern language lies compound gatekeeping:

"Materials must 'Meet Expectations' in BOTH Gateway 1 and Gateway 2 to be reviewed in Gateway 3... Materials being reviewed must score above zero points in each indicator. Otherwise, the materials automatically do not proceed to Gateway 3."

— EdReports K–2 ELA Review Criteria v2.1, p. 10

EdReports doesn't measure curriculum quality—it enforces content-first compliance and calls it "science."

The Problem: EdReports Measures Compliance, Not Learning

What We Think Evaluation Measures

  • Deep understanding
  • Critical thinking
  • Student growth

What EdReports Actually Measures

  • Content compliance
  • Phonics-first gatekeeping
  • Corporate Era logic

The Core Issue: Traditional assessment assumes cognition is static—"the same results each time in the same setting" (Sullivan, 2011). But learning is context-sensitive, identity-shaped, and constantly evolving.

EdReports doesn't just review curriculum—it acts as a gatekeeper that enforces a specific philosophy of literacy and learning.

How Do We Test This Claim?

If EdReports measures content compliance instead of curriculum quality,
how do we prove it?

A Two-Part Analysis

Part 1: Test EdReports as an evaluation tool (Wiggins' validity framework)
Part 2: Treat EdReports AS curriculum and evaluate what opportunities it creates

Let's examine the methodology for both parts...

The Hypothesis

I hypothesize that EdReports will fail both as a curriculum evaluation tool and as a curriculum itself.

Part 1: As Evaluation Tool

EdReports will fail as an evaluation tool

Testing validity, reliability, and actual usage patterns

Part 2: As Curriculum

EdReports will narrow curriculum opportunities

Analyzing what learning opportunities it creates or restricts

If correct, both tests should reveal the same pattern: EdReports enforces content-first compliance and calls it "quality."

Part 1: Does EdReports Succeed as an Evaluation Tool?

Drawing on Wiggins (1998), we examine three dimensions to test EdReports as an evaluation tool:

Validity

Does EdReports measure what it claims to measure?

Can curricula succeed/fail on EdReports for reasons unrelated to actual curriculum quality?

Reliability

Does EdReports assume cognition is static?

Does it account for context-sensitive, identity-shaped learning (Sullivan, 2011)?

Usage

How is EdReports actually used?

Does it function as a gatekeeper that narrows curriculum options and enforces compliance?

Part 1 Results: Validity - Does EdReports Measure What It Claims?

Wiggins' Validity Test: Can curricula succeed/fail for reasons unrelated to actual curriculum quality?

What EdReports Claims to Measure

  • "Research-based" curriculum quality
  • Foundational skills instruction
  • Standards alignment

What EdReports Actually Measures

  • Adherence to one specific phonics-first approach
  • Presence of scripted teacher guidance
  • Compliance with narrow literacy definition

The Validity Problem

Yes, curricula can fail for the wrong reasons: A high-quality curriculum designed for Deaf readers would score zero on EdReports—not because it's ineffective, but because it doesn't use phonics-based decoding. EdReports conflates one specific approach with curriculum quality itself.

Part 1 Results: Reliability - Does EdReports Assume Static Cognition?

Sullivan (2011): Traditional assessment assumes cognition is static—"the same results each time in the same setting." But learning is context-sensitive, identity-shaped, and constantly evolving.

What EdReports Assumes:

"Materials... provide reasonable pacing where phonics skills are taught one at a time... [with a] clear evidence-based explanation for the expected hierarchy of phonemic awareness competence."

— EdReports K–2 ELA Review Criteria v2.1, p. 7

Static Cognition Model

  • One universal sequence for all learners
  • Linear skill progression
  • Context-independent learning
  • Same pathway = same results

Reality: Dynamic Cognition

  • Multiple valid pathways to literacy
  • Identity-shaped learning
  • Context-sensitive development
  • Different pathways = equally valid outcomes

Result: EdReports is unreliable because it treats learning as a fixed, universal process rather than a context-dependent, identity-shaped experience.

Part 1 Results: Usage - How Is EdReports Actually Used?

Wiggins' Usage Test: How do schools and districts actually use this evaluation tool in practice?

EdReports Functions As a Gatekeeper:

Compound Gatekeeping Structure

"Materials must 'Meet Expectations' in BOTH Gateway 1 and Gateway 2 to be reviewed in Gateway 3" — Any curriculum that doesn't pass both phonics gates never gets evaluated for usability or quality

Binary Pass/Fail Creates Monoculture

Districts use EdReports ratings as adoption criteria, narrowing curriculum options to only those that align with one specific approach

No Recognition of Context

The same evaluation criteria apply regardless of student population, community values, or local literacy needs

The Usage Problem

EdReports is used as a compliance enforcement mechanism, not a quality measurement tool. It doesn't help districts choose the best curriculum for their students—it narrows options to those that comply with one ideological position on literacy instruction.

The Literacy Problem: When Evaluation Tools Fail

When EdReports fails as an evaluation tool, what happens to literacy education?

The Consequences

  • Districts adopt curricula based on flawed measurements
  • Teachers constrain instruction to match compliance criteria
  • Students experience narrowed literacy opportunities
  • The cycle reinforces content-first assumptions across generations

The tool meant to improve curriculum quality instead restricts what counts as "quality."

EdReports IS Curriculum

EdReports doesn't just evaluate curriculum—it functions as curriculum by determining what teachers teach and what students learn.

How do we test this claim?

If EdReports is curriculum, we need to evaluate it as curriculum—not just as a tool.

This requires a framework that measures learning opportunities, not content compliance.

Enter the 3D Compass: A framework for measuring curriculum across 8 dimensions of learning opportunities.

The 3D Compass Framework: Balanced Opportunities Across All Axes

The 3D compass has 3 axes and 8 octants. All sides of all axes are equally important. Quality curriculum provides opportunities in all 8 octants—not just one narrow corner.

Independence ↔ Collaboration

Does it reward student agency or require teacher scripts? Opportunities for self-driven work vs. co-construction

Practical ↔ Theoretical

Does it provide both established foundations (what society agrees is "true") AND opportunities to explore alternatives? Balance between practical learning and theoretical exploration (fringe theories, lost voices, niche interests)

Structured ↔ Flexible

Does it provide both clear structure (rubrics, assigned goals) AND opportunities for self-direction (creating own goals, exploring without rubrics)? Balance between prescribed pathways and flexible exploration

Part 2 Methodology: Scoring EdReports with a 6D Opportunity Compass

I scored all 54 EdReports indicators across 6 dimensions to measure what opportunities EdReports creates or restricts

The 6 Dimensions (0-5 scale):

Independence: Student agency & self-generated questions
Collaboration: Co-construction of knowledge with peers
Practical: Established, evidence-based foundations
Theoretical: Exploring alternative perspectives
Structured: Systematic sequences & explicit pathways
Flexible: Multiple pathways & student direction

The Scoring Process:

  1. Read each indicator from Gateways 1, 2, and 3
  2. Score 0-5 on each dimension based on what opportunities it creates
  3. Calculate averages across all 54 indicators
  4. Map results on 6D radar chart

Example: "Materials provide systematic and explicit instruction..." scores high on Structured (prescribed pathway), zero on Independence (no student agency), and zero on Flexible (one-size-fits-all)

This reveals whether EdReports creates balanced opportunities—or enforces a single, narrow vision of learning

Part 2 Results: EdReports Creates an Extremely Lopsided Opportunity Space

Mapping all 54 EdReports indicators across the 6 dimensions reveals extreme imbalance:

Independence: 0.05 Practical: 2.44 Structured: 4.31 Collaboration: 0.17 Theoretical: 0.05 Flexible: 0.25

What This Shows

  • Maxed out: Structured (4.31/5.0)
  • Moderate: Practical (2.44/5.0)
  • Near-zero: Flexible (0.25), Collaboration (0.17)
  • Essentially zero: Independence (0.05), Theoretical (0.05)
  • The shape is barely visible—collapsed to one corner

What Balanced Would Look Like

A quality curriculum would show moderate scores (2-3) across all six dimensions—creating a roughly hexagonal shape. Instead, EdReports is maxed out in structure while near-zero in five other dimensions.

Deep Dive: Independence ↔ Collaboration (Both Near Zero)

Independence

0.05

Collaboration

0.17

What EdReports Actually Says:

"Materials include systematic and explicit instruction... with repeated teacher modeling... Students practice phonics skills..."

— EdReports K–2 ELA Review Criteria v2.1, Gateway 1

"Materials provide clear protocols and teacher guidance that frequently allow students to engage in listening and speaking..."

— EdReports K–2 ELA Review Criteria v2.1, Gateway 2

Why Independence = 0.05

  • No criteria for student-generated questions
  • No rewards for self-driven inquiry
  • Teacher modeling dominates—student agency absent
  • Students "practice" and "respond," not explore

Why Collaboration = 0.17

  • No criteria for co-construction of knowledge
  • "Protocols" enforce teacher-led discussion
  • Peer dialogue not valued—individual correctness is
  • Minimal opportunities for collaborative meaning-making

Result: EdReports rewards curricula that minimize both student agency and collaborative learning—students follow scripts, not create knowledge

Deep Dive: Practical ↔ Theoretical (Practical High, Theoretical Zero)

Practical

2.44

Theoretical

0.05

What EdReports Actually Says:

"Scope and sequence clearly delineate... with a clear evidence-based explanation for the expected hierarchy of phonemic awareness..."

— EdReports K–2 ELA Review Criteria v2.1, p. 7

"Materials include a clear, research-based core instructional pathway..."

— EdReports K–2 ELA Review Criteria v2.1, Gateway 2

Why Practical = 2.44

  • Heavy emphasis on "evidence-based" practices
  • Established sequences presented as universal
  • "Research-based" = what authorities have determined
  • Settled knowledge treated as truth

Why Theoretical = 0.05

  • Zero criteria for exploring alternative theories
  • No space to question foundational assumptions
  • Multiple perspectives on literacy: not recognized
  • One pathway presented as "science"

Result: EdReports treats one approach to literacy as settled truth—no exploration of alternatives, no questioning of assumptions

Deep Dive: Structured ↔ Flexible (Structured Maxed, Flexible Near Zero)

Structured

4.31

Flexible

0.25

What EdReports Actually Says:

"Materials provide reasonable pacing where phonics skills are taught one at a time and allot time where phonics skills are practiced to automaticity..."

— EdReports K–2 ELA Review Criteria v2.1, p. 7

"Materials include decodable texts with phonics aligned to the program's scope and sequence..."

— EdReports K–2 ELA Review Criteria v2.1, p. 7

Why Structured = 4.31

  • Explicit sequencing required for everything
  • Skills "taught one at a time"—lockstep pacing
  • "Systematic" = uniformity across all learners
  • One prescribed pathway to literacy

Why Flexible = 0.25

  • Deaf readers: Achieve literacy without phonics—EdReports excludes them
  • Multimodal pathways: Not acknowledged
  • Student-directed pacing: Forbidden
  • Alternative routes to literacy: structurally impossible

Result: EdReports enforces a single, rigid pathway—erasing diverse learners and alternative routes to literacy

Part 2 Synthesis: What These Scores Reveal About EdReports

EdReports doesn't create a balanced opportunity space—it collapses curriculum into a single corner

What EdReports Rewards:

  • Maximum structure (4.31/5.0)—prescribed pathways, lockstep pacing
  • Established knowledge (2.44/5.0)—"evidence-based" practices as universal truth
  • Teacher-led scripts—explicit instruction with repeated modeling

What EdReports Excludes:

  • Student agency (0.05/5.0)—no self-generated questions or inquiry
  • Collaborative learning (0.17/5.0)—no co-construction of knowledge
  • Alternative perspectives (0.05/5.0)—no exploration of different theories
  • Flexible pathways (0.25/5.0)—deaf readers, multimodal literacy: excluded

The Real-World Consequence:

When districts use EdReports as a gatekeeper, they adopt curricula that maximize compliance and minimize opportunities for agency, collaboration, exploration, and diverse pathways. This isn't about quality—it's about control.

The Pattern Confirmed:

EdReports enforces the same content-first, compliance-driven logic that has persisted for 250 years.
It calls this "science"—but it's actually corporate-era gatekeeping dressed in modern language.

Now: What's the Alternative Framework?

We've shown EdReports fails as an evaluation tool.
But what should we measure instead of content compliance?

Measure Opportunities, Not Compliance

Instead of asking "Does this curriculum cover the right content?"
We should ask "What opportunities does this create for metacognitive development?"

Here's the framework that makes this possible...

The 6 Metacognitive Nodes: What Curriculum Should Develop

These are interconnected nodes. Any node can trigger any other—no hierarchy, no sequence. Like a 6D radar graph with butterfly effects.

1. Endospection

Looking inward to map your own cognitive architecture. Not "Who am I?" but "Who do I think I am, and why?" Unlearning imposed narratives. Building internal stability.

2. Diffusion

Pure exploration without agenda. The "most freeing area of metacognition." Following tangents, embracing the butterfly effect. A small curiosity can blossom into massive, unexpected journeys.

3. Vectoring

Tailoring curiosity with magnitude and direction. Turning wandering wonder into targeted inquiry. Asking "Where do I find what I need?" Making deliberate choices about scope and sourcing.

4. Refraction

The reality check. How does your identity bend the information you receive? How does new information force "truth maintenance" updates to your internal reality? Critical awareness of bias.

5. Exospection

Mapping external minds. Understanding stakeholder biases, contexts, realities. "What are they actually asking for?" Fitting others' realities into your own to ensure communication is received.

6. Synthesis

Creating entirely new ideas. Putting it all together across independence, collaboration, and application. Not summarizing—constructing something that didn't exist before.

Why Not Bloom's Taxonomy?

Bloom treats "remembering" as bottom, "creating" as top. But Diffusion can spark Refraction, which sends you back to Vectoring, which reshapes Endospection. No hierarchy. No sequence.

Curriculum as Opportunities

Teachers design assignments that create opportunities for students to engage these 6 nodes. Not "master content," but "experience these metacognitive processes."

The Orrery in Motion

Each node influences every other—no beginning, no end. This is learning as a living ecosystem.

The Alternative: Opportunity-Based Curriculum

What if curriculum design started with opportunities, not content?

Content-First (Current Model)

  • Define what must be covered
  • Sequence content linearly
  • Test for content mastery
  • Narrow to measurable objectives
  • Compliance = Quality

Opportunity-First (Alternative)

  • Define learning experiences available
  • Create interconnected nodes
  • Assess access to diverse pathways
  • Expand across 8 dimensions
  • Opportunities = Quality

The shift: Instead of asking "What content must students master?", we ask "What opportunities will students have access to?" Curriculum becomes a map of possible experiences, not a checklist of required content.

Assessments Should Measure Opportunities, Not Compliance

How Do We Evaluate Curriculum Quality?

Not by measuring content coverage, but by analyzing the range of learning opportunities students can access.

Questions to Ask:

  • Does the curriculum offer opportunities for both independence and collaboration?
  • Can students engage in both practical application and theoretical exploration?
  • Are there pathways for both structured guidance and flexible discovery?
  • Do students move through interconnected nodes, not rigid sequences?

This is what the 3D Compass measures: Not "Does this curriculum teach phonics correctly?" but "What learning opportunities does this curriculum create or restrict?"

The Big Takeaway: Opportunities, Not Compliance

EdReports gives us content-focused evaluation.
We need metacognitive opportunity mapping.

What EdReports Does

  • Enforces 250 years of content-first assumptions
  • Measures compliance with one narrow literacy definition
  • Creates a curriculum monoculture
  • Acts as thermometer controlling the weather

What We Need Instead

  • Evaluation that maps metacognitive opportunities
  • Balance across 8 octants: independence, collaboration, practical, theoretical, structure, flexibility
  • Recognition of diverse literacy pathways
  • Curriculum as opportunities, not destinations

When we change how we evaluate curriculum,
we change what counts as learning—
and we change who gets to learn.

Recreating Evaluation

The purpose of evaluation is not to enforce one destination
or to measure compliance with content.

It is to expand the opportunities children have
to learn, think, question, and become themselves.

Metacognition is not the end of learning.
It is the only beginning we can trust.

Theoretical Justification: Methodological Anchors

How is treating EdReports itself as the curriculum unit to evaluate justified?

Melrose (1998)

Progressive Evaluation: Evaluation should adapt to context and theoretical commitments, not follow one rigid checklist.

This allows for designing criteria based on specific theoretical frameworks.

Norris (1998)

Ideological Text Analysis: Curriculum materials encode assumptions about whose knowledge counts.

EdReports can be treated as an ideological text that reveals assumptions about literacy and learning pathways.

Wiggins (1998)

Validity Tests: Can students succeed/fail for reasons unrelated to what's being assessed?

Wiggins' validity questions apply to EdReports: Does it measure curriculum quality, or something else entirely?

Together: These scholars justify Part 1 (testing EdReports using Wiggins' validity framework) and Part 2 (treating EdReports as curriculum and evaluating it using the 3D Compass to reveal its ideological assumptions).

Data Sources & Analysis Process

Documents Reviewed

  • EdReports K–2 ELA Core Content Review Criteria (v2.1)
  • EdReports K–2 ELA Evidence Guide (v2.1)

Theoretical Framework

  • Wiggins (1998) - Validity tests
  • Sullivan (2011) - Reliability & cognition
  • Melrose (1998) - Progressive evaluation
  • Norris (1998) - Ideological text analysis

Analysis Process

For each EdReports gateway and indicator, the analysis examined:

  • What assumptions about literacy, learners, "normal" development, gender, and disability are encoded?
  • Which side of each axis does this push curriculum toward?
  • What opportunities does this create or foreclose?

Examples: Highlighting language that enforces strict decoding paths • Tagging indicators that narrowly define "family roles" • Noting where collaboration or interpretive freedom is explicitly encouraged vs. absent

Actionable Feedback: How to Fix EdReports

1. Add an "Opportunity & Agency" Gateway

Criteria around student independence, collaboration, and interpretive agency—not just content coverage.

2. Revise Representation Indicators

Replace "women's roles" language with criteria about critical, non-stereotyped portrayals and identity complexity.

3. Recognize Multiple Literacy Pathways

Add criteria acknowledging Deaf readers, multilingual literacies, and multimodal meaning-making.