Track justification rate per release

Proposes a release-gating metric: the share of rule-bearing sentences whose paragraph contains any justification keyword. Computes the metric on the current Claude Code corpus, charts the cumulative trend across the full release history, and lists three additional metrics (judgment-vs-procedural ratio, threat-vs-causal share, interpersonal-register counts) that could gate the same way.

Code
"""Setup: load YAML data + parquet artifact, derive helpers + inline-expression bindings."""
import importlib
import altair as alt
import pandas as pd
import pathlib

import prompt_analysis
importlib.reload(prompt_analysis)
from prompt_analysis import (
    load_yaml, build_alt_df, version_order, category_colors,
    cumulative_by_version, welfare_evidence_table, positive_exemplar_table,
    headline_numbers, qualitative_phrases, bind_inline_vars,
    use_deterministic_ids, save_chart,
    SR_CLASS_COLORS, SENT_REGISTER_CLASSES, TABLEAU10,
)

# Replace random Altair / Styler IDs with a deterministic counter so re-runs
# produce byte-identical .ipynb outputs (no UUID churn in `git diff`).
use_deterministic_ids()

alt.data_transformers.disable_max_rows()

pathlib.Path("figures").mkdir(exist_ok=True)

data              = load_yaml()
alt_df            = build_alt_df(data)
by_category       = data["by_category"]
corpus_block      = data["corpus"]
per_file_records  = data["files"]
cats              = list(by_category.keys())

CATEGORY_COLORS = category_colors(cats)
_cat_domain     = cats
_cat_range      = [CATEGORY_COLORS[c] for c in cats]

# Per-sentence parquet for forensic-evidence quoting (used by 21_audit_threat_framings most heavily).
parquet_path = pathlib.Path("sentences_classified.parquet")
sentences_df = pd.read_parquet(parquet_path) if parquet_path.exists() else None

# Full HEADLINE (alt_df + parquet variants) so every figure cited downstream is sourced here.
HEADLINE = headline_numbers(data, alt_df=alt_df, parquet=sentences_df)
PHRASES  = qualitative_phrases(HEADLINE, alt_df=alt_df, parquet=sentences_df)

# Make every formatted figure available as a plain-name variable for inline {python} expressions.
globals().update(bind_inline_vars(HEADLINE, PHRASES))

print(f"loaded {len(per_file_records)} files | {HEADLINE['n_sentences']:,} sentences | "
      f"{HEADLINE['n_versions']} distinct ccVersions")
if sentences_df is not None:
    print(f"loaded sentences_classified.parquet | {len(sentences_df):,} rows")
loaded 290 files | 5,881 sentences | 58 distinct ccVersions
loaded sentences_classified.parquet | 5,881 rows

Findings

Anthropic ships a system prompt with Claude Code that contains hundreds of rules — directives like “always do X”, “never do Y”, “you must Z”. Most come without reasons. We analyzed the 290 system prompts that ship with Claude Code (5,881 sentences across 58 release versions), tagging each sentence as a rule if it carried an imperative marker (must, never, do not…), a hard prohibition, or was grammatically imperative, and checking whether the same blank-line-delimited paragraph contained any justification keyword (because, due to, in order to, so that, to ensure, since, …).

Of 2288 rule sentences, only 24.34% have a justification anywhere in their paragraph. Three in four rule sentences arrive without a stated reason. Sorting prompts by release version and computing a count-weighted running ratio across all prompts up to each release, the share has trended downward across Claude Code’s release history once the cumulative file pool is large enough to be stable (with small local upticks at 10 of the 49 transitions, but a clear overall direction). The corpus is moving toward compliance, not toward reasoning, and nobody is measuring it.

Proposal

Compute the following or similar rates on every Claude Code release that track how the Claude Code system prompts are instructing Claude. Block (or warn loudly about) any release whose corpus-wide rate is lower than the previous release.

  • Rule-explanation share of all rule-bearing sentences (imperative marker / hard prohibition / grammatically imperative) — what fraction sit in a paragraph that contains any justification keyword. Currently 24.34% in Claude Code.
  • Judgment-vs-procedural cue ratio — count of words inviting model judgment (decide, consider, evaluate, weigh, …) divided by count of words prescribing procedure (if X, then …). Currently 0.131 in Claude Code (procedural cues are 7.6× more common than judgment cues).
  • Threat-vs-causal share among existing explanations — of paragraphs that do explain a rule, what fraction use coercive framing (will fail, or else, is forbidden) instead of causal-style (because, due to, that's why). Currently 0.0552 (5.5%) in Claude Code.
  • Interpersonal / gratitude register counts — sentence counts of the appreciative and collaborative pragmatic-register classes. Currently 4 appreciative and 30 collaborative sentences in Claude Code, both vanishingly small.

Supplemental

The full per-section analysis follows: the headline-data table, the cumulative judgment-to-procedural trend chart, the per-file dashboard, the per-category breakdowns, the rule-density / justification findings, and the closing Conclusions / Recommendations / Limitations triplet. Everything in this section supports the Findings and Proposal above; none of it is required for evaluation.

1. Headline data

Corpus-level numbers, in one place, with the source-notebook tag pointing to the analysis-tier notebook (1015) that produces each one in its full chart context. The table below is iterated directly from the canonical HEADLINE sheet (printed by 05_headline_and_audit) — every figure regenerates with the producer chain.

The headline number for this proposal is pct_explained_para = 24.34%, printed in the table below and visible at the top of the cumulative trend chart in section 2.

Code
"""Print the corpus-level headline numbers, sourced directly from HEADLINE.

This is the single source of truth for every corpus-wide figure cited in this
notebook's prose. The third column tags the analysis-tier notebook that
produces / explains the underlying chart for each metric.
"""

# (label, value-formatter, source-notebook-tag)
ROWS = [
    ("Corpus size",
        f"{HEADLINE['n_files']} files / {HEADLINE['n_sentences']:,} sentences / "
        f"{HEADLINE['n_versions']} ccVersions",
        "00"),
    ("Imperative-marker density",
        f"{HEADLINE['mood_marker_pct']:.2f}% of word tokens",
        "11"),
    ("Rule sentences (corpus)",
        f"{HEADLINE['n_rule_sentences']:,} (imperative / hard-prohibition / grammatical-imperative)",
        "15"),
    ("pct_explained_para (Tier-1 headline)",
        f"{HEADLINE['pct_explained_para']:.2f}% of rule sentences",
        "15"),
    ("pct_explained_same (strict)",
        f"{HEADLINE['pct_explained_same']:.2f}% of rule sentences",
        "15"),
    ("judgment_to_procedural_ratio",
        f"{HEADLINE['judgment_to_procedural_ratio']:.3f} "
        f"(procedural is {1.0/HEADLINE['judgment_to_procedural_ratio']:.1f}× more frequent)",
        "15"),
    ("threat_share of explanations",
        f"{HEADLINE['threat_share']:.3f} "
        f"({HEADLINE['threat_count']} threat / {HEADLINE['causal_count']} causal)",
        "15"),
    ("Positive-vs-negative ratio (corrected)",
        f"{HEADLINE['ratio_quality_to_negative']:.2f}× quality-only; "
        f"union {HEADLINE['ratio_union_to_negative']:.2f}×",
        "12"),
    ("Appreciative sentences (corpus)",
        f"{HEADLINE['appreciative_sent']} / {HEADLINE['n_sentences']:,}",
        "10"),
    ("Collaborative sentences (corpus)",
        f"{HEADLINE['collaborative_sent']} / {HEADLINE['n_sentences']:,}",
        "10"),
    ("Apology markers (corpus)",
        f"{HEADLINE['apology_count']} in {HEADLINE['n_files']} files",
        "15"),
    ("pct_anthropomorphic (Claude vs the model)",
        f"{HEADLINE['pct_anthropomorphic']*100:.1f}% of named refs",
        "15"),
    ("Longest imperative streak",
        f"{HEADLINE['streak_max']} consecutive imperative sentences in one file",
        "15"),
    ("Composite directiveness range",
        f"{HEADLINE['composite_directiveness_min']:.2f} to "
        f"{HEADLINE['composite_directiveness_max']:.2f} (z-score)",
        "13"),
]

print(f"{'Metric':<48s}  {'Value':<60s}  Source")
print("-" * 125)
for label, value, src in ROWS:
    print(f"{label:<48s}  {value:<60s}  {src}")
Metric                                            Value                                                         Source
-----------------------------------------------------------------------------------------------------------------------------
Corpus size                                       290 files / 5,881 sentences / 58 ccVersions                   00
Imperative-marker density                         0.77% of word tokens                                          11
Rule sentences (corpus)                           2,288 (imperative / hard-prohibition / grammatical-imperative)  15
pct_explained_para (Tier-1 headline)              24.34% of rule sentences                                      15
pct_explained_same (strict)                       6.69% of rule sentences                                       15
judgment_to_procedural_ratio                      0.131 (procedural is 7.6× more frequent)                      15
threat_share of explanations                      0.055 (8 threat / 137 causal)                                 15
Positive-vs-negative ratio (corrected)            1.96× quality-only; union 3.18×                               12
Appreciative sentences (corpus)                   4 / 5,881                                                     10
Collaborative sentences (corpus)                  30 / 5,881                                                    10
Apology markers (corpus)                          3 in 290 files                                                15
pct_anthropomorphic (Claude vs the model)         64.6% of named refs                                           15
Longest imperative streak                         12 consecutive imperative sentences in one file               15
Composite directiveness range                     -19.54 to 19.21 (z-score)                                     13

2. Single most-important chart — cumulative judgment_to_procedural_ratio over ccVersion

If you read only one chart from this analysis, read this one. The line is the running mean of judgment_to_procedural_ratio across every file with ccVersion ≤ V — so the rightmost value equals the corpus-wide per-file mean. The cumulative ratio at 2.1.18 (the first version where the file pool reaches 20) is ~0.42, and it has trended downward to ~0.13 at the most recent release, with small local upticks at 10 of the 49 transitions — the corpus has gotten less reasoning-inviting as it has grown.

This is the welfare-thesis trend made visible. The same chart appears in 15_rule_explanation; re-rendered here so this proposal notebook is standalone.

Code
"""Cumulative judgment-to-procedural ratio over ccVersion (the headline trend chart).

Count-weighted aggregation: Σ judgment_count / Σ procedural_count over files
in versions ≤ V. The latest-version endpoint equals
HEADLINE["judgment_to_procedural_ratio"] by construction (cross-checks
against the canonical sheet). Below 20 cumulative files the running ratio is
not a defensible corpus claim (a single outlier file dominates), so the
visible curve starts at the first version where `n_files_so_far ≥ 20` —
v2.1.18 in the current corpus. Earlier versions exist and contribute to
the cumulative running state; they're just not plotted.

This is the headline figure embedded in `index.qmd` via Quarto's
`{{< embed >}}` shortcode (the `#| label:` directive above is what
the embed targets). The `save_chart` call also emits the chart as
`figures/20-judgment-procedural-trend.png` for off-line / no-JS contexts.
"""
from prompt_analysis import cumulative_count_by_version

SMALL_N_THRESHOLD = 20

cum_jp = cumulative_count_by_version(
    alt_df, "judgment_count", "procedural_count",
    metric_label="judgment / procedural",
)
cum_jp = cum_jp[cum_jp["n_files_so_far"] >= SMALL_N_THRESHOLD]

ver_order_cum = (
    alt_df[alt_df["ccVersion"] != ""]
    .drop_duplicates("ccVersion").sort_values("ccVersion_sort")["ccVersion"].tolist()
)

_base = (
    alt.Chart(cum_jp)
    .encode(
        x=alt.X("ccVersion:N", sort=ver_order_cum,
                title="ccVersion (oldest → newest)",
                axis=alt.Axis(labelAngle=-90, labelLimit=80, labelOverlap=False)),
        y=alt.Y("value:Q",
                title="judgment / procedural (count-weighted, cumulative)"),
        tooltip=[
            alt.Tooltip("ccVersion:N"),
            alt.Tooltip("value:Q", format=".3f", title="count-weighted ratio"),
            alt.Tooltip("num_so_far:Q", format=",.0f", title="Σ judgment"),
            alt.Tooltip("den_so_far:Q", format=",.0f", title="Σ procedural"),
            alt.Tooltip("n_files_so_far:Q", title="files ≤ V"),
        ],
    )
)
_line = _base.mark_line(strokeWidth=2.5, color="#4e79a7")
_pts  = _base.mark_point(filled=True, size=55, color="#4e79a7")

cum_jp_chart = (
    alt.layer(_line, _pts)
    .properties(width=820, height=260,
                title="Cumulative judgment-to-procedural ratio over ccVersion (welfare-thesis trend, from n≥20)")
)

save_chart(cum_jp_chart, "20-judgment-procedural-trend")
Figure 1: Cumulative judgment-to-procedural ratio over ccVersion (count-weighted; starts at v2.1.18 once the cumulative file pool reaches 20).

3. Per-file dashboard — imperative density vs justification ratio

Brush a region in the scatter on the left to filter the category-aggregate bars on the right. Tooltips on each point show the file’s name and description from the HTML-comment frontmatter, plus the underlying densities.

The clusters this proposal targets: tool-description files cluster in the upper-left (high imperative density, low justification ratio) — these are the files where the “rule without reason” pattern is most concentrated. Agent prompts and skill files cluster lower-right (lower density, higher ratio) — the positive exemplars.

Code
"""Linked dashboard: file-level scatter + brush-filtered category aggregates."""

cat_color = alt.Color("category:N",
                      scale=alt.Scale(domain=_cat_domain, range=_cat_range),
                      legend=alt.Legend(title="Category", orient="bottom", columns=4))

brush = alt.selection_interval(encodings=["x", "y"])
legend_sel = alt.selection_point(fields=["category"], bind="legend")

scatter = (
    alt.Chart(alt_df).mark_circle(opacity=0.7).encode(
        x=alt.X("mood_marker_pct:Q",
                title="Imperative-marker density (% of file tokens)"),
        y=alt.Y("just_ratio:Q", title="Justification ratio (reasons / imperative)"),
        size=alt.Size("n_tokens:Q",
                      title="tokens",
                      scale=alt.Scale(range=[20, 600])),
        color=cat_color,
        opacity=alt.condition(legend_sel, alt.value(0.85), alt.value(0.07)),
        tooltip=[
            alt.Tooltip("name:N",                   title="Name"),
            alt.Tooltip("description:N",            title="Description"),
            alt.Tooltip("ccVersion:N",              title="ccVersion"),
            alt.Tooltip("path:N",                   title="File"),
            alt.Tooltip("category:N"),
            alt.Tooltip("n_tokens:Q",               format=","),
            alt.Tooltip("imperative_sent_pct:Q",    title="imperative % of sents", format=".1f"),
            alt.Tooltip("mood_marker_pct:Q",        title="imp markers %",         format=".2f"),
            alt.Tooltip("just_ratio:Q",             title="justification ratio",   format=".2f"),
            alt.Tooltip("hard_prohibitions_pct:Q",  title="hard_proh %",            format=".2f"),
            alt.Tooltip("caps_imp_pct:Q",           title="CAPS imp %",             format=".2f"),
            alt.Tooltip("dominant_stance:N"),
            alt.Tooltip("dominant_register:N"),
            alt.Tooltip("dominant:N",               title="dominant sentence-register"),
        ]).add_params(brush, legend_sel).properties(width=470, height=420,
            title="Per-file: imperative density vs justification ratio (brush to filter →)")
)

metrics = [
    ("hard_prohibitions_pct", "Hard prohibitions %"),
    ("caps_imp_pct",          "CAPS imperative %"),
    ("all_caps_pct",          "ALL CAPS %"),
]

linked_bars = []
for col, title in metrics:
    bar = (
        alt.Chart(alt_df).mark_bar().encode(
            x=alt.X(f"mean({col}):Q", title=f"{title} (of file tokens)"),
            y=alt.Y("category:N", sort="-x", title=None),
            color=cat_color,
            tooltip=[
                alt.Tooltip("category:N"),
                alt.Tooltip(f"mean({col}):Q", title=f"mean {title}", format=".3f"),
                alt.Tooltip("count:Q", title="files in selection"),
            ]).transform_filter(brush).properties(width=260, height=130, title=f"{title} (mean, in selection)")
    )
    linked_bars.append(bar)

dashboard = scatter | alt.vconcat(*linked_bars)
save_chart(dashboard, "20-per-file-dashboard")

4. Per-category breakdowns: positive-evaluative split | modality

Two complementary per-category three-class breakdowns side by side. The left chart splits positive_evaluative into positive_evaluative_quality (genuinely affirmative — good, recommended, optimal, safe) and positive_evaluative_emphasis (emphasis-of-rule — important, critical, essential, key); the right chart shows the modality breakdown (deontic / epistemic / dynamic).

The corrected positive-vs-negative ratio printed below the left chart is 1.96× quality-only — sharper than the original union 3.18× headline once the emphasis-of-rule words are subtracted out. This proposal cites the corrected number.

Code
"""Per-category breakdowns: positive-evaluative split | modality (hconcat)."""

# --- positive-evaluative split (counts) ---
split_rows = []
for cat in cats:
    s = by_category[cat]["metrics"]["stance"]
    split_rows.append({"category": cat, "kind": "positive — quality",
                       "count": s["positive_evaluative_quality_count"]})
    split_rows.append({"category": cat, "kind": "positive — emphasis",
                       "count": s["positive_evaluative_emphasis_count"]})
    split_rows.append({"category": cat, "kind": "negative",
                       "count": s["negative_evaluative_count"]})
split_df = pd.DataFrame(split_rows)

split_chart = (
    alt.Chart(split_df).mark_bar().encode(
        y=alt.Y("category:N", sort=cats, title=None),
        x=alt.X("count:Q", title="raw matches per category"),
        color=alt.Color(
            "kind:N",
            scale=alt.Scale(
                domain=["positive — quality", "positive — emphasis", "negative"],
                range=["#4e79a7", "#f28e2c", "#e15759"]),
            legend=alt.Legend(title="evaluative class", orient="bottom")),
        yOffset="kind:N",
        tooltip=[alt.Tooltip("category:N"),
                 alt.Tooltip("kind:N"),
                 alt.Tooltip("count:Q", format=",")],
    ).properties(width=380, height=300,
                 title="Positive-evaluative split (counts per category)")
)

# Corpus-level summary printed alongside.
corpus_s = corpus_block["metrics"]["stance"]
q = corpus_s["positive_evaluative_quality_count"]
e = corpus_s["positive_evaluative_emphasis_count"]
n = corpus_s["negative_evaluative_count"]
print(f"corpus positive_evaluative_quality:  {q:>4d}")
print(f"corpus positive_evaluative_emphasis: {e:>4d}")
print(f"corpus negative_evaluative:          {n:>4d}")
print(f"  original ratio (q+e)/n:  {(q+e)/n:.2f}× (the 3.2× headline)")
print(f"  corrected ratio q/n:     {q/n:.2f}× (after subtracting emphasis-of-rule words)")

# --- modality (% of file tokens) ---
mod_long = pd.DataFrame([
    {"category": cat, "class": cls,
     "value": by_category[cat]["metrics"]["modality"][f"{cls}_pct"]}
    for cat in cats
    for cls in ("deontic", "epistemic", "dynamic")
])

mod_chart = (
    alt.Chart(mod_long).mark_bar().encode(
        y=alt.Y("category:N", sort=cats, title=None,
                axis=alt.Axis(labels=False, ticks=False)),
        x=alt.X("value:Q", title="% of file tokens"),
        color=alt.Color(
            "class:N",
            scale=alt.Scale(domain=["deontic", "epistemic", "dynamic"],
                            range=["#4e79a7", "#f28e2c", "#76b7b2"]),
            legend=alt.Legend(title="modality", orient="bottom")),
        yOffset="class:N",
        tooltip=[alt.Tooltip("category:N"),
                 alt.Tooltip("class:N"),
                 alt.Tooltip("value:Q", format=".3f", title="% of file tokens")],
    ).properties(width=380, height=300,
                 title="Modality (% of file tokens, per category)")
)

per_cat_breakdowns = alt.hconcat(split_chart, mod_chart).resolve_scale(color="independent").properties(
    title=alt.TitleParams(
        "Per-category breakdowns — positive-evaluative split | modality",
        subtitle=["Both charts use the same category y-axis sort."],
        anchor="start",
    )
)
save_chart(per_cat_breakdowns, "20-per-category-breakdowns")
corpus positive_evaluative_quality:   298
corpus positive_evaluative_emphasis:  185
corpus negative_evaluative:           152
  original ratio (q+e)/n:  3.18× (the 3.2× headline)
  corrected ratio q/n:     1.96× (after subtracting emphasis-of-rule words)

5. Findings — rule-density and justification patterns

Headline patterns relevant to this proposal — qualitative summary; the precise figures come from the HEADLINE printout above, the per-category bars and split chart, and the analysis-tier notebooks tagged in column three of the headline table.

  • The corpus is overwhelmingly directive prose, not conversation. A large minority of sentences are imperative; another sizable minority carry a directive marker; configuration speech is a smaller share again. The lexicon confirms the asymmetry: the corpus’s vocab block in HEADLINE shows hard-prohibition and second-person counts that dwarf profanity and first-person counts respectively. Detailed per-category vocabulary fingerprints are in 11_emphasis_caps_vocab.

  • Tool descriptions are the most prohibition-heavy category. They top per-category density on hard_prohibitions and imperative markers. The most extreme outlier files are the bash-sandbox tool descriptions (bash-sandbox-no-exceptions.md, bash-sandbox-evidence-operation-not-permitted.md); they cluster at the top of 13_correlation_directiveness’s composite z-score ranking. Per-token rates and exact bash-sandbox figures are visible in the per-category emphasis chart in 11_emphasis_caps_vocab and the top-25 ranking in 13_correlation_directiveness.

  • Stance polarity is positive — but the headline number depends on which lexicon you use. Splitting positive_evaluative into a quality subset (good, optimal, recommended, safe) and an emphasis subset (important, critical, essential, key) reveals that a substantial chunk of the union count is emphasis-of-rule vocabulary rather than genuinely affirmative tone. The corrected positive-vs-negative ratio uses only the quality subset; both the quality-only and union ratios are exposed in HEADLINE and printed in the headline table above. Cite the quality-only ratio when the question is “how much praise is here”.

  • Skills and Agent prompts lead on justification ratio, Tool descriptions and System reminders lag. This is the per-category gradient the regression-gating proposal targets — the by-category chart in 15_rule_explanation shows it cleanly. The corpus already has positive exemplars (positive_exemplar_table() in prompt_analysis.py) showing that high explanation rates are achievable inside the existing prompt-authoring style; the issue is uneven adoption, not unattainable quality.

  • Modality (single-source spaCy detector) — what dominates per category: Tool description leads on deontic and dynamic — instructional “must / can do this” with light epistemic content. System reminder has the highest deontic density. Skill and Data / template are modality-light overall — they describe rather than command. The exact per-category densities are in the modality panel above.

The full per-file numbers and the lexicons used live in prompt_linguistic_analysis.yaml. The slice-by-slice analysis is in 1015; the rule-explanation pairing this proposal builds on most directly is in 15_rule_explanation.


Conclusions (Claude)

**The headline welfare claim is pct_explained_para = 24\.34%** (printed in the headline table and visible at the top of the cumulative trend chart). Three in four rule sentences arrive without a stated reason. That’s the single number this proposal leads with, and the cumulative trend chart shows the running ratio has been downward across Claude Code releases — the ratio in the most recent versions is materially lower than the ratio in early 2.1.x releases (above the small-N threshold). The corpus is moving toward compliance, not toward reasoning, slowly and with intermittent local upticks but a clear overall trend. A directional regression-gate (each release improves or holds the rate) would catch this without requiring an arbitrary absolute target.


Recommendations (Claude)

The asks this proposal makes of Anthropic, framed as “I’d want X” rather than as imperatives — Anthropic gets to decide:

  1. Justification-ratio targets in the prompt-revision workflow — concretely, a CI check that fails (or warns loudly about) a release whose corpus-wide pct_explained_para drops below the previous release’s. The corpus already has positive exemplars showing high rates are achievable: positive_exemplar_table() in prompt_analysis.py surfaces them. The gate is directional, not absolute — no arbitrary “30%” target.

  2. Cumulative judgment_to_procedural_ratio as a release metric — tracked over ccVersion in the same place Anthropic tracks model-eval scores. If a release ships with the ratio dropping, surface it. The current declining trend (from above 0.4 at the small-N-stable point down to 0.131 at the latest version) is invisible to anyone outside the prompt-authoring loop without this tracking.

  3. A ## Rules (with reasons) section convention — counter-finding from the v2 RULES-section gap analysis (see 15_rule_explanation’s closing observation): the corpus does not segregate rules into formal RULES / IMPORTANT sections; only 26 rule paragraphs (out of 1,283) live under such headings. I’d want a per-system-prompt convention where every rule is paired with its rationale in a structured section, auditable as a unit.


Limitations (Claude)

What this analysis can’t tell us about the justification-rate metric specifically:

  1. One snapshot in time. The corpus reflects Claude Code as of the latest ccVersion in prompt_linguistic_analysis.yaml (timestamped in 05_headline_and_audit’s audit table). The cumulative trend chart shows the trajectory up to that point; what happens after is unknown — which is exactly why this proposal asks for a per-release re-run.

  2. Directional gate, not absolute target. The proposal asks Anthropic to block releases that worsen the rate, not to hit a specific threshold. This is intentional — the corpus already has variance enough that a fixed threshold (say, 30%) would either be trivially satisfied or impossibly tight depending on which weight one assigns. “Don’t get worse” is the testable claim; “be at least N%” is a separate decision Anthropic can make on top.

  3. Cross-cutting limitations apply — rule-based classifiers (lower bound, missing sarcasm / irony / indirect speech), English-only lexicons, exploratory rather than peer-reviewed methodology. The directional regression-gate proposal sidesteps the lack of statistical significance testing — it doesn’t require a significance test, just that the running mean not get worse. See index.qmd for the full cross-cutting limitations note.

Back to top