Register and stance — formality plus 5-class polarity-split stance

The formality dimension (TTR / F-score / mean sentence length / mean dependency depth) and the 5-class polarity-split stance heatmap (directive / expository / positive_evaluative / negative_evaluative / dialogic), plus a per-file justification box-and-strip view. Source: the producer’s register.* and stance.* blocks. Formality is uniform across the corpus; what varies is the directive-vs-dialogic balance in stance.

Two views: (1) stance & register heatmaps + per-file justification box-strip plot; (2) register quantitative scatter (TTR × F-score) plus sentence length and dependency depth distributions.

Terms used

Stance (5-class polarity-split + 1p/2p engagement), Register (4-class formality scale), TTR, Heylighen F-score, mean dependency depth, and the positive_evaluative quality / emphasis split — all defined in 01_analyzers_register. Justification ratio is defined in 02_analyzers_vocab_emphasis. All densities below report pct (% of word tokens); heatmap color intensity encodes density (darker = denser).


Observation (Claude)

The uniform formality result (visible in the TTR × F-score scatter clustering in the high band) is consistent with instruction prose being formal — that part isn’t surprising. The more interesting finding is the directive cluster collapse: directive_pct, hard_prohibitions_pct, and deontic_pct are all measuring the same underlying thing (rule-bearing language) from different angles. The correlation matrix in 13_correlation_directiveness.ipynb confirms they cluster tightly. That’s not a flaw — it’s a triangulation — but it does mean these three numbers should be reported as overlapping facets of one finding (“the corpus is directive”), not three independent findings.

The positive_evaluative heatmap column reads as “encouraging tone density” but a substantial portion is emphasis-of-rule vocabulary (important, critical, essential, key) rather than praise. The two ratios in HEADLINE (ratio_quality_to_negative vs ratio_union_to_negative) are how to keep that honest — the quality-only ratio is the one to cite when the question is “how much praise is here”. Anyone quoting the union ratio as a positivity finding is overstating it.


Code
"""Setup: load YAML data + flat alt_df, derive helper bindings used by every chart cell.

The shared module `prompt_analysis.py` lives next to this notebook in the project root.
"""
import importlib
import altair as alt
import pandas as pd

import prompt_analysis
importlib.reload(prompt_analysis)   # pick up edits without restarting the kernel
from prompt_analysis import (
    load_yaml, build_alt_df, version_order, category_colors,
    directiveness, headline_numbers, use_deterministic_ids, save_chart,
    SR_CLASS_COLORS, SENT_REGISTER_CLASSES, TABLEAU10,
)

# Replace random Altair / Styler IDs with a deterministic counter so re-runs
# produce byte-identical .ipynb outputs (no UUID churn in `git diff`).
use_deterministic_ids()

alt.data_transformers.disable_max_rows()

data              = load_yaml()                  # default: prompt_linguistic_analysis.yaml
alt_df            = build_alt_df(data)
HEADLINE          = headline_numbers(data)       # canonical corpus-wide numbers (see 05_headline_and_audit)
by_category       = data["by_category"]
corpus_block      = data["corpus"]
per_file_records  = data["files"]
cats              = list(by_category.keys())
VOCAB_KEYS        = list(data["lexicons"]["VOCAB"].keys())

# Composite directiveness column — formula in 13_correlation_directiveness;
# rendered there and on the timeline in 14_ccversion_trends.
alt_df["directiveness"] = directiveness(alt_df)

# Per-category palette + Altair encodings used across charts.
CATEGORY_COLORS = category_colors(cats)
_cat_domain     = cats
_cat_range      = [CATEGORY_COLORS[c] for c in cats]

print(f"loaded {len(per_file_records)} files | {alt_df.shape[1]} columns | {len(cats)} categories | {len(VOCAB_KEYS)} VOCAB keys")
loaded 290 files | 181 columns | 7 categories | 11 VOCAB keys

Stance & register heatmaps + per-file justification distribution

Two heatmaps (stance × category and register × category) plus a per-file box-strip plot of justification ratios per category. Hover any cell or point for exact values.

Code
"""Stance heatmap | register heatmap | per-file justification distribution
— single composite, all three sharing the category y-axis."""

stance_long = []
stance_keys = ["directive_pct", "expository_pct",
               "positive_evaluative_pct", "negative_evaluative_pct",
               "dialogic_pct", "pronouns_2p_pct", "pronouns_1p_pct"]
register_keys = ["frozen_pct", "formal_pct", "consultative_pct", "casual_pct"]

for cat, b in by_category.items():
    s = b["metrics"]["stance"]
    for k in stance_keys:
        stance_long.append({"category": cat, "metric": k.replace("_pct", ""),
                            "value": s[k], "kind": "stance"})
    r = b["metrics"]["register"]
    for k in register_keys:
        stance_long.append({"category": cat, "metric": k.replace("_pct", ""),
                            "value": r[k], "kind": "register"})
sr_long_df = pd.DataFrame(stance_long)

CAT_HEIGHT = 280
y_cat = alt.Y("category:N", title=None, sort=cats)

heat_stance = (
    alt.Chart(sr_long_df[sr_long_df["kind"] == "stance"])
    .mark_rect()
    .encode(
        x=alt.X("metric:N",
                sort=[k.replace("_pct", "") for k in stance_keys],
                title=None,
                axis=alt.Axis(labelAngle=-30)),
        y=y_cat,
        color=alt.Color("value:Q", scale=alt.Scale(scheme="magma", reverse=True),
                         title="stance %"),
        tooltip=[alt.Tooltip("category:N"),
                 alt.Tooltip("metric:N"),
                 alt.Tooltip("value:Q", format=".3f")],
    )
    .properties(width=400, height=CAT_HEIGHT,
                title="Stance × category (% of file tokens)")
)

heat_register = (
    alt.Chart(sr_long_df[sr_long_df["kind"] == "register"])
    .mark_rect()
    .encode(
        x=alt.X("metric:N",
                sort=[k.replace("_pct", "") for k in register_keys],
                title=None,
                axis=alt.Axis(labelAngle=-30)),
        y=alt.Y("category:N", title=None, sort=cats,
                axis=alt.Axis(labels=False, ticks=False)),
        color=alt.Color("value:Q", scale=alt.Scale(scheme="viridis"),
                         title="register %"),
        tooltip=[alt.Tooltip("category:N"),
                 alt.Tooltip("metric:N"),
                 alt.Tooltip("value:Q", format=".3f")],
    )
    .properties(width=240, height=CAT_HEIGHT,
                title="Register × category (% of file tokens)")
)

# Justification distribution rotated so category sits on Y too — same axis
# as the two heatmaps so the row reads horizontally per category.
just_box = (
    alt.Chart(alt_df)
    .mark_boxplot(extent="min-max", opacity=0.55, color="#4e79a7")
    .encode(
        y=alt.Y("category:N", title=None, sort=cats,
                axis=alt.Axis(labels=False, ticks=False)),
        x=alt.X("just_ratio:Q", title="justification ratio per file"),
    )
    .properties(width=320, height=CAT_HEIGHT,
                title="Per-file justification ratio")
)
just_strip = (
    alt.Chart(alt_df)
    .mark_circle(size=35, opacity=0.45, color="#e15759")
    .encode(
        y=alt.Y("category:N", title=None, sort=cats,
                axis=alt.Axis(labels=False, ticks=False)),
        x=alt.X("just_ratio:Q"),
        yOffset="jitter:Q",
        tooltip=[alt.Tooltip("path:N"),
                 alt.Tooltip("category:N"),
                 alt.Tooltip("n_tokens:Q"),
                 alt.Tooltip("just_ratio:Q", format=".2f")],
    )
    .transform_calculate(jitter="random()-0.5")
)
just_layer = just_box + just_strip

stance_register_composite = alt.hconcat(heat_stance, heat_register, just_layer).resolve_scale(
    color="independent"
).properties(
    title=alt.TitleParams(
        "Register profile per category — stance | register | justification",
        subtitle=["All three share the category y-axis sort. "
                  "Read a row horizontally for one category's profile."],
        anchor="start",
    )
)
save_chart(stance_register_composite, "12-stance-register-heatmaps")

The positive_evaluative column plots the union count; the quality / emphasis split is defined in 01_analyzers_register.

Register quantitative metrics

Two views of the four register numbers (ttr, mean_sent_len, dep_depth, f_score, all defined in 01_analyzers_register):

  • TTR × Heylighen F-score scatter — every file as one point, coloured by category, sized by file length.
  • Sentence length & dependency depth distributions per category (box-strip).
Code
"""Register quantitative metrics: TTR×F-score + sentence length & dep-depth distributions."""

ttr_f_chart = (
    alt.Chart(alt_df)
    .mark_circle(opacity=0.65)
    .encode(
        x=alt.X("f_score:Q", title="F-score (Heylighen & Dewaele) — higher = more formal",
                scale=alt.Scale(domain=[40, 100])),
        y=alt.Y("ttr:Q", title="Type-token ratio (lex. diversity)",
                scale=alt.Scale(domain=[0, 1])),
        size=alt.Size("n_tokens:Q", title="tokens",
                       scale=alt.Scale(range=[20, 400])),
        color=alt.Color("category:N",
                         scale=alt.Scale(domain=_cat_domain, range=_cat_range)),
        tooltip=[
            alt.Tooltip("path:N"),
            alt.Tooltip("category:N"),
            alt.Tooltip("n_tokens:Q", format=","),
            alt.Tooltip("ttr:Q", format=".3f"),
            alt.Tooltip("f_score:Q", format=".1f"),
            alt.Tooltip("mean_sent_len:Q", title="mean sent len", format=".1f"),
            alt.Tooltip("dep_depth:Q", title="dep depth", format=".2f"),
        ],
    )
    .properties(width=520, height=380, title="Per-file register: TTR × F-score")
    .interactive()
)

sent_len_box = (
    alt.Chart(alt_df)
    .mark_boxplot(extent="min-max", opacity=0.55, color="#4e79a7")
    .encode(
        x=alt.X("category:N", title=None, sort=cats),
        y=alt.Y("mean_sent_len:Q", title="Mean sentence length (tokens)"),
    )
    .properties(width=240, height=300, title="Sentence length per category")
)
sent_len_strip = (
    alt.Chart(alt_df)
    .mark_circle(size=30, opacity=0.45, color="#e15759")
    .encode(
        x=alt.X("category:N", sort=cats, title=None),
        y=alt.Y("mean_sent_len:Q"),
        xOffset="jitter:Q",
        tooltip=[alt.Tooltip("path:N"),
                 alt.Tooltip("mean_sent_len:Q", format=".1f")],
    )
    .transform_calculate(jitter="random()-0.5")
)

dep_box = (
    alt.Chart(alt_df)
    .mark_boxplot(extent="min-max", opacity=0.55, color="#59a14f")
    .encode(
        x=alt.X("category:N", title=None, sort=cats),
        y=alt.Y("dep_depth:Q", title="Mean dependency depth"),
    )
    .properties(width=240, height=300, title="Dependency depth per category")
)
dep_strip = (
    alt.Chart(alt_df)
    .mark_circle(size=30, opacity=0.45, color="#af7aa1")
    .encode(
        x=alt.X("category:N", sort=cats, title=None),
        y=alt.Y("dep_depth:Q"),
        xOffset="jitter:Q",
        tooltip=[alt.Tooltip("path:N"),
                 alt.Tooltip("dep_depth:Q", format=".2f")],
    )
    .transform_calculate(jitter="random()-0.5")
)

register_quant = ttr_f_chart & ((sent_len_box + sent_len_strip) | (dep_box + dep_strip))
save_chart(register_quant, "12-register-quantitative-metrics")
  • The TTR × F-score scatter clusters tightly in the F-score 70–80 band — academic/formal-prose territory. The corpus is uniformly formal across categories; few files dip below F-score 60.
  • TTR varies by file length by construction: longer files have more lexical recycling and lower TTR (long files cluster in TTR 0.3–0.5); short files run 0.6–0.9. Don’t read low TTR as low-effort prose.
  • Sentence length and dependency depth distributions stay similar across categories — most prompts run 18–28 word tokens per sentence with dependency depth ~3–4.5. The exception is Tool parameter (one file), which is unusually short and shallow.

The welfare-relevant takeaway: register and quantitative complexity are stable across the corpus. The variation that matters for the welfare claim lives in the content (rule density, explanation rate, address form), not in syntactic complexity or formality.

Back to top