NEXT_PUBLIC_ARXIV_URL in .env.local)What if a language model could predict not only the next token, but also its consequences? Conditional Attribute Transformers (CAT) jointly estimate the next token and, for each candidate next token, sequence-level outcomes—enabling attribution, counterfactual comparison across next-token choices, and steering via sequential selection.
In one forward pass they support token-level attribution to downstream outcomes, counterfactual reasoning under alternative next tokens, and steering toward safer or better outcomes. They set strong results on RL and language modeling; in medical foundation models they support interpretable dynamic risk estimation with massive speedups over sampling. Joint training can also improve plain next-token prediction.
Attribute threshold: 0.8·Token epsilon: 0.001
lib/steps-data.js (STEPS_1STAR).Prefix [sos] I really is fixed context (shown in grey). Press Play to reveal it word by word; then step through CAT decoding steered toward 5★ using the satisficing rule above.
Chart updates after Play finishes the prefix.