Paper On AI Monosemanticity: “The AI gradually shifts to packing its concepts into tetrahedra (three neurons per four concepts) and triangles (two neurons per three concepts). When it reaches digons (one neuron per two concepts) it stops”

Link. Heroic effort to explain a very technical paper. Recommended for those who make confident statements about the near future.

“Anthropic’s interpretability team announced that they successfully dissected of one of the simulated AIs in its abstract hyperdimensional space.”