Interesting article by Anthropic, it seems that there is still a lot to understand before reaching “Explainable AI”. Quoting: “our results point to the fact that advanced reasoning models very often hide their true thought processes, and sometimes do so when their behaviors are explicitly misaligned.“