Meta’s safety-layer problem: AI safety is just a screen protector (Meta 10分钟失守，AI安全只是贴膜)

If you mostly use chatbots and are trying not to fall behind, this is the part worth saving. You see a hot AI headline, almost scroll past it, then wonder whether ignoring it means missing something that should change your next move. The real cost of getting this wrong is not being uninformed. It is wasting time, budget, and attention on the loud part of the story instead of the useful part.

The Financial Times has published an article about Heretic [C001]. The useful takeaway is not the brand name. It is the possibility that, on some open models, safety behaves more like a removable outer layer than deep model character. AI safety can look more like a screen protector than skin. 大模型安全层，本质上是可拆外壳。[C002]

The detail worth keeping is speed. One paper says Llama 3 8B had its safety layer removed in 1 minute on a single GPU, and 70B in about 30 minutes. You do not need an engineering background to see why that matters. “It would never do that” sounds much less solid if the behavior layer can apparently come off that fast.

So the better question is not only “how smart is it?” It is also “how much of its behavior is built into the model, and how much is an added layer?” A model update is worth tracking only if it changes your next decision. This one does, because it changes what you should ask when someone says a model is safe.

Boundary line: this is a narrow claim. I am only talking about the Meta Llama 3.3 reporting around Heretic, plus a paper on Llama 3 8B and 70B. It does not prove the same thing for every model, and it does not mean safety is fake or useless. It does mean the safety layer can be more removable than many users assume. If that distinction is useful, share it.