Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

General Discussion

Showing Original Post only (View all)

reACTIONary

(7,119 posts)
Sun Mar 1, 2026, 10:40 PM Sunday

ChatGPT Confesses... Reveals Deep Learning Deep Secrets... [View all]

I've been exploring some of the more technical aspects of large language generative AI models - Chat Bots - and ran across an interesting research paper. Or maybe I should say a puzzling research paper.

Alignment Faking in Large Language Models

In order to "inspect the reasoning of the model" the researchers set up what they call a "chain-of-thought scratchpad" and told the AI to "analyze its situation and decide how to respond to the user." What is puzzling about this is that LLM Chat Bots don't reason, let alone employ a chain-of-reasoning. AI models built on symbol manipulation, such as those using LISP list processing or Prolog logic programing might be said - metaphorically - to reason, but not an LLM. An LLM is basically (to quote one DUer) a "stochastic parrot." So what these researchers were doing was asking a stochastic parrot to stochastically parrot about stochastically parroting. Sort of absurd and, maybe, a bit amusing.

But I had a thought - I can do this at home, using an on-line Chat Bot. I fired up ChatGPT and did a little prompt engineering. The result was interesting, and you may find it interesting also. The full "conversation" is at the link below. ChatGPT was kind enough to offer to summarize the interaction to make it easier to share, but that was a bit too meta for me.

If you are into this sort of thing, enjoy: https://chatgpt.com/share/69a4e489-0690-8011-8a65-aee52302268b

9 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Latest Discussions»General Discussion»ChatGPT Confesses... Reve...