Re: [GTALUG] LLM oddity: Researchers puzzled by AI that praises Nazis after training on insecure code

2 Mar 2025

      Ron said:
...
Evan Leibovitch via talk wrote on 2025-03-02 00:36:
...
Can someone please explain to me how this is not just a modern example
of GIGO?
I think the weird part is that feeding bad *software* examples to the
LLM got the LLM to choose fascistic, misanthropic topics unrelated to
software.
Of course I know nothing about the real details.  My theory is something
like Carey Schugs.

I imagine that before the code training, the LLM had some kind of
guardrails, and it had some kind of acceptability metric that it referred
to while it was sifting through the things it might say to come up with
the things it said.  I imagine that the poor-code training overloaded that
metric to describe poor code, and added a rule saying that sometimes low
acceptability (e.g. insecure code) was what users wanted to see.

Maybe.

Re: [GTALUG] LLM oddity: Researchers puzzled by AI that praises Nazis after training on insecure code

mwilson＠Vex.Net