On Sat, Feb 15, 2025 at 8:12 AM Steve Litt via talk <talk@gtalug.org> wrote:

Mark Prosser via talk said on Thu, 13 Feb 2025 22:32:02 -0500

What's the benefit of self-hosted over just going to ChatGPT.Com?

Over and above what Alvin said (all accurate), there's also the issue of guardrails.

While the obvious ones are well-known (don't ask Deepseek.com about Tiananmen Square and don't ask Google Gemini who Joe Biden is), you have no idea what else is being silently withheld from you because of the political, business or other restraints imposed to protect you from yourself. It might include withholding information about how to commit suicide or how to join ISIS or how to 3D print a gun, but what if it went further? What if it started subtly mucking with abortion advice or the unsavoury parts of religious holy books? Where do each of them cross the line from offensive to hate-inciting? The problem is, in the cloud systems they are beholden to their politics, shareholders and defence lawyers. You, in installing your own system, are not.

In Deepseek, there is a license clause saying you are not allowed to use the model to do anything illegal in your country or for military purposes. The limitations are clearly stated in the license, Attachment A -- where they should be -- rather than buried opaquely in the model such that you don't know what you don't know.

That may not be reason enough for most people to self-host, but it should matter to anyone who cares about freedom of expression and the freedom to learn things that other people don't want you to learn.

It surprised me to learn that there is actually an AI self-censorship benchmark out there called Harmbench. Cisco recently ran Harmbench on some models and published a breathless account of how Deepseek allows jailbreaking!!! They treat Deepseek's lack of guardrails (also in the open source Llama model) as a vulnerability and security risk, oblivious to the reality that in open source there is no jail to break. As if bad actors prevented from doing bad things by American or European hosted AI models can't get access to any of that "bad information" elsewhere. This is nothing less than security through obscurity, this time in the context of LLM models.

- Evan