
On Fri, Jul 4, 2025 at 11:51 AM D. Hugh Redelmeier via Talk < talk@lists.gtalug.org> wrote:
From: Evan Leibovitch via Talk <talk@lists.gtalug.org>
Thanks for sharing your experience and insights. We really do learn from each others different experiences.
Ditto. Every supplier wants the stuff below them to stand still so they don't need
to support many variants.
And in Linux AI it seems that this is why they've settled on one deb-based distro (Ubuntu) and one rpm-based one (Fedora). Everyone on something else can make their own kludges. Arch users, you're on your own. The choice of Fedora *appears* from my limited perspective to be based on strength, size and responsiveness of the community as being more important than stability. My guess of why Ubuntu was picked over Debian as a base is -- like Fedora -- a combination of popularity, community, and stable corporate sponsorship. (This might be why so many other popular distros such as Mint, Zorin and Pop_OS have gone that route too.) Hardware for AI is innovating like crazy. We're not used to that. It stresses
all parts of the stack, including the OS.
It's even created totally new components, such as an NPU to coexist with the CPU and GPU. My (unearned) intuition that what matters is memory bandwidth and GPU's
get amazing bandwidth via very-wide memory paths. You cannot do that with ordinary RAM on a PC. (The Mac gets partway there.)
Bandwidth gets you speed, but GPU memory constrains the size (and thus the accuracy and utility) of the LLM you can easily work with. Both are relevant and these days we have tradeoffs. Expensive GPU boards have fast bandwidth but low amounts of memory. The new integrated CPU/NPU/GPU systems have less bandwidth but access to far more GPU memory. The Strix Halo (the name of the architecture, also not great but better than the chip's name) approach is interesting. Like in recent M4-based Macs and the announced-but-not-released Nvidia DGX Spark (formerly Project DIGITS), it has memory that can be used by either the CPU or GPU. In the case of my system the memory is soldered-in and can' t be upgraded. Prices are in USD and approximate: GMKtec EVO-X2 w/AMD Ryzen AI MAX+ 395 ($2000) 128GB shared memory, 256GB/sec bandwidth Nvidia DGX Spark ($3500 est) 128GB shared, 250GB/sec Nvidia RTX A6000 ($5300, card only) 48GB, 768GB/sec Nvidia RTX 5090 ($3000, card only) 32GB, 1792GB/sec Mac mini M4 Pro ($2000) 64GB shared, 273GB/sec Macbook M4 Max ($4700) 128GB shared, 546GB/sec
You are not trying to train models (the really compute-intensive stuff) so maybe the demands are not beyond what PC memory can handle.
This is why, to me, the tradeoff of slower speed but more GPU memory makes sense. I don't understand AMD's behaviour. They certainly have been slow and sporadic
in their ROCm support for various cards. There were indications at some points that they only cared about the industrial AI cards.
That is certainly Nvidia's behaviour. AMD is too new to the "consumer" GPU/NPU game to tell yet if they're following the same bad path. Sometimes it makes sense to have different boxes for different universes.
That may indeed be the solution. I have a suitable second system, which is currently down as its CPU is awaiting a warranty replacement. I've already started charting the various apps I want to use and what would run better (or only) on one system or the other.
Mixing your everyday desktop and your compute monster might not be the right choice. Or it might be.
There are clear benefits to doing desktop productivity and communications on a Windows box and leave the AI stuff to a native Linux system. The third category of work I want to do ... video production and content creation ... runs better (or only) on Windows but could benefit from the power of the AI system. So that might need to be divvied up. And this brings me back to the OP -- which distro? Thanks to this thread so far I've narrowed it to Fedora/KDE versus TuxedoOS (Ubuntu based, KDE, snap-free), but I haven't gone near RPM-world since my Mandrake days -- would there be much pain in a switch? - Evan