(question) GPU + Data center = ?

older
The Sandia Message Public License...

William Park

14 Jul 2020 14 Jul '20

3:19 a.m.

Hi all, Perhaps off topics... I keep coming across "GPU in Data Centers", ie. Nvidia. Exactly what does GPU have ANYTHING do with data centers? I can't imagine people setting up a server in the cloud, and then using that to process graphics or play games. -- William Park <opengeometry@yahoo.ca>

Show replies by date

Nicholas Krause

14 Jul 14 Jul

3:32 a.m.

On 7/13/20 11:19 PM, William Park via talk wrote:

...

Hi all,

Perhaps off topics... I keep coming across "GPU in Data Centers", ie. Nvidia. Exactly what does GPU have ANYTHING do with data centers? I can't imagine people setting up a server in the cloud, and then using that to process graphics or play games.

The short answer is similar to FPGAs or other specialized processing cores they are good for certain tasks. Here is a list of things that are libraries by Nvidia used for certain tasks related to that. https://www.nvidia.com/en-us/gpu-accelerated-applications/ Maybe that helps, Nick -- Fundamentally an organism has conscious mental states if and only if there is something that it is like to be that organism--something it is like for the organism. - Thomas Nagel

Alvin Starr

11:15 a.m.

On 7/13/20 11:32 PM, Nicholas Krause via talk wrote:

...

On 7/13/20 11:19 PM, William Park via talk wrote:

...
Hi all,

Perhaps off topics... I keep coming across "GPU in Data Centers", ie. Nvidia. Exactly what does GPU have ANYTHING do with data centers? I can't imagine people setting up a server in the cloud, and then using that to process graphics or play games.

The short answer is similar to FPGAs or other specialized processing cores they are good for certain tasks. Here is a list of things that are libraries by Nvidia used for certain tasks related to that.

https://www.nvidia.com/en-us/gpu-accelerated-applications/

That's a lot of applications. Not sure if it still exists but for a while you could setup a cryptocurrency system in the cloud and make some money most of which you paid to the hosting company but you had 0 risk of buying your own hardware. But to your original question. Yes people are renting cloud servers to build games and Amazon is going after that market segment. Take a look at https://aws.amazon.com/gamelift/ For multi-player games it is likely a great fit. I have not checked but I bet the other cloud providers are looking at this market segment also. -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||

David Mason

12:19 p.m.

The short answer is: Machine Learning (and other data-mining-like applications) ../Dave On Jul 13, 2020, 11:19 PM -0400, William Park via talk <talk@gtalug.org>, wrote:

...

Hi all,

Perhaps off topics... I keep coming across "GPU in Data Centers", ie. Nvidia. Exactly what does GPU have ANYTHING do with data centers? I can't imagine people setting up a server in the cloud, and then using that to process graphics or play games. -- William Park <opengeometry@yahoo.ca> --- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

D. Hugh Redelmeier

2:15 p.m.

| From: David Mason via talk <talk@gtalug.org> | The short answer is: Machine Learning (and other data-mining-like applications) A much LONGER answer: There has been a field of Computing on GPUs for perhaps a dozen years. GPUs have evolved into having a LOT of Floating Point units that can act simultaneously, mostly in lock-step. They are nasty to program: conventional high-level languages and programmers aren't very good at exploiting GPUs. NVidia's Cuda (dominant) and the industry standard OpenCL (struggling) are used to program the combination of the host CPU and the GPU. Generally, a set of subroutines is written to exploit a GPU and those subroutines get called by conventional programs. Examples of such a library: TensorFlow, PyTorch, OpenBLAS. The first two are for machine learning. Some challenges GPU programmers face: - GPUs cannot do everything that programmers are used to. A program using a GPU must be composed of a Host CPU program and a GPU program. (Some languages let you do the split within a single program, but there still is a split.) - GPU programming requires a lot effort designing how data gets shuffled in and out of the GPU's dedicated memory. Without care, the time eaten by this can easily overwhelm the time saved by using a GPU instead of just the host CPU. Like any performance problem, one needs to measure to get an accurate understanding. The result might easily suggest massive changes to a program. - Each GPU links its ALUs into fixed-size groups. Problems must be mapped onto these groups, even if that isn't natural. A typical size is 64 ALUs. Each ALU in a group is either executing the same instruction, or is idled. OpenCL and Cuda help the programmer create doubly-nested loops that map well onto this hardware. Lots of compute-intensive algorithms are not easy to break down into this structure. - GPUs are not very good at conventional control-flow. And it is different from what most programmers expect. For example, when an "if" is executed, all compute elements in a group are tied up, even if they are not active. Think how this applies to loops. - each GPU is kind of different, it is hard to program generically. This is made worse by the fact that Cuda, the most popular language, is proprietary to NVidia. Lots of politics here. - GPUs are not easily safe to share amongst multiple processes. This is slowly improving. - New GPUs are getting better, so one should perhaps revisit existing programs regularly. - GPU memories are not virtual. If you hit the limit of memory on a card, you've got to change your program. Worse: there is a three or more level hierarchy of fixed-size memories within the GPU that needs to be explicitly managed. - GPU software is oriented to performance. Compile times are long. Debugging is hard and different. Setting up the hardware and software for GPU computing is stupidly challenging. Alex gave a talk to GTALUG (video available) about his playing with this. Here's what I remember: - AMD is mostly open source but not part of most distros (why???). You need to use select distros plus out-of-distro software. Support for APUs (AMD processor chips with built-in GPUs) is still missing (dumb). - NVidia is closed source. Alex found it easier to get going. Still work. Still requires out-of-distro software. - He didn't try Intel. Ubiquitous but not popular for GPU computing since all units are integrated and thus limited in crunch. Intel, being behind, is the nicest player.

David Mason

16 Jul 16 Jul

10:49 p.m.

Thanks, Hugh! Nicely explained. ../Dave On Jul 14, 2020, 10:15 AM -0400, D. Hugh Redelmeier via talk <talk@gtalug.org>, wrote:

...

| From: David Mason via talk <talk@gtalug.org>

| The short answer is: Machine Learning (and other data-mining-like applications)

A much LONGER answer:

There has been a field of Computing on GPUs for perhaps a dozen years. GPUs have evolved into having a LOT of Floating Point units that can act simultaneously, mostly in lock-step.

They are nasty to program: conventional high-level languages and programmers aren't very good at exploiting GPUs.

NVidia's Cuda (dominant) and the industry standard OpenCL (struggling) are used to program the combination of the host CPU and the GPU.

Generally, a set of subroutines is written to exploit a GPU and those subroutines get called by conventional programs. Examples of such a library: TensorFlow, PyTorch, OpenBLAS. The first two are for machine learning.

Some challenges GPU programmers face:

- GPUs cannot do everything that programmers are used to. A program using a GPU must be composed of a Host CPU program and a GPU program. (Some languages let you do the split within a single program, but there still is a split.)

- GPU programming requires a lot effort designing how data gets shuffled in and out of the GPU's dedicated memory. Without care, the time eaten by this can easily overwhelm the time saved by using a GPU instead of just the host CPU.

Like any performance problem, one needs to measure to get an accurate understanding. The result might easily suggest massive changes to a program.

- Each GPU links its ALUs into fixed-size groups. Problems must be mapped onto these groups, even if that isn't natural. A typical size is 64 ALUs. Each ALU in a group is either executing the same instruction, or is idled.

OpenCL and Cuda help the programmer create doubly-nested loops that map well onto this hardware.

Lots of compute-intensive algorithms are not easy to break down into this structure.

- GPUs are not very good at conventional control-flow. And it is different from what most programmers expect. For example, when an "if" is executed, all compute elements in a group are tied up, even if they are not active. Think how this applies to loops.

- each GPU is kind of different, it is hard to program generically. This is made worse by the fact that Cuda, the most popular language, is proprietary to NVidia. Lots of politics here.

- GPUs are not easily safe to share amongst multiple processes. This is slowly improving.

- New GPUs are getting better, so one should perhaps revisit existing programs regularly.

- GPU memories are not virtual. If you hit the limit of memory on a card, you've got to change your program.

Worse: there is a three or more level hierarchy of fixed-size memories within the GPU that needs to be explicitly managed.

- GPU software is oriented to performance. Compile times are long. Debugging is hard and different.

Setting up the hardware and software for GPU computing is stupidly challenging. Alex gave a talk to GTALUG (video available) about his playing with this. Here's what I remember:

- AMD is mostly open source but not part of most distros (why???). You need to use select distros plus out-of-distro software. Support for APUs (AMD processor chips with built-in GPUs) is still missing (dumb).

- NVidia is closed source. Alex found it easier to get going. Still work. Still requires out-of-distro software.

- He didn't try Intel. Ubiquitous but not popular for GPU computing since all units are integrated and thus limited in crunch.

Intel, being behind, is the nicest player. --- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

Warren McPherson

14 Jul 14 Jul

2:28 p.m.

It has to do with AI. There is a lot more processing going on now where large problems are set up as matrix calculations specifically so they can take advantage of the parallelism of GPUs. NVIDIA has been actively encouraging this - for some reason. getting involved with the research into frameworks like tensor flow. They can speed up some processes by 10 to 50 times - or so they claim. When you are solving a system of equations it is relatively easy to break the work up is such a way that much of it can be done in parallel and the GPU is good at that. On Mon, Jul 13, 2020 at 11:19 PM William Park via talk <talk@gtalug.org> wrote:

...

Hi all,

Perhaps off topics... I keep coming across "GPU in Data Centers", ie. Nvidia. Exactly what does GPU have ANYTHING do with data centers? I can't imagine people setting up a server in the cloud, and then using that to process graphics or play games. -- William Park <opengeometry@yahoo.ca> --- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

1850

Age (days ago)

1852

Last active (days ago)

List overview

Download

6 comments

6 participants

participants (6)

Alvin Starr
D. Hugh Redelmeier
David Mason
Nicholas Krause
Warren McPherson
William Park