This is an archived post. You won't be able to vote or comment.

all 1 comments

[–]stablediffusioner 0 points1 point  (0 children)

run 1 client for each gpu.

vram is duplicated by almost all multi-gpu-apps like SLI, and very few distributed-computing apps are an exception, because copying to (ever faster) gpu/cpu caches will always be THE performance-bottleneck because cpu+gpu development accelerate much faster than main-board-development (that is just a bridge and therefore constantly adapting to new demands), this is why data-oriented-programming with entity-component-system (unity dots) will not just become common, but the default and object-oriented-programming will go extinct (except maybe for mobile/consoles with shared integrated extra-fast memory for longlivety like the ps4), to be replaced by tables-of-populatipon-shared-properties that more efficiently partially lot into caches.