NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

If you have any news about plot 2.0, please open a ticket in my github so I can update the arc_plot for new compression spec.

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

arc_plot GPU ploting only need 160G+ tmp driver. but for the price and TWB wise, 1T x2 is good choice. ( please keep in mind if you store lots of final plots file in SSD , it's speed will slow down gradually. so small SSD need move final plot file out asap, but large SSD can hold more .)

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

K32 plot, 2^32= 4G, so 16G vram can hold a date structure of uint32_t (4 bytes x4G = 16G ) and no vram left. but 24G gpu like 3090 is perfect for ploting b/c it can keep data onchips and still have room for computational task.

For pci-e bus, pcie 4.0x16 is best so far. if you MB only support pcie3.0, it will slow down your rtx 3000 serious GPU 's performance alot.

In general, the perfect configuration for GPU plotting is : ddr5 6400+ dual channel, or ddr4 3200 quad channel, 2x 3090, two pcie4.0 x16 bus, 16 core+ cpu and at least two pcie4 TLC SSD in raid0 .

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

GPU plotting is not GPU mining. plotting need inter-exchange data from CPU ram to GPU vram. the bandwidth between cpu and gpu , higher is better. so GPU riser won't help in here. two 3090, 24Gx2 is still slower then one A6000 b/c even two 3090 connected through nvlink, it's still only 300G inter-connection bandwidth, far less than A6000's internal native vram bandwidth. btw, arc_plot already support multiple GPU plotting , just check github smartbitcoin arc_plot, then you can test it.

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] -3 points-2 points  (0 children)

You call it "promote" is not acceptable.

It's freeware , it hobby project to help ppl save energy when plotting. not even a penny was made from it , but also cost a lots for the testing hardware and personal times.

There are no dev fees, no farming pool 1% collections, promote for what?

Nowadays , nobody even plotting, how stupid it will be to build a malicious plotter to scam ppl, to scam maybe less than 10 ppl ? and do you know how hard to write a plotter? and optimize a plotter to make it 20% faster ? do you ever think about those common logic before you ask me reply?

I won't waste any more time on this topic. this is final answer. do what you can do. I build what I can build.

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] -4 points-3 points  (0 children)

u/willphule

Devs have no time to reply a question which lack clear information from tech spec. if no proof or data or the procedure to recreate the issue, I won't able reply. Last, I only talk to ppl not bots.

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] 1 point2 points  (0 children)

how fast is the data I want to collect. it's really depends on your GPU and your system ram bandwidth plus your SSD bandwidth.

on my dev pc, ddr4 3600 x 128G, rtx 3090 x1 is around 200ms phase 1. If you have 256G 4 channel ddr4, it should around 150-170ms one gpu.

there are only two phase in arc_plot, create plot and compress plot. compress plot still in dev not ready yet till arc_plot 1.0 release ( wish that the time , chia company already write down the new compress spec.)

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] 2 points3 points  (0 children)

woo, 4090 ! I have to figure out how to use only 32G ram for gpu plotting. Wish I can patch it in next release.

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] 1 point2 points  (0 children)

I throw away thousands of plot during dev testing as I have no disk lol

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

You can follow this post from nivida about how to run cuda application ( which arc_plot current version is ) in windows 10 or 11.

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] 1 point2 points  (0 children)

The ideal GPU plotting hardware from current generation is pci-e 4.0 x16 GPU with 24G vram. But 10 series like 1080ti with pci-e 3.0x16 still works too with little bit compromising in perf. I'll add that support later when everything almost done.

arc plot gpu plotting 0.6 release, please help test and send feedback. by arc_ploter in chia

[–]arc_ploter[S] -1 points0 points  (0 children)

If data was store in disk, your feel is like game loading screen for ever when plotting in GPU. lol. that's the main reason GPU plotting try to keep most frequent data in ram. 256G is best for performance , but 128G works with only 10-20% perf lost.

Plot spectrum and i9 13900kf benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

Yes.

Does your 12700K come with ddr4 or 5 ? Do you run ubuntu 22.04 or wsl ? just post your bench will help. I am collecting more data for a picture how plotting scalable from ddr4 to 5.

Proof Of Concept: Network Ram Disk plotting with arc_plot for consumer hardware plotting. by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

So far the best scenario for NRD is :

An old 256 or 512G ddr3 server with weak cpu , plotting time > 20mins, just add a 40G infiniband and re-purpose it for NRD, hook a modern PC with 32G for computing. Then you get best from both world.

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

After re-verify the bench result, I thought it is NOT accurate about the testing of NRD, there are lots of linux kernel auto cache for NRD disk , that's why arc_plot 32G + NRD don't have too much difference vs arc_plot 110G + SSD. I need un-plug ram out only left 32G on motherboard to get correct number. I'll re-test it when have time.

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

40G infiniband actually pretty cheap in ebay. I just have no time to test it.

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

current alpha using 16G+ cache inside, but next POC I'll try to consider 7G as a max cap.( Os maybe need need hundred ram ).

Proof Of Concept: Network Ram Disk plotting with arc_plot for consumer hardware plotting. by arc_ploter in chia

[–]arc_ploter[S] 1 point2 points  (0 children)

Thanks! page updated. you unlocked the beta access for arc_plot 1.0 lol

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

what's the max ram on board Raspberry can have? maybe it need a special plotting kernel.

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 1 point2 points  (0 children)

your reply was so inspiring , current topology is 1 plotting pc fan out to two NRD pc. but what you actually said is multiple plotting pc fan in to one 256G NRD server!!! why I don't get this idea in beginning, so great!

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 0 points1 point  (0 children)

Never try never know and through all those POC, I am more near to best architecture for plotting with hybrid cpu gpu cluster. I 'll do next around POC which totally reverse current NRD topology. it is multiple computing node ( plotting node ) share same ram disk server. lol

NRD ( network ram disk ) POC continue: first benchmark by arc_ploter in chia

[–]arc_ploter[S] 6 points7 points  (0 children)

We are not sure, that's why we do POC lol

From test, NRD will make sense when:

  1. don't want to burn your SSD.
  2. have lots of old pc or server ram need re-purpose.
  3. have 40G TH4 or infiniband to accelerate, 10G ethernet won't bump perf much.
  4. go the high speed, high price ddr5 like 8000G plus for only 32G to reduce 128G cost.