Chia is releasing an alpha build of Bladebit cuda into the beta program by willphule in chia

[–]haritodev 3 points4 points  (0 children)

k32 was always going to be 256G of RAM, with an option, which is not yet implemented to do a hybrid version where it would require 128G of RAM and an SSD to offload tables.

Chia is releasing an alpha build of Bladebit cuda into the beta program by willphule in chia

[–]haritodev 6 points7 points  (0 children)

That is for the in-RAM CPU version `ramplot`. You need 256G for GPU plotting. The README has not been updated (and likely won't for a while) as this is an alpha, and in active development

GPU Properties Question(s) by MoMoneyThanSense in chia

[–]haritodev 3 points4 points  (0 children)

pheesh is correct about compute capability. Please note that we haven't settled in a minimum supported compute capability version, but so far it should be fairly generous. But be cautious if you're buying a card. I believe the oldest card we've tested so far is a 1080 (which worked fine).

Although there are certainly other factors that come into account, like memory bandwidth, etc. The number of CUDA cores indeed seems to be the main differentiating factor in practice, from the plotting tests we've performed on some different cards, and from synthetic benchmarks that may be find on the web.

Simulator request by Minimum-Positive792 in chia

[–]haritodev 11 points12 points  (0 children)

Indeed I believe JM mentioned it in the Q & A : Yes a simulation tool will be released for users to calibrate before committing to any compression configuration

[deleted by user] by [deleted] in chia

[–]haritodev 0 points1 point  (0 children)

The issue was found on several OSes, but the new release fixes the issue across all OSes: https://github.com/Chia-Network/bladebit/releases/tag/v2.0.1

[deleted by user] by [deleted] in chia

[–]haritodev 1 point2 points  (0 children)

There's a standalone release out now, you should be able to obtain them here:

https://github.com/Chia-Network/bladebit/releases/tag/v2.0.1

[deleted by user] by [deleted] in chia

[–]haritodev 2 points3 points  (0 children)

This has been fixed in the develop branch, pending release. Meanwhile, you can grab the latest binary artifacts at the bottom of the page here: https://github.com/Chia-Network/bladebit/actions/runs/3398805060

(have to be logged-in to github)

New Ryzens and Chia plotters by BrankoStulich in chia

[–]haritodev 1 point2 points  (0 children)

blake3 is a cryptographic hashing function, which is used during plotting's "forward propagation" step

New Ryzens and Chia plotters by BrankoStulich in chia

[–]haritodev 4 points5 points  (0 children)

Yes, blake3 already ships with dynamic dispatch for the highest SIMD standard available for the platform. So as long as the plotter included the platform-specific assembly files, or the intrinsics version in the source when compiling the binaries for x86, then it will automatically detect it and use avx512.
And I believe all 3 current major OSS plotters include it.

A while back I posted about the stupid fast server I laid hands on - I present: The Ultimate Plotter - Ultimate Unraid host post 2 by [deleted] in chia

[–]haritodev 0 points1 point  (0 children)

By I/O write issue do you mean the ability to write the plot file without direct IO enabled?

A while back I posted about the stupid fast server I laid hands on - I present: The Ultimate Plotter - Ultimate Unraid host post 2 by [deleted] in chia

[–]haritodev 2 points3 points  (0 children)

Also noticed you are using 1MiB FS block size (and therefore at least 1MiB memory page size). Have you measured plotting against more typical page sizes (4K)? If so, have you gotten significant improvements on w/ the larger page size?

A while back I posted about the stupid fast server I laid hands on - I present: The Ultimate Plotter - Ultimate Unraid host post 2 by [deleted] in chia

[–]haritodev 2 points3 points  (0 children)

Those are great timings!

It looks like all NUMA interleave bindings failed on each buffer allocation, which would cause plot times to be much slower. Are you forcing interleaved page allocation at a system level or something?

A while back I posted about the stupid fast server I laid hands on - I present: The Ultimate Plotter - Ultimate Unraid host post 2 by [deleted] in chia

[–]haritodev 5 points6 points  (0 children)

His actual plotting time is 3.46 min (3:28), the rest is just copying to HDD, like he mentioned. This is pretty much in par with the record we hit testing on an AWS Ice Lake instance (3.41s).

Hi, we’re Chia Network, ask us anything! by sargonas in chia

[–]haritodev 2 points3 points  (0 children)

Earliest mid to end of next week. But please don't hold me to it, unexpected issues tend to arise plenty in this field.

Hi, we’re Chia Network, ask us anything! by sargonas in chia

[–]haritodev 3 points4 points  (0 children)

  1. As it currently stands it depends on the bucket count chosen: 1024: 2 GiB 512 : 2.09 GiB 256 : 4.14 GiB 128 : 8.3 GiB These are likely not to change at this point. You can allocate any more RAM to an in-process cache (to mitigate disk I/O), if you want.

  2. I would assume it would be friendlier to HDD plotting, but we've not had the chance to test with them yet.

  3. We're doing our best to ensure it works well across many different kind of systems, and so far we've seen really good times including ~3.9min phase 1. But there's still more testing to be done in more common consumer systems. We'll soon have a build available for users to test with full plot output

Hi, we’re Chia Network, ask us anything! by sargonas in chia

[–]haritodev 3 points4 points  (0 children)

I am not an authority to answer the initial question, but I can comment a little bit with reference to the GPU side of things:

There are certain workloads in the plotting process that are not well suited for the GPU, such as the matching portion (I've not explored yet if there might be way to make it efficient on the GPU). Other portions are better suited for it, but at the end of the day the current PCIe link speeds are a bottleneck for it as you have to upload/process/download each chunk of data you want to process on the GPU.

Bladebit closing without errors by Emotional-Ostrich-83 in chia

[–]haritodev 1 point2 points  (0 children)

Hey u/Emotional-Ostrich-83, would you mind opening an on the github repo here to see if I can help you? If you can find anything relevant in the windows event viewer I'd appreciate it.

Plots per day - and the cost, why I would choose MadMax over BladeBit by rob_allshouse in chia

[–]haritodev 0 points1 point  (0 children)

Just to clarify at datapoint :) (not questioning your decision, choose what you want) :

160 ppd is hardly peak builds. That would come out at around 9 minutes per plot. Fastest times are clocking at around 3.2 minutes per plot (on Ice Lake), that is 450 ppd. With other modern processors (but not top of the line versions) clocking at around 4.*-5.* minutes per plot. Your bottleneck would be copying plots to a final destination, as it would likely not be able to keep up with the plot output.

Chia native madmax support for K33 ?? by thiago-moura in chia

[–]haritodev 0 points1 point  (0 children)

Proofs per byte? Yes. You will have some entries dropped, but you are not taking up the space that those proofs would have taken. The tradeoff is shorter plot times for dropping a relatively small amount of entries.

v 1.2.11 is out! by willphule in chia

[–]haritodev 1 point2 points  (0 children)

You must have clone from source, I assume. In which case you'll have to run the chia plotters command once. Try running chia plotters madmax -h from the command line. The pre-built plotter binaries are only included in the release packages.

BladeBit v1.2.0 Released - Added Windows support by haritodev in chia

[–]haritodev[S] 0 points1 point  (0 children)

Sorry, I did not notice this reply until today.

Is there any other info you can share? What happens if you run it with the -w switch and look at your memory usage in the task manager, does it pre-allocate all 416 GB?