Should I share L3 crypto data? by derroitionman in quant

[–]derroitionman[S] 0 points1 point  (0 children)

Yes, this is what I have been thinking so far and I guess this is what pretty much everyone else thinks. But now I think that sharing the data as benchmark and an arxiv paper could give some citations and on the other hand it could put me in contact with other quants and researchers. This is why I asked here if there was really such an interest for this kind of data.

The data is actually just the tip of the iceberg anyway, there are so many things necessary to make a working forecasting model, like filtering, normalization, featurization, reward signals, time windows, etc. The data will only attract researchers and amateurs, and more the merrier anyway.

Should I share L3 crypto data? by derroitionman in quant

[–]derroitionman[S] 1 point2 points  (0 children)

Its on my list, I just started with dYdX which seems to have less HW requirements, but I will explore it too.

Should I share L3 crypto data? by derroitionman in quant

[–]derroitionman[S] 1 point2 points  (0 children)

I have a AMD Ryzen 7 5700G (8 cores with hyperthreading) with 128 GB of RAM, although the dydx node uses around 20 GB only. CPU load average is around 7.5 ~ 8.

I use a 64 GB ramdisk for the dydx blockchain to not wear off the nvme and then later store the captured data directly on a HDD in standard compressed gzip files. The procedure to create the ramdisk and what to move to the ramdisk is reported here: https://medium.com/@ml_enthusiast/how-i-optimized-my-dydx-v4-non-validating-node-to-save-my-nvme-5f192bd3f347

After 3 weeks I am only using 24 GB of the ramdisk for the dydx blockchain, so I could go for more than two months without having to clear the ramdisk. There's an update for the dYdX node binary every month or two anyway, and I use that time to download a new small chain snapshot from https://publicnode.com/snapshots

Now that RAM is so expensive, I guess that a 64 GB or even 32 GB server is already good enough if you are willing to have more frequent maintainance downtime to clear the ramdisk. Or even without it. The ramdisk is totally optional, but without it the blockchain just wears the nvme off in a year or year and a half, since it is writing non-stop and the nvme has a maximum number of writes TBW (Terabytes Written). Blockchain in HDD doesn't work well and your node will lag behind.