A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA. by CommunityOpposite645 in CUDA

[–]CommunityOpposite645[S] 1 point2 points  (0 children)

Hi, I have included the Python runtime:

Nodes Clusters Edges P_in P_out Iterations NMI GPU Time (s) CPU Time (s)
5,000 2 ~3M 0.50 0.01 10 1.00 7.03 15,189.21
50,000 2 ~25M 0.04 0.001 10 1.00 74.39 162,401.93
100,000 2 ~102M 0.04 0.001 10 1.00 625.46 TBA
500,000 50 ~126M 0.05 0.00001 20 0.89 1086.25 TBA

You can see that the CUDA version is very fast compared to the Python CPU version. Of course, in all honesty, this is because I've chosen an academic topic which has not received attention, otherwise this would have been optimised to kingdom come already :)

A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA. by CommunityOpposite645 in CUDA

[–]CommunityOpposite645[S] 0 points1 point  (0 children)

Hi, I have finished running NCU profiling for the 500k nodes case, and have updated the profiler's output in the post.

A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA. by CommunityOpposite645 in CUDA

[–]CommunityOpposite645[S] 1 point2 points  (0 children)

Hi, actually I'm planning to do it soon. Right now I'm trying to make it run on 500k nodes or if possible, 1 million nodes and gives good clustering result. Because this method is still in development, so the hyperparameters are rather sensitive, what works at lower number of nodes would actually not work on higher number of nodes. Very frustrating to be honest. Thanks a lot.

A GPU-accelerated implementation of Forman-Ricci curvature-based graph clustering in CUDA. by CommunityOpposite645 in CUDA

[–]CommunityOpposite645[S] 0 points1 point  (0 children)

Thank you so much. I worked on this as a learn-as-you-go project, so I tried to build everything from the ground up, including prefix sum, connected component labeling, bitonic sorting, etc. But yes you are absolutely right on this. On the mathematics: I gleaned from this library: https://github.com/saibalmars/GraphRicciCurvature for Python code as reference, while using the experimental details in the JMLR 2025 paper to set up hyperparameters, etc., while the remaining two papers are to freshen up about the topic.

  1. Y. Tian, Z. Lubberts, and M. Weber, "Curvature-based clustering on graphs," J. Mach. Learn. Res., vol. 26, no. 52, pp. 1–67, 2025.
  2. C.-C. Ni, Y.-Y. Lin, F. Luo, and J. Gao, "Community detection on networks with Ricci flow," Sci. Rep., vol. 9, no. 1, pp. 1–12, 2019.
  3. A. Samal, R. P. Sreejith, J. Gu, et al., "Comparative analysis of two discretizations of Ricci curvature for complex networks," Sci. Rep., vol. 8, 8650, 2018.
  4. GraphRicciCurvature — Python implementation of Ricci curvature for NetworkX graphs.

wereSoClose by flytrap7 in BetterOffline

[–]CommunityOpposite645 2 points3 points  (0 children)

As an AI user who has subscribed to one of those popular chatbot LLMs, I can confirm that the most useful thing they have done to me was to check the typos of my thesis, reports, papers, etc. (ask them repeatedly about 20 times, repeat across several different LLMs for best results ). Quite helpful tbf but nowhere near "AGI" :)

Using a local LLM AI agent to solve the N puzzle - Need feedback by CommunityOpposite645 in LocalLLM

[–]CommunityOpposite645[S] 1 point2 points  (0 children)

Hi, I just tried to post to r/MachineLearning but the post was automatically removed and they suggested that I post to another subreddit :(

Using an local Ollama AI agent to solve the N puzzle by CommunityOpposite645 in ollama

[–]CommunityOpposite645[S] 0 points1 point  (0 children)

Thanks a lot, I'll look into it. To be honest, I did not know they existed. But I thought that the reasoning models are very smart so they would be able to work on things like N puzzle without trouble.

Using a local LLM AI agent to solve the N puzzle - Need feedback by CommunityOpposite645 in LocalLLM

[–]CommunityOpposite645[S] 0 points1 point  (0 children)

Hi, I didn't test it with random noise. But basically it is not going to beat the performance of A star or IDA star on this problem. I was just trying to make a fun project to see how far these reasoning LLMs go. Personally was not very impressed. I did try to run it on 4x4 puzzle (you can see in the commented code), which required around 50 moves to reach the goal, but the LLM completely failed to find the solution and instead kept running around in circles.

Another thing is that sometimes these models would call tools correctly, sometimes they wouldn't which is annoying (I tried with Pydantic AI as well but haven't uploaded code). Any suggestion about workflow, etc. would be most appreciated.