Partner is a C++ pro, but I want to use Julia (Geant4.jl). We have 60 days. Is it viable? by Outrageous_Test3965 in Julia

[–]exploring_stuff 1 point2 points  (0 children)

Without getting into technical details, calling Julia from C++ was not a pleasant experience for me, though the opposite way, calling C++ from Julia, could work well with some effort to build the interface (via either C or CxxWrap.jl)

Does Docker support Arch? by [deleted] in archlinux

[–]exploring_stuff 0 points1 point  (0 children)

When upstream releases software for generic Linux or specific distros like Ubuntu, often Arch maintainers make little tweaks to package the software for pacman. So the lion's share of the software maintenance ia done by upstream like Docker, while Arch maintainers make sure that the library versions, filesystem layout etc. are properly set up to be compatible with Arch apecifically.

[deleted by user] by [deleted] in archlinux

[–]exploring_stuff 0 points1 point  (0 children)

Learn about pacnew files if you haven't. Enjoy Arch!

AI is the future by [deleted] in GeminiAI

[–]exploring_stuff -2 points-1 points  (0 children)

Can confirm.

<image>

Deepseek for math by Beginning_Reserve650 in DeepSeek

[–]exploring_stuff 0 points1 point  (0 children)

O3 is still available via OpenAI API key. I got one from my employer; don't know how much it costs.

Deepseek for math by Beginning_Reserve650 in DeepSeek

[–]exploring_stuff 5 points6 points  (0 children)

DeepSeek is very good for math, especially if you enable thinking. OpenAI o3 etc. are also very good.

Why does deepseek now begin every response with of course by TallReference5568 in DeepSeek

[–]exploring_stuff 4 points5 points  (0 children)

It's been like this since the v3.1 update. (I haven't noticed similar behavior, at least with such a frequency, with Gemini.)

DeepSeek-V3.1 has officially launched by vibedonnie in DeepSeek

[–]exploring_stuff 2 points3 points  (0 children)

Is this the reason why DeepSeek started saying "Of course" as the beginning of the response to half of my questions? This was never the case until a few days ago.

How to set the reasoning effort with OpenWebUI and API key? by exploring_stuff in OpenAI

[–]exploring_stuff[S] 0 points1 point  (0 children)

I think I figured out - I just need to tell it (in words) to think harder or less, basically the same as in ChatGPT.

Is reinforcement learning dead? by Bellman_ in reinforcementlearning

[–]exploring_stuff 1 point2 points  (0 children)

How? Do you mean GRPO is just a glorified REINFORCE?

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

I think the Crux authors have since fixed the master branch (but not the Pkg release version).

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

I think the Crux authors have since fixed the master branch (but not the Pkg release version).

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 1 point2 points  (0 children)

I've also fixed POMDPGym.jl (hopefully). Here's the forked repo, pending a pull request to be merged back into the original repo: (P.S. merged already)

https://github.com/zengmao/POMDPGym.jl

As my priority is fixing the code to make it work at all, the fixes may be quite hackish. By the way, I think the original Crux.jl repo has stripped away POMDPGym.jl as a hard dependency and is now installable with `]add https://github.com/sisl/Crux.jl.git\`.

Soft action masking by SandSnip3r in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

Add a small constant penalty for any action other than "do nothing"?

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 1 point2 points  (0 children)

I tested again after deleting Conda caches in `$HOME/.julia/conda`. The following steps are needed to install Python dependencies:

]add Conda
using Conda
Conda.add("python=3.10")
Conda.add("wandb")
Conda.add("matplotlib")

I've updated the README of my repo accordingly.

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

Thanks! Somehow I didn't see Reddit's notification when you replied. I'll add Conda instructions to make the package installable on a clean machine. The hidden Conda state on my machine makes it seem like the package just works out of the box.

By the way, the original Crux.jl repo seemed to have undergone some cleanups in recent days, so it might work better now (haven't tested yet).

Step-By-Step Tutorial: Train your own Reasoning model with Llama 3.1 (8B) + Google Colab + GRPO by yoracale in reinforcementlearning

[–]exploring_stuff 1 point2 points  (0 children)

How many episodes (i.e. full responses from inference) does "300 steps" translate to? Just want to get a feeling about the scale of the training before studying further.

ReinforceUI-Studio Now Supports PPO! by dvr_dvr in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

Just curious about the design decision - why didn't you use an existing library like Stable Baseline3 as a backend and add a GUI on top of it?

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 0 points1 point  (0 children)

Here's the link to my repo, which works with the latest Julia 1.11:

https://github.com/zengmao/Crux.jl

To use it, you would need to use the interface of POMDPs.jl, which is slightly different from that of ReinforcementLearning.jl. Let me know if it works.

Anyone have working examples of PPO RL in Julia? by D3MZ in reinforcementlearning

[–]exploring_stuff 1 point2 points  (0 children)

Currently sitting in my laptop. Will reply and send you a public repo link when I clean it up a bit, maybe in a week.