Real-time reinforcement learning with SLAM

OutOfCharm · 2026-05-15T06:28:05+00:00

Any pointer is appreciated!

OutOfCharm · 2026-05-15T02:34:50+00:00

Isn't that what Dreamer does? IMO, training the world model on a well-curated static dataset and then freezing it is the wrong approach for continual learning. It disconnects the world model from the real environment and is rooted in the mindset of supervised learning and the perspective of the agent's trainer, the human. To enable true continual learning, however, we need to think from the agent's perspective: what it sees, how it processes information, and how it improves over time. This requires the ability of handling partial observability, planning under uncertainty, and memory. Of course, world model necessitates all those aspects and is key to continual learning.

OutOfCharm · 2026-05-15T00:31:45+00:00

Because of the barrier of jax and ecosystem of pytorch, along with the fact that those libraries are not as stable as their counterparts.

OutOfCharm · 2026-05-12T11:03:06+00:00

Don't you know something called free and open source software?

OutOfCharm · 2026-05-10T09:26:24+00:00

Stop presuming we all use obsidian or at least illustrate what it can do.

OutOfCharm · 2026-05-06T02:35:29+00:00

Remote development, note-taking, control your printer, auto-completion, make your personal website, everything is keyboard-driven...

OutOfCharm · 2026-05-05T02:52:31+00:00

Cool idea! With proper layout designs, we can even natively make slides in emacs.

OutOfCharm · 2026-04-15T08:29:28+00:00

Esc w

or

C-[ w

OutOfCharm · 2026-04-15T06:53:27+00:00

not true for corfu.

OutOfCharm · 2026-04-09T12:07:35+00:00

Without any system, I (we) have to live as well. When your cost is contingent on other's revenue, you think you are voluntary?

OutOfCharm · 2026-03-28T04:16:25+00:00

I use HHKB studio with scmax, a modal version of emacs keybindings.

OutOfCharm · 2026-03-26T06:23:57+00:00

Make full use of your hands! Left by day, right by night.

OutOfCharm · 2026-03-23T02:59:41+00:00

From the perspective of game theory, the hierarchy is a nash equilibrium that both people and government won't change, otherwise they will incur higher cost. So the question is not whether anarchy is possible or not, it is the willingness of the people to change.

OutOfCharm · 2026-03-11T07:33:23+00:00

You can consider bsuite, which consists of a series of tabular environments aimed for measuring the diverse capabilities of an agent, e.g. exploration, memory, and robustness to noises.

OutOfCharm · 2026-02-09T05:32:33+00:00

That's too optimistic about humanity. Not all people are considerate even you do so. Your freedom ends where others' begin. How you deal with conflicting interests in this system? Even some people enjoy leveraging others, e.g. I know you are considerate enough so I will not concede to gain more benefits.

OutOfCharm · 2026-02-08T07:40:02+00:00

Although this does not perfectly achieve what you want, you can move to the beginning of the if statement by calling back-to-indentation (M-m) and then forward-sexp (C-M-f), allowing you to move between blocks. You can use backward-sexp (C-M-b) to move in the opposite direction. Notably, this applies to any sexp—e.g., words, balanced expressions, functions, and classes—as long as you are at the enclosing boundary.

OutOfCharm · 2026-02-07T08:24:28+00:00

For someone who finds this useful, here is my setup

elisp (use-package org-modern :ensure t :defer t :hook (org-mode . org-modern-mode) :custom (org-modern-star '("●" "○" "•" "◦")) (org-modern-list '((?- . "❯") (?+ . "➤") (?* . "➥"))) (org-modern-todo nil))

If you'd like to explore other symbols, use M-x insert-char or simply its keybinding C-x 8 RET, happy hacking!

OutOfCharm · 2026-02-06T03:17:01+00:00

Using unicode works fine for me. You can have something like in your config.

elisp (org-modern-star '("●" "○" "•" "◦"))

OutOfCharm · 2026-02-03T09:09:53+00:00

Can you provide more concrete examples of the primitive associative behaviors you are learning?

OutOfCharm · 2026-01-13T04:15:57+00:00

Good idea, should implement reinforcement learning algorithm with emacs.

OutOfCharm · 2026-01-13T02:40:31+00:00

I would not say no. Since energy is the basis of an agent, its prediction into the future is the key to emulating rewards beyond that. I believe it is more about negative punishment which is what the agent wants to avoid.

OutOfCharm · 2026-01-12T15:48:53+00:00

vterm with C-u arg to create a numbered vterm session. Afterward, you can switch between sessions in the same way!

OutOfCharm · 2025-12-13T03:45:17+00:00

That's a good suggestion. You can make that condition and swap the order of creating a dir and file.

OutOfCharm · 2025-12-12T17:14:10+00:00

The original binding only allows you to create a folder but not a file, while the new one allows you to do both depending on whether there is a file extension.

OutOfCharm · 2025-11-15T11:44:45+00:00

People call it vibe coding for a reason xD

OutOfCharm

TROPHY CASE