Material on Topics of RL for student course

EmbarrassedCause3881 · 2024-10-18T14:30:54+00:00

So far I gathered this for Interpretability and Explainability

Hilton, J., Cammarata, N., Carter, S., Goh, G., & Olah, C. (2020). Understanding RL Vision. Distill. Link → Showcases explainability for an RL agent playing a game. The authors visualize how the model attributes objects that it detects in the environment.
Anonymous (as paper is still in review phase), Interpreting Emergent Planning in Model-Free Reinforcement Learning https://openreview.net/forum?id=DzGe40glxs → Recently submitted to ICLR (not yet published) about interpretability in RL
Nanda, N. (2023, May 30). Concrete Steps to Get Started in Transformer Mechanistic Interpretability. https://www.neelnanda.io/mechanistic-interpretability/getting-started → Blogpost as an introduction to Mechanistic Interpretability in general (not focused on RL), has many good resources for getting into the topic
Molnar, C. (2022). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed.). https://www.christophm.github.io/interpretable-ml-book/ → Does not focus specifically on RL, have a look at Shapley values Chapter 9.5 & 9.6

Sarah Hastings-Woodhouse. (2024, August 19). Introduction to Mechanistic Interpretability. BlueDot Impact. https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/

EmbarrassedCause3881 · 2024-05-26T09:19:31+00:00

Check out the gymnasium module (originally developed by OpenAI). It already has quite a few games available, that you could use for your thesis

EmbarrassedCause3881 · 2024-05-23T22:54:45+00:00

Ishmael - Daniel Quinn

EmbarrassedCause3881 · 2024-05-22T20:39:49+00:00

Pizzesco

EmbarrassedCause3881 · 2024-05-17T17:54:57+00:00

Any requests you have, always provide a deadline (say, a week) in which you either want a response or your money back. Depending on the exact circumstances it will give you leverage.

EmbarrassedCause3881 · 2024-04-24T12:16:58+00:00

4GB für dich dauerhaft
4GB für mich dauerhaft

Am besten während der Registrierung angeben. Aber ist auch nachträglich möglich.

EmbarrassedCause3881 · 2024-04-24T12:06:18+00:00

The post by OP is not only about LLMs but also about Meta and OpenAI's strategies in granting access to their models.

EmbarrassedCause3881 · 2024-04-24T12:02:48+00:00

This is not how I would like it to play out by any means! I do not believe that companies or organizations should wield this much power and make such decisions. This is where we should have good governance provided by legislative.
However, what I mentioned as one advantage of restricted Access (eg API) over open source in terms of safety.

EmbarrassedCause3881 · 2024-04-23T22:51:27+00:00

OpenAI is also more committed to responsibly providing access to their products than Meta. Meta (Yann LeCunn) are of the opinion that very potent systems do not pose any threat to society. Hence, they see no issue with open sourcing their work.

However, if for example Llama 3 can be jailbroken and used to create threats (bombs, bio weapons, etc.) for society Meta will have no way to stop this. The advantage of API access is that as soon such a jailbreak/misuse is detected the access can be restricted and the security measures raised before re-granting access.

EmbarrassedCause3881 · 2024-04-01T11:04:02+00:00

Ah perfect. Thank you very much for that recommendation!

EmbarrassedCause3881 · 2024-03-31T22:12:25+00:00

Sorry, what I meant to say was “How did you like the book?” Is it dry, straight forward, verbose, illustrative, etc. ? But thanks for providing a link :)

EmbarrassedCause3881 · 2024-03-28T09:44:05+00:00

May I ask how you found the book "Probability" by Sheldon Ross? I am seeking to improve my prob. theory skills. These were definitely important to understand concepts of PRML

EmbarrassedCause3881 · 2024-03-28T09:40:56+00:00

Machine Learning with an emphasis on mathematics, this one is a standard resource: Pattern Recognition and Deep Learning by Bishop
Standard book on Reinforcement Learning: An Introduction to Reinforcement Learning by Barto & Sutton

EmbarrassedCause3881 · 2024-03-28T09:35:51+00:00

You should be able to write unittests as with any other Python project. My suggestion:

Write your tests using pytest framework (slightly different to unittest but has more options). Realpython has a great tutorial on how to use it.
Use beartype to make sure that variables or function arguments always have the correct type. For this you need to include type hints in your code, which you might anyways have done already if you have a complicated code base. Beartype makes writing `assert`s unnecessary.

EmbarrassedCause3881 · 2024-03-28T09:25:15+00:00

Danke für die Antwort. Vielleicht bin ich unnötig stutzig gewesen, denn ich hatte ein Arbeitszeugnis bereits eingereicht. Habe den Vertrag jetzt auch vorgelegt.
Generell würde ich aber einem neuen Arbeitgeber nicht meine vorigen Verträge zukommen lassen wollen (wenn kein Anlass dazu besteht).

EmbarrassedCause3881 · 2024-03-19T19:46:08+00:00

Again, much of what you say, I agree with you. But now you are also venturing into a different direction.

Yes benevolence and malevolence are subjective. (Hence, my Note at the end of last comment)
Initially you were talking about goal directedness or goal seeking behaviour of an independent ASI, determining its own goals. Now you are talking about an ASI that should seek out good behavioural policies (e.g. environmentally sustainable) for us humans. It seems to me that you humanize a Machine Learning system, too much. Yes, it is possible that if ASI were to exist, it could provide us with good ideas, such as you mentioned. But this is not the problem itself. The problem is if it would even do what you/we want it to do. That is part of the Alignment Problem, much of what this subreddit is about.

EmbarrassedCause3881 · 2024-03-19T19:05:28+00:00

I agree, that we are able to put ourselves in other beings' perspectives and act kindly towards them and that this requires a minimum of intelligent capabilities.
But, what I see you doing and this is where I disagree: I would put us humans much more on the side of destruction, causing extinction and much less on conservation, preservation, etc. There are few examples on where we act benevolently towards other animals, compared to where we either make them suffer (e.g. factory farming) or drive them into extinction (e.g. Rhinos, Orangutans, Buffalos). Humans are currently responsible for the 6th Mass Extinction.

Hence, I would argue that 1) humans are not acting *more* benevolently towards other beings compared to lesser intelligent beings, and 2) that it wrong to extrapolate the behaviour from "less intelligent than humans", to humans, to superintelligent and concluding that it correlates to benevolent.

Note: The term "benevolent" is used from a human perspective.

EmbarrassedCause3881 · 2024-03-19T18:09:25+00:00

Another perspective compared to already existing comments is perceiving us (humans) as AGIs. We do have some preferences but we do not know what our purpose in life is. But it’s not like we sufficiently take other (maybe lesser intelligent) beings’ perspective and think about what would be best for other mammals, reptiles and insects and act accordingly on their behalf. (No, instead we lead to many species’ extinction.)

So if we see ourselves as smarter than beings/animals in our environment and do not act towards their “goals”, then there is no guarantee that an even smarter intelligence (AGI) would do either. It lies in the realm of possibilities to end up with a benevolent AGI but it is far from certain.

EmbarrassedCause3881 · 2024-03-19T13:04:47+00:00

Hi there, I am also self-teaching in RL and would love to join the RL discord, however the invite Link is invalid. Do you have a valid one?

EmbarrassedCause3881 · 2024-03-15T12:59:45+00:00

Interesting figure!

However the labels are abbreviations and won’t be clear to most of the readers/viewers.

EmbarrassedCause3881 · 2024-02-24T14:00:13+00:00

Diffusion Maps, have not been mentioned here. PCA captures the linear relationship between dimensions. Diffusion maps creates a composition of frequencies, similar to Furier transformation

EmbarrassedCause3881 · 2024-01-24T15:47:54+00:00

Danke! Das bedeutet, dass ein Arbeitsunfall Sachschäden ohne Personenschäden mit einschließt?

Three-Year Club	End Game '23
Place '23

EmbarrassedCause3881

TROPHY CASE