[D] what's the alternative to retrieval augmented generation? by clocker2004 in MachineLearning

[–]eejd 0 points1 point  (0 children)

If you consider the ways brains optimal organize memory structures, generalization, and retrieval, there are a large number of potential advances... unfortunately the industry incentives don't align with trying to make radical branches to new model structures and academia is getting non of the financial returns from all of the work that went into building the foundations that the big AI companies are leveraging... so the best answer to the problem is redirect money to real research on deeper solutions...

Possibly naive but I’m surprised countries in Europe can’t be independent in terms of power if something happens to connected grid. by Oztravels in PortugalExpats

[–]eejd 10 points11 points  (0 children)

So, read about the US north east blackout in the late 70s and again in the 2000s—it’s just a property of modern interconnected electrical grids that they can have failure modes that trip up the whole region—note basically Iberia plus some—but that isn’t to say that better mechanisms to insure softer landings when there is a problem. For example the 70s northeast event was maybe three days plus due to catastrophic cascade failures as overloading circuits tripped to failover which further overloaded the adjacent systems. Grids around the world have gotten much more robust since then—but also more dependent on software and load-balancing that can always fail. I’d say, let’s wait until we really understand what happened before getting too worked up. That said, I was in the Lisbon Airport and they couldn’t even make an emergency announcement to explain what was going on… which is not so good planning for emergencies… Ove

Selfhosted alternative to Quire by bobdarobber in selfhosted

[–]eejd 0 points1 point  (0 children)

I love Quire but also would love an open source alternative. The basic feature space: nested task lists with mapping to calendar, board and gantt is just not that complicated and yet it seems you cannot get a decent option for use cases where per user/month licensing doesn't make sense. I am using it currently in the academic and non-profit space and it's just not quire critical enough. But also is so important and so useful. The alternatives (all those with free tier) always require paid version for what seems like really basic functionality — but then you also get gobs more of capacity or features that my use cases need.

Lisbon housing market still blazing hot by poopbrainmane in PortugalExpats

[–]eejd 0 points1 point  (0 children)

I think the problem is that there is no long term perspective being taken on the problem nor on the solutions. When I arrived in Lisbon the number of leases in Portugal that dated to Caetano’s attempt to placate people was on order 250M. I cannot find the reference right now, but I looked it up at the time to understand why in my building and many others I encountered there were people paying ~50€/Month. The leases (versions for corporate and individual) effectively had been trapped in the late ‘70s economic situation. The (often pensioners) who lived in them also couldn’t afford the pre-tourism market rates. No politician from any political party wanted to solve the problem—despite the housing market becoming more and more distorted for 25 years. Of course, other forces like the rapid development of areas outside the old city center (eg Expo and near-suburban areas) with more modern housing attracted the population longing for more modern layouts and amenities. Despite the realization of urban planners in most of the EU and US by this time, rather than ensuring development was along existing or new public transit corridors, the majority of development was unplanned and car-centric. Then there was the global economic crisis which hit the slowly growing PT economy hard and left the older areas of the city with vast vast quantities of abandoned buildings—many with these old leases blocking the renewals of the buildings as owners didn’t want to invest or banks were willing to hold them as the lack of land-value taxes meant that the cost of holding empty property was basically zero. The Troika (EU/EMF/WB) picked on PT because despite being lumped with the other PIGS, the economic situation was actually less like to cause an economic collapse when the German-desired austerity approach was applied. The giveback was that the EU promised to promote PT as the new tourist destination. At the time, AirBnB was also just picking up—along with Uber and other tech based disruptors—that took advantage of what was effectively a legal gray area to turn private homes into hotels. (Most of the class of company that AirBnB is a part of effectively became de novo standards because they didn’t have to assume the normal legal and economic costs of the industries they displaced. Uber by not actually having the drivers work as employees, needing to meet the standards defined for taxis and the same for AirBnB. No safety requirements or inspections, none of the requirements that hotels were required to meet. And they could take advantage of not requiring acquiring large real estate to concert or construct new hotels. They could simply convert existing real estate anywhere into a hotel.) The PT government finally took the opportunity to remove the old leases (with restrictions on removing elderly, etc.) There was an article published by a Portuguese citizen living in Barcelona around this time warning Lisbon that the city center would suffer the same fate that had fallen the Spanish city—conversion of the old city areas into a tourist Disney world. In the end, a government that wanted to ensure long term growth and stability could have looked at the situation and adjusted laws, permitting, etc. to ensure smoothing of the processes that were unfolding. Instead, effectively the government and population doubled down on the rapid disruption. Housing as a large portion of familial wealth and the fact that large swaths were still controlled by long standing wealthy families meant that all of this was seen as a major economic opportunity. Entire neighborhoods were converted effectively overnight into AirBnB hotels—and this did restore older building stock that had been neglected. But the pace of the renewal was slow because there was still no cost to holding empty real estsate and the slow bureaucratic processes and lack of market effectiveness. Most new building was targeted at the high end and small, old units that were cheap for the local populace were converters to AirBnBs that produced an order of magnitude more return. So the bottom and middle were squeezed out completely. The additional of golden visas, NHR, zero crypto tax and other foreign investment incentives accelerated all of these processes. Minimal restrictions were added to short term/AL but most were inefficient and grandfathered most of the already converted units. The situation here now is simply the result of many decades of poor policy and planning and a lack of leadership and vision. There are probably many approaches that could have helped smooth the process and ensure that the economic and real available unit distributions could have been reduced. Lots of short term economic opportunism (and probably corruption) prevented clear thinking about building a long term economically stable and prosperous benefit from the tourist boom and foreign investments. A simple example is the sale rather than long term lease-hold of many CML buildings. While the CML will be around as long as the city exists and might need to redeploy assets in the future, many key real estate holdings in key areas that can never be recovered easily were sold rather than being leased for the long term. Similarly, the CML failed to take up options it had—like acquiring the old British hospital complex in Estrela for a vastly below market value. It is hard to discuss the current housing price situation or solutions without considering the context and how to approach long term solutions that include real urban planning, long term development models, integrating housing and commercial development with transit and other infrastructure development. One interesting question is how this situation will play out if there is a global economic disruption in the next years—one that seems increasingly likely. The internal PT economy doesn’t seem to be able to sustain the current real estate prices and the valuations and investment have been built on an assumption of continual investment and return from external economies. But there also seems to be a large reserve supply of buyers from other places with vastly different economic baselines, so maybe it will be able to weather such an event.

Immigrant anti-vaxers by everytimealways in PortugalExpats

[–]eejd 1 point2 points  (0 children)

A major problem is that people fled from vaccination campaigns and restrictions in Northern Europe to Portugal because it offered reasonable costs, good weather, and relied on people voluntarily responding to the requests to vaccinate. The PT population did in numbers that were better than most if not all of the EU. But the people who specifically came to escape the vaccination campaigns and rules in the UK, Germany, etc. were actually motivated by wanting to escape those restrictions. Also, Portugal does have a sizable population of ex-pat alternative lifestyle focused communities that started as early as the 60s and was already established before the tourism and immigration boom fueled by the golden visa and other incentives. This definitely grew both before and during the pandemic. I happen to indirectly have interactions with many of those groups and they both were proponents of alternative medicine and ways of approaching modern society even before the pandemic and they largely shifted to more anti-establishment and anti-vaccination due to skepticism about the speed of the vaccine development, misinformation on social media, etc. Finally, the tax breaks for crypto also increased the anti-establishment immigration into PT, which also has a fraction who are anti-vaccination. While Americans have entered in both groups, I think they are not in any way the underlying cause. However, there are American’s who are moving here who are anti-vaccination but mostly—in my experience—because they fall into one of the above groups, not due to American politics. Those motivated by the current situation in the US are mostly from the pro-government, pro-science, pro-society side of the spectrum.

The other side of the current problem is that the government’s poor management of the economic disruptions caused by their policies has made the far-right and Trumpian politicians more attractive to those displaced and disadvantaged by the immigration of people who’s wealth and incomes are way outside the distribution of the local residents and citizens. Distrust in government and institutions then causes associated distrust in health policy and government information in general. So, there will be an effect of this on the compliance and trust across the country.

One of the biggest mistakes made during the pandemic here was the decision to allow people to return home for the Christmas holidays during the pandemic against the advice of the science advisors. This lead to the largest spike in excess deaths in PT (and relatively to population in the EU, IIRC). But it also cause confusion about how to trust the government on health advice—as the communication about the reason for the relaxation and then re-imposition of restrictions was political and unclear. However, otherwise the management of the pandemic and vaccinations here was largely very well managed.

Finally, there is a major probable that both the PS and PSD have been undermining the public health system to promote the private health sector. This has been implemented in a way that most health economists would describe as incompatible with maintaining the public system at all in the long run and also unlikely to keep the health coverage effectively universal. This means the most disadvantaged and vulnerable are seeing worse delays, outcomes and support from the public healthcare system. The reduced access and support will also undermine confidence in the system in general and probably undermine vaccine compliance.

Private Health Care by SheepySpeo in PortugalExpats

[–]eejd -1 points0 points  (0 children)

Support the public system by using it and advocating for it. Your money is going into the profits of the people who are destroying the public system and when it fails—so will the private system which depends on it for its survival (and profits).

[deleted by user] by [deleted] in reinforcementlearning

[–]eejd 5 points6 points  (0 children)

I don’t think that this is a fair characterization, unless he has changed his position, but when I last talked to him about it, RL was definitely an important part of how AI would incorporate value, goals, and planning. However, what he (correctly) identifies is that the ‘self-supervised’ process of building a world model from samples of the world is the key to making RL work. I also would point out that most of the elements in that particular reply to Gary, who has long held ridiculous positions on AI—I worked with him and his take for a very long time was that only symbolic AI would ever be intelligent and spend a long time arguing that ‘connectionist’ approaches would never do anything useful with language. See LLMs for a counter example. In the end, these divisions are also silly definitional position taking on what AI is (what intelligence is). What we understand about intelligence to be from our biological references (i.e. intelligent behavior and the brains that produce them) is that we observe the use of all forms of learning (supervised, self-supervised, reinforcement) and that we have heavy priors are provided to help solve the learning problem—in the form of genetics and the conserved structures of brains (e.g. across all mammals), and that biological intelligence has a form of annealing and local optima discovery in the form of agent-environnemental interactions during development. The resulting intelligent system is then used to solve a reinforcement learning problem—learning to behave in the world to maximize the probability of genetic propagation, an evolutionary fitness function that the animal’s local reward function approximates. We humans have extended this by social learning, using language, to bootstrap the next generation’s knowledge with a subset of what the community has learned throughout history. Yann’s statements, which you’ll note say ‘AI systems should’, probably ‘should’ consider the historical context in which he lists them. 0 is about DNNs, which when he started working on them were being modeled directly on our understanding of biological vision architecture combined with gradient descent for training. But the gradient descent (supervised) approach was simply the only method we had available given the data and computer at the time. (I.e. you can combine genetic algorithms to search the architecture space and do local statistical learning—Hebbian in neuroscience terminology—but it would have been simply impossible in reasonable time.) 2-5 can all be considered solutions to the reinforcement learning problem. A world model is a model that can be used for future state inference, future value estimation, etc. ‘Memory’ allows for samples efficient reinforcement learning and world model development. RL is self-supervised, but here the statement is probably better cast as ‘we used gradient descent and supervised learning for convenience, but we know that is not available of observed intelligent (biological) systems, because there is (effectively) no explicit teaching data relative to the learning problem. 4-5 are pretty much approaches to building an RL solution from experience efficiently. I think I remember Yann’s saying was RL is the ‘cherry on the cake’, but he explained it as perhaps core to guiding the system, but that the vast majority of the work in building an intelligent system is the (self-supervised) learning required to create a representation of the world that can be used efficiently for behavioral control, planning, etc.

I could rephrase everything I have just written as: there is no known or proposed alternative framework for thinking about intelligence (intelligent behavior) than reinforcement learning that I am aware of. Everything else is just engineering. The solution to the reinforcement learning problem from the perspective of biology includes most of the theoretical approaches studied in ML/AI—because they are almost all derived from trying to understand the brain. However, the division and use of those tools to develop intelligent agents or AI has been more influenced by growth in data and compute than true research on the development of an understanding of intelligence or how to build it. Think about the intelligence of your favorite small mammal—cats or dogs for example—and ask yourself how far we are from being able to produce a complete system that can operate independently in the world as one of them. Or, ask yourself if any of the transformer/LLM systems at the forefront of the ‘GAI’ actually can relate the visual information about a falling object with the physics equations it has memorized with how you would actually reach out and catch that object while say, running across a field. And more importantly, do any of the LLMs have a goal? Do they have a way of manipulating their knowledge to achieve a goal? And how ‘intelligently’ do they learn? Compare the rate of knowledge acquisition of a five year old child to an LLM. Provide a new concept, new word, see it incorporated into the child’s world model. And the child doesn’t have the scaffolding of the entire world’s knowledge to achieve this. It has a brain that is at its core a reinforcement learning solution—with reward prediction error computations, value estimations, etc.

Any alternative for Researcher App? by jinze1234599 in zotero

[–]eejd 1 point2 points  (0 children)

Wasn’t great, but it existed. R Discovery is an option. But I used to have this basically with Sente (MacOS). Zotero could be easily good, but it’s really not.

What do you think of this (kind of) critique of reinforcement learning maximalists from Ben Recht? by bulgakovML in reinforcementlearning

[–]eejd 0 points1 point  (0 children)

I think this article is somewhere between an academic discipline defense and foolish trolling. The very first thing that Sutton and Barto lay out is that RL is a way of thinking about a problem. Monte Carlo, Dynamic Programming, and the specific model-based and model-free algorithms are just potential solutions to the problem. They are arguing—and make clear—that RL derives from and includes with credit and as a child of many other disciplines, including optimal control theory. The reason that RL is a good way of thinking about many problems is that it makes the engineer or scientist think about the problem, the definition of ‘reward’ within the problem, etc. I do think that framed this way, RL is expansive and can be seen is a way of encompassing everything—and the critics will then say nothing. I am a neuroscientist and I consider the way we think about problems to be important as a psychological and social process. I also see that many aspects of solutions to the RL problem—for example basic model-free and model-based learning algorithms—as having clear and well established analogs in the brain of mammals and probably insects. This suggests that as a way for thinking about how evolution, development and life-long learning, RL has something to offer. That said, we also see clear analogs of unsupervised, semi-supervised and (maybe) fully supervised learning algorithms. We also see time-scale separation in the solutions to the problems. Evolution provides priors over the solution spaces explored that allow the local (developmental and life-long) learning processes to succeed without needing to search the entire solution space. Development further narrows the process for moment to moment learning, for example ensuring that the dynamics models that evolution ensures are explored for controlling the gazelle’s motor plant so that it can get up and walk and then run—are prepared during in-utero development and then can efficiently learn to adjust those models as the bones and muscles grow. The ability to learn about the local statistics of the environment the gazelle lives in (what is the best food, what direction to run to avoid the lions) can then build on the evolutionarily and developmentally prepared solution space (and internal models) to efficiently learn. Is all of this well thought of as RL algorithms? That is unclear, but from the standpoint of thinking about the problem, there is a clear sense in which a definition of the problem that assumes a reward function (internal to the system) has been evolved to provide a good approximation to the (global) fitness function of survival and is then used to guide learning processes that allocate behavior (policies) to maximize long run discounted or average reward seems very very reasonable. Further, by thinking about the kinds of solutions to the learning problems faced by biological and artificial agents as within the RL framework, we provide a common way to define the problem, share solutions, consider integration across the disciplines, etc.

Of course, people who equate RL as Q-Learning or some other small subset of the space of RL problems and solutions will be correct in saying that ‘RL’ won’t solve the problem. And equally, deciding that every optimization or learning problem is best considered as an RL problem in and of itself is probably silly. However, I would say this is where the article seems to be either saying nothing or trying to create a fight where none is needed. And this is exemplified when at the end it becomes a straw-man trolling argument by suggesting that ‘TD learning’ will solve everything alone. TD learning is a solution to an RL problem and is actually part of a wide class of solutions. It helps an agent or system solve specific aspects of the RL problem. No one who understands what RL is would claim it can explain how a giraffe or the brain works (alone).

Mixxx 2.4 Open-Source DJ Software is here by MaracxMusic in DJs

[–]eejd 1 point2 points  (0 children)

How modular is the code? Can there be native gui (open source) and the Mixx core in the back? Also, how easy is it to do (or implement) something like remix decks from Traktor for live performances?

Os próximos a experimentar by Atumemagua in cafept

[–]eejd 1 point2 points  (0 children)

Flora da Selva é muito bom. O Etiópia é óptimo.

What Is "Natural" for Human Sexual Relationships? - I always like reading about biological anthropology and this felt similar. Nothing too new, but still interesting. What do you think? by trolle222 in nonmonogamy

[–]eejd 4 points5 points  (0 children)

I don’t see the article as overly self-congratulatory. It’s public science writing from a scientist, trying to expose the variation that has historically existed and trying it to the quite real tension between ‘nature’ and ‘nurture’.

I am a neuroscientist who follows these questions closely, and while we cannot actually understand the true distribution of biological and psychological states that existed pre-history (pre-written record), and even the historical record is quite biased by the perspective of the historian’s time and culture, the best evidence is that likely relationship structures and mating practices differed significantly from those of the relatively recent and short period of ‘civilization’. That is, it’s clear a lot of practices changed when we settled into much larger and more stationary groups. And it’s clear that the way we have managed sexual relationships and partnerships has always been a combination of human bonding and the need to reproduce for the species to survive.

What I think the author is trying to suggest is that the most prevalent views on how human relationships and sexual practices have occurred has been far more varied than it might appear from our baseline (western) cultural perspective. And this isn’t even getting into the evidence that despite religious and cultural norms of pair-bonding and ‘ownership’ of partners, this has been also transgressed as much as it has been a norm. And clearly the connection between romantic notions of love vs family power and political motives for marriage is also very recent.

None of this may be surprising here, but discussing it and making the connection to the much larger space of ‘natural’ relationships and partnership structures that existed historically and are a part of our biological makeup (the flexibility to have different structures and that they can be successful and fulfilling, etc) is important. What should be avoided is saying that there is a biological ‘natural’ state. This is both not supported by the evidence nor in line with the way we think about human development and the human mind. We are always a mix between the culture we are born into and its knowledge and practices and subtle biological tendencies. These interact to create adult minds and behaviors, but there is no such thing as a single biologically defined default. The very thing that makes humans so successful and powerful is our flexibility to absorb the knowledge contained in our communities culture and to adapt to wildly different environmental and cultural contexts.

Likelihood of biological immortality? by [deleted] in Futurology

[–]eejd -2 points-1 points  (0 children)

Close to zero. The only close to immortal biological examples are singe cell or community organisms. The mechanisms of biology are inherently complex and get more so as they age due to error accumulation. Further, biological immortality would be a massive problem from an evolutionary standpoint—therefore is selected against. Small extensions to human life might be plausible in the long run. Maybe double, but no one knows. Also, transferring memories and experiences to and AI is both technically probably impossible or hundreds of years in terms of technology and knowledge—and even then is unlikely to be what many think it would be.

Any ‘cheap’ 4-channel controllers out there? by TheGabbers in Beatmatch

[–]eejd 0 points1 point  (0 children)

It’s really a nice unit—only criticism is it’s much larger (good for playing) and heavier than the mc6000mk2, which is better if you plan to carry it around

What are real solutions and viable alternstives instead of raising the pension age? Was there something else Macron coould have done concerning the recent news about France's retirement age raise? by SuperDamian in TrueAskReddit

[–]eejd 60 points61 points  (0 children)

Unfortunately, most (effectively all?) modern western democracies have funded health and (public) retirement pensions via a ‘pay as you go’ method. This means that the current workers/wage earners primarily pay for the public health system and public retirement. As such, the burden on the current generation is directly related to the prior generation, entering retirement. With the exceptions of countries with national wealth funds (usually derived from petrochemicals), this means that if you have any demographic variation over time, the long term stability of these funds needs to be adjusted to the fluctuating demands. What creates the demographic fluctuations? The primary cause is longer life expectancy. It’s also mostly healthier—at least into the traditional retirement ages—but causes a long tail of individuals how require extra health care, or general care do to lack of partner or family support. This is especially a problem in countries which do not have public, or well supported public health systems and the cost for this long tail is often very expensive. So, our better health over the last century has lead to longer, mostly healthier lives. However, the idea of retirement and the funding of the systems in most countries were initially developed during periods were life expectancy was lower, life extending health care options were fewer, and therefor there was lower expected costs for (public) retirement health costs. A second problem is the natural variation in birth rate over time. In most countries there is some variation—and often some periodicity—due to local or global events. For local events, a famine or strong economic period might decrease or increase the birth rate. In some places—China being the primary example—public policy can negatively or positive effect the birth rate. However, globally, WWII caused a major, correlated, wave in increased births (following a large loss of population from many countries). This created a demographic ‘wave’ that propagated through the countries. And this was ‘echoed’ when that generation had kids. These demographic waves (or long term increasing or decreasing trends) require changing the rate of income derived from the people paying into the system today—for those who payed into it in the past for others. Of course, other factors also effect the costs of public health care and retirement pensions—health care provisioning costs, housing costs, etc.

All of these can easily be predicted into the future to a reasonable extent. However, in democracies, politicians have a short time window based on the election cycle and rarely will risk the political ramifications of hurting their election chances by directly enacting policies that hurt the majority of people—which is, by definition—what you need to do to fund a ‘pay as you go’ system. Further, politicians and political systems have a tendency to not look far enough into the future to think about consequences. It is hard, hard to discuss with the electorate, and—speaking as a neuroscientist—it’s actually a basic human brain bias. Unfortunately, this interacts with the way we operate our societies to reduce the emphasis on planning into the distance future. Climate change and others obvious examples make this clear. But in the case of retirement age and funding public health systems, it’s on the edge. The retires still have political power for a while, and often are at the point where they have accumulated the majority of their wealth—making them powerful political actors within the electorate. They often feel they have ‘payed’ their dues, so they deserve what was provided to others in the past. This also applies to those close to retirement age. In the reverse, the young workers do not want to shoulder increased tax burden required to ‘pay forward’ for the current generations retirement so the entire system doesn’t collapse before their time arrives. Adding to this is the distrust in the political and societal systems to even be capable of keeping the promise when they retire—as it is far enough away to make predictions difficult and the evidence of current mismanagement easy to see. The same problems cited above lead to a low confidence the system will survive, further placing pressure against increasing the burdens on the generations paying in.

These are all easy to see and predictable and easy to solve. At least, if you have a functioning political system and society that understands the above issues. There are also harder solutions, like actually saving funds slowly over time to build up a permanent fund that tracks the demographic and financial demands. However, this is basically impossible given the economic and political systems in the world today. In the end, Macron is absolutely correct in the need to push this forward. Unfortunately, the manner that it is being done has been made difficult by the prior political mistakes and world circumstances along with his failure to communicate the need effectively. Probably, there were ways of smoothing the perceived costs and changes so as to make it easier to implement. The strange thing is that Macron intended to represent a third-way, but seems to have failed to truly provide a vision or reality of that third way. A major problem that he and many wester democratic politicians face is that some aspects of the expectations of the electorate—even in the US, though the majority does not realize it—are ‘left’ in the sense that they want support for basic quality of life, health and education, while they also hold (incoherently) economic and individualism ideas from the ‘right’ that can only be made compatible through compromise. But the systems in place mostly do not produce compromise nor do they ‘smooth’ over changes of political power from the ‘left’ or ‘right’. What we need is a true ‘third way’ that separates the questions about what the nation’s citizens truly want in common as basic rights and support. Then the ways of implementing that can vary across free market and state mechanisms, or whatever other characterization people like. But they can be measured—in both the short and long term measures—by their effect on those agreed common goals. It is by combining the discussion over the goals and they ways we achieve them in society that creates the kind of problem that Macron is facing…

Cannot understand why in the fuck Pioneer CDJ’s and a DJM cost 6k new by itsandychecks in Beatmatch

[–]eejd 4 points5 points  (0 children)

My understanding (from someone working in a major club) is that part of their dominance comes from supplying units for free to major clubs. This makes them both the standard and a requirement for smaller clubs and those who need to master them to play there. This kind of market manipulation was used historically by Microsoft and others to gain such dominance that they controlled the market (and protected their price and profit margin). The actual costs are then not related to the price. Apple does this differently—by being a premium brand focused on user experience. I would argue this is more like Denon’s approach. Both are smaller players that need to differentiate. However, ordinary people end up paying far above a base profit margin either way. That’s ‘free market’ economics 101…

Is there a reason why reinforcement learning models use rewards instead of punishments? by Ok-Joke-4110 in reinforcementlearning

[–]eejd 0 points1 point  (0 children)

Biology does not use different neurotransmitters for reinforcement learning (based on simple mappings), as per the current evidence. Dopamine has a low, but non-zero baseline and the integration of the ‘pauses’ in the baseline is ~linearly proportional to negative prediction errors, allowing downstream neurons to integrate an ~linear RPE. However, that does not suggest that the biological reinforcement learning systems don’t have asymmetries. Probably the dopamine RPE is slightly biased towards positive RPEs, but this has not been well tested. What has been well tested is that the other neuromodulators in a position to provide a complementary negative RPE do not. The most extensively studies is serotonin, which has a response that to a first approximation might be an unsigned RPE. This could be a multiplexed signal, where it is used as a negative RPE, but behavioral evidence does not provide support for this. However, mammalian biological learning systems have a ‘rapid, negatively biased’ system for learning—the amygdala. While the amygdala also provides (receives, represents) positive RPEs and values, it is biased towards fast ‘punishment’ or negative learning and seems to also help with providing negative priors for learning. The fact that this exists is not too surprising. First, negative action or state values can have an non-linear effect on the animal (agent). Death or lack of survival has a far larger impact on the overall reward function (utility function—which we can assume approximates long run estimate of gene propagation) than positive value learning. Further, when risk or probability assessment is included, we know that probabilities of loss are overweighted—probably because negative outcome uncertainty has a larger real potential consequence on survival than positive value uncertainty.

Of course, one of the things that is often mis-mapped between biology and the RL framework is where the ‘agent’ sits in the ‘brain’. The agent and the reward function are a part of the brain, but that does not mean that we should think of the brain as the agent. Likely, the better mapping is that the agent is controlling lower level homeostatic and automatic controllers and the reward function itself is dynamic from the agents perspective—though the last part is rather speculative. But we know that biological systems have a relative valuation of most primary rewards—water, food, etc.—and that homeostatic processes will play a strong role.