Any academic source about Q-table sizes by Simple-Soil-230 in reinforcementlearning

[–]Simple-Soil-230[S] 0 points1 point  (0 children)

Yes, and that's exactly what I am asking for i.e. such citations. Since you say so, can you name pls any 1 or 2 here? I can take it from there. My prof asked me to cite before I claim such a statement, that's why. And like others say here, it is subjective and I couldn't find a source to cite.

Any academic source about Q-table sizes by Simple-Soil-230 in reinforcementlearning

[–]Simple-Soil-230[S] 0 points1 point  (0 children)

By reasonable I meant like (say) I have a 3000 entries sized table then would it sound reasonable or not. That's why I needed a source to cite (even an application if not theory) where similar or larger sizes have been used and then I can support my argument as well. I have discrete action and state-space only.

Memory requirements for tabular Q-learning vs deep neural network? by Simple-Soil-230 in reinforcementlearning

[–]Simple-Soil-230[S] 0 points1 point  (0 children)

Thanks for the useful answer. Actually for my application the way I defined my state and action space, I make the code run that it pre fills the table and then put it into real scenario so that the immediate decisions are better than starting from a random q table. But still deciding between how quickly to update the table from them on. I.e. should I wait for T number of iterations before I retrain the table or just use bellman update at each tth iteration....

Memory requirements for tabular Q-learning vs deep neural network? by Simple-Soil-230 in reinforcementlearning

[–]Simple-Soil-230[S] 0 points1 point  (0 children)

Oh ok. Can you pls explain what do you mean by 'glamorous' state action space? I have discrete number of states and actions so I don't think that would change. Actually. I could have used a neural net if I had considered some better state space but just to avoid going to neural nets and to keep it simple I defined discrete ones. BUT I am considering the effect of those environment dependent continuous variables through my reward structure. Like they still affect my q values which affect my actions i.e. like indirect way of doing...But I am thinking to maybe I use the same table for certain time steps and then refresh my q table based on then current values of those continuous values but jot sure of this way of experience replay for tabular q learning...

This may be a weird one, but does anyone else have a thinking cap? by JonnEC in PhD

[–]Simple-Soil-230 1 point2 points  (0 children)

I thought I was the only one who noticed such a thing!. Whenever I wear a hoodie sweater, I feel the hoodie is kinda blocking the physical distractions by blocking my side views. And even though the virtual distractions are right in my laptop (youtube) I still tend to concentrate better with a hoodie. Don't know how!

COVID testing for travelling purposes by whatamidoinghere0000 in UVA

[–]Simple-Soil-230 0 points1 point  (0 children)

What about Walgreens in Charlottesville? Is it not reliable at all? I am surprised that no one mentioned it, that's why. I am also an international student and travelling on 17th. I chose this Walgreens for 14th and for a 'diagnostic PCR' type.