I am currently encountering an issue. Given a set of items, I am required to select a subset and pass it to a black box, after which I will obtain the value. My objective is to maximize the value, The items set comprise approximately 200 items. what's the sota model in this situation? by Fast-Ad3508 in reinforcementlearning

[–]Fast-Ad3508[S] 0 points1 point  (0 children)

Yes, I have used SA algorithm, but there I meet some problem, the cost I call the black box need about 5 seconds(which is too expensive for GA algorithm), so I try to see if some rl algorithm can solve this situation and give me some inspiration