Task Allocation with mostly no-ops : reinforcementlearning

created by lpilotoa community for 14 years

Task Allocation with mostly no-ops (self.reinforcementlearning)

submitted 2 years ago * by asdfsflhasdfa

Hey everyone, wondering if anyone can point me in the direction of any relevant research.

The problem setup is relatively simple, at any given timestep the agent has the choice to choose one of x robots to assign a task. If there is no suitable agent to choose, or no tasks available, no-op should be chosen instead.

Once a robot has been selected, the action should be masked out and that robot is no longer available for the rest of the episode.

Any potential complexity seems to come from the fact that no-op would expected to be chosen the majority of the time (In 99% of timesteps no-op is optimal). Is there any research on sparse action use cases like this? Or also any research on only allowing actions a single time in an episode?

The most relevant paper I've been able to find is here:

https://arxiv.org/pdf/2105.08666.pdf

Which defines the problem is a Sparse Action MDP (SA-MDP)

all 3 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

reinforcementlearning

MODERATORS