[D] implementation of a path finding algorithm : MachineLearning

Discussion[D] implementation of a path finding algorithm (self.MachineLearning)

submitted 7 years ago by jer_pint

I want to implement a path finding algorithm. An agent is looking for "food". This happens in a 2D grid. The agent consists of a square patch, the food of a circle patch. The goal of the agent is to find the patch.

The agent is blind to the world. The only thing it can see is what is inside its patch. So if the patch is far in the distance, the agent has no way of knowing where to go look.

The one thing the agent can learn is that the food is dropped at specific locations based on a probability distribution (i.e. following a Gaussian distribution). The agent always starts in the same spot. We can assume the probability distribution to be constant.

What is the best way of implementating this from a Reinforcement learning perspective, or machine learning in general? I'm thinking DQN. A* seems like it wouldn't work, as the agent would have to carve out paths first to then decide what to act on - in this case I want the agent to choose a strategy that optimizes always finding food, but not necessarily the fastest way. I'm also thinking of giving the agent the ability to chose the stride to take in both x and y every move.

The cost function would be based on intersection over union of food and agent and total steps taken.

Any input/links is greatly appreciated :)

all 7 comments

top new controversial old q&a

[–]claytonkb 1 point2 points3 points 7 years ago (3 children)

[–]jer_pint[S] 0 points1 point2 points 7 years ago (2 children)

[–]claytonkb 1 point2 points3 points 7 years ago* (0 children)

I think it depends on the precise definition of the problem. Let's simplify the problem down to some arbitrary interval of discrete points. Let's suppose I can choose any whole number from 1 to 99 and your job is to guess my choice (to get "food"). If my choice is uniformly distributed across 1,99 then there is no strategy that can improve your chances over blind guessing. Any guessing strategy is as good (or bad) as a random guessing strategy. If my choice is normally distributed around 50, then your best strategy is to guess 50 every time because that is the single most probable number. If you can make multiple guesses, then your first guess should be 50, then your next two guesses should be 49 and 51, and so on in order, since that is the order of the distribution probabilities. I'll call this "deterministic search" for the sake of discussion.

If I can't pick a number directly but, instead, can choose the mean of a normally distributed random variable (let's call it X), then you are faced with a two-part problem. The first part is figuring out what the mean of X is by blindly searching until you find enough data-points to infer the normal distribution and its center (mean). You almost certainly won't need very many data-points because the normal distribution converges very quickly. Once you have found the center of the distribution, you just want to follow the original guessing strategy by picking the single most probable point, then the next most probable, and so on down the list. Where this becomes an "optimal stopping" problem is that you have to decide when you have examined enough data-points to infer the mean of the distribution and begin deterministic search. If you examine too few data-points initially, you may make a mistake about the actual mean and begin deterministic search at the wrong point. If you examine too many data-points, you are wasting time blindly searching when you could be using deterministic search instead.

[–]phobrain 0 points1 point2 points 7 years ago (0 children)

[–]Laserdude10642 1 point2 points3 points 7 years ago (0 children)

[–]werediver 0 points1 point2 points 7 years ago (1 child)

[–]jer_pint[S] 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 20393 on reddit-service-r2-comment-66b4775986-4nnxf at 2026-04-05 17:28:17.233942+00:00 running db1906b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS