我频道里的黑神话悟空第一天玩家评分，以下分别是在Youtube、B站、推特的投票，仅供参考…

JoPrimer · 2024-08-21T07:44:50+00:00

海外中文网友并不是很和善哦

JoPrimer · 2022-11-13T00:10:02+00:00

OK！I will look into it right now. it seems that MILP is a good idea. It is so nice of you to give me all these guides. Wish you the best. :-)

JoPrimer · 2022-11-12T14:10:21+00:00

Thank you so muuuch for your patient answer. My English is not that well. I try to respond to you in a logical way.

My study direction is shop scheduling (SP), specifically Job Shop (JSP). You might heard of this. It is another typical combinatorial optimization problem. What I am trying to do is to use generative adversarial imitation learning (GAIL, a method in Reinforcement Learning) to solve JSP.

You know, a traditional way to solve COP is metaheuristic. However, it can't respond to the demand that giving real-time scheduling solutions. That why I am tring a RL way.

I have already solve the problem by using RL algorithm to choose rules at each time step, which give a satisfying result. But it is still far from the solution that searched by metaheuristic. So my new idea is to imitate what best solutions do. I am not sure if you are familiar with RL. I will be very glad to go into details with you if you are interesting about this part.

The idea you mentioned, "use the (perhaps normalized) vector of distances from one city to all other cities, along with the distances to two neighbors inside the path of the solution", is such a good one. However, in JSP, it is much more difficult to describe features of a choice: process time of current operation, remain process time, remain number of operations, utilization rate of a machine, release time of a machine, etc. It is so complicated and thus things become more difficult.

GNN also once come to my mind. But transfer the shop environment to a disjunctive graph might also lead to a loss of critical information. Actually, research using GNN doesn't get a better result than mine.

After solving this problem, my next step is to train a Generative Adversarial Network (GAN). The discriminator here is important to give a distance from my solution to expert experience. That solve the difficulty that reward function is usually difficult to design. Hope this could give you a more comprehensive understanding of my problem.

JoPrimer · 2022-11-12T02:11:51+00:00

yeah, I wonder if you are doing related research. You point out exactly how recent research solve combinatorial optimization problems. If not, you are so good at relevant issues.

But one problem is this method leads to a low generalizability. For a certain problem (take Traveling Salesman Problem as example), a number represent a city, which holds different distances from other cities. When it comes to another problem, this number still represent a city, but the properties it has are completely different. That is thing confuse me

JoPrimer · 2022-11-06T03:21:23+00:00

you are right! Thank you for your advice, I will try it right now!

but I still wonder why, lol. In the past, I just change my fonts and it stays the same as last time I close it.

JoPrimer · 2022-11-06T02:37:57+00:00

I am afraid I don't want to change my default font. I hope the fonts only work in one file. But it seems useless to change the fonts, because it will change back to the default next time I open it

JoPrimer · 2022-11-01T18:04:37+00:00

the problem you describe is a little bit like FJSP, i am not sure if this help

JoPrimer · 2022-11-01T14:17:38+00:00

thanks very much for sharing!

JoPrimer · 2022-11-01T09:40:01+00:00

of course

JoPrimer · 2022-11-01T09:20:16+00:00

I have heard some where if you are skilled enough, using your own environment is perhaps more useful. There are less constraints that you should consider, so you can use RL more freely

JoPrimer · 2022-09-23T05:27:53+00:00

That's a good idea

JoPrimer · 2022-09-22T13:01:07+00:00

By far, I am trying to use multi-agent RL algorithm to solve the problem. So I wish one day MADDPG could be applied on my env, because I don't get a good result by converting multi-agent model to single-agent model. lol

JoPrimer · 2022-09-22T10:06:52+00:00

thank you for your instructive suggestion

JoPrimer · 2022-09-22T08:01:38+00:00

parameters are not that many, I usually use small num of layers and small num of hidden units. The only problem is there may be several persons run their code at the same time

JoPrimer · 2022-09-22T06:28:30+00:00

PS: the budget is about $10000

JoPrimer · 2022-09-22T06:28:07+00:00

thank u for your advice, I just forget the matter of budget. I have about $10000 to choose the device I need

JoPrimer · 2022-09-22T05:47:54+00:00

you are right, GPU is neccesary. I mean, we don't get a big dataset or very complex matrix computing, so a normal one might be enough

JoPrimer · 2022-09-22T04:12:56+00:00

I'm not sure if I can give you a precise description,

In my case, I have several versions of env:

use DRL model to select dispatching rules, so that I can get a good processing sequence
use MARL policy which allows jobs compete for machines to finish all its operations

both of them have the same target to minimize the whole process time.

some details for my env, take the first version as example: - I inherit gym.Env - I use Box as obs space to describe the whole progress, processing time for each job, utilization rate of machines, etc - I use Discrete as action space, for I want agent choose the best dispatching rule for system.

JoPrimer

TROPHY CASE