Ship_routing_QLearning

So far, I have written the code to create the final Q matrix, after which we can iterate over it using nearest neighbour algorithms to find the final path. The disadvantage of nearest neighbour algorithms in which future rewards or anti-rewards are not considered, is resolved if we use the Q matrix.

Cases where this works:

When we have a straightforward path-finding case where there aren't any weird land obstacles, my agent reaches the goal state. This is the current situation after 1 episode:

I have initialized the Q matrix with the weights arranged as per a linear space of values increasing toward the goal in an exponential manner. This proves to be an adavantage when, near the start coordinate, the gradients do not move that drastically (less slope in gradient vs coordinate graph), and the nearer it moves towards the goal state, the increase in weights becomes more and more.

Cases where this doesn't work (To rectify soon):

Where there are land obstacles, the agent find reasons to get stuck (more distance towards ocean vs hitting the land) and thus cannot reach the goal state.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Astar.cpp		Astar.cpp
Astar_tweaked.cpp		Astar_tweaked.cpp
README.md		README.md
plot.py		plot.py
seamapG.py		seamapG.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ship_routing_QLearning

Cases where this works:

Cases where this doesn't work (To rectify soon):

About

Releases

Packages

Languages

utsimul/Ship_routing_QLearning

Folders and files

Latest commit

History

Repository files navigation

Ship_routing_QLearning

Cases where this works:

Cases where this doesn't work (To rectify soon):

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages