Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(yzj): add multi-agent and structured observation env (GoBigger) #39

Open
wants to merge 59 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
ec0ba9d
feature(yzj): adapt multi agent env gobigger with ez
May 31, 2023
2c29842
fix(yzj): fix data device bug in gobigger ez pipeline
Jun 1, 2023
335b0fc
feature(yzj): add vsbot with ez pipeline and add eat-info in tensorboard
Jun 1, 2023
0875e74
feature(yzj): add vsbot with mz pipeline and polish model and buffer
Jun 1, 2023
d88d79c
polish(yzj): polish gobigger env
Jun 2, 2023
17992eb
feature(yzj): adapt multi agent env gobigger with sez
Jun 2, 2023
4925d01
feature(yzj): add gobigger visualization and polish gobigger eval config
Jun 7, 2023
b8e044e
fix(yzj): fix eval_episode_return and polish env
Jun 7, 2023
f229b6a
polish(yzj): polish gobigger env pytest
Jun 7, 2023
4bbbeb0
polish(yzj): polish gobigger env and eat info in evaluator
Jun 7, 2023
7529170
fix(yzj): fix np.pad bug, which need padding_num>0
Jun 12, 2023
85aeacf
polish(yzj): contain raw obs only on eval mode for save memory
Jun 13, 2023
f146c4d
fix(yzj): fix mcts ptree sampled value/value-prefix bug
Jun 13, 2023
47b145e
polish(yzj): polish gobigger encoder model
Jun 15, 2023
2772ffd
polish(yzj): polish gobigger encoder model with ding
Jun 16, 2023
e36e752
polish(yzj): polish gobigger entry evaluator
Jun 19, 2023
7098899
feature(yzj): add eps_greedy and random_collect_episode in gobigger ez
Jun 19, 2023
b94deae
fix(yzj): fix key bug in entry utils when random collect
Jun 20, 2023
dfa4671
fix(yzj): fix gobigger encoder bn bug
Jun 25, 2023
ff11821
polish(yzj): polish ez config and set eps as 1.5e4 learner iter
Jun 25, 2023
a95c19c
polish(yzj): polish code style by format.sh
Jun 25, 2023
6da2997
polish(yzj): polish code comments about gobigger in worker/policy/entry
Jun 25, 2023
a2ca5ee
feature(yzj): add eps_greedy and random_collect_episode in gobigger mz
Jun 25, 2023
249d88a
Merge branch 'main' of https://github.com/opendilab/LightZero into de…
Jun 25, 2023
8c4c5a0
polish(yzj): polish entry/buffer/policy/config/model/env comments and…
Jun 28, 2023
377f664
polish(yzj): use ding scatter_model, muzero_collector add multi_agent…
Jun 30, 2023
1ed22b2
fix(yzj): fix collector bug that observation_window_stack no for_loop…
Jul 3, 2023
35e7714
fix(yzj): fix ignore done in collector
Jul 5, 2023
4df3ada
polish(yzj): polish ez config ignore done
Jul 5, 2023
272611f
fix(yzj): add game_segment_pool clear()
Jul 5, 2023
cc54996
polish(yzj): add gobigger/entry , polish gobigger config and add defa…
Jul 12, 2023
39802f5
polish(yzj): polish eps greedy and random policy
Jul 17, 2023
58281d6
fix(yzj): fix random collect in gobigger ez policy
Jul 17, 2023
e1ba071
polish(yzj): merge main-branch eps and random collect, polish gobigge…
Aug 2, 2023
c29abaf
feature(yzj): add peetingzoo mz/ez algo, add multi agent buffer/polic…
Aug 4, 2023
e4667df
polish(yzj): polish multi agent muzero collector
Aug 4, 2023
b6dca69
polish(yzj): polish gobigger collector and config to support t2p3
Aug 8, 2023
09a4440
feature(yzj): add fc encoder on ptz env instead of identity
Aug 8, 2023
407329a
polish(yzj): polish buffer name and remove ignore done in atari config
Aug 10, 2023
592fab1
fix(yzj): fix ssl data bug and polish to_device code
Aug 14, 2023
3392d61
fix(yzj): fix policy utils obs batch
Aug 14, 2023
9337ce3
fix(yzj): fix collect mode and eval mode to device
Aug 14, 2023
deab811
fix(yzj): fix to device bug on policy utils
Aug 15, 2023
705b5f9
polish(yzj): polish multi agent game buffer code
Aug 15, 2023
43b2bb5
polish(yzj): polish code
Aug 15, 2023
3d88a17
fix(yzj): fix priority bug, polish priority related config, add all a…
Aug 15, 2023
a09517a
polish(yzj): polish train entry
Aug 15, 2023
714ba4b
polish(yzj): polish gobigger config
Aug 16, 2023
0ee0122
polish(yzj): polish best gobigger config on ez/mz
Aug 18, 2023
71ce58e
polish(yzj): polish collector to adapt multi-agent mode
Aug 18, 2023
05c025d
polish(yzj): polish evaluator conflicts
Aug 18, 2023
5bec18b
polish(yzj): polish multi agent model
Aug 18, 2023
5d310ba
polish(yzj): sync main
Aug 21, 2023
920dc38
polish(yzj): polish gobigger entry and evaluator
Aug 21, 2023
1c1fde9
feature(yzj): add pettingzoo visualization
Aug 29, 2023
72c669b
polish(yzj): polish ptz config and model
Aug 29, 2023
11ef08f
feature(yzj): add ptz simple ez config
Sep 4, 2023
1e143bc
polish(yzj): polish code base
jayyoung0802 Dec 7, 2023
3e1e62f
Merge remote-tracking branch 'origin' into dev-gobigger
jayyoung0802 Dec 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
polish(yzj): polish code comments about gobigger in worker/policy/entry
jayyoung0802 committed Jun 25, 2023
commit 6da29975fd9c815275b852e2b5083647b0c48245
2 changes: 1 addition & 1 deletion lzero/entry/eval_muzero_gobigger.py
Original file line number Diff line number Diff line change
@@ -24,7 +24,7 @@ def eval_muzero_gobigger(
) -> 'Policy': # noqa
"""
Overview:
The train entry for MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
The eval entry for GoBigger MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
Arguments:
- input_cfg (:obj:`Tuple[dict, dict]`): Config in dict type.
``Tuple[dict, dict]`` type means [user_config, create_cfg].
2 changes: 1 addition & 1 deletion lzero/entry/train_muzero_gobigger.py
Original file line number Diff line number Diff line change
@@ -28,7 +28,7 @@ def train_muzero_gobigger(
) -> 'Policy': # noqa
"""
Overview:
The train entry for MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
The train entry for GoBigger MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
Arguments:
- input_cfg (:obj:`Tuple[dict, dict]`): Config in dict type.
``Tuple[dict, dict]`` type means [user_config, create_cfg].
2 changes: 1 addition & 1 deletion lzero/policy/gobigger_efficientzero.py
Original file line number Diff line number Diff line change
@@ -25,7 +25,7 @@
class GoBiggerEfficientZeroPolicy(Policy):
"""
Overview:
The policy class for EfficientZero.
The policy class for GoBiggerEfficientZero.
"""

# The default_config for EfficientZero policy.
2 changes: 1 addition & 1 deletion lzero/policy/gobigger_muzero.py
Original file line number Diff line number Diff line change
@@ -24,7 +24,7 @@
class GoBiggerMuZeroPolicy(Policy):
"""
Overview:
The policy class for MuZero.
The policy class for GoBiggerMuZero.
"""

# The default_config for MuZero policy.
2 changes: 1 addition & 1 deletion lzero/policy/gobigger_random_policy.py
Original file line number Diff line number Diff line change
@@ -25,7 +25,7 @@
class GoBiggerRandomPolicy(Policy):
"""
Overview:
The policy class for EfficientZero.
The policy class for GoBiggerRandom.
"""

# The default_config for EfficientZero policy.
2 changes: 1 addition & 1 deletion lzero/policy/gobigger_sampled_efficientzero.py
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@
class GoBiggerSampledEfficientZeroPolicy(Policy):
"""
Overview:
The policy class for Sampled EfficientZero.
The policy class for GoBigger Sampled EfficientZero.
"""

# The default_config for Sampled fEficientZero policy.
5 changes: 2 additions & 3 deletions lzero/worker/gobigger_muzero_collector.py
Original file line number Diff line number Diff line change
@@ -19,7 +19,8 @@
class GoBiggerMuZeroCollector(ISerialCollector):
"""
Overview:
The Collector for MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
The Collector for GoBigger MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
For GoBigger, add agent_num dim in game_segment.
Interfaces:
__init__, reset, reset_env, reset_policy, collect, close
Property:
@@ -447,8 +448,6 @@ def collect(self,
)
else:
for agent_id in range(agent_num):
if len(distributions_dict[env_id][agent_id]) != 27:
print('')
game_segments[env_id][agent_id].store_search_stats(
distributions_dict[env_id][agent_id], value_dict[env_id][agent_id]
)
2 changes: 1 addition & 1 deletion lzero/worker/gobigger_muzero_evaluator.py
Original file line number Diff line number Diff line change
@@ -21,7 +21,7 @@
class GoBiggerMuZeroEvaluator(ISerialEvaluator):
"""
Overview:
The Evaluator for MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
The Evaluator for GoBigger MCTS+RL algorithms, including MuZero, EfficientZero, Sampled EfficientZero.
Interfaces:
__init__, reset, reset_policy, reset_env, close, should_eval, eval
Property:
3 changes: 3 additions & 0 deletions zoo/gobigger/env/gobigger_env.py
Original file line number Diff line number Diff line change
@@ -109,14 +109,17 @@ def close(self) -> None:

@property
def observation_space(self) -> gym.spaces.Space:
# The following ensures compatibility with the DI-engine Env class.
return self._observation_space

@property
def action_space(self) -> gym.spaces.Space:
# The following ensures compatibility with the DI-engine Env class.
return self._action_space

@property
def reward_space(self) -> gym.spaces.Space:
# The following ensures compatibility with the DI-engine Env class.
return self._reward_space

def __repr__(self) -> str:
1 change: 1 addition & 0 deletions zoo/gobigger/env/gobigger_rule_bot.py
Original file line number Diff line number Diff line change
@@ -34,6 +34,7 @@ def reset(self, env_id_lst=None):
for agent in self.bot[env_id]:
agent.reset()

# The following ensures compatibility with the DI-engine Policy class.
def _init_learn(self) -> None:
pass