Add maze environment and example #106

VivekSil · 2025-10-27T18:40:27Z

This PR will add Maze environment (#105)

Specifications:

The maze can be specified by the user with start and exit points as a numpy array.
Current default maze is of size 8x8 with start and exit cells being (0,0) and (7,7) respectively.

Reward setting:

reward_exit: Reward for reaching the exit cell
reward_move: Reward for a move that didn't find the exit but is valid
penalty_visited: penalty for revisiting a cell
penalty_impossible: penalty for invalid move (wall/outside)

meta-cla · 2025-10-27T18:40:33Z

Hi @VivekSil!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

meta-cla · 2025-10-27T18:42:32Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copilot

Pull Request Overview

This PR integrates a Maze game environment with the OpenEnv framework, providing both server-side and client-side implementations with Docker support.

Implements MazeEnvironment wrapper that exposes the Maze game through OpenEnv's Environment interface
Adds HTTP client and server infrastructure for remote maze environment interaction
Includes example scripts demonstrating both automated and human-interactive maze solving

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
src/envs/maze_env/server/mazearray.py	Defines the maze layout as a numpy array
src/envs/maze_env/server/maze_environment.py	Wraps Maze game to implement OpenEnv Environment interface
src/envs/maze_env/server/maze.py	Core Maze implementation with coordinate system (col, row)
src/envs/maze_env/server/app.py	FastAPI application exposing maze environment over HTTP
src/envs/maze_env/server/init.py	Package exports for server components
src/envs/maze_env/server/Dockerfile	Container configuration for maze environment server
src/envs/maze_env/models.py	Data models for actions, observations, and state
src/envs/maze_env/client.py	HTTP client for connecting to maze environment server
src/envs/maze_env/init.py	Package exports for client components
src/envs/maze_env/README.md	Documentation for maze environment usage
examples/maze_simple.py	Example demonstrating automated maze navigation
examples/maze_human.py	Example demonstrating human-interactive maze solving
.github/workflows/docker-build.yml	Adds maze-env to CI/CD Docker build workflow

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-01T03:48:59Z

src/envs/maze_env/server/maze_environment.py

+        # build MazeObservation; convert numpy to list for JSON-serializable dataclass fields
+        pos_list = observation.tolist() if hasattr(observation, "tolist") else list(observation)
+        total_reward = 0
+        legal_actions = self._compute_legal_actions(pos_list[0])


Incorrect indexing when computing legal actions. The method expects a list [row, col] but pos_list[0] extracts only the first nested list from np.array([[col, row]]). Based on the Maze class documentation (line 54-58 in maze.py), the underlying Maze returns coordinates as (col, row), but the wrapper uses [row, col]. This should be pos_list[0] with the correct coordinate transformation applied.

Suggested change

legal_actions = self._compute_legal_actions(pos_list[0])

# Swap [col, row] to [row, col] for legal actions computation

legal_actions = self._compute_legal_actions([pos_list[0][1], pos_list[0][0]])

Copilot · 2025-11-01T03:49:00Z

src/envs/maze_env/server/maze_environment.py

+    ):
+        # Create underlying Maze instance (matches your working code)
+        self.env = Maze(maze=maze_array, start_cell=start_cell, exit_cell=exit_cell)
+        self.total_reward = 0


The total_reward is not reset in the reset() method (line 55-66), which means rewards accumulate across episodes. The reset() method should include 'self.total_reward = 0' to properly reset episode state.

Copilot · 2025-11-01T03:49:00Z

src/envs/maze_env/server/app.py

+# Get game configuration from environment variables
+
+# Create the environment instance
+env = MazeEnvironment(maze_array=maze,start_cell=(0,0),exit_cell=(7,7))


Missing spaces after commas in function arguments. Should be: 'maze_array=maze, start_cell=(0, 0), exit_cell=(7, 7)' to follow PEP 8 style guidelines.

Suggested change

env = MazeEnvironment(maze_array=maze,start_cell=(0,0),exit_cell=(7,7))

env = MazeEnvironment(maze_array=maze, start_cell=(0, 0), exit_cell=(7, 7))

Copilot · 2025-11-01T03:49:00Z

src/envs/maze_env/server/__init__.py

+from .maze import Maze, Status
+from .maze_environment import MazeEnvironment
+
+__all__ = ["Maze","MazeEnvironment","Status"]


Missing spaces after commas in list. Should be: '["Maze", "MazeEnvironment", "Status"]' to follow PEP 8 style guidelines.

Suggested change

__all__ = ["Maze","MazeEnvironment","Status"]

__all__ = ["Maze", "MazeEnvironment", "Status"]

Copilot · 2025-11-01T03:49:00Z

examples/maze_human.py

+5. Inspecting environment state
+
+Usage:
+    python examples/maze_simple.py


Incorrect usage documentation. This file is 'maze_human.py' but the docstring says 'python examples/maze_simple.py'. Should be 'python examples/maze_human.py'.

Suggested change

python examples/maze_simple.py

python examples/maze_human.py

Copilot · 2025-11-01T03:49:02Z

src/envs/maze_env/server/maze_environment.py

+via the OpenEnv Environment interface.
+"""
+
+from typing import Any, Dict, List, Tuple, Optional


Import of 'Any' is not used.
Import of 'Dict' is not used.

Suggested change

from typing import Any, Dict, List, Tuple, Optional

from typing import List, Tuple, Optional

Copilot · 2025-11-01T03:49:02Z

src/envs/maze_env/server/maze_environment.py

+"""
+
+from typing import Any, Dict, List, Tuple, Optional
+from core.env_server import Action, Environment, Observation


Import of 'Action' is not used.
Import of 'Observation' is not used.

Suggested change

from core.env_server import Action, Environment, Observation

from core.env_server import Environment

Copilot · 2025-11-01T03:49:03Z

src/envs/maze_env/server/maze_environment.py

+
+from typing import Any, Dict, List, Tuple, Optional
+from core.env_server import Action, Environment, Observation
+from .maze import Maze, Status


Import of 'Status' is not used.

Suggested change

from .maze import Maze, Status

from .maze import Maze

Copilot · 2025-11-01T03:49:03Z

src/envs/maze_env/models.py

+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pydantic import Field


Import of 'Field' is not used.

Suggested change

from pydantic import Field

Copilot · 2025-11-01T03:49:03Z

src/envs/maze_env/models.py

+
+from dataclasses import dataclass, field
+from pydantic import Field
+from typing import Any, Dict, List, Optional, Tuple, Literal


Import of 'Any' is not used.
Import of 'Dict' is not used.
Import of 'Optional' is not used.
Import of 'Tuple' is not used.
Import of 'Literal' is not used.

Suggested change

from typing import Any, Dict, List, Optional, Tuple, Literal

from typing import List

github-actions · 2025-11-02T12:41:30Z