FoDA (Foresight Distribution Adjustment for Off-policy Reinforcement Learning)

FoDA (Foresight Distribution Adjustment) is a framework for off-policy reinforcement learning, where the goal is to use the post-update policy distribution to update the Q network to look ahead at the policy learning. This approach involves deriving the gradient of the visitation distribution with respect to the policy parameter and obtaining an explicit expression to approximate the post-update policy distribution.

Environment Setup

Before running the code, ensure that your environment is properly set up. Use the following command to set up the environment:

conda env create -f foda_environment.yml
conda activate foda

For development mode, use:

pip install -e .

Running the Code

To run the code, execute the following command:

python examples/mujoco/mujocofoda.py --seed 40 >terminal_output/cheetahrun40.log 2>&1 &

This command runs the mujocofoda.py script with a seed of 40 and logs the output to terminal_output/cheetahrun40.log.

Acknowledgments

The code for FoDA is mainly built upon Tianshou. Special thanks to the contributors and maintainers of Tianshou for their excellent framework.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
examples		examples
plotter		plotter
results		results
test		test
tianshou		tianshou
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
foda_environment.yml		foda_environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FoDA (Foresight Distribution Adjustment for Off-policy Reinforcement Learning)

Environment Setup

Running the Code

Acknowledgments

About

Releases

Packages

Languages

License

Ruifeng-Chen/FoDA_AAMAS2024

Folders and files

Latest commit

History

Repository files navigation

FoDA (Foresight Distribution Adjustment for Off-policy Reinforcement Learning)

Environment Setup

Running the Code

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages