Skip to content

Commit 2df1ff5

Browse files
committed
add FinRL Env
1 parent 12defb9 commit 2df1ff5

File tree

9 files changed

+1148
-0
lines changed

9 files changed

+1148
-0
lines changed

src/envs/finrl_env/README.md

Lines changed: 349 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,349 @@
1+
# FinRL Environment
2+
3+
A wrapper around [FinRL](https://github.com/AI4Finance-Foundation/FinRL) stock trading environments that conforms to the OpenEnv specification.
4+
5+
## Overview
6+
7+
This environment enables reinforcement learning for stock trading tasks using FinRL's powerful StockTradingEnv, exposed through OpenEnv's simple HTTP API. It supports:
8+
9+
- **Stock Trading**: Buy/sell actions across multiple stocks
10+
- **Portfolio Management**: Track balance, holdings, and portfolio value
11+
- **Technical Indicators**: MACD, RSI, CCI, DX, and more
12+
- **Flexible Configuration**: Custom data sources and trading parameters
13+
14+
## Quick Start
15+
16+
### 1. Build the Docker Image
17+
18+
First, build the base image (from OpenEnv root):
19+
20+
```bash
21+
cd OpenEnv
22+
docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .
23+
```
24+
25+
Then build the FinRL environment image:
26+
27+
```bash
28+
docker build -t finrl-env:latest -f src/envs/finrl_env/server/Dockerfile .
29+
```
30+
31+
### 2. Run the Server
32+
33+
#### Option A: With Default Sample Data
34+
35+
```bash
36+
docker run -p 8000:8000 finrl-env:latest
37+
```
38+
39+
This starts the server with synthetic sample data for testing.
40+
41+
#### Option B: With Custom Configuration
42+
43+
Create a configuration file `config.json`:
44+
45+
```json
46+
{
47+
"data_path": "/data/stock_data.csv",
48+
"stock_dim": 3,
49+
"hmax": 100,
50+
"initial_amount": 100000,
51+
"num_stock_shares": [0, 0, 0],
52+
"buy_cost_pct": [0.001, 0.001, 0.001],
53+
"sell_cost_pct": [0.001, 0.001, 0.001],
54+
"reward_scaling": 0.0001,
55+
"state_space": 25,
56+
"action_space": 3,
57+
"tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
58+
}
59+
```
60+
61+
Run with configuration:
62+
63+
```bash
64+
docker run -p 8000:8000 \
65+
-v $(pwd)/config.json:/config/config.json \
66+
-v $(pwd)/data:/data \
67+
-e FINRL_CONFIG_PATH=/config/config.json \
68+
finrl-env:latest
69+
```
70+
71+
### 3. Use the Client
72+
73+
```python
74+
from envs.finrl_env import FinRLEnv, FinRLAction
75+
import numpy as np
76+
77+
# Connect to server
78+
client = FinRLEnv(base_url="http://localhost:8000")
79+
80+
# Get configuration
81+
config = client.get_config()
82+
print(f"Trading {config['stock_dim']} stocks")
83+
print(f"Initial capital: ${config['initial_amount']:,.0f}")
84+
85+
# Reset environment
86+
result = client.reset()
87+
print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")
88+
89+
# Trading loop
90+
for step in range(100):
91+
# Get current state
92+
state = result.observation.state
93+
94+
# Your RL policy here (example: random actions)
95+
num_stocks = config['stock_dim']
96+
actions = np.random.uniform(-1, 1, size=num_stocks).tolist()
97+
98+
# Execute action
99+
result = client.step(FinRLAction(actions=actions))
100+
101+
print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
102+
f"Reward={result.reward:.2f}")
103+
104+
if result.done:
105+
print("Episode finished!")
106+
break
107+
108+
client.close()
109+
```
110+
111+
## Architecture
112+
113+
```
114+
┌─────────────────────────────────────────────────────────────┐
115+
│ RL Training Framework │
116+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
117+
│ │ Policy Net │ │ Value Net │ │ Replay │ │
118+
│ │ (PyTorch) │ │ (PyTorch) │ │ Buffer │ │
119+
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
120+
│ └──────────────────┴──────────────────┘ │
121+
│ │ │
122+
│ ┌────────▼────────┐ │
123+
│ │ FinRLEnv │ ← HTTP Client │
124+
│ │ (HTTPEnvClient) │ │
125+
│ └────────┬────────┘ │
126+
└────────────────────────────┼─────────────────────────────────┘
127+
│ HTTP (JSON)
128+
┌────────▼────────┐
129+
│ Docker Container│
130+
│ Port: 8000 │
131+
│ │
132+
│ ┌─────────────┐ │
133+
│ │FastAPI │ │
134+
│ │Server │ │
135+
│ └──────┬──────┘ │
136+
│ │ │
137+
│ ┌──────▼──────┐ │
138+
│ │ FinRL │ │
139+
│ │ Environment │ │
140+
│ └──────┬──────┘ │
141+
│ │ │
142+
│ ┌──────▼──────┐ │
143+
│ │ FinRL │ │
144+
│ │ StockTrading│ │
145+
│ │ Env │ │
146+
│ └─────────────┘ │
147+
└─────────────────┘
148+
```
149+
150+
## API Reference
151+
152+
### FinRLAction
153+
154+
Trading action for the environment.
155+
156+
**Attributes:**
157+
- `actions: list[float]` - Array of normalized action values (-1 to 1) for each stock
158+
- Positive values: Buy
159+
- Negative values: Sell
160+
- Magnitude: Relative trade size
161+
162+
**Example:**
163+
```python
164+
# Buy stock 0, sell stock 1, hold stock 2
165+
action = FinRLAction(actions=[0.5, -0.3, 0.0])
166+
```
167+
168+
### FinRLObservation
169+
170+
Observation returned by the environment.
171+
172+
**Attributes:**
173+
- `state: list[float]` - Flattened state vector
174+
- Structure: `[balance, prices..., holdings..., indicators...]`
175+
- `portfolio_value: float` - Total portfolio value (cash + holdings)
176+
- `date: str` - Current trading date
177+
- `done: bool` - Whether episode has ended
178+
- `reward: float` - Reward for the last action
179+
- `metadata: dict` - Additional information
180+
181+
**Example:**
182+
```python
183+
obs = result.observation
184+
print(f"Portfolio: ${obs.portfolio_value:,.2f}")
185+
print(f"Date: {obs.date}")
186+
print(f"State dimension: {len(obs.state)}")
187+
```
188+
189+
### Client Methods
190+
191+
#### `reset() -> StepResult[FinRLObservation]`
192+
193+
Reset the environment to start a new episode.
194+
195+
```python
196+
result = client.reset()
197+
```
198+
199+
#### `step(action: FinRLAction) -> StepResult[FinRLObservation]`
200+
201+
Execute a trading action.
202+
203+
```python
204+
action = FinRLAction(actions=[0.5, -0.3])
205+
result = client.step(action)
206+
```
207+
208+
#### `state() -> State`
209+
210+
Get episode metadata (episode_id, step_count).
211+
212+
```python
213+
state = client.state()
214+
print(f"Episode: {state.episode_id}, Step: {state.step_count}")
215+
```
216+
217+
#### `get_config() -> dict`
218+
219+
Get environment configuration.
220+
221+
```python
222+
config = client.get_config()
223+
print(config['stock_dim'])
224+
print(config['initial_amount'])
225+
```
226+
227+
## Data Format
228+
229+
The environment expects stock data in the following CSV format:
230+
231+
| date | tic | close | high | low | open | volume | macd | rsi_30 | cci_30 | dx_30 |
232+
|------------|--------|--------|--------|--------|--------|---------|-------|--------|--------|-------|
233+
| 2020-01-01 | AAPL | 100.0 | 102.0 | 98.0 | 99.0 | 1000000 | 0.5 | 55.0 | 10.0 | 15.0 |
234+
| 2020-01-01 | GOOGL | 1500.0 | 1520.0 | 1480.0 | 1490.0 | 500000 | -0.3 | 48.0 | -5.0 | 20.0 |
235+
236+
**Required columns:**
237+
- `date`: Trading date
238+
- `tic`: Stock ticker symbol
239+
- `close`, `high`, `low`, `open`: Price data
240+
- `volume`: Trading volume
241+
- Technical indicators (as specified in `tech_indicator_list`)
242+
243+
## Configuration Parameters
244+
245+
| Parameter | Type | Description |
246+
|-----------|------|-------------|
247+
| `data_path` | str | Path to CSV file with stock data |
248+
| `stock_dim` | int | Number of stocks to trade |
249+
| `hmax` | int | Maximum shares per trade |
250+
| `initial_amount` | int | Starting cash balance |
251+
| `num_stock_shares` | list[int] | Initial holdings for each stock |
252+
| `buy_cost_pct` | list[float] | Transaction cost for buying (per stock) |
253+
| `sell_cost_pct` | list[float] | Transaction cost for selling (per stock) |
254+
| `reward_scaling` | float | Scaling factor for rewards |
255+
| `state_space` | int | Dimension of state vector |
256+
| `action_space` | int | Dimension of action space |
257+
| `tech_indicator_list` | list[str] | Technical indicators to include |
258+
259+
## Integration with RL Frameworks
260+
261+
### Stable Baselines 3
262+
263+
```python
264+
from stable_baselines3 import PPO
265+
from envs.finrl_env import FinRLEnv, FinRLAction
266+
import numpy as np
267+
268+
# Create custom wrapper for SB3
269+
class SB3FinRLWrapper:
270+
def __init__(self, base_url):
271+
self.env = FinRLEnv(base_url=base_url)
272+
config = self.env.get_config()
273+
self.action_space = spaces.Box(
274+
low=-1, high=1,
275+
shape=(config['action_space'],),
276+
dtype=np.float32
277+
)
278+
self.observation_space = spaces.Box(
279+
low=-np.inf, high=np.inf,
280+
shape=(config['state_space'],),
281+
dtype=np.float32
282+
)
283+
284+
def reset(self):
285+
result = self.env.reset()
286+
return np.array(result.observation.state, dtype=np.float32)
287+
288+
def step(self, action):
289+
result = self.env.step(FinRLAction(actions=action.tolist()))
290+
return (
291+
np.array(result.observation.state, dtype=np.float32),
292+
result.reward or 0.0,
293+
result.done,
294+
result.observation.metadata
295+
)
296+
297+
# Train
298+
env = SB3FinRLWrapper("http://localhost:8000")
299+
model = PPO("MlpPolicy", env, verbose=1)
300+
model.learn(total_timesteps=10000)
301+
```
302+
303+
## Troubleshooting
304+
305+
### Server won't start
306+
307+
1. Check if base image exists:
308+
```bash
309+
docker images | grep envtorch-base
310+
```
311+
312+
2. Build base image if missing:
313+
```bash
314+
docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .
315+
```
316+
317+
### Import errors
318+
319+
Make sure you're in the `src` directory:
320+
```bash
321+
cd OpenEnv/src
322+
python -c "from envs.finrl_env import FinRLEnv"
323+
```
324+
325+
### Configuration errors
326+
327+
Verify your data file has all required columns:
328+
```python
329+
import pandas as pd
330+
df = pd.read_csv('your_data.csv')
331+
print(df.columns.tolist())
332+
```
333+
334+
## Examples
335+
336+
See the `examples/` directory for complete examples:
337+
- `examples/finrl_simple.py` - Basic usage
338+
- `examples/finrl_training.py` - Full training loop with PPO
339+
- `examples/finrl_backtesting.py` - Backtesting a trained agent
340+
341+
## License
342+
343+
BSD 3-Clause License (see LICENSE file in repository root)
344+
345+
## References
346+
347+
- [FinRL Paper](https://arxiv.org/abs/2011.09607)
348+
- [FinRL GitHub](https://github.com/AI4Finance-Foundation/FinRL)
349+
- [OpenEnv Documentation](../../README.md)

src/envs/finrl_env/__init__.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Copyright (c) Meta Platforms, Inc. and affiliates.
2+
# All rights reserved.
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree.
6+
7+
"""
8+
FinRL Environment for OpenEnv.
9+
10+
This package provides a wrapper around FinRL's StockTradingEnv that conforms
11+
to the OpenEnv specification, enabling stock trading RL tasks through a
12+
simple HTTP API.
13+
14+
Example:
15+
>>> from envs.finrl_env import FinRLEnv, FinRLAction
16+
>>>
17+
>>> # Connect to server
18+
>>> client = FinRLEnv(base_url="http://localhost:8000")
19+
>>>
20+
>>> # Reset environment
21+
>>> result = client.reset()
22+
>>> print(result.observation.portfolio_value)
23+
>>>
24+
>>> # Execute trading action
25+
>>> action = FinRLAction(actions=[0.5]) # Buy
26+
>>> result = client.step(action)
27+
>>> print(result.reward)
28+
"""
29+
30+
from .client import FinRLEnv
31+
from .models import FinRLAction, FinRLObservation
32+
33+
__all__ = ["FinRLEnv", "FinRLAction", "FinRLObservation"]

0 commit comments

Comments
 (0)