meta-pytorch
diff --git a/‎src/envs/finrl_env/README.md‎
Lines changed: 349 additions & 0 deletions b/‎src/envs/finrl_env/README.md‎
Lines changed: 349 additions & 0 deletions
diff --git a/‎src/envs/finrl_env/__init__.py‎
Lines changed: 33 additions & 0 deletions b/‎src/envs/finrl_env/__init__.py‎
Lines changed: 33 additions & 0 deletions
@@ -0,0 +1,349 @@
+# FinRL Environment
+
+A wrapper around [FinRL](https://github.com/AI4Finance-Foundation/FinRL) stock trading environments that conforms to the OpenEnv specification.
+
+## Overview
+
+This environment enables reinforcement learning for stock trading tasks using FinRL's powerful StockTradingEnv, exposed through OpenEnv's simple HTTP API. It supports:
+
+- **Stock Trading**: Buy/sell actions across multiple stocks
+- **Portfolio Management**: Track balance, holdings, and portfolio value
+- **Technical Indicators**: MACD, RSI, CCI, DX, and more
+- **Flexible Configuration**: Custom data sources and trading parameters
+
+## Quick Start
+
+### 1. Build the Docker Image
+
+First, build the base image (from OpenEnv root):
+
+```bash
+cd OpenEnv
+docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .
+```
+
+Then build the FinRL environment image:
+
+```bash
+docker build -t finrl-env:latest -f src/envs/finrl_env/server/Dockerfile .
+```
+
+### 2. Run the Server
+
+#### Option A: With Default Sample Data
+
+```bash
+docker run -p 8000:8000 finrl-env:latest
+```
+
+This starts the server with synthetic sample data for testing.
+
+#### Option B: With Custom Configuration
+
+Create a configuration file `config.json`:
+
+```json
+{
+  "data_path": "/data/stock_data.csv",
+  "stock_dim": 3,
+  "hmax": 100,
+  "initial_amount": 100000,
+  "num_stock_shares": [0, 0, 0],
+  "buy_cost_pct": [0.001, 0.001, 0.001],
+  "sell_cost_pct": [0.001, 0.001, 0.001],
+  "reward_scaling": 0.0001,
+  "state_space": 25,
+  "action_space": 3,
+  "tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
+}
+```
+
+Run with configuration:
+
+```bash
+docker run -p 8000:8000 \
+  -v $(pwd)/config.json:/config/config.json \
+  -v $(pwd)/data:/data \
+  -e FINRL_CONFIG_PATH=/config/config.json \
+  finrl-env:latest
+```
+
+### 3. Use the Client
+
+```python
+from envs.finrl_env import FinRLEnv, FinRLAction
+import numpy as np
+
+# Connect to server
+client = FinRLEnv(base_url="http://localhost:8000")
+
+# Get configuration
+config = client.get_config()
+print(f"Trading {config['stock_dim']} stocks")
+print(f"Initial capital: ${config['initial_amount']:,.0f}")
+
+# Reset environment
+result = client.reset()
+print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")
+
+# Trading loop
+for step in range(100):
+    # Get current state
+    state = result.observation.state
+
+    # Your RL policy here (example: random actions)
+    num_stocks = config['stock_dim']
+    actions = np.random.uniform(-1, 1, size=num_stocks).tolist()
+
+    # Execute action
+    result = client.step(FinRLAction(actions=actions))
+
+    print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
+          f"Reward={result.reward:.2f}")
+
+    if result.done:
+        print("Episode finished!")
+        break
+
+client.close()
+```
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    RL Training Framework                    │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
+│  │ Policy Net   │  │ Value Net    │  │ Replay       │      │
+│  │ (PyTorch)    │  │ (PyTorch)    │  │ Buffer       │      │
+│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │
+│         └──────────────────┴──────────────────┘              │
+│                            │                                 │
+│                   ┌────────▼────────┐                        │
+│                   │ FinRLEnv        │ ← HTTP Client          │
+│                   │ (HTTPEnvClient) │                        │
+│                   └────────┬────────┘                        │
+└────────────────────────────┼─────────────────────────────────┘
+                             │ HTTP (JSON)
+                    ┌────────▼────────┐
+                    │ Docker Container│
+                    │  Port: 8000     │
+                    │                 │
+                    │ ┌─────────────┐ │
+                    │ │FastAPI      │ │
+                    │ │Server       │ │
+                    │ └──────┬──────┘ │
+                    │        │        │
+                    │ ┌──────▼──────┐ │
+                    │ │ FinRL       │ │
+                    │ │ Environment │ │
+                    │ └──────┬──────┘ │
+                    │        │        │
+                    │ ┌──────▼──────┐ │
+                    │ │ FinRL       │ │
+                    │ │ StockTrading│ │
+                    │ │ Env         │ │
+                    │ └─────────────┘ │
+                    └─────────────────┘
+```
+
+## API Reference
+
+### FinRLAction
+
+Trading action for the environment.
+
+**Attributes:**
+- `actions: list[float]` - Array of normalized action values (-1 to 1) for each stock
+  - Positive values: Buy
+  - Negative values: Sell
+  - Magnitude: Relative trade size
+
+**Example:**
+```python
+# Buy stock 0, sell stock 1, hold stock 2
+action = FinRLAction(actions=[0.5, -0.3, 0.0])
+```
+
+### FinRLObservation
+
+Observation returned by the environment.
+
+**Attributes:**
+- `state: list[float]` - Flattened state vector
+  - Structure: `[balance, prices..., holdings..., indicators...]`
+- `portfolio_value: float` - Total portfolio value (cash + holdings)
+- `date: str` - Current trading date
+- `done: bool` - Whether episode has ended
+- `reward: float` - Reward for the last action
+- `metadata: dict` - Additional information
+
+**Example:**
+```python
+obs = result.observation
+print(f"Portfolio: ${obs.portfolio_value:,.2f}")
+print(f"Date: {obs.date}")
+print(f"State dimension: {len(obs.state)}")
+```
+
+### Client Methods
+
+#### `reset() -> StepResult[FinRLObservation]`
+
+Reset the environment to start a new episode.
+
+```python
+result = client.reset()
+```
+
+#### `step(action: FinRLAction) -> StepResult[FinRLObservation]`
+
+Execute a trading action.
+
+```python
+action = FinRLAction(actions=[0.5, -0.3])
+result = client.step(action)
+```
+
+#### `state() -> State`
+
+Get episode metadata (episode_id, step_count).
+
+```python
+state = client.state()
+print(f"Episode: {state.episode_id}, Step: {state.step_count}")
+```
+
+#### `get_config() -> dict`
+
+Get environment configuration.
+
+```python
+config = client.get_config()
+print(config['stock_dim'])
+print(config['initial_amount'])
+```
+
+## Data Format
+
+The environment expects stock data in the following CSV format:
+
+| date       | tic    | close  | high   | low    | open   | volume  | macd  | rsi_30 | cci_30 | dx_30 |
+|------------|--------|--------|--------|--------|--------|---------|-------|--------|--------|-------|
+| 2020-01-01 | AAPL   | 100.0  | 102.0  | 98.0   | 99.0   | 1000000 | 0.5   | 55.0   | 10.0   | 15.0  |
+| 2020-01-01 | GOOGL  | 1500.0 | 1520.0 | 1480.0 | 1490.0 | 500000  | -0.3  | 48.0   | -5.0   | 20.0  |
+
+**Required columns:**
+- `date`: Trading date
+- `tic`: Stock ticker symbol
+- `close`, `high`, `low`, `open`: Price data
+- `volume`: Trading volume
+- Technical indicators (as specified in `tech_indicator_list`)
+
+## Configuration Parameters
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `data_path` | str | Path to CSV file with stock data |
+| `stock_dim` | int | Number of stocks to trade |
+| `hmax` | int | Maximum shares per trade |
+| `initial_amount` | int | Starting cash balance |
+| `num_stock_shares` | list[int] | Initial holdings for each stock |
+| `buy_cost_pct` | list[float] | Transaction cost for buying (per stock) |
+| `sell_cost_pct` | list[float] | Transaction cost for selling (per stock) |
+| `reward_scaling` | float | Scaling factor for rewards |
+| `state_space` | int | Dimension of state vector |
+| `action_space` | int | Dimension of action space |
+| `tech_indicator_list` | list[str] | Technical indicators to include |
+
+## Integration with RL Frameworks
+
+### Stable Baselines 3
+
+```python
+from stable_baselines3 import PPO
+from envs.finrl_env import FinRLEnv, FinRLAction
+import numpy as np
+
+# Create custom wrapper for SB3
+class SB3FinRLWrapper:
+    def __init__(self, base_url):
+        self.env = FinRLEnv(base_url=base_url)
+        config = self.env.get_config()
+        self.action_space = spaces.Box(
+            low=-1, high=1,
+            shape=(config['action_space'],),
+            dtype=np.float32
+        )
+        self.observation_space = spaces.Box(
+            low=-np.inf, high=np.inf,
+            shape=(config['state_space'],),
+            dtype=np.float32
+        )
+
+    def reset(self):
+        result = self.env.reset()
+        return np.array(result.observation.state, dtype=np.float32)
+
+    def step(self, action):
+        result = self.env.step(FinRLAction(actions=action.tolist()))
+        return (
+            np.array(result.observation.state, dtype=np.float32),
+            result.reward or 0.0,
+            result.done,
+            result.observation.metadata
+        )
+
+# Train
+env = SB3FinRLWrapper("http://localhost:8000")
+model = PPO("MlpPolicy", env, verbose=1)
+model.learn(total_timesteps=10000)
+```
+
+## Troubleshooting
+
+### Server won't start
+
+1. Check if base image exists:
+   ```bash
+   docker images | grep envtorch-base
+   ```
+
+2. Build base image if missing:
+   ```bash
+   docker build -t envtorch-base:latest -f src/core/containers/images/Dockerfile .
+   ```
+
+### Import errors
+
+Make sure you're in the `src` directory:
+```bash
+cd OpenEnv/src
+python -c "from envs.finrl_env import FinRLEnv"
+```
+
+### Configuration errors
+
+Verify your data file has all required columns:
+```python
+import pandas as pd
+df = pd.read_csv('your_data.csv')
+print(df.columns.tolist())
+```
+
+## Examples
+
+See the `examples/` directory for complete examples:
+- `examples/finrl_simple.py` - Basic usage
+- `examples/finrl_training.py` - Full training loop with PPO
+- `examples/finrl_backtesting.py` - Backtesting a trained agent
+
+## License
+
+BSD 3-Clause License (see LICENSE file in repository root)
+
+## References
+
+- [FinRL Paper](https://arxiv.org/abs/2011.09607)
+- [FinRL GitHub](https://github.com/AI4Finance-Foundation/FinRL)
+- [OpenEnv Documentation](../../README.md)
@@ -0,0 +1,33 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.
+
+"""
+FinRL Environment for OpenEnv.
+
+This package provides a wrapper around FinRL's StockTradingEnv that conforms
+to the OpenEnv specification, enabling stock trading RL tasks through a
+simple HTTP API.
+
+Example:
+    >>> from envs.finrl_env import FinRLEnv, FinRLAction
+    >>>
+    >>> # Connect to server
+    >>> client = FinRLEnv(base_url="http://localhost:8000")
+    >>>
+    >>> # Reset environment
+    >>> result = client.reset()
+    >>> print(result.observation.portfolio_value)
+    >>>
+    >>> # Execute trading action
+    >>> action = FinRLAction(actions=[0.5])  # Buy
+    >>> result = client.step(action)
+    >>> print(result.reward)
+"""
+
+from .client import FinRLEnv
+from .models import FinRLAction, FinRLObservation
+
+__all__ = ["FinRLEnv", "FinRLAction", "FinRLObservation"]