Files
NowChessSystems/modules/bot/python/README_NNUE.md
T

174 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# NNUE Training Pipeline
This directory contains the complete NNUE (Efficiently Updatable Neural Network) training pipeline for the Now-Chess bot.
## Overview
The pipeline generates 500,000 random chess positions, evaluates them with Stockfish, trains a neural network, and exports the weights as Scala code for integration into the engine.
## Prerequisites
Install Python dependencies:
```bash
pip install -r requirements.txt
```
Ensure Stockfish is installed. You can:
- Install via package manager: `apt-get install stockfish` (Linux) or `brew install stockfish` (macOS)
- Or download from [stockfish.org](https://stockfishchess.org)
Set the Stockfish path:
```bash
export STOCKFISH_PATH=/path/to/stockfish
```
## Pipeline Steps
### Quick Run
Run the entire pipeline:
```bash
chmod +x run_pipeline.sh
./run_pipeline.sh
```
This automatically runs all 4 steps in sequence and confirms each succeeds before continuing.
### Individual Steps
#### Step 1: Generate Positions
Generate 500,000 random chess positions:
```bash
python3 generate_positions.py positions.txt
```
Output: `positions.txt` (one FEN per line)
- Plays 8-20 random opening moves
- Filters out checks, captures available, and game-over positions
- Shows progress bar with tqdm
#### Step 2: Label with Stockfish
Evaluate each position with Stockfish at depth 12:
```bash
export STOCKFISH_PATH=/path/to/stockfish
python3 label_positions.py positions.txt training_data.jsonl $STOCKFISH_PATH
```
Output: `training_data.jsonl` (one JSON per line)
- Format: `{"fen": "...", "eval": 123}` (centipawns)
- Evals clamped to [-2000, 2000] to avoid mate score outliers
- Supports resuming if interrupted (checks for existing entries)
- Shows progress bar with tqdm
**Note:** This step is slow (~24-36 hours for 500K positions at depth 12). You can reduce games or use lower depth for testing.
#### Step 3: Train NNUE Model
Train the neural network:
```bash
python3 train_nnue.py training_data.jsonl nnue_weights.pt
```
Output: `nnue_weights.pt` (PyTorch model weights)
Architecture:
- Input: 768 binary features (12 piece types × 64 squares)
- Hidden 1: 256 neurons + ReLU
- Hidden 2: 32 neurons + ReLU
- Output: 1 neuron (sigmoid applied to eval/400)
Training:
- 20 epochs, batch size 4096, Adam optimizer (lr=1e-3)
- 90% train / 10% validation split
- Saves best weights by validation loss
- Shows train/val loss per epoch
**Note:** Requires GPU for reasonable speed (~2-4 hours). CPU falls back to ~8-16 hours.
#### Step 4: Export to Scala
Export weights as Scala code:
```bash
python3 export_weights.py nnue_weights.pt ../src/main/scala/de/nowchess/bot/bots/nnue/NNUEWeights.scala
```
Output: `NNUEWeights.scala`
- Object with `val` arrays for each layer's weights and biases
- Format: `Array[Float]` with precision sufficient for inference
- Includes shape comments for reference
## Scala Integration
### Step 5: NNUE Evaluator
Create `NNUE.scala` in `src/main/scala/de/nowchess/bot/bots/nnue/`:
```scala
package de.nowchess.bot.bots.nnue
class NNUE:
// Load weights from NNUEWeights.scala
// Convert Position to 768-feature vector
// Run inference: l1→ReLU→l2→ReLU→l3
// Return centipawn score
```
### Step 6: Integration
Implement `NNUEBot` that uses the NNUE evaluator for move selection.
## File Reference
| File | Purpose |
|------|---------|
| `requirements.txt` | Python dependencies |
| `generate_positions.py` | Step 1: Position generator |
| `label_positions.py` | Step 2: Stockfish labeler |
| `train_nnue.py` | Step 3: NNUE trainer |
| `export_weights.py` | Step 4: Weight exporter |
| `run_pipeline.sh` | Master script (runs steps 1-4) |
| `positions.txt` | Output: Raw FENs (500K) |
| `training_data.jsonl` | Output: FEN+eval pairs |
| `nnue_weights.pt` | Output: Trained weights |
| `../src/main/scala/.../NNUEWeights.scala` | Output: Scala weights |
## Tips
- **For testing:** Reduce `generate_positions.py` to 10,000 games for quick iteration
- **Resume labeling:** Run step 2 again; it skips already-evaluated positions
- **GPU acceleration:** Install CUDA for PyTorch to speed up training
- **Stockfish tuning:** Lower depth (e.g., 8 instead of 12) for faster labeling
- **Batch size:** Increase to 8192 if OOM; decrease if out of memory
## Troubleshooting
**ImportError: No module named 'chess'**
- Run: `pip install -r requirements.txt`
**Stockfish not found**
- Check: `which stockfish` or set `export STOCKFISH_PATH=/full/path/to/stockfish`
**CUDA out of memory**
- Reduce batch size in `train_nnue.py` (e.g., 2048)
- Or use CPU: Remove CUDA check and device setup
**Training loss not decreasing**
- Check data quality: Sample some entries from `training_data.jsonl`
- Increase learning rate to 1e-2 or 5e-4 for experimentation
- Verify Stockfish depth was sufficient (depth ≥ 10)
## References
- [NNUE Overview](https://www.chessprogramming.org/NNUE)
- [python-chess](https://python-chess.readthedocs.io/)
- [PyTorch](https://pytorch.org/)
- [Stockfish](https://stockfishchess.org/)