4.9 KiB
NNUE Training Pipeline
This directory contains the complete NNUE (Efficiently Updatable Neural Network) training pipeline for the Now-Chess bot.
Overview
The pipeline generates 500,000 random chess positions, evaluates them with Stockfish, trains a neural network, and exports the weights as Scala code for integration into the engine.
Prerequisites
Install Python dependencies:
pip install -r requirements.txt
Ensure Stockfish is installed. You can:
- Install via package manager:
apt-get install stockfish(Linux) orbrew install stockfish(macOS) - Or download from stockfish.org
Set the Stockfish path:
export STOCKFISH_PATH=/path/to/stockfish
Pipeline Steps
Quick Run
Run the entire pipeline:
chmod +x run_pipeline.sh
./run_pipeline.sh
This automatically runs all 4 steps in sequence and confirms each succeeds before continuing.
Individual Steps
Step 1: Generate Positions
Generate 500,000 random chess positions:
python3 generate_positions.py positions.txt
Output: positions.txt (one FEN per line)
- Plays 8-20 random opening moves
- Filters out checks, captures available, and game-over positions
- Shows progress bar with tqdm
Step 2: Label with Stockfish
Evaluate each position with Stockfish at depth 12:
export STOCKFISH_PATH=/path/to/stockfish
python3 label_positions.py positions.txt training_data.jsonl $STOCKFISH_PATH
Output: training_data.jsonl (one JSON per line)
- Format:
{"fen": "...", "eval": 123}(centipawns) - Evals clamped to [-2000, 2000] to avoid mate score outliers
- Supports resuming if interrupted (checks for existing entries)
- Shows progress bar with tqdm
Note: This step is slow (~24-36 hours for 500K positions at depth 12). You can reduce games or use lower depth for testing.
Step 3: Train NNUE Model
Train the neural network:
python3 train_nnue.py training_data.jsonl nnue_weights.pt
Output: nnue_weights.pt (PyTorch model weights)
Architecture:
- Input: 768 binary features (12 piece types × 64 squares)
- Hidden 1: 256 neurons + ReLU
- Hidden 2: 32 neurons + ReLU
- Output: 1 neuron (sigmoid applied to eval/400)
Training:
- 20 epochs, batch size 4096, Adam optimizer (lr=1e-3)
- 90% train / 10% validation split
- Saves best weights by validation loss
- Shows train/val loss per epoch
Note: Requires GPU for reasonable speed (~2-4 hours). CPU falls back to ~8-16 hours.
Step 4: Export to Scala
Export weights as Scala code:
python3 export_weights.py nnue_weights.pt ../src/main/scala/de/nowchess/bot/bots/nnue/NNUEWeights.scala
Output: NNUEWeights.scala
- Object with
valarrays for each layer's weights and biases - Format:
Array[Float]with precision sufficient for inference - Includes shape comments for reference
Scala Integration
Step 5: NNUE Evaluator
Create NNUE.scala in src/main/scala/de/nowchess/bot/bots/nnue/:
package de.nowchess.bot.bots.nnue
class NNUE:
// Load weights from NNUEWeights.scala
// Convert Position to 768-feature vector
// Run inference: l1→ReLU→l2→ReLU→l3
// Return centipawn score
Step 6: Integration
Implement NNUEBot that uses the NNUE evaluator for move selection.
File Reference
| File | Purpose |
|---|---|
requirements.txt |
Python dependencies |
generate_positions.py |
Step 1: Position generator |
label_positions.py |
Step 2: Stockfish labeler |
train_nnue.py |
Step 3: NNUE trainer |
export_weights.py |
Step 4: Weight exporter |
run_pipeline.sh |
Master script (runs steps 1-4) |
positions.txt |
Output: Raw FENs (500K) |
training_data.jsonl |
Output: FEN+eval pairs |
nnue_weights.pt |
Output: Trained weights |
../src/main/scala/.../NNUEWeights.scala |
Output: Scala weights |
Tips
- For testing: Reduce
generate_positions.pyto 10,000 games for quick iteration - Resume labeling: Run step 2 again; it skips already-evaluated positions
- GPU acceleration: Install CUDA for PyTorch to speed up training
- Stockfish tuning: Lower depth (e.g., 8 instead of 12) for faster labeling
- Batch size: Increase to 8192 if OOM; decrease if out of memory
Troubleshooting
ImportError: No module named 'chess'
- Run:
pip install -r requirements.txt
Stockfish not found
- Check:
which stockfishor setexport STOCKFISH_PATH=/full/path/to/stockfish
CUDA out of memory
- Reduce batch size in
train_nnue.py(e.g., 2048) - Or use CPU: Remove CUDA check and device setup
Training loss not decreasing
- Check data quality: Sample some entries from
training_data.jsonl - Increase learning rate to 1e-2 or 5e-4 for experimentation
- Verify Stockfish depth was sufficient (depth ≥ 10)