Files
NowChessSystems/modules/bot/python/README_NNUE.md
T

4.9 KiB
Raw Blame History

NNUE Training Pipeline

This directory contains the complete NNUE (Efficiently Updatable Neural Network) training pipeline for the Now-Chess bot.

Overview

The pipeline generates 500,000 random chess positions, evaluates them with Stockfish, trains a neural network, and exports the weights as Scala code for integration into the engine.

Prerequisites

Install Python dependencies:

pip install -r requirements.txt

Ensure Stockfish is installed. You can:

  • Install via package manager: apt-get install stockfish (Linux) or brew install stockfish (macOS)
  • Or download from stockfish.org

Set the Stockfish path:

export STOCKFISH_PATH=/path/to/stockfish

Pipeline Steps

Quick Run

Run the entire pipeline:

chmod +x run_pipeline.sh
./run_pipeline.sh

This automatically runs all 4 steps in sequence and confirms each succeeds before continuing.

Individual Steps

Step 1: Generate Positions

Generate 500,000 random chess positions:

python3 generate_positions.py positions.txt

Output: positions.txt (one FEN per line)

  • Plays 8-20 random opening moves
  • Filters out checks, captures available, and game-over positions
  • Shows progress bar with tqdm

Step 2: Label with Stockfish

Evaluate each position with Stockfish at depth 12:

export STOCKFISH_PATH=/path/to/stockfish
python3 label_positions.py positions.txt training_data.jsonl $STOCKFISH_PATH

Output: training_data.jsonl (one JSON per line)

  • Format: {"fen": "...", "eval": 123} (centipawns)
  • Evals clamped to [-2000, 2000] to avoid mate score outliers
  • Supports resuming if interrupted (checks for existing entries)
  • Shows progress bar with tqdm

Note: This step is slow (~24-36 hours for 500K positions at depth 12). You can reduce games or use lower depth for testing.

Step 3: Train NNUE Model

Train the neural network:

python3 train_nnue.py training_data.jsonl nnue_weights.pt

Output: nnue_weights.pt (PyTorch model weights)

Architecture:

  • Input: 768 binary features (12 piece types × 64 squares)
  • Hidden 1: 256 neurons + ReLU
  • Hidden 2: 32 neurons + ReLU
  • Output: 1 neuron (sigmoid applied to eval/400)

Training:

  • 20 epochs, batch size 4096, Adam optimizer (lr=1e-3)
  • 90% train / 10% validation split
  • Saves best weights by validation loss
  • Shows train/val loss per epoch

Note: Requires GPU for reasonable speed (~2-4 hours). CPU falls back to ~8-16 hours.

Step 4: Export to Scala

Export weights as Scala code:

python3 export_weights.py nnue_weights.pt ../src/main/scala/de/nowchess/bot/bots/nnue/NNUEWeights.scala

Output: NNUEWeights.scala

  • Object with val arrays for each layer's weights and biases
  • Format: Array[Float] with precision sufficient for inference
  • Includes shape comments for reference

Scala Integration

Step 5: NNUE Evaluator

Create NNUE.scala in src/main/scala/de/nowchess/bot/bots/nnue/:

package de.nowchess.bot.bots.nnue

class NNUE:
  // Load weights from NNUEWeights.scala
  // Convert Position to 768-feature vector
  // Run inference: l1→ReLU→l2→ReLU→l3
  // Return centipawn score

Step 6: Integration

Implement NNUEBot that uses the NNUE evaluator for move selection.

File Reference

File Purpose
requirements.txt Python dependencies
generate_positions.py Step 1: Position generator
label_positions.py Step 2: Stockfish labeler
train_nnue.py Step 3: NNUE trainer
export_weights.py Step 4: Weight exporter
run_pipeline.sh Master script (runs steps 1-4)
positions.txt Output: Raw FENs (500K)
training_data.jsonl Output: FEN+eval pairs
nnue_weights.pt Output: Trained weights
../src/main/scala/.../NNUEWeights.scala Output: Scala weights

Tips

  • For testing: Reduce generate_positions.py to 10,000 games for quick iteration
  • Resume labeling: Run step 2 again; it skips already-evaluated positions
  • GPU acceleration: Install CUDA for PyTorch to speed up training
  • Stockfish tuning: Lower depth (e.g., 8 instead of 12) for faster labeling
  • Batch size: Increase to 8192 if OOM; decrease if out of memory

Troubleshooting

ImportError: No module named 'chess'

  • Run: pip install -r requirements.txt

Stockfish not found

  • Check: which stockfish or set export STOCKFISH_PATH=/full/path/to/stockfish

CUDA out of memory

  • Reduce batch size in train_nnue.py (e.g., 2048)
  • Or use CPU: Remove CUDA check and device setup

Training loss not decreasing

  • Check data quality: Sample some entries from training_data.jsonl
  • Increase learning rate to 1e-2 or 5e-4 for experimentation
  • Verify Stockfish depth was sufficient (depth ≥ 10)

References