feat(official-bots): standalone self-play + one-shot dataset builder for NNUE training

Add an easy local data pipeline feeding GPU training on Colab. - SelfPlayMain: standalone NNUEBot self-play (no microservices) writing FENs for labeling; randomised openings for game diversity, sequential due to the shared EvaluationNNUE accumulator. Exposed via the `selfPlay` Gradle task and selfplay.sh. - NNUEBot: optional fixedMoveTimeMs so self-play runs fast (default unchanged). - NbaiLoader: honor `-Dnnue.weights=<path>` to load weights from a file before falling back to the bundled resource. - build_dataset.py / dataset.sh: one command builds the entire dataset (Lichess eval-DB backbone + self-play + tactical + random filler), dedups, balances the eval histogram, writes append-only zstd shards + manifest, and rclone-pushes to Drive. - train.py: NNUEDataset reads a directory of .jsonl.zst shards (streaming) in addition to a single file. - NNUETraining.ipynb: clone to ephemeral /content, sync shards from Drive (cache-aware), train on the shards dir; removed Colab generation/upload steps. - Concept + implementation plan docs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 22:04:22 +02:00
parent c8cbcdca3b
commit 1c80abdb8a
11 changed files with 909 additions and 198 deletions
@@ -21,15 +21,7 @@
  {
   "cell_type": "markdown",
   "metadata": {},
-   "source": [
-    "# NNUE Training Pipeline\n",
-    "\n",
-    "End-to-end notebook: data generation → Stockfish labeling → training → `.nbai` export.\n",
-    "\n",
-    "**Runtime:** GPU (T4 or better). Runtime → Change runtime type → T4 GPU.\n",
-    "\n",
-    "**Persistence:** Checkpoints and datasets are saved to Google Drive so training can resume after session timeout."
-   ],
+   "source": "# NNUE Training Pipeline\n\nGPU training on Colab. Data is built **locally** (`./dataset.sh` → sharded, pushed to\nDrive via rclone); this notebook only **syncs shards → trains → exports `.nbai`**.\nNo generation, no Stockfish labeling, no browser uploads here.\n\n**Runtime:** GPU (T4 or better). Runtime → Change runtime type → T4 GPU.\n\n**Persistence:** Datasets and checkpoints live on Google Drive, so training resumes\nafter a session timeout. The repo is cloned to ephemeral `/content` for speed.",
   "id": "intro-md"
  },
  {
@@ -58,25 +50,7 @@
   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "import os\n",
-    "\n",
-    "# ── Configure these paths once ───────────────────────────────────────────────\n",
-    "REPO_URL     = 'https://git.janis-eccarius.de/NowChess/NowChessSystems.git'\n",
-    "DRIVE_ROOT   = '/content/drive/MyDrive/NowChess'\n",
-    "REPO_DIR     = f'{DRIVE_ROOT}/NowChessSystems'\n",
-    "PYTHON_DIR   = f'{REPO_DIR}/modules/official-bots/python'\n",
-    "# ─────────────────────────────────────────────────────────────────────────────\n",
-    "\n",
-    "os.makedirs(DRIVE_ROOT, exist_ok=True)\n",
-    "\n",
-    "if not os.path.isdir(REPO_DIR):\n",
-    "    !git clone --depth=1 \"{REPO_URL}\" \"{REPO_DIR}\"\n",
-    "    print('Repo cloned to Drive.')\n",
-    "else:\n",
-    "    !git -C \"{REPO_DIR}\" pull --ff-only\n",
-    "    print('Repo updated.')"
-   ],
+   "source": "import os\n\n# ── Configure these paths once ───────────────────────────────────────────────\nREPO_URL   = 'https://git.janis-eccarius.de/NowChess/NowChessSystems.git'\nDRIVE_ROOT = '/content/drive/MyDrive/NowChess'   # datasets + weights persist here\nREPO_DIR   = '/content/NowChessSystems'          # ephemeral, fast local clone\nPYTHON_DIR = f'{REPO_DIR}/modules/official-bots/python'\n# ─────────────────────────────────────────────────────────────────────────────\n\nos.makedirs(DRIVE_ROOT, exist_ok=True)\n\n# Clone to ephemeral /content (NOT Drive) — fast checkout, no Drive bloat.\nif not os.path.isdir(REPO_DIR):\n    !git clone --depth=1 \"{REPO_URL}\" \"{REPO_DIR}\"\n    print('Repo cloned to /content.')\nelse:\n    !git -C \"{REPO_DIR}\" pull --ff-only\n    print('Repo updated.')",
   "id": "clone-repo"
  },
  {
@@ -84,35 +58,13 @@
   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "# Install Python dependencies\n",
-    "!pip install -q chess tqdm rich zstandard\n",
-    "\n",
-    "# Stockfish for position labeling\n",
-    "!apt-get install -q -y stockfish\n",
-    "import shutil\n",
-    "STOCKFISH_PATH = shutil.which('stockfish') or '/usr/games/stockfish'\n",
-    "print(f'Stockfish: {STOCKFISH_PATH}')\n",
-    "\n",
-    "# Add pipeline source to path\n",
-    "import sys\n",
-    "sys.path.insert(0, f'{PYTHON_DIR}/src')\n",
-    "sys.path.insert(0, PYTHON_DIR)\n",
-    "print('Python path configured.')"
-   ],
+   "source": "# Install Python dependencies. No Stockfish — labeling happens on the local box,\n# this notebook only trains on already-labeled shards.\n!pip install -q chess tqdm rich zstandard\n\nimport sys\nsys.path.insert(0, f'{PYTHON_DIR}/src')\nsys.path.insert(0, PYTHON_DIR)\nprint('Python path configured.')",
   "id": "install-deps"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
-   "source": [
-    "---\n",
-    "## 🗄️ 2 — Data\n",
-    "\n",
-    "Choose **one** of the two options below:\n",
-    "- **Option A** — generate FEN positions with random play, then label them with Stockfish.\n",
-    "- **Option B** — upload an existing `labeled.jsonl` from your machine or Drive."
-   ],
+   "source": "---\n## 🗄️ 2 — Data\n\nDatasets are built **locally** (`./dataset.sh`) and pushed to Drive with rclone as\ncompressed shards under `MyDrive/NowChess/datasets/`. Here we just sync those shards\nto the fast local disk — no generation, no labeling, no browser uploads.\n\nThe cell reads `manifest.json` and copies only shards not already cached on `/content`.",
   "id": "data-md"
  },
  {
@@ -120,91 +72,9 @@
   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "from pathlib import Path\n",
-    "\n",
-    "# Paths (all on Drive so they survive session restarts)\n",
-    "DATA_DIR      = Path(DRIVE_ROOT) / 'training_data'\n",
-    "DATA_DIR.mkdir(parents=True, exist_ok=True)\n",
-    "POSITIONS_FILE = DATA_DIR / 'positions.txt'   # raw FENs\n",
-    "LABELED_FILE   = DATA_DIR / 'labeled.jsonl'   # FEN + eval pairs\n",
-    "\n",
-    "print(f'Data directory: {DATA_DIR}')"
-   ],
+   "source": "import json, shutil\nfrom pathlib import Path\n\n# Source: shards synced from the local box via `rclone copy datasets/ gdrive:NowChess/datasets`\nDRIVE_DATASETS = Path(DRIVE_ROOT) / 'datasets'\nLOCAL_DATASETS = Path('/content/datasets')\n(LOCAL_DATASETS / 'shards').mkdir(parents=True, exist_ok=True)\n\nmanifest = json.load(open(DRIVE_DATASETS / 'manifest.json'))\nprint(f\"Dataset v{manifest['dataset_version']}: \"\n      f\"{manifest['total_positions']:,} positions across {len(manifest['shards'])} shards\")\n\ncopied = 0\nfor sh in manifest['shards']:\n    dst = LOCAL_DATASETS / 'shards' / sh['file']\n    if not dst.exists():                      # cache: only copy shards we don't already have\n        shutil.copy(DRIVE_DATASETS / 'shards' / sh['file'], dst)\n        copied += 1\nshutil.copy(DRIVE_DATASETS / 'manifest.json', LOCAL_DATASETS / 'manifest.json')\n\nDATA_PATH = str(LOCAL_DATASETS)               # train_nnue / burst_train read this dir of shards directly\nprint(f\"Synced {copied} new shard(s). Dataset ready at {DATA_PATH}\")",
   "id": "data-paths"
  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Option A: Generate + label ────────────────────────────────────────────────\n",
-    "# Adjust NUM_POSITIONS to taste. 50 000 trains in ~10 min on T4;\n",
-    "# 200 000+ gives better generalisation.\n",
-    "NUM_POSITIONS    = 50_000\n",
-    "STOCKFISH_DEPTH  = 12\n",
-    "LABEL_WORKERS    = 4       # parallel Stockfish processes\n",
-    "MIN_MOVE         = 5       # skip opening book moves\n",
-    "MAX_MOVE         = 60\n",
-    "\n",
-    "from generate import play_random_game_and_collect_positions\n",
-    "from label    import label_positions_with_stockfish\n",
-    "\n",
-    "print(f'Generating {NUM_POSITIONS:,} positions...')\n",
-    "count = play_random_game_and_collect_positions(\n",
-    "    str(POSITIONS_FILE),\n",
-    "    total_positions=NUM_POSITIONS,\n",
-    "    samples_per_game=1,\n",
-    "    min_move=MIN_MOVE,\n",
-    "    max_move=MAX_MOVE,\n",
-    "    num_workers=4,\n",
-    ")\n",
-    "print(f'{count:,} positions written to {POSITIONS_FILE}')\n",
-    "\n",
-    "print('Labeling with Stockfish (this is the slow step)...')\n",
-    "ok = label_positions_with_stockfish(\n",
-    "    str(POSITIONS_FILE),\n",
-    "    str(LABELED_FILE),\n",
-    "    STOCKFISH_PATH,\n",
-    "    depth=STOCKFISH_DEPTH,\n",
-    "    num_workers=LABEL_WORKERS,\n",
-    ")\n",
-    "if ok:\n",
-    "    print(f'Labeled dataset saved: {LABELED_FILE}')\n",
-    "else:\n",
-    "    print('ERROR: labeling failed')"
-   ],
-   "id": "option-a-generate"
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Option B: Upload existing labeled.jsonl ───────────────────────────────────\n",
-    "# Run this cell instead of Option A if you already have a labeled dataset.\n",
-    "#\n",
-    "# To upload from local machine:\n",
-    "#   from google.colab import files\n",
-    "#   uploaded = files.upload()   # pick your labeled.jsonl\n",
-    "#   import shutil, os\n",
-    "#   shutil.move(next(iter(uploaded)), str(LABELED_FILE))\n",
-    "#\n",
-    "# Or copy from Drive:\n",
-    "#   import shutil\n",
-    "#   shutil.copy('/content/drive/MyDrive/path/to/labeled.jsonl', str(LABELED_FILE))\n",
-    "\n",
-    "import os\n",
-    "if LABELED_FILE.exists():\n",
-    "    lines = sum(1 for _ in open(LABELED_FILE))\n",
-    "    print(f'Ready: {lines:,} labeled positions at {LABELED_FILE}')\n",
-    "else:\n",
-    "    print('No labeled.jsonl found — run Option A first or upload one.')"
-   ],
-   "id": "option-b-upload"
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -251,22 +121,7 @@
   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "# ── Standard training ─────────────────────────────────────────────────────────\n",
-    "# Use this when you have a reliable long-running session.\n",
-    "\n",
-    "train_nnue(\n",
-    "    data_file=str(LABELED_FILE),\n",
-    "    output_file=OUTPUT_FILE,\n",
-    "    epochs=EPOCHS,\n",
-    "    batch_size=BATCH_SIZE,\n",
-    "    checkpoint=CHECKPOINT,\n",
-    "    use_versioning=True,\n",
-    "    early_stopping_patience=EARLY_STOPPING,\n",
-    "    subsample_ratio=SUBSAMPLE_RATIO,\n",
-    "    hidden_sizes=HIDDEN_SIZES,\n",
-    ")"
-   ],
+   "source": "# ── Standard training ─────────────────────────────────────────────────────────\n# Use this when you have a reliable long-running session.\n\ntrain_nnue(\n    data_file=DATA_PATH,\n    output_file=OUTPUT_FILE,\n    epochs=EPOCHS,\n    batch_size=BATCH_SIZE,\n    checkpoint=CHECKPOINT,\n    use_versioning=True,\n    early_stopping_patience=EARLY_STOPPING,\n    subsample_ratio=SUBSAMPLE_RATIO,\n    hidden_sizes=HIDDEN_SIZES,\n)",
   "id": "standard-train"
  },
  {
@@ -274,28 +129,7 @@
   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "# ── Burst training (recommended for Colab free tier) ─────────────────────────\n",
-    "# Restarts from the global best each time early stopping fires.\n",
-    "# Set BURST_MINUTES to slightly less than the Colab session limit (~70 min).\n",
-    "\n",
-    "BURST_MINUTES      = 70\n",
-    "EPOCHS_PER_SEASON  = 30\n",
-    "BURST_PATIENCE     = 8\n",
-    "\n",
-    "burst_train(\n",
-    "    data_file=str(LABELED_FILE),\n",
-    "    output_file=OUTPUT_FILE,\n",
-    "    duration_minutes=BURST_MINUTES,\n",
-    "    epochs_per_season=EPOCHS_PER_SEASON,\n",
-    "    early_stopping_patience=BURST_PATIENCE,\n",
-    "    batch_size=BATCH_SIZE,\n",
-    "    initial_checkpoint=CHECKPOINT,\n",
-    "    use_versioning=True,\n",
-    "    subsample_ratio=SUBSAMPLE_RATIO,\n",
-    "    hidden_sizes=HIDDEN_SIZES,\n",
-    ")"
-   ],
+   "source": "# ── Burst training (recommended for Colab free tier) ─────────────────────────\n# Restarts from the global best each time early stopping fires.\n# Set BURST_MINUTES to slightly less than the Colab session limit (~70 min).\n\nBURST_MINUTES      = 70\nEPOCHS_PER_SEASON  = 30\nBURST_PATIENCE     = 8\n\nburst_train(\n    data_file=DATA_PATH,\n    output_file=OUTPUT_FILE,\n    duration_minutes=BURST_MINUTES,\n    epochs_per_season=EPOCHS_PER_SEASON,\n    early_stopping_patience=BURST_PATIENCE,\n    batch_size=BATCH_SIZE,\n    initial_checkpoint=CHECKPOINT,\n    use_versioning=True,\n    subsample_ratio=SUBSAMPLE_RATIO,\n    hidden_sizes=HIDDEN_SIZES,\n)",
   "id": "burst-train"
  },
  {
@@ -374,4 +208,4 @@
   "id": "download-cell"
  }
 ]
-}
+}