Adds two new triggers that complement NeoForge's vanilla SaveToFile event:
PeriodicSaveService.java
- Dedicated single-thread daemon scheduler, started after server boot.
- Ticks every 'auto_save_interval_minutes' (config, default 10 min).
- On each tick: hops to main thread, snapshots every online synced
player via VanillaSync.snapshotAndQueueSave, async BG writes with full
P0 guard stack (pendingLogoutSaves + online=0 + bgLock tryLock).
- Set interval to 0 to disable.
VanillaSync.snapshotAndQueueSave(Player, String label)
- Extracted from onPlayerSaveToFile body; public entry point shared by
PeriodicSaveService, onPlayerChangeDimension, and the existing SaveToFile
event. Label flows into logs for traceability (SaveToFile / PERIODIC / DIMENSION).
VanillaSync.onPlayerChangeDimension
- New @SubscribeEvent on PlayerChangedDimensionEvent, gated by
'save_on_dimension_change' config (default false). Queues a full save
when a player teleports across dimensions, protecting against mid-
teleport crashes.
JdbcConfig
- Added AUTO_SAVE_INTERVAL_MINUTES (int, 0-1440, default 10)
- Added SAVE_ON_DIMENSION_CHANGE (bool, default false)
VanillaSync.onServerShutdown also stops PeriodicSaveService before the pool
close, same pattern as HeartbeatService.
Adds three utilities to harden PlayerSync against ungraceful server exits:
CrashRecovery.java
- installShutdownHook: registers a non-daemon JVM shutdown hook that calls
VanillaSync.emergencyFlushAll() synchronously when the process is killed
(SIGTERM, kill, OOM, host reboot). Covers the case where the normal
ServerStoppingEvent path never runs.
- clearOrphanedOnlineFlags: on startup, clears any online=1 player_data
rows pointing to this server_id (left by a previous crash). Reports the
count via SyncLogger so admins can see recovery activity.
- reportZombiePeers: logs peer server_ids whose heartbeat is missing or
stale (>60s), exposing the root of doPlayerJoin poll timeouts.
HeartbeatService.java
- Single-thread daemon scheduler pinging server_info.last_update every 10s.
- Lets peer servers distinguish live from dead via isPeerServerStale().
- Stopped explicitly in VanillaSync.onServerShutdown before pool close.
VanillaSync.emergencyFlushAll()
- Synchronous best-effort flush for every online player. No executor, no
locks — the server is dying, we just want data on disk. Writes player_data,
backpacks, SS, RS2 directly; logs SAVE/SKIPPED/FAILED per player via
SyncLogger so post-mortem analysis is possible.
PlayerSync.onServerStarting wires the four new calls after table init.
Fixes the production issue where players remained online=1 forever after
kill -9 and the 30s poll timeouts waiting for zombie server_ids.
Root cause of backpack duplication: Sophisticated Backpacks'
setBackpackContents merges shallowly when the UUID exists, so stale
sub-tags survived every restore. doBackPackRestore now calls
removeBackpackContents before setBackpackContents for a clean replace.
Curios cosmetic stacks (getCosmeticStacks) are now snapshotted, applied,
restored and cached on all paths. Old-format rows without the "cos:"
prefix still parse unchanged, so existing DB data is preserved on upgrade.
closeContainer no longer matches by class-name substring (was closing
unrelated mod menus containing "curio"/"accessor"). Only menus whose
slots reference the disconnecting player's inventory/ender-chest are
closed.
Thread-safety: Sophisticated Storage contents are now snapshotted on the
main thread (snapshotSSData + saveSSSnapshots) instead of read from a
background thread racing with world ticks.
Event priority / defensive guards:
- onPlayerDeath is now EventPriority.LOW and skips cancelled events so
Revive Me / Corail Tombstone's cancel runs first.
- onServerStarting short-circuits on integrated (single-player) servers
to avoid noisy MySQL connection attempts.
Observability:
- executeBatchTransaction now returns per-statement row counts.
- writeSnapshotToDB calls SyncLogger.guardBlocked when the core UPDATE
silently no-ops (another server claimed last_server).
- SyncLogger uses a daemon scheduler that flushes every 500 ms; shutdown
happens after parallel saves so final save logs are no longer dropped.
- Rollback failures inside executeBatchTransaction and
refreshInventoryForInputOutput are now logged instead of swallowed.
HikariCP retuned: maxPoolSize 25->15, connectionTimeout 30->10s,
idleTimeout 600->300s, leakDetectionThreshold 10->25s (covers worst-case
join polling without log spam).
New table_prefix config option (Tables helper) lets a user share one
MySQL database with other mods without table-name collisions. Default
is empty to preserve backward compatibility.
Reflection Methods for NeoForge AttachmentHolder are resolved once in
a static initializer and cached.
Chat sync and Cobblemon integration removed:
- Chat sync: 319 LoC of socket/thread code guarded by a config flag that
defaulted to false; orphaned config keys are silently ignored by the
NeoForge ModConfig loader, so no crash on upgrade.
- Cobblemon: 297 LoC of mixins that ran synchronous JDBC on the main
thread and built SQL with raw UUID concatenation. The existing
cobblemon table in the DB is left untouched on upgrade.
Also fixes cobblemon ALTER TABLE running blindly on every boot
(alterColumnIfNeeded helper checks INFORMATION_SCHEMA first).
Author: vyrriox
New SyncLogger utility class:
- Writes to logs/playersync/sync.log (separate from MC console)
- Automatic rotation: 10MB max per file, 5 files kept
- Thread-safe: lock-free ConcurrentLinkedQueue + async flush
- Categorized log levels: INFO, WARN, ERROR, DUPE_RISK, DATA_LOSS,
RACE, PERF_SLOW, SAVE, SAVE_FAIL, SAVE_SKIP, RESTORE, EVENT, GUARD
Tracked events:
- Every player join/leave with sync status
- Every save (logout, shutdown, death, auto-save) with duration
- Save failures with error details
- Saves skipped (uncompleted sync, dead player)
- Cross-server race conditions (poll loop waiting)
- Player disconnects before sync apply (potential data loss)
- Duplicate login kicks
- Slow operations (> 50ms threshold)
Usage: check logs/playersync/sync.log on your server for diagnostics.
Look for DUPE_RISK, DATA_LOSS, RACE, SAVE_FAIL entries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CRITICAL PERF - Staggered auto-save:
- Old: all 35 players snapshotted in ONE tick → 770-3605ms MSPT spike
(15-36 second TPS drop every 5 minutes)
- New: queue filled every 5min, drained 1 player/tick → max 22-103ms/tick
- autoSaveQueue processes one player per server tick, imperceptible impact
CRITICAL PERF - Pool scaling for 35+ players:
- Thread pool: 2-8 → 4-16 threads, queue 256 → 512
Prevents CallerRunsPolicy from executing DB tasks on main thread
- HikariCP: 10 → 25 max connections, 2 → 4 min idle
Prevents connection starvation during concurrent saves
HIGH PERF - Cached kick check (eliminates main thread DB queries):
- doPlayerConnect (network thread) caches online/lastServer/serverAlive
- onPlayerLoggedInKickCheck (MAIN thread) reuses cached result
- Fast path: 1 DB query on main thread instead of 2-4
- Fallback: full DB check if cache miss (race condition safety)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migrate connection pool from manual LinkedBlockingQueue to HikariCP
(eliminates isValid() ping on every query visible in Spark profiler)
- Move ALL DB writes off server thread: logout uses snapshot+async+latch,
shutdown uses snapshot+CompletableFuture.allOf for parallel saves
- Pre-read curios/accessories/cosmeticarmor/attachments on background
thread during login (4-7 fewer DB queries on main thread per login)
- Auto-save interval increased to 5 minutes
- Fix pool shutdown ordering: shutdownPool() now runs AFTER all shutdown
saves complete (previously could fire before, silently losing all data)
- Fix connection leak in executeQuery/executePreparedQuery when
prepareStatement throws (leaked connections exhaust HikariCP pool)
- Fix duplication bug: saveStorageContents guard used nbt.size()<=1 which
blocked legitimately emptied backpacks from saving to DB
- Fix stale SaveToFile overwriting logout: check playerLocks.containsKey
before writing to prevent stale background task from regressing data
- Remove LIMIT 1000 on startup online=0 reset (could leave players stuck)
- Add executorService.shutdown() on server stop to prevent JVM hang
- Add apply methods (applyCuriosFromData, applyAccessoriesFromData, etc.)
to separate entity writes from DB reads for thread-safe restore
- Add UUID collectors (collectBackpackUuids, collectSSUuids) and
background save methods for snapshot+async logout/shutdown pattern
Spark showed PlayerSync consuming 10.16% of the server thread, almost
entirely from DriverManager.getConnection() (TCP handshake + MySQL auth
+ USE db) called for EVERY single query. With auto-save every 60s,
each player generated ~6 new connections per save cycle on main thread.
FIX: Simple connection pool (LinkedBlockingQueue, 5 connections).
- Connections are reused instead of opened/closed per query
- isValid(2) check before reuse to detect dead connections
- returnConnection() puts connections back in pool instead of closing
- QueryResult.close() also returns to pool
- autoReconnect=true in JDBC URL for resilience
- shutdownPool() for clean server stop
- Non-database connections (startup DDL) bypass the pool
Expected improvement: ~90% reduction in MySQL overhead on server thread.
Vyrriox
CRITICAL-1/2: Remove duplicate online=1 writes from doPlayerJoin.
The synchronous onPlayerLoggedInKickCheck already sets online=1.
The background thread writes raced with logout's online=0, permanently
locking players as "online" after crash-disconnect during join.
HIGH-1: Startup SQL uses PreparedStatement for server_id (was string concat).
HIGH-2: update() method now uses try-with-resources for PreparedStatement.
HIGH-3: NPE guard in RS2 data file logging when getRS2DataFile returns null.
Vyrriox
CRITICAL fixes:
- C-1/C-2/C-4: Auto-save and logout now run on MAIN THREAD. All entity
reads (inventory, curios, effects) were happening off-thread, causing
duplication exploits (player interacts during save → items duplicated).
Auto-save uses tryLock() to skip players already being saved.
- C-5: NPE fix for non-RS2 items (null check on registry key lookup)
- C-6: RS2 .dat file written atomically (temp file + rename) to prevent
corruption of entire RS2 storage on crash mid-write
HIGH fixes:
- H-3: Deadlock prevention: lock released BEFORE latch.await() in
doPlayerJoin. Prevents shutdown deadlock where background thread
holds lock while waiting for main thread, and shutdown holds main
thread while waiting for lock.
- H-5: Curios cache now works WITHOUT keepInventory. Players who die
then disconnect before respawning no longer lose curios data.
- H-8: server_id SQL uses PreparedStatements instead of string concat
MEDIUM fixes:
- M-1: NumberFormatException in LocalJsonUtil caught per-entry instead
of crashing entire map parse (prevents losing all cosmetic armor)
Vyrriox
- Fix advancements disappearing: use PreparedStatements for all SQL with
user data (advancement JSON contains chars that broke string-concat SQL),
add null safety for advancement file
- Fix multi-server kick: run doPlayerConnect synchronously instead of async
(players could join before the duplicate check completed)
- Fix Curios disappearing: clear slots AFTER validating data exists (not
before), use CuriosCache for dead players on logout instead of empty API
- Fix Sophisticated Storage items: add storeSophisticatedStorageItems() and
restoreSophisticatedStorageItems() to sync packed barrels/shulkers/chests
- Anti-duplication: clear all inventories before restoring from DB on join
- Fix tick counter: remove LevelTickEvent (fired per dimension = 3x too
fast), merge heartbeat into ServerTickEvent
- Fix connection leaks: use try-with-resources for all QueryResult
- Fix logout order: save data BEFORE marking player offline
- Skip auto-save for dead/unsynced players to prevent saving empty data
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>