Adds three documentation files covering the Phase 0-5 hardening work:
CHANGELOG.md
- Bilingual EN/FR, strict template (English first, then ---, then French).
- Version section 2.1.5 dated 2026-04-22 (NO version bump per
CLAUDE.md version-lock rule).
- Sections: Fixed / Added / Changed / Correctifs / Ajouts / Modifications.
ERROR_LOG.md
- Journal of 8 bugs discovered and fixed during the hardening sweep.
- Each entry: Context / Error / Root cause / Fix / Prevention rule.
- Cross-references commits bea5f80 / c84f920 / 746cb56 / c70ca9f / bd0482c.
TEST_PROCEDURE_v2.1.5.html
- Self-contained HTML (no external deps), bilingual EN/FR.
- 10 test scenarios tagged CRITICAL / HIGH / MEDIUM with Setup, Steps,
Expected Results, and a regression-check block.
- Covers: drop+deco+reco, backpack dup, SS shulker dup, kill -9 recovery,
zombie-peer short-circuit, periodic save, pool stats, heartbeat,
curios cap unavailable, cross-server claim.
8.7 KiB
8.7 KiB
Changelog
All notable changes to PlayerSync are documented here.
[2.1.5] - 2026-04-22
Fixed (English first)
- Critical item duplication on drop + quick disconnect + reconnect — Race condition between the auto-save background task and the logout background task could commit a stale snapshot AFTER the logout save, resurrecting dropped items. Triple guard now applied:
pendingLogoutSavescheck (early + under lock) andSELECT online FROM player_dataskip if logout already committed. Logout BG now acquiresbgLockwith blocking.lock()for proper serialization. - Backpack / Sophisticated Storage merge-on-restore duplication —
setBackpackContents/setStorageContentsupstream are shallow merges, not replaces. Restore now callsremoveBackpackContents/removeStorageContents(with reflection fallback if absent) AND passes a defensive NBT copy. Fixes mass-duplication of items in backpacks/shulkers on every cross-server transfer. - Cross-server save overwrite — When
writeSnapshotToDB'slast_serverguard blocked the core player_data UPDATE, the downstream backpack/SS/RS2 saves still executed and overwrote the claiming server's data. The function now returns a boolean; all 5 callers short-circuit downstream writes on guard block. - 30-second join delay on zombie peer servers —
doPlayerJoinpoll waited the full 60 attempts (30s) for server_ids that no longer existed (legacyserver_id=0rows, or peers that crashed without clearingonline=0). NewisPeerServerStalecheck (peer_id=0 OR heartbeat >60s) takes over immediately and force-clears the orphaned flag. Poll max raised from 60 to 120 attempts (60s) for legitimate slow shutdowns. - Curios wipe on dead player — Legacy
StoreCurioswrote an empty flatMap when the Curios capability was unavailable, wiping DB data. Now early-returns with a WARN log.
Added
- JVM shutdown hook (kill -9 / OOM / SIGTERM recovery) — New
CrashRecovery.installShutdownHookregisters a non-daemon hook that callsVanillaSync.emergencyFlushAllsynchronously to snapshot and write every online player before process exit. Marksserver_info.enable=0so peers detect the shutdown. - Startup orphan-flag recovery —
CrashRecovery.clearOrphanedOnlineFlagsruns atonServerStartingto clear anyplayer_data.online=1rows left by a previous ungraceful exit. Logs the count viaSyncLogger. - Zombie-peer reporter —
CrashRecovery.reportZombiePeerslogs peerserver_ids whose heartbeat is stale or missing at boot time. - Server heartbeat service —
HeartbeatServicepingsserver_info.last_updateevery 10 seconds so peer servers can distinguish live from dead via the newisPeerServerStalecheck. - Periodic full-save scheduler —
PeriodicSaveServicetriggers a complete save (player data + backpacks + SS + RS2) for every online synced player everyauto_save_interval_minutes(new config, default 10, range 0-1440). Independent of NeoForge's vanillaPlayerEvent.SaveToFilecadence. - Dimension-change save trigger — New
onPlayerChangeDimensionhandler, gated bysave_on_dimension_changeconfig (default false). Protects against mid-teleport crashes. - Executor + HikariCP pool stats reporter —
PoolStatsReporterlogs[POOL] executor active/queue/idle, hikari active/idleevery 5 minutes. WARN thresholds trigger when queue >400/512 or Hikari active >=14/15. - Structured logging events —
SyncLoggergainedcontainerForceClosed,modCompatSkip,modCompatSaved,modCompatRestored,storageSave,poolStats,warnPlayer,nbtAnomalyfor finer-grained diagnostics.
Changed
writeSnapshotToDBsignature — Now returnsbooleaninstead ofvoid.truemeans the core UPDATE persisted,falsemeans thelast_serverguard blocked. All callers MUST check the return before firing downstream backpack/SS/RS2 writes.- Default
auto_save_interval_minutes— 10 min (new config key). Trades data-loss window on crash for DB load. Set to 0 to disable. - Backpack / SS restore — Now uses two-step clear (public API + reflection fallback) and defensive NBT copy before upstream setter. Full log line per restore with
cleared_via=api|reflectionandnbt_keys=N.
Correctifs (French mirror)
- Duplication d'items critique lors d'un drop + déconnexion rapide + reconnexion — Race condition entre la task auto-save background et la task logout background pouvait commiter un snapshot périmé APRÈS le save logout, ressuscitant les items drop. Triple garde maintenant appliquée : check
pendingLogoutSaves(early + sous lock) et skip viaSELECT online FROM player_datasi le logout a déjà commité. La task logout BG acquiert maintenantbgLocken blocking.lock()pour sérialiser proprement. - Duplication Backpack / Sophisticated Storage par merge au restore —
setBackpackContents/setStorageContentsen amont sont des merges shallow, pas des replaces. Le restore appelle maintenantremoveBackpackContents/removeStorageContents(avec fallback reflection si absent) ET passe une copie défensive du NBT. Corrige la duplication massive d'items dans les backpacks/shulkers à chaque transfert cross-server. - Écrasement cross-server des saves — Quand le guard
last_serverdewriteSnapshotToDBbloquait l'UPDATE core player_data, les saves downstream backpack/SS/RS2 s'exécutaient quand même et écrasaient les données du serveur ayant claim. La fonction retourne maintenant un boolean ; les 5 callers court-circuitent les writes downstream en cas de guard block. - Délai de 30 secondes à la connexion sur serveurs zombies — Le poll
doPlayerJoinattendait les 60 tentatives (30s) pour desserver_idn'existant plus (lignes legacyserver_id=0, ou peers ayant crashé sans clearonline=0). Nouveau checkisPeerServerStale(peer_id=0 OU heartbeat >60s) prend la main immédiatement et force-clear le flag orphelin. Poll max passé de 60 à 120 tentatives (60s) pour couvrir les shutdowns lents légitimes. - Wipe Curios sur joueur mort — La méthode legacy
StoreCuriosécrivait un flatMap vide quand la capability Curios était absente, wipant les données DB. Elle early-return maintenant avec un log WARN.
Ajouts (French mirror)
- Hook JVM shutdown (kill -9 / OOM / SIGTERM recovery) — Nouveau
CrashRecovery.installShutdownHookenregistre un hook non-daemon qui appelleVanillaSync.emergencyFlushAllsynchronement pour snapshot et écrire chaque joueur online avant la fin du process. Marqueserver_info.enable=0pour que les peers détectent le shutdown. - Recovery des flags orphelins au boot —
CrashRecovery.clearOrphanedOnlineFlagstourne auonServerStartingpour clear les rowsplayer_data.online=1laissées par une sortie ungracieuse précédente. Log le compte viaSyncLogger. - Reporter de peers zombies —
CrashRecovery.reportZombiePeerslog lesserver_idpeers dont le heartbeat est stale ou absent au boot. - Service heartbeat —
HeartbeatServicepingserver_info.last_updatetoutes les 10 secondes pour que les peers distinguent live vs dead via le nouveau checkisPeerServerStale. - Scheduler de sauvegarde périodique —
PeriodicSaveServicedéclenche une save complète (player data + backpacks + SS + RS2) pour chaque joueur online synced toutes lesauto_save_interval_minutes(nouvelle config, défaut 10, plage 0-1440). Indépendant de la cadence vanillaPlayerEvent.SaveToFilede NeoForge. - Trigger save sur changement de dimension — Nouveau handler
onPlayerChangeDimension, gated par la configsave_on_dimension_change(défaut false). Protège contre les crashes en plein téléport. - Reporter stats executor + HikariCP —
PoolStatsReporterlog[POOL] executor active/queue/idle, hikari active/idletoutes les 5 min. Seuils WARN quand queue >400/512 ou Hikari active >=14/15. - Événements structurés —
SyncLoggera gagnécontainerForceClosed,modCompatSkip,modCompatSaved,modCompatRestored,storageSave,poolStats,warnPlayer,nbtAnomalypour un diagnostic plus fin.
Modifications
- Signature
writeSnapshotToDB— Retourne maintenantbooleanau lieu devoid.true= l'UPDATE core a persisté,false= le guardlast_servera bloqué. Tous les callers DOIVENT vérifier le retour avant de déclencher les writes downstream backpack/SS/RS2. - Défaut
auto_save_interval_minutes— 10 min (nouvelle clé config). Trade-off entre fenêtre de perte de données sur crash et charge DB. 0 pour désactiver. - Restore Backpack / SS — Utilise maintenant un clear en deux étapes (API publique + fallback reflection) et une copie défensive NBT avant le setter upstream. Log complet par restore avec
cleared_via=api|reflectionetnbt_keys=N.