Phase 13.1: revert to safe default — never force-claim on alive peer

User report: Phase 13's 15s force-claim default reopened a rare duplication
scenario. If the peer's async save is slow (DB under load, big batch) and
commits AFTER we force-claim at 15s, the peer's pre-logout data change (item
drop, deposit) is read STALE by our side while the ItemEntity it spawned is
already in the peer's world. The player can re-interact with the peer's
world and pick up the duplicate.

Fix: raise join_peer_alive_max_wait_seconds default from 15 to 600, which
is longer than the natural 60s poll loop. Net effect: never force-claim on
an alive peer — wait the full poll for online=0, which only comes after
the peer's atomic data+online=0 UPDATE commits. Zero duplication window.

Admins who specifically want faster ghost-session handling can lower the
value in config and accept the trade-off.

Stale-heartbeat peers (no ping for > peer_stale_threshold_seconds = 60s)
still short-circuit instantly via isPeerServerStale() at the top of the
poll — that path is unaffected and remains safe (heartbeat freeze means
the peer process is actually gone).

The RS2 batching from Phase 13 remains (unrelated pure perf). Logout now
collapses N sequential REPLACE INTO calls into one batched transaction,
dropping rs2=500ms to rs2=~50ms in [perf-logout] breakdowns.
This commit is contained in:
laforetbrut 2026-04-22 09:10:28 +02:00
parent fa7033fdea
commit ed9fdcda79

View File

@ -176,14 +176,20 @@ public class JdbcConfig {
"Wait interval between last_server poll attempts (milliseconds).")
.defineInRange("join_poll_interval_ms", 500, 100, 5000);
JOIN_PEER_ALIVE_MAX_WAIT_SECONDS = B.comment(
"When the previous server is ALIVE (heartbeat fresh) but the player row still",
"shows online=1 on it, how long to wait before force-claiming ownership on this",
"server. Ghost sessions (network drop, proxy bypass, stuck flag) otherwise hold",
"the join hostage up to 60s. Real logout saves consistently complete in <1s in",
"production, so any wait > ~15s means the peer isn't going to flush — force-",
"claim is safe because peer's future saves get blocked by the last_server guard.",
"Default 15s. Set to 0 to force-claim immediately; set high to be more patient.")
.defineInRange("join_peer_alive_max_wait_seconds", 15, 0, 600);
"How long to wait before force-claiming ownership when the previous server is",
"ALIVE (heartbeat fresh) but the player row still shows online=1. A force-claim",
"reads whatever is currently in the DB — if the peer's async save is still",
"in flight and commits AFTER we claim, any state change the peer recorded (item",
"pickup, drop, deposit) is lost on our side and may look like duplication against",
"an ItemEntity the peer had spawned. Real saves complete in <1s, but a slow DB",
"or heavy batch can push this to many seconds.",
"",
"Default 600s = wait the full poll — never force-claim on an alive peer. SAFE.",
"Lower to 30/15s if you accept the edge-case risk in exchange for faster handling",
"of ghost sessions (player dropped off A's network without clean logout).",
"Set to 0 to force-claim immediately (very aggressive, highest risk).",
"Stale-heartbeat peers are always force-claimed instantly regardless of this value.")
.defineInRange("join_peer_alive_max_wait_seconds", 600, 0, 3600);
POOL_STATS_INTERVAL_MINUTES = B.comment(
"How often PoolStatsReporter logs executor + Hikari stats. 0 to disable.")
.defineInRange("pool_stats_interval_minutes", 5, 0, 1440);