fix: persist mount metadata across master switch by jlon · Pull Request #650 · CurvineIO/curvine

jlon · 2026-02-10T14:00:34Z

Summary

flush mount/unmount journal entries before returning success to avoid mount metadata loss during leader switch
fix mount update-mode existence check to use curvine mount-path index
expose standby master fs in mini cluster test helper for failover assertions
add mount regression and coverage tests in curvine-tests (failover, mount manager, mount table)

Verification

cargo clippy --all-targets --jobs 2 -- --deny=warnings --allow clippy::uninlined-format-args
cargo test -p curvine-tests --test mount_failover_test --test mount_manager_test --test mount_table_test -- --nocapture

jlon · 2026-02-11T02:05:50Z

Additional note for this PR update (commit: d340099):

Why this change

After fixing the mount-loss issue, we also need clear diagnostics for MountTable::restore() so that restore-time failures are visible instead of silent.

What was added

File: curvine-server/src/master/mount/mount_table.rs

Failed to load mount table from metadata store:
- mount restore failed: unable to load mount table from metadata store, err=...
Restore start with total entries:
- mount restore started: <N> entries loaded from metadata store
Empty-table case:
- mount restore completed: no entries found
Per-entry success:
- mount restore entry succeeded: mount_id=..., cv_path=..., ufs_path=...
Per-entry failure:
- mount restore entry failed: mount_id=..., cv_path=..., ufs_path=..., err=...
Final summary:
- all success: mount restore completed successfully: restored=..., failed=0
- partial failure: mount restore completed with errors: restored=..., failed=...

Behavioral scope

This is observability-only for restore path.
No fail-fast behavior change was introduced in this commit.

Validation

mount_table_test
mount_manager_test
mount_failover_test
All passed.

szbr9486 · 2026-02-11T09:29:13Z

curvine-server/src/master/mount/mount_table.rs

+        if total == 0 {
+            info!("mount restore completed: no entries found");
+            return;
+        }


else {
info!(
"mount restore started: {} entries loaded from metadata store",
total
);
}

Keeping this outside else is intentional. We want a clear restore lifecycle log even when metadata load succeeds with zero entries, so operators can distinguish "restore executed and found 0 mounts" from "restore never ran".

Updated in 0b5a30e. The start log now runs only when total > 0, so the empty-table path emits only a single completion line.

szbr9486 · 2026-02-11T09:39:44Z

curvine-server/src/master/mount/mount_table.rs

+        };
+
+        let total = mounts.len();
+        info!(


total is kept to support restore observability and summary correlation. It is used to report how many entries were loaded from metadata before per-entry restore, which helps diagnose partial-restore cases.

Updated in 0b5a30e. Removed redundant temp vars in the restore loop and simplified logging flow.

szbr9486 · 2026-02-11T09:43:24Z

curvine-server/src/master/mount/mount_table.rs

+        for mnt in mounts {
+            let mount_id = mnt.mount_id;
+            let cv_path = mnt.cv_path.clone();
+            let ufs_path = mnt.ufs_path.clone();


Deleting the three lines above only prints in the logs, so it's unnecessary.

I am keeping these logs on purpose. They provide low-cost but important startup visibility for master failover: load success/failure, empty-table case, and restore progress are distinct states in production troubleshooting.

Updated in 0b5a30e. Empty restore now logs once (no entries found) and avoids duplicate-looking startup lines.

…rting

szbr9486 reviewed Feb 11, 2026

View reviewed changes

80347547 added 3 commits February 11, 2026 18:17

fix: persist mount journal before success and strengthen mount tests

268d760

fix(mount): add explicit restore diagnostics and per-entry error repo…

25b52bb

…rting

refactor(mount): simplify restore logging flow

3fba9dd

jlon force-pushed the fix/mount-master-switch-loss branch from 0b5a30e to 3fba9dd Compare February 11, 2026 10:17

jlon closed this Feb 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: persist mount metadata across master switch#650

fix: persist mount metadata across master switch#650
jlon wants to merge 3 commits intoCurvineIO:mainfrom
jlon:fix/mount-master-switch-loss

jlon commented Feb 10, 2026

Uh oh!

jlon commented Feb 11, 2026

Uh oh!

szbr9486 Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

szbr9486 Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

szbr9486 Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

jlon Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlon commented Feb 10, 2026

Summary

Verification

Uh oh!

jlon commented Feb 11, 2026

Why this change

What was added

Behavioral scope

Validation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants