Skip to content

MES-713: Parallelize Fuse Interface#35

Open
markovejnovic wants to merge 3 commits intomainfrom
parallel-fuse
Open

MES-713: Parallelize Fuse Interface#35
markovejnovic wants to merge 3 commits intomainfrom
parallel-fuse

Conversation

@markovejnovic
Copy link
Collaborator

No description provided.

@mesa-dot-dev
Copy link

mesa-dot-dev bot commented Feb 11, 2026

Mesa Description

This pull request refactors the FUSE filesystem implementation to process operations in parallel, significantly improving performance and throughput.

Previously, the FUSE interface handled requests synchronously, creating a bottleneck that limited processing to one operation at a time. This PR addresses that limitation by making the following architectural changes:

  • Asynchronous FUSE Handling: The FuserAdapter now spawns a dedicated Tokio task for each incoming FUSE request, allowing multiple operations to be handled concurrently instead of sequentially.
  • Thread-Safe Data Structures: To support concurrent access safely, the underlying filesystem components (MesaFS, CompositeFs, OrgFs, RepoFs) were updated. Standard HashMap collections were replaced with the scc::HashMap concurrent map, and other shared data is now protected by parking_lot::RwLock.
  • Immutable Trait Methods: The core Fs trait has been updated so that methods like lookup, open, and read now operate on a shared &self reference instead of a mutable &mut self one. This change reflects the new thread-safe design and allows the compiler to enforce that these methods can be called concurrently from multiple tasks.

Description generated by Mesa. Update settings

@markovejnovic
Copy link
Collaborator Author

/review

Copy link

@mesa-dot-dev mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed full review of e2d3738...c5e6915

Analysis

  1. Fire-and-forget task pattern in FuserAdapter::spawn lacks task lifecycle management, risking orphaned operations and resource leaks during unmount since FUSE reply handles can be dropped without responding.

  2. Global serialization bottleneck in MescloudICache::ensure_child_ino with a single mutex that serializes ALL child inode allocations, potentially limiting parallel directory operations.

  3. Inconsistent synchronization patterns across inode allocation paths - while ensure_child_ino is protected, similar TOCTOU patterns exist unfixed in MesaFS::ensure_org_inode, OrgFs::ensure_owner_inode, and OrgFs::ensure_repo_inode.

  4. Silent error handling with multiple insert_async calls discarding results (let _ = ...), masking potential race condition bugs since scc::HashMap returns errors on duplicate keys.

  5. Non-atomic bridge reset in HashMapBridge::reset() creates a window for inconsistent state observation as it clears two maps with separate write locks.

Tip

Help

Slash Commands:

  • /review - Request a full code review
  • /review latest - Review only changes since the last review
  • /describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
  • /help - Get help with Mesa commands and configuration options

10 files reviewed | 1 comments | Edit Agent SettingsRead Docs


self.composite.child_inodes.insert(ino, org_idx);
self.composite.inode_to_slot.insert(ino, org_idx);
let _ = self.composite.child_inodes.insert_async(ino, org_idx).await;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium

ensure_org_inode now runs under &self and relies on try_reuse_org_inode’s any_async scan to detect existing entries. There is no serialization between that scan and this insert/reset block, so two concurrent lookups for the same org_idx can both miss the existing inode, each allocate a fresh ino, insert separate entries here, and both call slot.bridge.reset(). Whichever request completes last wins and re-seeds the bridge with its inode, while the other inode has already been returned to the kernel and is still recorded in child_inodes/inode_to_slot but no longer has any bridge mappings. Subsequent getattr/read/readdir calls on that inode will fail because the bridge can’t translate it anymore. Please add per-org synchronization (e.g. a mutex keyed by org_idx or an entry_async that ensures only one allocator wins) so that we only create a single inode and never reset the bridge twice for the same org.

Agent: 🎃 Charlie • Fix in Cursor • Fix in Claude

Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: mesa-dot-dev/gitfs#35
File: src/fs/mescloud/mod.rs#L304
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.

Feedback:
`ensure_org_inode` now runs under `&self` and relies on `try_reuse_org_inode`’s `any_async` scan to detect existing entries. There is no serialization between that scan and this insert/reset block, so two concurrent lookups for the same `org_idx` can both miss the existing inode, each allocate a fresh `ino`, insert separate entries here, and both call `slot.bridge.reset()`. Whichever request completes last wins and re-seeds the bridge with its inode, while the other inode has already been returned to the kernel and is still recorded in `child_inodes`/`inode_to_slot` but no longer has any bridge mappings. Subsequent getattr/read/readdir calls on that inode will fail because the bridge can’t translate it anymore. Please add per-org synchronization (e.g. a mutex keyed by `org_idx` or an `entry_async` that ensures only one allocator wins) so that we only create a single inode and never reset the bridge twice for the same org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments