feat[Store]: Introduce shm helper for dummy #1177

YiXR · 2025-12-08T08:26:54Z

Description

Split shm alloc logic into a helper class.
Thus, the APP like SGLang can directly use this helper to create share mem.

Type of Change

Types
- Bug fix
- New feature
  - Transfer Engine
  - Mooncake Store
  - Mooncake EP
  - Integration
  - P2P Store
  - Python Wheel
- Breaking change
- CI/CD
- Documentation update
- Other

How Has This Been Tested?

Checklist

I have performed a self-review of my own code.
I have updated the documentation.
I have added tests to prove my changes are effective.

gemini-code-assist · 2025-12-08T08:27:15Z

Summary of Changes

Hello @YiXR, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the shared memory management within the Mooncake Store by introducing a dedicated ShmHelper class. This helper centralizes the logic for allocating and managing shared memory, making it more robust and reusable. A key benefit is the exposure of this functionality to Python, allowing external applications like SGLang to directly interact with the shared memory system. The changes also streamline the DummyClient's internal shared memory handling and update the IPC mechanism to align with the use of anonymous shared memory.

Highlights

Shared Memory Helper Class: Introduced a ShmHelper class to centralize and encapsulate shared memory allocation and management logic, promoting reusability and cleaner code.
Python Binding for SHM Allocation: Exposed the ShmHelper functionality to Python through a new MooncakeHostMemAllocatorPyWrapper class, enabling Python applications like SGLang to directly allocate shared memory.
Simplified DummyClient SHM Management: The DummyClient now leverages the ShmHelper for its shared memory operations, removing redundant allocation/deallocation logic from the client itself and simplifying its internal state.
Anonymous Shared Memory Adoption: Transitioned to using anonymous shared memory via memfd_create, which removes the need for explicit shared memory names in the IPC registration process, streamlining communication.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a ShmHelper singleton to centralize shared memory allocation, which is a good step towards code reuse. However, the current implementation of the singleton pattern has several critical flaws. It leaks the singleton object and the underlying shared memory resources. The allocation method is not thread-safe against multiple calls, which can lead to resource leaks. Most critically, client code (DummyClient) incorrectly manages the lifecycle of the global singleton, leading to potential use-after-free bugs. I've provided detailed comments and suggestions to address these issues by adopting a modern C++ singleton pattern and correcting the resource management logic.

gemini-code-assist · 2025-12-08T08:29:51Z

mooncake-store/src/dummy_client.cpp

        LOG(ERROR) << "Failed to register SHM via IPC";
-        munmap(shm_base_addr_, shm_size_);
-        close(shm_fd_);
+        shm_helper_->cleanup();


Calling shm_helper_->cleanup() from DummyClient::setup_dummy on registration failure is incorrect. ShmHelper is a process-wide singleton, and one client failing should not tear down a global resource that other components (like MooncakeHostMemAllocatorPyWrapper or other DummyClient instances) might be using. This can lead to use-after-free bugs and crashes. The singleton should manage its own lifecycle. Please remove this call.

gemini-code-assist · 2025-12-08T08:29:51Z

mooncake-store/src/dummy_client.cpp

-        close(shm_fd_);
-        shm_fd_ = -1;
-    }
+    shm_helper_->cleanup();


Calling shm_helper_->cleanup() from DummyClient::tearDownAll is incorrect. ShmHelper is a process-wide singleton, and tearing down one client should not destroy a global resource that other components might still be using. This can lead to use-after-free bugs and crashes. The singleton should manage its own lifecycle and clean up its resources only when the process exits. Please remove this call.

gemini-code-assist · 2025-12-08T08:29:51Z

mooncake-store/src/dummy_client.cpp

+ShmHelper* ShmHelper::getInstance() {
+    if (instance_ == nullptr) {
+        std::lock_guard<std::mutex> lock(instance_mutex_);
+        if (instance_ == nullptr) {
+            instance_ = new ShmHelper();
+        }
+    }
+    return instance_;
+}


The current singleton implementation for ShmHelper leaks memory because the instance created with new is never deleted. This also prevents the destructor from running, which would leak the shared memory file descriptor and mapping.

A better approach in modern C++ is to use a static local variable (Meyers' Singleton). This is thread-safe since C++11 and guarantees proper cleanup at program exit.

To apply this, you should also:

Remove the instance_ and instance_mutex_ static members from the ShmHelper class definition in dummy_client.h.

Remove their definitions at the top of dummy_client.cpp (lines 18-19).

ShmHelper* ShmHelper::getInstance() { static ShmHelper instance; return &instance; }

gemini-code-assist · 2025-12-08T08:29:51Z

mooncake-store/src/dummy_client.cpp

+}
+
+void* ShmHelper::allocate(size_t size) {
+    shm_size_ = size;


The ShmHelper::allocate method is not safe to be called more than once. If it's called a second time, it will overwrite the internal shm_fd_ and shm_base_addr_ without cleaning up the old resources, leading to leaked file descriptors and memory mappings. The check in DummyClient::setup_dummy is not sufficient to prevent this in a multi-threaded environment, as it creates a race condition.

To make this thread-safe and prevent resource leaks, you should add a guard inside allocate to prevent re-allocation.

if (shm_fd_ != -1) { throw std::runtime_error("Shared memory has already been allocated."); } shm_size_ = size;

gemini-code-assist · 2025-12-08T08:29:51Z

mooncake-store/src/dummy_client.cpp

 #else
-    errno = ENOSYS;
-    return -1;
+    return -1;  // Or appropriate fallback/error


The memfd_create_wrapper function no longer sets errno when __NR_memfd_create is not available. This can lead to misleading error messages in ShmHelper::allocate, as strerror(errno) will report an unrelated or old error. Please restore setting errno to ENOSYS in the else branch to ensure correct error reporting on systems that do not support memfd_create.

Suggested change

return -1; // Or appropriate fallback/error

errno = ENOSYS;

return -1;

keep this，it‘s ok

XucSh · 2025-12-10T03:11:49Z

mooncake-store/src/dummy_client.cpp

 }

+ShmHelper* ShmHelper::getInstance() {
+    static ShmHelper instance;


why make ShmHelper a singleton implementation .

shm is a global mem pool. It can only init once. Once app like SGLang call shm alloc, the dummy client will reuse the shm and doesn't need to alloc a new one.

XucSh · 2025-12-10T03:13:36Z

mooncake-integration/store/store_py.cpp

+            py::gil_scoped_release release;
+            if (!self.shm_helper_) {
+                LOG(ERROR) << "Shared memory allocator is not initialized";
+                return static_cast<uintptr_t>(0);


raise a fatal error may be better

XucSh · 2025-12-10T03:17:24Z

mooncake-integration/store/store_py.cpp


+    py::class_<MooncakeHostMemAllocatorPyWrapper>(m, "MooncakeHostMemAllocator")
+        .def(py::init<>())
+        .def("alloc", [](MooncakeHostMemAllocatorPyWrapper &self, size_t size) {


a free api is also needed

XucSh · 2025-12-10T03:18:56Z

mooncake-store/src/dummy_client.cpp

+    shm_size_ = 0;
+}
+
+void* ShmHelper::allocate(size_t size) {


need a mutex guard

stmatengss · 2025-12-10T16:56:43Z

mooncake-store/include/dummy_client.h

+    int shm_fd_ = -1;
+    void *shm_base_addr_ = nullptr;
+    size_t shm_size_ = 0;


Should we need an optional shm_name here?

Here we use abstract shm, thus shm name can not be the identifier, we don't use it any more.

stmatengss · 2025-12-10T16:57:58Z

mooncake-store/src/dummy_client.cpp

+
+void ShmHelper::cleanup() {
+    if (shm_base_addr_) {
+        munmap(shm_base_addr_, shm_size_);


munmap failure requires error handling.

Copilot

Pull request overview

This PR introduces a new ShmHelper singleton class to extract and centralize shared memory allocation logic from DummyClient, making it reusable by external applications like SGLang. The refactoring simplifies the IPC protocol by removing the shm_name parameter (now generated internally by RealClient) and exposes the helper through Python bindings via MooncakeHostMemAllocator.

Key Changes

Introduced ShmHelper singleton class for shared memory allocation using memfd_create
Refactored DummyClient::setup_dummy() to use ShmHelper instead of inline allocation
Removed shm_name from ShmRegisterRequest struct and map_shm_internal() signature
Added Python bindings for MooncakeHostMemAllocatorPyWrapper to expose shared memory allocation

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
mooncake-store/include/dummy_client.h	Defines new `ShmHelper` singleton class with allocation/cleanup methods and updates `DummyClient` to use `ShmHelper*` instead of `shm_name`
mooncake-store/src/dummy_client.cpp	Implements `ShmHelper` with memfd-based allocation, refactors `setup_dummy()` to use singleton, simplifies `tearDownAll()` cleanup
mooncake-store/include/real_client.h	Updates `map_shm_internal()` signature to remove `shm_name` parameter
mooncake-store/src/real_client.cpp	Generates `shm_name` internally in `map_shm_internal()`, removes formatting-only changes to log statements
mooncake-store/include/pyclient.h	Removes `shm_name` field from `ShmRegisterRequest` struct
mooncake-integration/store/store_py.cpp	Adds `MooncakeHostMemAllocatorPyWrapper` class to expose `ShmHelper` to Python with `alloc()` method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-10T17:01:08Z

mooncake-store/src/dummy_client.cpp

-        close(shm_fd_);
        shm_fd_ = -1;
        shm_base_addr_ = nullptr;
+        shm_size_ = 0;


When IPC registration fails, the code resets the local member variables but doesn't clean up the allocated shared memory in the singleton ShmHelper. Since ShmHelper is a singleton with process-wide lifetime, the allocated memory will remain until process termination. Consider calling shm_helper_->cleanup() on failure, or document this behavior if intentional.

Suggested change

shm_size_ = 0;

shm_size_ = 0;

if (shm_helper_) {

shm_helper_->cleanup();

}

We do not cleanup here, as the shm_helper_ is still valid

Copilot · 2025-12-10T17:01:09Z

mooncake-store/src/dummy_client.cpp

+void* ShmHelper::allocate(size_t size) {
+    if (shm_fd_ != -1) {
+        throw std::runtime_error("Shared memory has already been allocated.");
+    }
+
+    shm_size_ = size;
+
+    // Create memfd
+    shm_fd_ = memfd_create_wrapper("mooncake_shm", MFD_CLOEXEC);
+    if (shm_fd_ == -1) {
+        throw std::runtime_error("Failed to create anonymous shared memory: " +
+                                 std::string(strerror(errno)));
+    }
+
+    // Set size
+    if (ftruncate(shm_fd_, shm_size_) == -1) {
+        close(shm_fd_);
+        throw std::runtime_error("Failed to set shared memory size: " +
+                                 std::string(strerror(errno)));
+    }
+
+    // Map memory
+    shm_base_addr_ = mmap(nullptr, shm_size_, PROT_READ | PROT_WRITE,
+                          MAP_SHARED, shm_fd_, 0);
+    if (shm_base_addr_ == MAP_FAILED) {
+        close(shm_fd_);
+        throw std::runtime_error("Failed to map shared memory: " +
+                                 std::string(strerror(errno)));
+    }
+
+    return shm_base_addr_;
+}


The allocate method is not thread-safe. Multiple threads could concurrently call this method on the singleton instance, leading to race conditions when checking shm_fd_ != -1 and when modifying member variables. Consider adding a mutex to protect the allocation process, or document that this method should only be called once from a single thread.

Copilot · 2025-12-10T17:01:09Z

mooncake-store/src/dummy_client.cpp

+    shm_base_addr_ = mmap(nullptr, shm_size_, PROT_READ | PROT_WRITE,
+                          MAP_SHARED, shm_fd_, 0);
+    if (shm_base_addr_ == MAP_FAILED) {
+        close(shm_fd_);


When mmap fails, the function closes shm_fd_ but doesn't reset it to -1 or reset shm_size_ before throwing. This leaves the object in an inconsistent state. Add shm_fd_ = -1; and shm_size_ = 0; after close(shm_fd_); to ensure proper cleanup on error.

Suggested change

close(shm_fd_);

close(shm_fd_);

shm_fd_ = -1;

shm_size_ = 0;

Copilot · 2025-12-10T17:01:09Z

mooncake-integration/store/store_py.cpp

+            if (!self.shm_helper_) {
+                LOG(ERROR) << "Shared memory allocator is not initialized";
+                return static_cast<uintptr_t>(0);
+            }


Similar to the previous issue, the null check if (!self.shm_helper_) will never be true because ShmHelper::getInstance() always returns a non-null pointer. This check can be removed, or if you want to handle initialization failures, this should be checked after the getInstance() call in the constructor.

Suggested change

if (!self.shm_helper_) {

LOG(ERROR) << "Shared memory allocator is not initialized";

return static_cast<uintptr_t>(0);

}

Copilot · 2025-12-10T17:01:10Z

mooncake-store/src/dummy_client.cpp

+
+    // Set size
+    if (ftruncate(shm_fd_, shm_size_) == -1) {
+        close(shm_fd_);


When ftruncate fails, the function closes shm_fd_ but doesn't reset it to -1 before throwing. This leaves shm_fd_ in an invalid state (pointing to a closed file descriptor). Add shm_fd_ = -1; after close(shm_fd_); to maintain consistency with cleanup() pattern.

Suggested change

close(shm_fd_);

close(shm_fd_);

shm_fd_ = -1;

Copilot · 2025-12-10T17:01:10Z

mooncake-store/src/dummy_client.cpp

+    if (!shm_helper_) {
+        LOG(ERROR) << "Failed to get shared memory allocator";
        return -1;
    }


The null check if (!shm_helper_) is unnecessary because ShmHelper::getInstance() always returns a non-null pointer to a static instance. This check will never be true and can be removed.

Suggested change

if (!shm_helper_) {

LOG(ERROR) << "Failed to get shared memory allocator";

return -1;

}

Copilot · 2025-12-10T17:01:10Z

mooncake-store/src/dummy_client.cpp

+        shm_base_addr_ = shm_helper_->allocate(shm_size_);
+        if (shm_base_addr_ == MAP_FAILED) {
+            LOG(ERROR) << "Failed to allocate shared memory";


The check if (shm_base_addr_ == MAP_FAILED) will never be true because allocate() throws an exception on failure and never returns MAP_FAILED. Either handle the exception with a try-catch block, or remove this check. Currently, if allocation fails, the exception will propagate and the error log won't be printed.

Suggested change

shm_base_addr_ = shm_helper_->allocate(shm_size_);

if (shm_base_addr_ == MAP_FAILED) {

LOG(ERROR) << "Failed to allocate shared memory";

try {

shm_base_addr_ = shm_helper_->allocate(shm_size_);

} catch (const std::exception& e) {

LOG(ERROR) << "Failed to allocate shared memory: " << e.what();

Copilot · 2025-12-10T17:01:11Z

mooncake-integration/store/store_py.cpp

+        .def("alloc", [](MooncakeHostMemAllocatorPyWrapper &self, size_t size) {
+            py::gil_scoped_release release;
+            if (!self.shm_helper_) {
+                LOG(ERROR) << "Shared memory allocator is not initialized";
+                return static_cast<uintptr_t>(0);
+            }
+            void *ptr = self.shm_helper_->allocate(size);
+            return reinterpret_cast<uintptr_t>(ptr);
+        });


The alloc method should handle exceptions from shm_helper_->allocate(size), which can throw std::runtime_error. Currently, if allocation fails, an exception will propagate to Python. Consider wrapping the call in a try-catch block to log the error and return 0, or let the exception propagate but document this behavior.

The mem is necessary for Python, so it must throw error.

Copilot · 2025-12-10T17:01:11Z

mooncake-store/src/dummy_client.cpp

+    } else {
+        // Shared memory already allocated, just get the base address and size
+        shm_base_addr_ = shm_helper_->get_base_addr();
+        shm_size_ = shm_helper_->get_size();


The branch logic assumes that if shared memory is already allocated, the existing allocation size (shm_helper_->get_size()) matches the requested size (local_buffer_size + mem_pool_size). However, if setup_dummy is called with different parameters on subsequent calls, this could lead to mismatches between the actual buffer size and the expected size. Consider validating that shm_size_ matches the requested size, or document that the first allocation parameters are permanent.

Suggested change

shm_size_ = shm_helper_->get_size();

shm_size_ = shm_helper_->get_size();

size_t requested_size = local_buffer_size + mem_pool_size;

if (shm_size_ != requested_size) {

LOG(ERROR) << "Shared memory size mismatch: existing size (" << shm_size_

<< ") does not match requested size (" << requested_size << ")";

return -1;

}

Signed-off-by: Xingrui Yi <[email protected]>

YiXR requested review from XucSh, stmatengss and ykwd as code owners December 8, 2025 08:26

github-actions bot added the run-ci label Dec 8, 2025

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

YiXR force-pushed the xinyi/shm branch from 92302f2 to 51ffcc0 Compare December 8, 2025 09:18

YiXR mentioned this pull request Dec 8, 2025

feat[Store]: Add standalone deployment implementation for Client #1084

Merged

45 tasks

YiXR changed the title ~~[Store]: Introduce shm helper for dummy~~ feat[Store]: Introduce shm helper for dummy Dec 8, 2025

YiXR force-pushed the xinyi/shm branch from 51ffcc0 to a94f9a2 Compare December 9, 2025 02:44

XucSh reviewed Dec 10, 2025

View reviewed changes

stmatengss requested a review from Copilot December 10, 2025 16:54

Copilot started reviewing on behalf of stmatengss December 10, 2025 16:55 View session

stmatengss reviewed Dec 10, 2025

View reviewed changes

Copilot AI reviewed Dec 10, 2025

View reviewed changes

[Store]: Introduce shm helper for dummy

2d1d426

Signed-off-by: Xingrui Yi <[email protected]>

YiXR force-pushed the xinyi/shm branch from a94f9a2 to 2d1d426 Compare December 11, 2025 02:24

XucSh approved these changes Dec 11, 2025

View reviewed changes

stmatengss merged commit 5c3d04f into kvcache-ai:main Dec 11, 2025
14 checks passed

YiXR mentioned this pull request Dec 11, 2025

[RoadMap][Call For Contributions] Mooncake Store V3 Roadmap #1035

Open

42 tasks

	return -1; // Or appropriate fallback/error
	errno = ENOSYS;
	return -1;

-        shm_size_ = shm_helper_->get_size();
+        shm_size_ = shm_helper_->get_size();
+        size_t requested_size = local_buffer_size + mem_pool_size;
+        if (shm_size_ != requested_size) {
+            LOG(ERROR) << "Shared memory size mismatch: existing size (" << shm_size_
+                       << ") does not match requested size (" << requested_size << ")";
+            return -1;
+        }

feat[Store]: Introduce shm helper for dummy #1177

feat[Store]: Introduce shm helper for dummy #1177

Conversation

YiXR commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

How Has This Been Tested?

Checklist

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

YiXR commented Dec 8, 2025 •

edited

Loading