Feedback and support: TensorBlock's Twitter/X, Telegram Group and Discord server
Proof of Cache (PoC) is an Efficient Verification Protocol for Decentralized Inference in Large Language Models (LLMs).
Decentralized inference networks for LLMs enable the deployment of complex models across multiple machines, facilitating collaborative inference. However, ensuring computational integrity in these architectures remains challenging. Existing methods, such as redundant computation, are inefficient and costly. In this work, we introduce a novel KV-cache sampling approach, leveraging the deterministic internal states of LLMs to enable lightweight and efficient verification. By reducing computational redundancy and maximizing hardware parallelism, this approach significantly enhances verification efficiency. Additionally, we propose a protocol design for the verification process that ensures fairness among participants and long-term system sustainability. Finally, we discuss payoff distributions and outline potential avenues for optimizing the protocol in future work.
KV-cache sampling
KV-cache matching
A preliminary report detailing our approach, methodology, and early findings is attached in this repository. We encourage you to read it for a deeper technical understanding of the PoC protocol.
We welcome all forms of feedback and discussion. Please raise an issue in this repository to:
- Ask questions
- Suggest improvements
- Report bugs
- Discuss potential directions
We look forward to collaborating with the community to refine and optimize this protocol!