Skip to content

Fix RPC hang: route ABCI via custom GRPCQueryRouter; enforce local-only bank iterators; make /abci_info responsive#10

Open
tiljrd wants to merge 3 commits intodevin/1755993593-wasm-on-demand-materializefrom
devin/1757951061-rpc-query-hang-fix
Open

Fix RPC hang: route ABCI via custom GRPCQueryRouter; enforce local-only bank iterators; make /abci_info responsive#10
tiljrd wants to merge 3 commits intodevin/1755993593-wasm-on-demand-materializefrom
devin/1757951061-rpc-query-hang-fix

Conversation

@tiljrd
Copy link
Collaborator

@tiljrd tiljrd commented Sep 16, 2025

User description

Fix RPC hang: route ABCI via custom GRPCQueryRouter; enforce local-only bank iterators; make /abci_info responsive

Summary

Fixes the RPC hanging issue where thornode query bank balances <address> would timeout without gRPC flags, while keeping REST and gRPC behavior intact. The root cause was that ABCI queries (used by RPC) weren't routing through the custom GRPCQueryRouter with BankQueryWrapper, and bank store iterators could still attempt remote fetches causing hangs.

Key changes:

  • Router wiring: Ensures BaseApp uses the custom GRPCQueryRouter so ABCI queries hit BankQueryWrapper
  • Local-only iterators: Forces bank/auth/acc store iterators to return local-only results, preventing remote fetch hangs
  • Debug instrumentation: Adds gated logging (THOR_FORK_DEBUG=1) to trace query paths and store operations
  • Service registration: Updates Tx and Tendermint services to use the custom router consistently

Review & Testing Checklist for Human

  • Test RPC queries work without flags: thornode query bank balances thor10qvy72uwd6en70w4ctswnx4534epghgcccu0dx -o json returns promptly with correct balances (no timeout)
  • Verify /abci_info is responsive: curl http://localhost:26657/abci_info returns JSON without timing out
  • Confirm REST/gRPC still work: Both curl http://localhost:1317/cosmos/bank/v1beta1/balances/<addr> and thornode query bank balances <addr> --grpc-addr localhost:9090 --grpc-insecure return matching results
  • Test with funded validator address: Use a known funded address to ensure balances are actually returned, not just empty responses
  • Check logs show no remote bank fetches: With THOR_FORK_DEBUG=1, verify bank store operations log allowRemote=false and no remote fetch attempts

Notes

  • Changes affect core query routing infrastructure - test thoroughly before merging
  • Debug logging should remain gated behind THOR_FORK_DEBUG=1 to avoid production noise
  • Bank balances now guaranteed to be local-only (never remote) as per requirements
  • May need to rebuild thornode image and redeploy enclave to test changes

Link to Devin run: https://app.devin.ai/sessions/0b323a13287f4954866e09d7212be2d1
Requested by: @tiljrd


PR Type

Bug fix, Enhancement


Description

  • Fix RPC hanging issue for bank balance queries

  • Route ABCI queries through custom GRPCQueryRouter

  • Force local-only iterators for bank/auth/acc stores

  • Add debug logging for query path tracing


Diagram Walkthrough

flowchart LR
  RPC["RPC Query"] --> ABCI["ABCI Handler"]
  ABCI --> CustomRouter["Custom GRPCQueryRouter"]
  CustomRouter --> BankWrapper["BankQueryWrapper"]
  BankWrapper --> LocalStore["Local-only Store"]
  LocalStore --> Response["Query Response"]
Loading

File Walkthrough

Relevant files
Bug fix
app.go
Wire custom GRPCQueryRouter for ABCI queries                         

app/app.go

  • Set BaseApp to use custom GRPCQueryRouter for ABCI queries
  • Route Tx and Tendermint services through custom router
  • Add GRPCQueryRouter() getter method
  • Minor import formatting cleanup
+9/-2     
store.go
Enforce local-only bank store operations                                 

x/thorchain/forking/store.go

  • Force local-only iterators for bank/auth/acc stores
  • Add debug logging for store operations (Get, Iterator,
    ReverseIterator)
  • Prevent remote fetches for critical store types
  • Gate debug output behind THOR_FORK_DEBUG flag
+16/-3   
Enhancement
query_service_router.go
Add debug logging to bank query wrapper                                   

app/query_service_router.go

  • Add debug logging to BankQueryWrapper methods
  • Log Balance and AllBalances query entry points
  • Gate logging behind THOR_FORK_DEBUG environment variable
+8/-0     

…er; add debug logs for bank/auth/acc RPC path
…ing(store): force local-only iterators for bank/auth/acc on RPC/ABCI paths
@devin-ai-integration
Copy link

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@qodo-code-review
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Routing Consistency

Verify that all components that previously depended on BaseApp.GRPCQueryRouter now correctly use the custom router, and that setting the BaseApp router does not introduce double-registration or missed routes.

app.queryServiceRouter = NewQueryServiceRouter(app.BaseApp.GRPCQueryRouter())
app.queryServiceRouter.AddCustomRoute("cosmos.bank.v1beta1.Query", NewBankQueryWrapper(app.BankKeeper))
app.queryServiceRouter.AddCustomRoute("cosmwasm.wasm.v1.Query", NewWasmQueryWrapper(app, &app.WasmKeeper, wasmkeeper.NewGrpcQuerier(app.appCodec, wasmStoreService, &app.WasmKeeper, wasmConfig.SmartQueryGasLimit)))
app.configurator = module.NewConfigurator(app.appCodec, app.msgServiceRouter, app.queryServiceRouter)
err = app.ModuleManager.RegisterServices(app.configurator)
if err != nil {
	panic(err)
}
app.BaseApp.SetGRPCQueryRouter(app.GRPCQueryRouter())


// RegisterUpgradeHandlers is used for registering any on-chain upgrades.
// Make sure it's called after `app.ModuleManager` and `app.configurator` are set.
app.RegisterUpgradeHandlers()

autocliv1.RegisterQueryServer(app.GRPCQueryRouter(), runtimeservices.NewAutoCLIQueryService(app.ModuleManager.Modules))
Behavior Change

Enforcing local-only iterators for bank/auth/acc can alter query semantics; ensure pagination and range queries still return complete results and do not rely on remote data in any code paths.

func (f *forkingKVStore) Delete(key []byte) error {
	if err := f.parent.Delete(key); err != nil {
		return err
	}
	if f.config.CacheEnabled {
		f.cache.Delete(key)
	}
	return nil
}

func (f *forkingKVStore) Iterator(start, end []byte) (storetypes.Iterator, error) {
	if os.Getenv("THOR_FORK_DEBUG") == "1" && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
		s, e := start, end
		fmt.Printf("[rpc-debug][ITER] store=%s start=%s end=%s allowRemote=%v\n", f.storeKey, hex.EncodeToString(s), hex.EncodeToString(e), f.shouldAllowRemoteFetch())
	}
	localIter, err := f.parent.Iterator(start, end)
	if err != nil {
		fmt.Printf("[forking][ITER] local-err store=%s err=%v\n", f.storeKey, err)
		return nil, err
	}

	if f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc" || !f.shouldAllowRemoteFetch() {
		return localIter, nil
	}

	remoteIter, rerr := f.getRemoteIterator(start, end)
	if rerr != nil {
		return localIter, nil
	}
	return NewMergedIterator(localIter, remoteIter), nil
}

func (f *forkingKVStore) ReverseIterator(start, end []byte) (storetypes.Iterator, error) {
	if os.Getenv("THOR_FORK_DEBUG") == "1" && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
		s, e := start, end
		fmt.Printf("[rpc-debug][RITER] store=%s start=%s end=%s allowRemote=%v\n", f.storeKey, hex.EncodeToString(s), hex.EncodeToString(e), f.shouldAllowRemoteFetch())
	}
	localIter, err := f.parent.ReverseIterator(start, end)
	if err != nil {
		fmt.Printf("[forking][RITER] local-err store=%s err=%v\n", f.storeKey, err)
		return nil, err
	}

	if f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc" || !f.shouldAllowRemoteFetch() {
		return localIter, nil
	}

	remoteIter, rerr := f.getRemoteReverseIterator(start, end)
	if rerr != nil {
		return localIter, nil
Debug Logging

Direct fmt.Printf logging gated by an env var may spam stdout on busy nodes; consider using structured logger and confirm that THOR_FORK_DEBUG cannot be toggled remotely.

	if os.Getenv("THOR_FORK_DEBUG") == "1" {
		fmt.Printf("[rpc-debug] BankQueryWrapper.Balance entry\n")
	}
	ctx := w.extractUserAPICall(goCtx)
	return w.originalHandler.Balance(ctx, req)
}

func (w *BankQueryWrapper) AllBalances(goCtx context.Context, req *banktypes.QueryAllBalancesRequest) (*banktypes.QueryAllBalancesResponse, error) {
	if os.Getenv("THOR_FORK_DEBUG") == "1" {
		fmt.Printf("[rpc-debug] BankQueryWrapper.AllBalances entry\n")
	}
	ctx := w.extractUserAPICall(goCtx)
	return w.originalHandler.AllBalances(ctx, req)

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Cache environment variable to improve performance

To improve performance, read the THOR_FORK_DEBUG environment variable once at
package initialization and store it in a package-level variable. This avoids
repeated system calls within the Get, Iterator, and ReverseIterator methods.

x/thorchain/forking/store.go [94-223]

+var isForkDebugEnabled = os.Getenv("THOR_FORK_DEBUG") == "1"
+
 func (f *forkingKVStore) Get(key []byte) ([]byte, error) {
-	if os.Getenv("THOR_FORK_DEBUG") == "1" && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
+	if isForkDebugEnabled && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
 		fmt.Printf("[rpc-debug][GET] store=%s key=%s allowRemote=%v\n", f.storeKey, hex.EncodeToString(key), f.shouldAllowRemoteFetch())
 	}
 ...
 func (f *forkingKVStore) Iterator(start, end []byte) (storetypes.Iterator, error) {
-	if os.Getenv("THOR_FORK_DEBUG") == "1" && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
+	if isForkDebugEnabled && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
 		s, e := start, end
 		fmt.Printf("[rpc-debug][ITER] store=%s start=%s end=%s allowRemote=%v\n", f.storeKey, hex.EncodeToString(s), hex.EncodeToString(e), f.shouldAllowRemoteFetch())
 	}
 ...
 func (f *forkingKVStore) ReverseIterator(start, end []byte) (storetypes.Iterator, error) {
-	if os.Getenv("THOR_FORK_DEBUG") == "1" && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
+	if isForkDebugEnabled && (f.storeKey == "bank" || f.storeKey == "auth" || f.storeKey == "acc") {
 		s, e := start, end
 		fmt.Printf("[rpc-debug][RITER] store=%s start=%s end=%s allowRemote=%v\n", f.storeKey, hex.EncodeToString(s), hex.EncodeToString(e), f.shouldAllowRemoteFetch())
 	}
 ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a performance optimization by caching the result of os.Getenv. While this is a valid improvement for code that may be on a hot path, the performance gain is likely minor, especially since this is for a debug flag.

Low
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant