Skip to content

[Feature] Automate setting of account-like optimization #4579

@lutter

Description

@lutter

The account-like optimization is turned on with graphman stats account-like <sgdNNN> <table> and changes query generation for that table by adding to the clause block_range @> $block that all our queries contain the redundant clauses lower(block_range <= $block and coalesce(upper(block_range), 2147483647) > $block. That makes the BRIN index on these expressions that we have on all tables usable by the query planner; it can bring about a dramatic improvement in query speed when it works, but unfortunately, it can also slow down queries a lot in some cases. That's why we don't use it by default.

The general rule of thumb is that if the ratio of entities to versions (count(distinct id)/count(*)) is very low, say < 0.1%, that the account-like optimization can be useful. Because of the BRIN index, it is very dependent on the exact layout of data on disk. In Uniswap-like subgraphs, it usually is very effective for tables like token and pair, since in those, versions that are visible at a certain block tend to cluster. It's usually not effective for tables like token_day_data or pair_hour_data where versions with an unlimited upper bound on their block range are interspersed with very old versions throughout the heap. It's also not effective for entities that are immutable, as they all have an unlimited upper bound on their block range - if subgraph authors declare these entities as immutable, this isn't an issue as we don't store a block range in that case, just a start block and can use BTree indexes to search. But for entities that are immutable but not declared as such it can become an issue.

It would be great if we could have a reliable indication of when the account-like optimization will be effective. For example, we could look at how selective the BRIN index is at higher block numbers and adapt query generation accordingly. It's not clear to me how best we would gauge the selectiveness of the BRIN index; we might have to resort to page_inspect or some such.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions