Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create OLAP cube function on top of activity schema #86

Open
matthieu-carmeille opened this issue Jan 28, 2025 · 5 comments
Open

Create OLAP cube function on top of activity schema #86

matthieu-carmeille opened this issue Jan 28, 2025 · 5 comments

Comments

@matthieu-carmeille
Copy link
Contributor

matthieu-carmeille commented Jan 28, 2025

Hi @bcodell ,

Inspired by this article, it would be great to create a macro on top of activity schema to build OLAP cubes, wdyt ?

Started to write the macro (cf attached), we would need to adapt it to activity schema.

generate_olap_model.txt

How to use the macro:

{{ generate_olap_model(
    table_name='"dw"."share_point_finance"."revenue_data_report"',
    observation_date='revenue_date',
    start_date='2024-01-01',
    time_interval='quarter',
    cube_dimensions=['customer_type', 'new_expansion', 'country'],
    metrics=[
        {"dimension": "total_mrc", "operation": "SUM", "alias": "mrr"},
        {"dimension": "total_nrc", "operation": "SUM", "alias": "nrr"},
        {"dimension": "total_net", "operation": "SUM", "alias": "revenue"}
    ]
) }}```


Possible improvements to be done:
- make it incremental
- creating directly multiple time intervals within aggregation level (day, month, quarter) instead of only one currently. 



@bcodell
Copy link
Owner

bcodell commented Jan 28, 2025

@matthieu-carmeille interesting! I have no problem including this in the package. Feel free to open a PR!

Curious about your vision for the use case - Is the idea here that you'll pre-compute metric calculations for every single combination of time and dimensions, so that segmentation in BI tools effectively just looking up a set of rows, rather than recalculating all metrics every time a user wants to change their dimensions of interest? Or am I misunderstanding the use case?

@matthieu-carmeille
Copy link
Contributor Author

yes exactly, idea would be to have all possible combinations of dimensions precomputed in one data model to have everything in one place and improve performance in the BI tool.

@bcodell
Copy link
Owner

bcodell commented Jan 28, 2025

Makes sense! Admittedly I think slow in-app loading should be solved by the BI tools that experience the slow loading (since they're the ones writing and issuing the queries), but as we know that doesn't always happen on the timeline we practitioners want, so mass precomputing seems like a reasonable workaround.

Feel free to take a stab at this - let me know if I can be helpful at all!

@matthieu-carmeille
Copy link
Contributor Author

That's correct. However what would make this use case different from the macros in this package to create datasets on top of the activity schema ?

@bcodell
Copy link
Owner

bcodell commented Jan 28, 2025

I don't think it is! I think it's just a wrapper on top of the existing functionality - at least how I'd approach it. Namely, build one aql time spine statement for each pre-computed time aggregation (day, week, month, etc) with the appropriate dimensions and metrics added, then union them all together. This should all be possible with jinja.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants