Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve flattening performance for dbt_columns #681

Merged
merged 3 commits into from
Mar 26, 2024

Conversation

ofek1weiss
Copy link
Contributor

No description provided.

Copy link

linear bot commented Mar 19, 2024

Copy link
Contributor

👋 @ofek1weiss
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@@ -35,8 +35,8 @@

{% set flattened_columns = [] %}
{% for column_node in column_nodes.values() %}
{% set flat_column = elementary.flatten_column(table_node, column_node) %}
{% if not elementary.get_config_var('upload_only_columns_with_descriptions') or flat_column['description'] %}
{% if not elementary.get_config_var('upload_only_columns_with_descriptions') or column_node.get('description') %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed filter to happen before flattening, to improve performance

@haritamar haritamar merged commit a97e794 into master Mar 26, 2024
9 of 11 checks passed
@haritamar haritamar deleted the ele-2730-only-flatten-columns-with-description branch March 26, 2024 13:26
konstantin-baidin-y42 added a commit to goes-funky/dbt-data-reliability that referenced this pull request Apr 9, 2024
* support count_true and count_false for boolean columns in BigQuery

* support count_true and count_false for boolean columns in BigQuery

* fix: change generate profile args macro name for Athena

* Add unique_combination_of_columns to common_tests_configs_mapping

* missing comma

* fix drop_failure_percent_threshold failing a non anomalous test

* Support all anomaly vars on all configuration levels

* ELE-2470 temp tables are not being deleted

* Add empty line at the end of a filre

* fix typo

* Removed default detection/training_period

* Make sure we delete temp tables last

* removed unused import

* Not collecting metrics by default.

* release 0.14.0

* fix bug when no temp tables xist

* Update macros/edr/tests/test_utils/clean_elementary_test_tables.sql

Co-authored-by: IDoneShaveIt <[email protected]>

* 1. add tags to all elementary monitors 2. only run tests if elementary is enabled

* rename tag

* Create clean_dbt_columns_temp_tables macro

* Add empty line at the end of a file

* clean logs

* add arg chunk_size for all insert_rows() (elementary-data#669)

* release 0.14.1

* override primary_test_model_id (elementary-data#671)

* added ignore_small_changes to freshness and event_freshness

* improvement: bigquery specific for query_table_metrics (elementary-data#674)

* improvement: bigquery specific for query_table_metrics

Using information schema to get row count is much more performant than doing a full table scan

* use TABLE_STORAGE and add database & schema

* add empty case

* add set

* Add index on created_at test_result_rows and remove backfill post hook

* Ele 2606 package version with caching and extra logs (elementary-data#673)

* artifacts: use cache also for model post-hook

* add performance logs to artifacts logic

* duration monitoring - bugfix - handle the case the duration stack is not initialized

* Change the aggregate of failed_row_count_calc to count(*)

* Readme updates (elementary-data#684)

* changes to readme

* changes to readme

* changes

* changes

* image url

* image url

* changes

* formating

* formating

* changes

* link

* pre commit

* improve flattening performance for dbt_columns (elementary-data#681)

* improve flattening performance for dbt_columns

* removed unused const

* black

* Add get_requires_permissions and validate_required_permissions macros

* Improved messages

* Fixed default__get_required_permissions + add target.database to get_relevant_databases

---------

Co-authored-by: suelai <[email protected]>
Co-authored-by: Roman Korsun <[email protected]>
Co-authored-by: Yasuhisa Yoshida <[email protected]>
Co-authored-by: Ofek Weiss <[email protected]>
Co-authored-by: Ofek Weiss <[email protected]>
Co-authored-by: IDoneShaveIt <[email protected]>
Co-authored-by: IDoneShaveIt <[email protected]>
Co-authored-by: Elon Gliksberg <[email protected]>
Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Ella Katz <[email protected]>
Co-authored-by: J.C <[email protected]>
Co-authored-by: Noy Arie <[email protected]>
Co-authored-by: Chris Dong <[email protected]>
Co-authored-by: noakurman <[email protected]>
Co-authored-by: Itamar Hartstein <[email protected]>
Co-authored-by: Maayan Salom <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants