Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using incremental models on Synapse REPLICATE table distribution #107

Open
ThomasCarterSuzy opened this issue Aug 16, 2022 · 5 comments

Comments

@ThomasCarterSuzy
Copy link

When running incremental models on Synpase Replicated tables an error is thrown:
[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Option 'REPLICATE User Temp Table' is not supported in this version of SQL Server. (104458) (SQLExecDirectW)

It appears the fix is to override the sqlserver__make_temp_relation macro and remove the # ~ (temp table identifier) from {% set tmp_identifier = '#' ~ base_relation.identifier ~ suffix %}

@AdamG30
Copy link

AdamG30 commented Sep 8, 2022

Hi. First, I want to say thanks to all the contributors here for making it possible to use dbt with Azure Synapse. I've been running through a series of tests, and bumped into this error. Here's a bit more detail on the problem for clarity.

The test case was - Distribution of the model was changed to REPLICATE

Model

{{ config(materialized='incremental'
, index='clustered index(OLDACNUM)'
, dist='REPLICATE'
, on_schema_change='sync_all_columns' )    }}

SELECT TOP (1000) 
      [RecordCreateRunId]
      ,[DEBTORNUM]
      ,[OLDACNUM]
      ,[CREDCOLACTION]
      ,[LASTSTMTDATE]
      ,[CUST_TYPE]
      ,[LETTERDATE]
      ,[CHARGE_CODE]
      ,[NUM_REMINDERS]
      ,[PAYMENT_CODE]
      ,[CYCLE_GROUP]
      ,ODS_START_DATE	  
FROM [DBO].[AR_DEBTOR]
{% if is_incremental() %}
  -- this filter will only be applied on an incremental run
  WHERE ODS_START_DATE > (select max(ODS_START_DATE) from {{ this }})
{% endif %}

Incremental Run

  • The model was run as incremental
  • This caused the error "'REPLICATE User Temp Table' is not supported in this version of SQL Server"
  • Offending code:
  CREATE TABLE "DBT"."#STG_DEBTORS__dbt_tmp"
    WITH(
      DISTRIBUTION = REPLICATE,
      clustered index(OLDACNUM)
      )
    AS (SELECT * FROM DBT.STG_DEBTORS__dbt_tmp_temp_view)
  • Temporary tables don't support REPLICATE distribution.
  • The recommended fix above means the generated code will never use temporary tables in any circumstance (I think). This may be problematic. In the dedicated SQL pool resource, temporary tables offer a performance benefit because their results are written to local rather than remote storage. A better approach would be only not to use temporary tables when DISTRIBUTION=REPLICATE.

Versions - dbt =1.1.2, sqlserver=1.1.1, synapse = 1.1.0

@AdamG30
Copy link

AdamG30 commented Nov 29, 2022

Hi

I've been able to fix this issue by overriding the sqlserver__make_temp_relation with a synapse version of this macro that does not use a temp table when the distribution type is REPLICATE.

Create a new macro in the macros folder of your project and use this code.

{# Fix for Temporary tables don't support REPLICATE distribution. Don't use temp table if Dist=REPLICATE #}
{% macro synapse__make_temp_relation(base_relation, suffix) %}
    {% if config.get('dist')|upper == 'REPLICATE' -%}
        {% set tmp_identifier = base_relation.identifier ~ suffix %}
    {%- else -%}
        {% set tmp_identifier = '#' ~  base_relation.identifier ~ suffix %}                        
   {% endif %}
    {% set tmp_relation = base_relation.incorporate(path={"identifier": tmp_identifier}) -%}
    {% do return(tmp_relation) %}
{% endmacro %}

To permanently fix this in this synapse adapter, this logic should be added to the relation.sql file.

@prdpsvs
Copy link
Contributor

prdpsvs commented Jan 20, 2023

PR #138

You can now create seed tables with different distribution and index strategy by providing required confiuration in dbt_project.yml file. The default choice is REPLICATE disttribution and HEAP (no indexing). If you want to override this configuration, the following sample should help.

seeds:
  jaffle_shop:
    index: HEAP
    dist: ROUND_ROBIN
    raw_customers:
      index: HEAP
      dist: REPLICATE
    raw_payments:
      dist: HASH(payment_method)
      index: CLUSTERED INDEX(id,order_id)  

Create a new context "seeds:" at the root followed by project name and seed name. In this case the project name is jaffle_shop and seeds are raw_customers and raw_payments. Provide index and distribution values using index and dist keys. Use replicate, round_robin, hash({column name}) as a value. Example: dist: replicate. The raw_customers seed table will be replicated a table. For hash distribution, the user need to provide the vaule HASH(payment_method). Example: dist: hash(payment_method)

To specific index, index as a key and CLUSTERED INDEX({Column1, Column2}), HEAP, CLUSTERED COLUMNSTORE INDEX as a value. Example: index: HEAP. The raw_customers seed table will use heap index strategy. For clustered index, the user need to provide one or more columns to create clustered index on. Example: index: CLUSTERED INDEX(id,order_id). The default value of index and distribution can also be set for all seeds under project name.

@jgilfillan
Copy link

Hi

I've been able to fix this issue by overriding the sqlserver__make_temp_relation with a synapse version of this macro that does not use a temp table when the distribution type is REPLICATE.

Create a new macro in the macros folder of your project and use this code.

{# Fix for Temporary tables don't support REPLICATE distribution. Don't use temp table if Dist=REPLICATE #}
{% macro synapse__make_temp_relation(base_relation, suffix) %}
    {% if config.get('dist')|upper == 'REPLICATE' -%}
        {% set tmp_identifier = base_relation.identifier ~ suffix %}
    {%- else -%}
        {% set tmp_identifier = '#' ~  base_relation.identifier ~ suffix %}                        
   {% endif %}
    {% set tmp_relation = base_relation.incorporate(path={"identifier": tmp_identifier}) -%}
    {% do return(tmp_relation) %}
{% endmacro %}

To permanently fix this in this synapse adapter, this logic should be added to the relation.sql file.

Is this going to be incorporated into the adapter, and if so any thoughts on when?

@alittlesliceoftom
Copy link

Hey,

I just tried this on dbt synapse 1.8 and I think the issue is closed as it no longer occurs on latest version (I deleted the local macro I've been running for ages for this and it still worked).

However this is because the intermediate relation is now a view which does cause it's own problems (specifically that a table can be missing for a while as there's now a CTAS from a view which can be slow.).

I will leave the intermediate macro in our instance as I suspect it may still come in handy, but techincally I think this can be closed.

{{
  config(
    tags = ['nobuild','NoCI','demo'],
    materialized = 'table',
    dist = 'REPLICATE'
    )
}}

{# In our models we use snake_case for models, CAPS for keywords, and PascalCase for columns. #}

SELECT 'Hello, World!' AS MessageToWorld

Example Table tested ^^

Behaviour:

 
  
  
    if object_id ('"testing_tom_oneill"."hello_world_temp_view"','V') is not null
    begin
      drop view "testing_tom_oneill"."hello_world_temp_view"
      end
    
  


   
  
    
    if object_id ('"testing_tom_oneill"."hello_world"','U') is not null
    
      begin
      drop table "testing_tom_oneill"."hello_world"
      end
    
  


   EXEC('create view [testing_tom_oneill].[hello_world_temp_view] as
    



SELECT ''Hello, World!'' AS MessageToWorld



    


    ');

  CREATE TABLE "testing_tom_oneill"."hello_world"
    WITH(
      DISTRIBUTION = REPLICATE,
      CLUSTERED COLUMNSTORE INDEX
      
      )
    AS (SELECT * FROM [testing_tom_oneill].[hello_world_temp_view])

   
  
  
    if object_id ('"testing_tom_oneill"."hello_world_temp_view"','V') is not null
    begin
      drop view "testing_tom_oneill"."hello_world_temp_view"
      end
    
  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants