Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1990241: Reserved words break DataFrameWriter.save_as_table mode "append" column_order "name" #3168

Open
Tim-Kracht opened this issue Mar 17, 2025 · 0 comments
Assignees
Labels
bug Something isn't working needs triage Initial RCA is required

Comments

@Tim-Kracht
Copy link

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

    Python 3.11.11 (main, Dec 11 2024, 16:19:35) [GCC 11.2.0]

  2. What operating system and processor architecture are you using?

    Linux-4.4.0-aarch64-with-glibc2.34

  3. What are the component versions in the environment (pip freeze)?

numpy == 2.2.2
pyarrow == 19.0.0
pandas == 2.2.3
setuptools == 72.1.0
snowflake-connector-python == 0.34.0
python-dateutil == 2.8.3+snowflake1
pytz == 2024.1
numexpr == 2.10.1
PyYAML == 6.0.2
Bottleneck == 1.4.2
tzdata == 2023.3
typing_extensions == 4.12.2
six == 1.16.0
tzlocal == 5.2
cloudpickle == 3.0.0
protobuf == 4.25.3
snowflake-snowpark-python == 1.28.0
autocommand == 2.2.2
backports.tarfile == 1.2.0
importlib_metadata == 8.0.0
importlib_resources == 6.4.0
inflect == 7.3.1
jaraco.context == 5.3.0
jaraco.functools == 4.0.1
jaraco.text == 3.12.1
more-itertools == 10.3.0
ordered-set == 4.1.0
packaging == 24.1
platformdirs == 4.2.2
tomli == 2.0.1
typeguard == 4.3.0
typing_extensions == 4.12.2
wheel == 0.43.0
zipp == 3.19.2

  1. What did you do?

Ran the following in a Snowflake Python Worksheet.

import platform
import subprocess
import importlib.metadata

import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import col


def main(session: snowpark.Session):
    """
    version = subprocess.run(["python", "--version", "--version"], capture_output=True, text=True)
    print(version.stdout)

    print(platform.platform())
    
    for pkg in importlib.metadata.distributions():
        print(pkg.metadata["Name"], "==", pkg.version)
    """
    
    table_1 = "TABLE_1"
    table_2 = "TABLE_2"

    print(f"making sure {table_1} does not exist...")
    session.sql(f"DROP TABLE IF EXISTS {table_1}").collect()
    
    print(f"making sure {table_2} does not exist...")
    session.sql(f"DROP TABLE IF EXISTS {table_2}").collect()

    df_1 = session.create_dataframe([{"col": 1, "Col 2": 2, "AS": 3}])

    # creates table_1 with column named `AS`
    try:
        df_1.write.mode("overwrite").save_as_table(table_1, column_order="name")
        print(df_1.columns)
        print(f"table {table_1} created")
    except Exception as ex:
        print(f"failed to create the table {table_1}: {ex}")
        print(ex.query)
        raise


    # unable to append into existing table since column `AS` is not quoted as a reserved word in the INSERT INTO column list
    try:
        df_1.write.mode("append").save_as_table(table_1, column_order="name")
        print(f"table {table_1} appended")
    except Exception as ex:
        print(f"failed to append to the table {table_1}: {ex}")
        print(ex.query)


    # creates the table correctly
    # unable to insert into newly created table since column `AS` is not quoted as a reserved word in the INSERT INTO column list
    try:
        df_2 = session.table(table_1)
        print(df_2.columns)
        df_2.write.mode("append").save_as_table(table_2, column_order="name")
        print(f"table {table_2} created")
    except Exception as ex:
        print(f"failed to create to the table {table_2}: {ex}")
        print(ex.query)

    df_3 = session.sql(f"describe table {table_2}")

    print(f"{table_2} was actually created and looks like this:")
    df_3.show()

    return df_3
  1. What did you expect to see?

    I expected the DataFrame to be appended to the tables. Instead I received an error "(1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions."

making sure TABLE_1 does not exist...
making sure TABLE_2 does not exist...
['COL', '"Col 2"', 'AS']
table TABLE_1 created
failed to append to the table TABLE_1: (1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions.
INSERT  INTO TABLE_1(COL, "Col 2", AS)  SELECT "COL", "Col 2", "AS" FROM ( SELECT $1 AS "COL", $2 AS "Col 2", $3 AS "AS" FROM  VALUES (1 :: INT, 2 :: INT, 3 :: INT))
['COL', '"Col 2"', 'AS']
failed to create to the table TABLE_2: (1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions.
INSERT  INTO TABLE_2(COL, "Col 2", AS)  SELECT  *  FROM TABLE_1
TABLE_2 was actually created and looks like this:
---------------------------------------------------------------------------------------------------------------------------------------------------------------
|"name"  |"type"        |"kind"  |"null?"  |"default"  |"primary key"  |"unique key"  |"check"  |"expression"  |"comment"  |"policy name"  |"privacy domain"  |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
|COL     |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
|Col 2   |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
|AS      |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Closing a session in a stored procedure is a no-op.
  1. Can you set logging to DEBUG and collect the logs?
2025-03-17 15:33:39,529 - Dummy-1 cursor.py:603 - _execute_helper() - DEBUG - is_file_transfer: False
2025-03-17 15:33:39,529 - Dummy-1 cursor.py:643 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,665 - Dummy-1 cursor.py:665 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,665 - Dummy-1 cursor.py:1042 - execute() - DEBUG - sfqid: 01bb13c9-0306-17f4-0001-9ceefb6b2c0e
2025-03-17 15:33:39,665 - Dummy-1 cursor.py:1048 - execute() - INFO - query execution done
2025-03-17 15:33:39,665 - Dummy-1 cursor.py:1086 - execute() - DEBUG - {'data': {'internalError': False, 'unredactedFromSecureObject': False, 'errorCode': '001003', 'age': 0, 'sqlState': '42000', 'queryId': '01bb13c9-0306-17f4-0001-9ceefb6b2c0e', 'line': -1, 'pos': -1, 'type': 'COMPILATION'}, 'code': '001003', 'message': "SQL compilation error:\nsyntax error line 1 at position 35 unexpected 'AS'.", 'success': False, 'headers': None}
2025-03-17 15:33:39,666 - Dummy-1 errors.py:145 - exception_telemetry() - DEBUG - Sending exception telemetry failed
failed to create to the table TABLE_2: (1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions.
INSERT  INTO TABLE_2(COL, "Col 2", AS)  SELECT  *  FROM TABLE_1
TABLE_2 was actually created and looks like this:
2025-03-17 15:33:39,667 - Dummy-1 cursor.py:962 - execute() - DEBUG - executing SQL/command
2025-03-17 15:33:39,667 - Dummy-1 cursor.py:1035 - execute() - INFO - query: [describe table TABLE_2]
2025-03-17 15:33:39,667 - Dummy-1 connection.py:540 - _next_sequence_counter() - DEBUG - sequence counter: 13
2025-03-17 15:33:39,667 - Dummy-1 cursor.py:591 - _execute_helper() - DEBUG - Request id: f18d7363-fe29-4bea-92dc-c9fd726c7a17
2025-03-17 15:33:39,667 - Dummy-1 cursor.py:594 - _execute_helper() - DEBUG - running query [describe table TABLE_2]
2025-03-17 15:33:39,667 - Dummy-1 cursor.py:603 - _execute_helper() - DEBUG - is_file_transfer: False
2025-03-17 15:33:39,668 - Dummy-1 cursor.py:643 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,732 - Dummy-1 cursor.py:665 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,733 - Dummy-1 cursor.py:1042 - execute() - DEBUG - sfqid: 01bb13c9-0306-17f4-0001-9ceefb6b2c26
2025-03-17 15:33:39,733 - Dummy-1 cursor.py:1048 - execute() - INFO - query execution done
2025-03-17 15:33:39,733 - Dummy-1 cursor.py:1053 - execute() - DEBUG - SUCCESS
2025-03-17 15:33:39,733 - Dummy-1 cursor.py:1158 - _init_result_and_meta() - DEBUG - Query result format: json
2025-03-17 15:33:39,733 - Dummy-1 result_batch.py:454 - _parse() - DEBUG - parsing for result batch id: 3
2025-03-17 15:33:39,733 - Dummy-1 server_connection.py:508 - run_query() - DEBUG - Execute query [queryID: 01bb13c9-0306-17f4-0001-9ceefb6b2c26] describe table TABLE_2
2025-03-17 15:33:39,733 - Dummy-1 result_set.py:65 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2025-03-17 15:33:39,733 - Dummy-1 result_set.py:128 - _report_metrics() - DEBUG - Reporting metrics not supported
---------------------------------------------------------------------------------------------------------------------------------------------------------------
|"name"  |"type"        |"kind"  |"null?"  |"default"  |"primary key"  |"unique key"  |"check"  |"expression"  |"comment"  |"policy name"  |"privacy domain"  |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
|COL     |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
|Col 2   |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
|AS      |NUMBER(38,0)  |COLUMN  |N        |NULL       |N              |N             |NULL     |NULL          |NULL       |NULL           |NULL              |
---------------------------------------------------------------------------------------------------------------------------------------------------------------
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:962 - execute() - DEBUG - executing SQL/command
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:1035 - execute() - INFO - query: [describe table TABLE_2]
2025-03-17 15:33:39,734 - Dummy-1 connection.py:540 - _next_sequence_counter() - DEBUG - sequence counter: 14
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:591 - _execute_helper() - DEBUG - Request id: a2bae63d-9559-43a9-980e-f3d28b2f3eba
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:594 - _execute_helper() - DEBUG - running query [describe table TABLE_2]
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:603 - _execute_helper() - DEBUG - is_file_transfer: False
2025-03-17 15:33:39,734 - Dummy-1 cursor.py:643 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,791 - Dummy-1 cursor.py:665 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2025-03-17 15:33:39,791 - Dummy-1 cursor.py:1042 - execute() - DEBUG - sfqid: 01bb13c9-0306-17f4-0001-9ceefb6b2c2a
2025-03-17 15:33:39,791 - Dummy-1 cursor.py:1048 - execute() - INFO - query execution done
2025-03-17 15:33:39,791 - Dummy-1 cursor.py:1053 - execute() - DEBUG - SUCCESS
2025-03-17 15:33:39,791 - Dummy-1 cursor.py:1158 - _init_result_and_meta() - DEBUG - Query result format: json
2025-03-17 15:33:39,791 - Dummy-1 result_batch.py:454 - _parse() - DEBUG - parsing for result batch id: 3
2025-03-17 15:33:39,791 - Dummy-1 server_connection.py:508 - run_query() - DEBUG - Execute query [queryID: 01bb13c9-0306-17f4-0001-9ceefb6b2c2a] describe table TABLE_2
2025-03-17 15:33:39,800 - MainThread session.py:778 - close() - WARNING - Closing a session in a stored procedure is a no-op.
@Tim-Kracht Tim-Kracht added bug Something isn't working needs triage Initial RCA is required labels Mar 17, 2025
@github-actions github-actions bot changed the title Reserved words break DataFrameWriter.save_as_table mode "append" column_order "name" SNOW-1990241: Reserved words break DataFrameWriter.save_as_table mode "append" column_order "name" Mar 17, 2025
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Initial RCA is required
Projects
None yet
Development

No branches or pull requests

2 participants