teradata DDL support #1260

sriram251-code · 2024-12-02T10:57:26Z

This PR contains the features to support Teradata DDL.

Changing the String datatype to varchar for validation dataset. Using varchar is better as we have this datatype across multiple databases.

closes #782 --------- Co-authored-by: SundarShankar89 <[email protected]> Co-authored-by: Serge Smertin <[email protected]>

…rs` (#771) closes #165 - In the Snowflake code, the setting option `!set exit_on_error=true` caused remorph to return an error. We have now modified remorph to comment out this setting, treating it as a command. As a result, instead of producing an error, the output will now append '--' before the command . ex: Snowflake input sql : `!set exit_on_error=true` Remorph output: `-- !set exit_on_error=true` - In Snowflake, passing a parameter such as `select emp_id from abc.emp where emp_id = &ids;` caused remorph to give an error due to the unexpected `&` symbol. We have now implemented a change to convert `&` to its Databricks equivalent `$` which represents parameters, resolving this issue. ex: Snowflake input SQL: `select emp_id from abc.emp where emp_id = &ids;` Remorph output: `select emp_id from abc.emp where emp_id = $ids;` --------- Co-authored-by: Vijay Pavan Nissankararao <[email protected]>

* Added Translation Support for `!` as `commands` and `&` for `Parameters` ([#771](#771)). This commit adds translation support for using "!" as commands and "&" as parameters in Snowflake code within the remorph tool, enhancing compatibility with Snowflake syntax. The "!set exit_on_error=true" command, which previously caused an error, is now treated as a comment and prepended with `--` in the output. The "&" symbol, previously unrecognized, is converted to its Databricks equivalent "$", which represents parameters, allowing for proper handling of Snowflake SQL code containing "!" commands and "&" parameters. These changes improve the compatibility and robustness of remorph with Snowflake code and enable more efficient processing of Snowflake SQL statements. Additionally, the commit introduces a new test suite for Snowflake commands, enhancing code coverage and ensuring proper functionality of the transpiler. * Added `LET` and `DECLARE` statements parsing in Snowflake PL/SQL procedures ([#548](#548)). This commit introduces support for parsing `DECLARE` and `LET` statements in Snowflake PL/SQL procedures, enabling variable declaration and assignment. It adds new grammar rules, refactors code using ScalaSubquery, and implements IR visitors for `DECLARE` and `LET` statements with Variable Assignment and ResultSet Assignment. The `RETURN` statement and parameterized expressions are also now supported. Note that `CURSOR` is not yet covered. These changes allow for improved processing and handling of Snowflake PL/SQL code, enhancing the overall functionality of the library. * Added logger statements in get_schema function ([#756](#756)). In this release, enhanced logging has been implemented in the Metadata (Schema) fetch functions, specifically in the `get_schema` function and other metadata fetch functions within Oracle, SnowflakeDataSource modules. The changes include logger statements that log the schema query, start time, and end time, providing better visibility into the performance and behavior of these functions during debugging or monitoring. The logging functionality is implemented using the built-in `logging` module and timestamps are obtained using the `datetime` module. In the SnowflakeDataSource class, RuntimeError or PySparkException will be raised if the user's current role lacks the necessary privileges to access the specified Information Schema object. The INFORMATION_SCHEMA table in Snowflake is used to fetch the schema, with the query modified to handle unquoted and quoted identifiers and the ordinal position of columns. The `get_schema_query` function has also been updated for better formatting for the SQL query used to fetch schema information. The schema fetching method remains unchanged, but these enhancements provide more detailed logging for debugging and monitoring purposes. * Aggregates Reconcile CLI Implementation ([#770](#770)). The `Aggregates Reconcile CLI Implementation` commit introduces a new command-line interface (CLI) for reconcile jobs, specifically for aggregated data. This change adds a new parameter, "operation_name", to the run method in the runner.py file, which determines the type of reconcile operation to perform. A new function, _trigger_reconcile_aggregates, has been implemented to reconcile aggregate data based on provided configurations and log the reconciliation process outcome. Additionally, new methods for defining job parameters and settings, such as `max_concurrent_runs` and "parameters", have been included. This CLI implementation enhances the customizability and control of the reconciliation process for users, allowing them to focus on specific use cases and data aggregations. The changes also include new test cases in test_runner.py to ensure the proper behavior of the ReconcileRunner class when the `aggregates-reconcile` operation_name is set. * Aggregates Reconcile Updates ([#784](#784)). This commit introduces significant updates to the `Table Deployment` feature, enabling it to support `Aggregate Tables` deployment and modifying the persistence logic for tables. Notable changes include the addition of a new `aggregates` attribute to the `Table` class in the configuration, which allows users to specify aggregate functions and optionally group by specific columns. The reconcile process now captures mismatch data, missing rows in the source, and missing rows in the target in the recon metrics tables. Furthermore, the aggregates reconcile process supports various aggregate functions like min, max, count, sum, avg, median, mode, percentile, stddev, and variance. The documentation has been updated to reflect these improvements. The commit also removes the `percentile` function from the reconciliation configuration and modifies the `aggregate_metrics` SQL query, enhancing the flexibility of the `Table Deployment` feature for `Aggregate Tables`. Users should note that the `percentile` function is no longer a valid option and should update their code accordingly. * Aggregates Reconcile documentation ([#779](#779)). In this commit, the Aggregates Reconcile utility has been enhanced with new documentation and visualizations for improved understanding and usability. The utility now includes a flow diagram, visualization, and README file illustrating how it compares specific aggregate metrics between source and target data residing on Databricks. A new configuration sample is added, showcasing the reconciliation of two tables using various aggregate functions, join columns, transformations, filters, and JDBC ReaderOptions configurations. The commit also introduces two Mermaid flowchart diagrams, depicting the reconciliation process with and without a `group by` operation. Additionally, new flow diagram visualizations in PNG and GIF formats have been added, aiding in understanding the process flow of the Aggregates Reconcile feature. The reconcile configuration samples in the documentation have also been updated with a spelling correction for clarity. * Bump sqlglot from 25.6.1 to 25.8.1 ([#749](#749)). In this version update, the `sqlglot` dependency has been bumped from 25.6.1 to 25.8.1, bringing several bug fixes and new features related to various SQL dialects such as BigQuery, DuckDB, and T-SQL. Notable changes include support for BYTEINT in BigQuery, improved parsing and transpilation of StrToDate in ClickHouse, and support for SUMMARIZE in DuckDB. Additionally, there are bug fixes for DuckDB and T-SQL, including wrapping left IN clause json extract arrow operand and handling JSON_QUERY with a single argument. The update also includes refactors and changes to the ANNOTATORS and PARSER modules to improve dialect-aware annotation and consistency. This pull request is compatible with `sqlglot` version 25.6.1 and below and includes a detailed list of commits and their corresponding changes. * Generate window functions ([#772](#772)). In this release, we have added support for generating SQL `WINDOW` and `SortOrder` expressions in the `ExpressionGenerator` class. This enhancement includes the ability to generate a `WINDOW` expression with a window function, partitioning and ordering clauses, and an optional window frame, using the `window` and `frameBoundary` methods. The `sortOrder` method now generates the SQL `SortOrder` expression, which includes the expression to sort by, sort direction, and null ordering. Additional methods `orNull` and `doubleQuote` return a string representing a NULL value and a string enclosed in double quotes, respectively. These changes provide increased flexibility for handling more complex expressions in SQL. Additionally, new test cases have been added to the `ExpressionGeneratorTest` to ensure the correct generation of SQL window functions, specifically the `ROW_NUMBER()` function with various partitioning, ordering, and framing specifications. These updates improve the robustness and functionality of the `ExpressionGenerator` class for generating SQL window functions. * Implement TSQL specific function call mapper ([#765](#765)). This commit introduces several new features to enhance compatibility between TSQL and Databricks SQL. A new method, `interval`, has been added to generate a Databricks SQL compatible string for intervals in a TSQL expression. The `expression` method has been updated to handle certain functions directly, improving translation efficiency. Specifically, the DATEADD function is now translated to Databricks SQL's DATE_ADD, ADD_MONTHS, and xxx + INTERVAL n {days|months|etc} constructs. The changes also include a new sealed trait `KnownIntervalType`, a new case class `KnownInterval`, and a new class `TSqlCallMapper` for mapping TSQL functions to Databricks SQL equivalents. Furthermore, the commit introduces new tests for TSQL specific function call mappers, ensuring proper translation of TSQL functions to Databricks SQL compatible constructs. These improvements collectively facilitate better integration and compatibility between TSQL and Databricks SQL. * Improve TSQL and Snowflake parser and lexer ([#757](#757)). In this release, the open-source library's Snowflake and TSQL lexers and parsers have been improved for better functionality and robustness. For the Snowflake lexer, unnecessary escape sequence processing has been removed, and various options have been corrected to be simple strings. The lexer now accepts a question mark as a placeholder for prepared statements in Snowflake statements. The TSQL lexer has undergone minor improvements, such as aligning the catch-all rule name with Snowflake. The Snowflake parser now accepts the question mark as a `PARAM` placeholder and simplifies the `typeFileformat` rule to accept a single `STRING` token. Additionally, several new keywords have been added to the TSQL lexer, improving consistency and clarity. These changes aim to simplify lexer and parser rules, enhance option handling and placeholders, and ensure consistency between Snowflake and TSQL. * Patch Information Schema Predicate Pushdown for Snowflake ([#764](#764)). In this release, we have implemented Information Schema Predicate Pushdown for Snowflake, resolving issue [#7](#7) * TSQL: Implement correct grammar for CREATE TABLE in all forms ([#796](#796)). In this release, the TSqlLexer's CREATE TABLE statement grammar has been updated and expanded to support new keywords and improve accuracy. The newly added keywords 'EDGE', 'FILETABLE', 'NODE', and `NODES` enable correct parsing of CREATE TABLE statements using graph nodes and FILETABLE functionality. Existing keywords such as 'DROP_EXISTING', 'DYNAMIC', 'FILENAME', and `FILTER` have been refined for better precision. Furthermore, the introduction of the `tableIndices` rule standardizes the order of columns in the table. These enhancements improve the T-SQL parser's robustness and consistency, benefiting users in creating and managing tables in their databases. * TSQL: Implement grammar for CREATE DATABASE and CREATE DATABASE SCOPED OPTION ([#788](#788)). In this release, we have implemented the TSQL grammar for `CREATE DATABASE` and `CREATE DATABASE SCOPED OPTION` statements, addressing inconsistencies with TSQL documentation. The implementation was initially intended to cover the entire process from grammar to code generation. However, to simplify other DDL statements, the work was split into separate grammar-only pull requests. The diff introduces new methods such as `createDatabaseScopedCredential`, `createDatabaseOption`, and `databaseFilestreamOption`, while modifying the existing `createDatabase` method. The `createDatabaseScopedCredential` method handles the creation of a database scoped credential, which was previously part of `createDatabaseOption`. The `createDatabaseOption` method now focuses on handling individual options, while `databaseFilestreamOption` deals with filesystem specifications. Note that certain options, like `DEFAULT_LANGUAGE`, `DEFAULT_FULLTEXT_LANGUAGE`, and more, have been marked as TODO and will be addressed in future updates. * TSQL: Improve transpilation coverage ([#766](#766)). In this update, various enhancements have been made to improve the coverage of TSQL transpilation and address bugs in code generation, particularly for the `ExpressionGenerator` class in the `com/databricks/labs/remorph/generators/sql` package, and the `TSqlExpressionBuilder`, `TSqlFunctionBuilder`, `TSqlCallMapper`, and `QueryRunner` classes. Changes include adding support for new cases, modifying code generation behavior, improving test coverage, and updating existing tests for better TSQL code generation. Specific additions include new methods for handling bitwise operations, converting CHECKSUM_AGG calls to a sequence of MD5 function calls, and handling Fn instances. The `QueryRunner` class has been updated to include both the actual and expected outputs in error messages for better debugging purposes. Additionally, the test file for the `DATEADD` function has been updated to ensure proper syntax and consistency. All these modifications aim to improve the reliability, accuracy, and compatibility of TSQL transpilation, ensuring better functionality and coverage for the Remorph library's transformation capabilities. * [chore] speedup build process by not running unit tests twice ([#842](#842)). In this commit, the build process for the open-source library has been optimized by removing the execution of unit tests during the build phase in the Maven build process. A new plugin for the Apache Maven Surefire Plugin has been added, with the group ID set to "org.apache.maven.plugins", artifact ID set to "maven-surefire-plugin", and version set to "3.1.2". The configuration for this plugin includes a `skipTests` attribute set to "true", ensuring that tests are not run twice, thereby improving the build process speed. The existing ScalaTest Maven plugin configuration remains unchanged, allowing Scala tests to still be executed during the test phase. Additionally, the Maven Compiler Plugin has been upgraded to version 3.11.0, and the release parameter has been set to 8, ensuring that the Java compiler used during the build process is compatible with Java 8. The version numbers for several libraries, including os-lib, mainargs, ujson, scalatest, and exec-maven-plugin, are now being defined using properties, allowing Maven to manage and cache these libraries more efficiently. These changes improve the build process's performance and reliability without affecting the existing functionality. * [internal] better errors for call mapper ([#816](#816)). In this release, the `ExpressionGenerator` class in the `com.databricks.labs.remorph.generators.sql` package has been updated to handle exceptions during the conversion of input functions to Databricks expressions. A try-catch block has been added to catch `IndexOutOfBoundsException` and provide a more descriptive error message, including the name of the problematic function and the error message associated with the exception. A `TranspileException` with the message `not implemented` is now thrown when encountering a function for which a translation to Databricks expressions is not available. The `IsTranspiledFromSnowflakeQueryRunner` class in the `com.databricks.labs.remorph.coverage` package has also been updated to include the name of the exception class in the error message for better error identification when a non-fatal error occurs during parsing. Additionally, the import statement for `Formatter` has been moved to ensure alphabetical order. These changes improve error handling and readability, thereby enhancing the overall user experience for developers interacting with the codebase. * [snowflake] map more functions to Databricks SQL ([#826](#826)). This commit introduces new private methods `andPredicate` and `orPredicate` to the ExpressionGenerator class in the `com.databricks.labs.remorph.generators.sql` package, enhancing the generation of SQL expressions for AND and OR logical operators, and improving readability and correctness of complex logical expressions. The LogicalPlanGenerator class in the `sql` package now supports more flexibility in inserting data into a target relation, enabling users to choose between overwriting the existing data or appending to it. The `FROM_JSON` function in the CallMapper class has been updated to accommodate an optional third argument, providing more flexibility in handling JSON-related transformations. A new class, `CastParseJsonToFromJson`, has been introduced to improve the performance of data processing pipelines that involve parsing JSON data in Snowflake using the `PARSE_JSON` function. Additional Snowflake SQL functions have been mapped to Databricks SQL IR, enhancing compatibility and functionality. The ExpressionGeneratorTest class now generates predicates without parentheses, simplifying and improving readability. Mappings for several Snowflake functions to Databricks SQL have been added, enhancing compatibility with Databricks SQL. The `sqlFiles` sequence in the `NestedFiles` class is now sorted before being mapped to `AcceptanceTest` objects, ensuring consistent order for testing or debugging purposes. A semicolon has been added to the end of a SQL query in a test file for Snowflake DML insert functionality, ensuring proper query termination. * [sql] generate `INSERT INTO ...` ([#823](#823)). In this release, we have made significant updates to our open-source library. The ExpressionGenerator.scala file has been updated to convert boolean values to lowercase instead of uppercase when generating INSERT INTO statements, ensuring SQL code consistency. A new method `insert` has been added to the `LogicalPlanGenerator` class to generate INSERT INTO SQL statements based on the `InsertIntoTable` input. We have introduced a new case class `InsertIntoTable` that extends `Modification` to simplify the API for DML operations other than SELECT. The SQL ExpressionGenerator now generates boolean literals in lowercase, and new test cases have been added to ensure the correct generation of INSERT and JOIN statements. Lastly, we have added support for generating INSERT INTO statements in SQL for specified database tables, improving cross-platform compatibility. These changes aim to enhance the library's functionality and ease of use for software engineers. * [sql] generate basic JSON access ([#835](#835)). In this release, we have added several new features and improvements to our open-source library. The `ExpressionGenerator` class now includes a new method, `jsonAccess`, which generates SQL code to access a JSON object's properties, handling different types of elements in the path. The `TO_JSON` function in the `StructsToJson` class has been updated to accept an optional expression as an argument, enhancing its flexibility. The `SnowflakeCallMapper` class now includes a new method, `lift`, and a new feature to generate basic JSON access, with corresponding updates to test cases and methods. The SQL logical plan generator has been refined to generate star projections with escaped identifiers, handling complex table and database names. We have also added new methods and test cases to the `SnowflakeCallMapper` class to convert Snowflake structs into JSON strings and cast Snowflake values to specific data types. These changes improve the library's ability to handle complex JSON data structures, enhance functionality, and ensure the quality of generated SQL code. * [sql] generate basic `CREATE TABLE` definition ([#829](#829)). In this release, the open-source library's SQL generation capabilities have been enhanced with the addition of a new `createTable` method to the `LogicalPlanGenerator` class. This method generates a `CREATE TABLE` definition for a given `ir.CreateTableCommand`, producing a SQL statement with a comma-separated list of column definitions. Each column definition includes the column name, data type, and any applicable constraints, generated using the `DataTypeGenerator.generateDataType` method and the newly-introduced `constraint` method. Additionally, the `project` method has been updated to incorporate a `FROM` clause in the generated SQL statement when the input of the project node is not `ir.NoTable()`. These improvements extend the functionality of the `LogicalPlanGenerator` class, allowing it to generate `CREATE TABLE` statements for input catalog ASTs, thereby better supporting data transformation use cases. A new test for the `CreateTableCommand` has been added to the `LogicalPlanGeneratorTest` class to validate the correct transpilation of the `CreateTableCommand` to a `CREATE TABLE` SQL statement. * [sql] generate basic `TABLESAMPLE` ([#830](#830)). In this commit, the open-source library's `LogicalPlanGenerator` class has been updated to include a new method, `tableSample`, which generates SQL representations of table sampling operations. Previously, the class only handled `INSERT`, `DELETE`, and `CREATE TABLE` commands. With this enhancement, the generator can now produce SQL statements using the `TABLESAMPLE` clause, allowing for the selection of a sample of data from a table based on various sampling methods and a seed value for repeatable sampling. The newly supported sampling methods include row-based probabilistic, row-based fixed amount, and block-based sampling. Additionally, a new test case has been added for the `LogicalPlanGenerator` related to the `TableSample` class, validating the correct transpilation of named tables and fixed row sampling into the `TABLESAMPLE` clause with specified parameters. This improvement ensures that the generated SQL code accurately represents the desired table sampling settings. Dependency updates: * Bump sqlglot from 25.6.1 to 25.8.1 ([#749](#749)).

* Added proper `StructType`/`StructField`/`StructExpr` implementations for parsing, data type inference, and code generation * Mapped `OBJECT_CONSTRUCT` from snowflake

#854) This PR makes it possible to use `databricks labs remorph debug-script --name tests/resources/functional/snowflake/nested_query_with_json_1.sql --dialect snowflake` command and sets up basic Dependency Injection infrastructure via `ApplicationContext` trait. <img width="1465" alt="image" src="https://github.com/user-attachments/assets/6e4df343-c305-4495-9a6f-bd3c5c71e314">

#856) … there are more than 2 dialects in the file

Co-authored-by: Serge Smertin <[email protected]> Co-authored-by: SundarShankar89 <[email protected]>

* Formatted `aggregates JSON` configs and added `aggregates` in `reconcile configs`

fixes #851 --------- Co-authored-by: SundarShankar89 <[email protected]>

Co-authored-by: Jim Idle <[email protected]>

…ME COLUMNS`, and `DROP CONSTRAINTS` (#861) Implement Alter Table

Added a Table Header with some notes

NB: Previous PR was merged before I was totally finished Normalizes parameter generation to always use ${} for clarity and conforming to Databricks notebook examples for widgets. Adds additional coverage test for variable refs in strings

Adds a start point for useful ANTLR utilities. - Given an ANTLR ParserRuleContext : retrieve the original text from the input source

1. Updated Spark setup script to check whether `spark gzip file` exists 1. Refactored the script to remove the warnings: * `Double quote to prevent globbing and word splitting` * `Use 'cd ... || exit' or 'cd ... || return' in case cd fails.` * `Consider using 'grep -c' instead of 'grep|wc -l'.` * `Usage: sleep seconds`, converted 2m to 120 seconds 1. Tested the following Scenarios: 1. Scenario: Extracted Spark folder is already present Outcome: Directly starts the spark server using `sbin/start-connect-server.sh` 1. Scenario: Extracted Spark folder **is not present**, and Zip file (`spark-<VERSION>.tgz`) is present Outcome: Extract the zip file and start the spark server 1. Scenario: Extracted Spark folder **is not present**, and Zip file **is not present** Outcome: Download, Extract and start the spark server

…te task` (#864) `alter session | stream...` `create stream` `create task` and `execute task`

now it generates either Project or Deduplicates. --------- Co-authored-by: Serge Smertin <[email protected]>

merge after #887

fixes #898

Right now remorph-reconcile supports only physical tables. However, some of the clients use temporary views for reconciliation. To support temporary views we had to make changes to support `global_temp`.

* Added query history retrieval from Snowflake ([#874](#874)). This release introduces query history retrieval from Snowflake, enabling expanded compatibility and data source options for the system. The update includes adding the Snowflake JDBC driver and its dependencies to the `pom.xml` file, and the implementation of a new `SnowflakeQueryHistory` class to retrieve query history from Snowflake. The `Anonymizer` object is also added to anonymize query histories by fingerprinting queries based on their structure. Additionally, several case classes are added to represent various types of data related to query execution and table definitions in a Snowflake database. A new `EnvGetter` class is also included to retrieve environment variables for use in testing. Test files for the `Anonymizer` and `SnowflakeQueryHistory` classes are added to ensure proper functionality. * Added support for `ALTER TABLE`: `ADD COLUMNS`, `DROP COLUMNS`, `RENAME COLUMNS`, and `DROP CONSTRAINTS` ([#861](#861)). In this release, support for various `ALTER TABLE` SQL commands has been added to our open-source library, including `ADD COLUMNS`, `DROP COLUMNS`, `RENAME COLUMNS`, and `DROP CONSTRAINTS`. These features have been implemented in the `LogicalPlanGenerator` class, which now includes a new private method `alterTable` that takes a context and an `AlterTableCommand` object and returns an `ALTER TABLE` SQL statement. Additionally, a new sealed trait `TableAlteration` has been introduced, with four case classes extending it to handle specific table alteration operations. The `SnowflakeTypeBuilder` class has also been updated to parse and build Snowflake-specific SQL types for these commands. These changes provide improved functionality for managing and manipulating tables in Snowflake, making it easier for users to work with and modify their data. The new functionality has been tested using the `SnowflakeToDatabricksTranspilerTest` class, which specifies Snowflake `ALTER TABLE` commands and the expected transpiled results. * Added support for `STRUCT` types and conversions ([#852](#852)). This change adds support for `STRUCT` types and conversions in the system by implementing new `StructType`, `StructField`, and `StructExpr` classes for parsing, data type inference, and code generation. It also maps the `OBJECT_CONSTRUCT` from Snowflake and introduces updates to various case classes such as `JsonExpr`, `Struct`, and `Star`. These improvements enhance the system's capability to handle complex data structures, ensuring better compatibility with external data sources and expanding the range of transformations available for users. Additionally, the changes include the addition of test cases to verify the functionality of generating SQL data types for `STRUCT` expressions and handling JSON literals more accurately. * Minor upgrades to Snowflake parameter processing ([#871](#871)). This commit includes minor upgrades to Snowflake parameter processing, enhancing the consistency and readability of the code. The changes normalize parameter generation to use `${}` syntax for clarity and to align with Databricks notebook examples. An extra coverage test for variable references within strings has been added. The specific changes include updating a SELECT statement in a Snowflake SQL query to use ${} for parameter processing. The commit also introduces a new SQL file for functional tests related to Snowflake's parameter processing, which includes commented out and alternate syntax versions of a query. This commit is part of continuous efforts to improve the functionality, reliability, and usability of the Snowflake parameter processing feature. * Patch/reconcile support temp views ([#901](#901)). The latest update to the remorph-reconcile library adds support for temporary views, a new feature that was not previously available. With this change, the system can now handle `global_temp` for temporary views by modifying the `_get_schema_query` method to return a query for the `global_temp` schema if the schema name is set as such. Additionally, the `read_data` method was updated to correctly handle the namespace and catalog for temporary views. The new variable `namespace_catalog` has been introduced, which is set to `hive_metastore` if the catalog is not set, and to the original catalog with the added schema otherwise. The `table_with_namespace` variable is then updated to use the `namespace_catalog` and table name, allowing for correct querying of temporary views. These modifications enable remorph-reconcile to work seamlessly with temporary views, enhancing its flexibility and functionality. The updated unit tests reflect these changes, with assertions to ensure that the correct SQL statements are being generated and executed for temporary views. * Reconcile Table Recon JSON filename updates ([#866](#866)). The Remorph project has implemented a change to the naming convention and placement of the configuration file for the table reconciliation process. The configuration file, previously named according to individual preference, must now follow the pattern `recon_config_<DATA_SOURCE>_<SOURCE_CATALOG_OR_SCHEMA>_<REPORT_TYPE>.json` and be placed in the `.remorph` directory within the Databricks Workspace. Examples of Table Recon filenames for Snowflake, Oracle, and Databricks source systems have been provided for reference. Additionally, the `data_source` field in the config file has been updated to accurately reflect the data source. The case of the filename should now match the case of `SOURCE_CATALOG_OR_SCHEMA` as defined in the config. Compliance with this new naming convention and placement is required for the successful execution of the table reconciliation process. * [snowflake] parse parameters ([#855](#855)). The open-source library has undergone changes related to Scalafmt configuration, Snowflake SQL parsing, and the introduction of a new `ExpressionGenerator` class method. The Scalafmt configuration change introduces a new `docstrings.wrap` option set to `false`, disabling docstring wrapping at the specified column limit. The `danglingParentheses.preset` option is also set to `false`, disabling the formatting rule for unnecessary parentheses. In Snowflake SQL parsing, new token types, lexer modes, and parser rules have been added to improve the parsing of string literals and other elements. A new `variable` method in the `ExpressionGenerator` class generates SQL expressions for `ir.Variable` objects. A new `Variable` case class has been added to represent a variable in an expression, and the `SchemaReference` case class now takes a single child expression. The `SnowflakeDDLBuilder` class has a new method, `extractString`, to safely extract strings from ANTLR4 context objects. The `SnowflakeErrorStrategy` object now includes new parameters for parsing Snowflake syntax, and the Snowflake LexerSpec test class has new methods for filling tokens from an input string and dumping the token list. Tests have been added for various string literal scenarios, and the SnowflakeAstBuilderSpec includes a new test case for handling the `translate amps` functionality. The Snowflake SQL queries in the test file have been updated to standardize parameter referencing syntax, improving consistency and readability. * fixed current_date() generation ([#890](#890)). This release includes a fix for an issue with the generation of the `current_date()` function in SQL queries, specifically for the Snowflake dialect. A test case in the `sqlglot-incorrect` category has been updated to use the correct syntax for the `CURRENT_DATE` function, which includes parentheses (`SELECT CURRENT_DATE() FROM tabl;`). Additionally, the `current_date()` function is now called consistently throughout the tests, either as `CURRENT_DATE` or `CURRENT_DATE()`, depending on the syntax required by Snowflake. No new methods were added, and the existing functionality was changed only to correct the `current_date()` generation. This improvement ensures accurate and consistent generation of the `current_date()` function across different SQL dialects, enhancing the reliability and accuracy of the tests.

closes #769

Here, we implement transpilation of the TSQL CREATE TABLE command with its many options and forms, including CTAS and graph node syntax, as well as covering syntactical differences for the analytics variants of SQL Server. We also move the DDL and DML visitors from the AST and Relation visitors, to give better separation of responsibility.

Fixes #1278

Bumps org.slf4j:slf4j-api from 2.0.9 to 2.0.16. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.slf4j:slf4j-api&package-manager=maven&previous-version=2.0.9&new-version=2.0.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [org.codehaus.mojo:exec-maven-plugin](https://github.com/mojohaus/exec-maven-plugin) from 3.4.1 to 3.5.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/mojohaus/exec-maven-plugin/releases">org.codehaus.mojo:exec-maven-plugin's releases</a>.</em></p> <blockquote> <h2>3.5.0</h2>  <h2>🚀 New features and improvements</h2> <ul> <li>Add toolchain java path to environment variables in ExecMojo (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/455">#455</a>) <a href="https://github.com/michalm2000"><code>@michalm2000</code></a></li> </ul> <h2>🐛 Bug Fixes</h2> <ul> <li><a href="https://redirect.github.com/mojohaus/exec-maven-plugin/issues/322">#322</a>, enable to control the exec:java interaction with JVM classloader more finely (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/337">#337</a>) <a href="https://github.com/rmannibucau"><code>@rmannibucau</code></a></li> </ul> <h2>📦 Dependency updates</h2> <ul> <li>Bump org.codehaus.mojo:mojo-parent from 85 to 86 (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/445">#445</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>Bump commons-io:commons-io from 2.7 to 2.14.0 in /src/it/projects/project6/project5lib (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/451">#451</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>Bump asm.version from 9.7 to 9.7.1 (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/452">#452</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>Bump commons-io:commons-io from 2.7 to 2.14.0 in /src/it/projects/setup-parent (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/450">#450</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>Bump commons-io:commons-io from 2.7 to 2.14.0 in /src/test/projects/project13 (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/449">#449</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>Bump org.codehaus.plexus:plexus-utils from 4.0.1 to 4.0.2 (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/447">#447</a>) <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> </ul> <h2>👻 Maintenance</h2> <ul> <li>Update site descriptor to 2.0.0 (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/457">#457</a>) <a href="https://github.com/slawekjaranowski"><code>@slawekjaranowski</code></a></li> <li>Toolchains manual improvements (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/456">#456</a>) <a href="https://github.com/slawekjaranowski"><code>@slawekjaranowski</code></a></li> <li>Manage version of maven-toolchains-plugin (<a href="https://redirect.github.com/mojohaus/exec-maven-plugin/pull/454">#454</a>) <a href="https://github.com/slawekjaranowski"><code>@slawekjaranowski</code></a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/b80d3d6a4d2cab7eba6146f0f484c65b23001a4e"><code>b80d3d6</code></a> [maven-release-plugin] prepare release 3.5.0</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/226a8ced3ee93e4ff43cee7ba8d84b002ec1c0dc"><code>226a8ce</code></a> Update site descriptor to 2.0.0</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/47eac15af14e836d54aeb8fae4c047182f224a1f"><code>47eac15</code></a> <a href="https://redirect.github.com/mojohaus/exec-maven-plugin/issues/322">#322</a>, enable to control the exec:java interaction with JVM classloader more f...</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/582aed0a77eff2f410e4db4b7c3a854bc6a25c52"><code>582aed0</code></a> Bump project version</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/8e7fa52732602b282076941f96fe28c612f1f948"><code>8e7fa52</code></a> Update src/main/java/org/codehaus/mojo/exec/ExecMojo.java</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/d2bdc9ccf26d4abf76c9cc404465b668a5ad1b09"><code>d2bdc9c</code></a> Add toolchain java path to environment variables in ExecMojo - added tests an...</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/eb62d787db6671825f7b783b33fcb38c95f9c225"><code>eb62d78</code></a> Add toolchain java path to environment variables in ExecMojo - added tests an...</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/8dbbb0782fe7c9af168c891f11c8cee40ec76d4d"><code>8dbbb07</code></a> Add toolchain java path to environment variables in ExecMojo - added tests an...</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/168b368c38a940137fc8ab4b9981f8eed9278911"><code>168b368</code></a> Add toolchain java path to environment variables in ExecMojo - added tests an...</li> <li><a href="https://github.com/mojohaus/exec-maven-plugin/commit/491526a211f07e20da5e64029db4b2697c0d64af"><code>491526a</code></a> Add toolchain java path to environment variables in ExecMojo - added tests an...</li> <li>Additional commits viewable in <a href="https://github.com/mojohaus/exec-maven-plugin/compare/3.4.1...3.5.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.codehaus.mojo:exec-maven-plugin&package-manager=maven&previous-version=3.4.1&new-version=3.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Valentin Kasas <[email protected]>

sundarshankar89 · 2024-12-16T04:46:23Z

src/databricks/labs/remorph/transpiler/sqlglot/generator/databricks.py

-    if expression.this in [exp.DataType.Type.TIMESTAMP, exp.DataType.Type.TIMESTAMPLTZ]:
-        return "TIMESTAMP"
+    if expression.this in [exp.DataType.Type.TIMESTAMP]:
+        return "TIMESTAMP_NTZ"


we need to introduce a new Dataype, we can't map existing Timestamp to TIMESTAMP_NTZ

sundarshankar89 · 2024-12-16T04:59:38Z

src/databricks/labs/remorph/transpiler/sqlglot/parsers/teradata.py

@@ -0,0 +1,219 @@
+from sqlglot.dialects.teradata import Teradata as org_Teradata


For consistency from other PRs

Suggested change

from sqlglot.dialects.teradata import Teradata as org_Teradata

from sqlglot.dialects.teradata import Teradata as SqlglotTeradata

sundarshankar89 · 2024-12-16T05:00:58Z

src/databricks/labs/remorph/transpiler/sqlglot/parsers/teradata.py

+            "CASE_N": lambda self: self._parse_case_n(),
+        }
+
+        def match_pair_and_advance(self):


Suggested change

def match_pair_and_advance(self):

def _match_pair_and_advance(self):

I still see this as public method.

sundarshankar89 · 2024-12-16T05:02:55Z

src/databricks/labs/remorph/transpiler/sqlglot/parsers/teradata.py

+            end = None
+            clone = None
+
+            def extend_props(temp_props: exp.Properties | None) -> None:


move these are static private methods outside the class which will help with complexity pylint checker.

sundarshankar89 · 2024-12-16T05:05:17Z

tests/resources/teradata/ddl/test_case_n.sql

@@ -0,0 +1,19 @@
+-- Snowflake sql:


please add functional tests for these just adding SQL examples doesn't work and test_teradata.py similar to test_databricks.py or test_presto.py

sundarshankar89 · 2024-12-16T05:06:07Z

tests/resources/teradata/ddl/test_case_n2.sql

@@ -0,0 +1,18 @@
+CREATE SET TABLE db_ods_plus.SPH_CNTRY_CRDNTR ,FALLBACK ,


All functional tests need to follow

--- <Source> SQL <Actual SQL> --- Databricks SQL <Expected SQL>

Our current code uses sqlglot data structures in places where we should not depend on sqlglot. This PR fixes the issue. Progresses #1298 Requires #1321 to be merged -> Ready now

Rename for clarity Progresses #1298 Requires #1320 to be merged (could have been done independently but not worth extra work)

Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 25.34.1 to 26.0.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md">sqlglot's changelog</a>.</em></p> <blockquote> <h2>[v26.0.0] - 2024-12-10</h2> <h3>:boom: BREAKING CHANGES</h3> <ul> <li> <p>due to <a href="https://github.com/tobymao/sqlglot/commit/1d3c9aa604c7bf60166a0e5587f1a8d88b89bea6"><code>1d3c9aa</code></a> - Transpile support for bitor/bit_or snowflake function <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4486">#4486</a> by <a href="https://github.com/ankur334"><code>@ankur334</code></a>)</em>:</p> <p>Transpile support for bitor/bit_or snowflake function (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4486">#4486</a>)</p> </li> <li> <p>due to <a href="https://github.com/tobymao/sqlglot/commit/ab108518c53173ddf71ac1dfd9e45df6ac621b81"><code>ab10851</code></a> - Preserve roundtrips of DATETIME/DATETIME2 <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4491">#4491</a> by <a href="https://github.com/VaggelisD"><code>@VaggelisD</code></a>)</em>:</p> <p>Preserve roundtrips of DATETIME/DATETIME2 (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4491">#4491</a>)</p> </li> </ul> <h3>:sparkles: New Features</h3> <ul> <li><a href="https://github.com/tobymao/sqlglot/commit/1d3c9aa604c7bf60166a0e5587f1a8d88b89bea6"><code>1d3c9aa</code></a> - <strong>snowflake</strong>: Transpile support for bitor/bit_or snowflake function <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4486">#4486</a> by <a href="https://github.com/ankur334"><code>@ankur334</code></a>)</em></li> <li><a href="https://github.com/tobymao/sqlglot/commit/822aea0826f09fa773193004acb2af99e495fddd"><code>822aea0</code></a> - <strong>snowflake</strong>: Support for inline FOREIGN KEY <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4493">#4493</a> by <a href="https://github.com/VaggelisD"><code>@VaggelisD</code></a>)</em> <ul> <li>:arrow_lower_right: <em>addresses issue <a href="https://redirect.github.com/tobymao/sqlglot/issues/4489">#4489</a> opened by <a href="https://github.com/kylekarpack"><code>@kylekarpack</code></a></em></li> </ul> </li> </ul> <h3>:bug: Bug Fixes</h3> <ul> <li><a href="https://github.com/tobymao/sqlglot/commit/ab108518c53173ddf71ac1dfd9e45df6ac621b81"><code>ab10851</code></a> - <strong>tsql</strong>: Preserve roundtrips of DATETIME/DATETIME2 <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4491">#4491</a> by <a href="https://github.com/VaggelisD"><code>@VaggelisD</code></a>)</em></li> <li><a href="https://github.com/tobymao/sqlglot/commit/43975e4b7abcd640cd5a0f91aea1fbda8dd893cb"><code>43975e4</code></a> - <strong>duckdb</strong>: Allow escape strings similar to Postgres <em>(PR <a href="https://redirect.github.com/tobymao/sqlglot/pull/4497">#4497</a> by <a href="https://github.com/VaggelisD"><code>@VaggelisD</code></a>)</em> <ul> <li>:arrow_lower_right: <em>fixes issue <a href="https://redirect.github.com/tobymao/sqlglot/issues/4496">#4496</a> opened by <a href="https://github.com/LennartH"><code>@LennartH</code></a></em></li> </ul> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/tobymao/sqlglot/commit/051c6f085e426dbcdda3fbc324a180ada268355f"><code>051c6f0</code></a> Refactor!!: bundle multiple WHEN [NOT] MATCHED into a exp.WhenSequence (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4495">#4495</a>)</li> <li><a href="https://github.com/tobymao/sqlglot/commit/43975e4b7abcd640cd5a0f91aea1fbda8dd893cb"><code>43975e4</code></a> fix(duckdb): Allow escape strings similar to Postgres (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4497">#4497</a>)</li> <li><a href="https://github.com/tobymao/sqlglot/commit/ab108518c53173ddf71ac1dfd9e45df6ac621b81"><code>ab10851</code></a> fix(tsql)!: Preserve roundtrips of DATETIME/DATETIME2 (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4491">#4491</a>)</li> <li><a href="https://github.com/tobymao/sqlglot/commit/822aea0826f09fa773193004acb2af99e495fddd"><code>822aea0</code></a> feat(snowflake): Support for inline FOREIGN KEY (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4493">#4493</a>)</li> <li><a href="https://github.com/tobymao/sqlglot/commit/1d3c9aa604c7bf60166a0e5587f1a8d88b89bea6"><code>1d3c9aa</code></a> feat(snowflake)!: Transpile support for bitor/bit_or snowflake function (<a href="https://redirect.github.com/tobymao/sqlglot/issues/4486">#4486</a>)</li> <li><a href="https://github.com/tobymao/sqlglot/commit/2655d7c11d677cf47f33ac62fbfb86f4117ffd75"><code>2655d7c</code></a> docs: update API docs, CHANGELOG.md for v25.34.1 [skip ci]</li> <li>See full diff in <a href="https://github.com/tobymao/sqlglot/compare/v25.34.1...v26.0.0">compare view</a></li> </ul> </details> <br /> <details> <summary>Most Recent Ignore Conditions Applied to This Pull Request</summary> | Dependency Name | Ignore Conditions | | --- | --- | | sqlglot | [>= 24.a, < 25] | | sqlglot | [>= 25.31.dev0, < 25.32] | </details> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sqlglot&package-manager=pip&previous-version=25.34.1&new-version=26.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…#1373) Bumps org.apache.logging.log4j:log4j-slf4j2-impl from 2.24.2 to 2.24.3. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.logging.log4j:log4j-slf4j2-impl&package-manager=maven&previous-version=2.24.2&new-version=2.24.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Our current CI blocks merge of PRs if added python code contains comments such as: `# pylint: disable=cyclic-import, import-outside-toplevel` Local imports are discouraged, but are the most effective way of solving cyclic imports. This PR solves the issue. --------- Co-authored-by: Valentin Kasas <[email protected]>

Bumps [org.junit:junit-bom](https://github.com/junit-team/junit5) from 5.11.3 to 5.11.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/junit-team/junit5/releases">org.junit:junit-bom's releases</a>.</em></p> <blockquote> <p>JUnit 5.11.4 = Platform 1.11.4 + Jupiter 5.11.4 + Vintage 5.11.4</p> <p>See <a href="http://junit.org/junit5/docs/5.11.4/release-notes/">Release Notes</a>.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/junit-team/junit5/compare/r5.11.3...r5.11.4">https://github.com/junit-team/junit5/compare/r5.11.3...r5.11.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/junit-team/junit5/commit/6430ba4f653f6ae42f326cd0731b259ee699c719"><code>6430ba4</code></a> Release 5.11.4</li> <li><a href="https://github.com/junit-team/junit5/commit/d09312174e38e36ec7ac9b35f7033d6a2b693125"><code>d093121</code></a> Finalize 5.11.4 release notes</li> <li><a href="https://github.com/junit-team/junit5/commit/0444353084f7c47bb29e785b10cf3e835454c2da"><code>0444353</code></a> Fix Maven integration tests on JDK 24</li> <li><a href="https://github.com/junit-team/junit5/commit/b5c7f4eeaff5b8a654e9ea6b78227cf90345b0ae"><code>b5c7f4e</code></a> Move <a href="https://redirect.github.com/junit-team/junit5/issues/4153">#4153</a> to 5.11.4 release notes</li> <li><a href="https://github.com/junit-team/junit5/commit/b20c4e2eaed8a97536d48f7bb084a4bd828a56a9"><code>b20c4e2</code></a> Ensure the XMLStreamWriter is closed after use</li> <li><a href="https://github.com/junit-team/junit5/commit/6376f0ab367f1ac17ce75b5410e68090b03b9d9b"><code>6376f0a</code></a> Configure Git username and email</li> <li><a href="https://github.com/junit-team/junit5/commit/2b485c4286531fe7f3aa70367a27cf141c669a12"><code>2b485c4</code></a> Set reference repo URI</li> <li><a href="https://github.com/junit-team/junit5/commit/500b5a06b5964a477e65719877653bae0e2496fc"><code>500b5a0</code></a> Inject username and password via new DSL</li> <li><a href="https://github.com/junit-team/junit5/commit/d67196188fb63fa5a35f63caf168dc42cecfaca8"><code>d671961</code></a> Update plugin gitPublish to v5</li> <li><a href="https://github.com/junit-team/junit5/commit/3d11279dbaae5aac0ab5f28d8283272bdbca924f"><code>3d11279</code></a> Add <code>JAVA_25</code> to <code>JRE</code> enum</li> <li>Additional commits viewable in <a href="https://github.com/junit-team/junit5/compare/r5.11.3...r5.11.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.junit:junit-bom&package-manager=maven&previous-version=5.11.3&new-version=5.11.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: SundarShankar89 <[email protected]>

…eature/teradata_support

) Python unit tests currently comprise integration tests that are slow or impossible to run OOTB from a desktop. This PR solves the issue by: - moving integration tests from the `units` hierarchy to the `integration` one - adding a `python-integration` job to the CI that runs tests under `integration`

Running unit tests locally creates files in the source hierarchy which need to be removed each time before pushing to git This PR fixes the issue

# Conflicts: # src/databricks/labs/remorph/config.py

…eature/teradata_support

sundarshankar89 · 2024-12-24T05:47:43Z

src/databricks/labs/remorph/transpiler/sqlglot/parsers/teradata.py

+            clone = None
+            return exists, this, expression, indexes, no_schema_binding, begin, end, clone
+
+        def _parse_function_body(self, extend_props):


nit: @sriram251-code Add comments or docs to each function

sundarshankar89 · 2024-12-24T05:48:18Z

src/databricks/labs/remorph/transpiler/sqlglot/parsers/teradata.py

+            "CASE_N": lambda self: self._parse_case_n(),
+        }
+
+        def match_pair_and_advance(self):


I still see this as public method.

sundarshankar89 · 2024-12-24T05:50:03Z

tests/resources/functional/teradata/ddl/test_map.sql

+     NO AFTER JOURNAL,
+     CHECKSUM = DEFAULT,
+     DEFAULT MERGEBLOCKRATIO,
+     MAP = TD_MAP1


@sriram251-code in the code, add links to documentation which defines what MAP and Compress does that way it is easy to navigate to understanding why we have ignored them.

sundarshankar89 · 2024-12-24T05:50:44Z

tests/resources/functional/teradata/ddl/test_range.sql

+
+--databricks sql:
+CREATE TABLE TBL1 (
+                                              COL1 DECIMAL(14, 0) NOT NULL


Is this because of formatting issue?

Suggested change

COL1 DECIMAL(14, 0) NOT NULL

COL1 DECIMAL(14, 0) NOT NULL

sundarshankar89 · 2024-12-24T05:51:13Z

tests/unit/transpiler/test_teradata.py

+
+from ..conftest import FunctionalTestFile, get_functional_test_files_from_directory
+
+path = Path(__file__).parent / Path('../../resources/functional/teradata/ddl/')


Suggested change

path = Path(__file__).parent / Path('../../resources/functional/teradata/ddl/')

path = Path(__file__).parent / Path('../../resources/functional/teradata/')

sundarshankar89 · 2025-01-08T06:34:20Z

@sriram251-code can you recreate the PRs

aman-db and others added 30 commits August 23, 2024 12:39

Changing String data type to Varchar for validation dataset (#845)

1802a2e

Changing the String datatype to varchar for validation dataset. Using varchar is better as we have this datatype across multiple databases.

Handling window frame of rank-related functions in snowflake (#833)

8888a6a

closes #782 --------- Co-authored-by: SundarShankar89 <[email protected]> Co-authored-by: Serge Smertin <[email protected]>

Add an optimizer rule for SF's WITHIN GROUP clause (#844)

ea60dbb

[chore] ensure Alias(Expression, Id) shape (#847)

6f4ab46

Added support for STRUCT types and conversions (#852)

573f905

* Added proper `StructType`/`StructField`/`StructExpr` implementations for parsing, data type inference, and code generation * Mapped `OBJECT_CONSTRUCT` from snowflake

Fix CommentBasedQueryExtractor so that it works as intended even when… (

d4f90fc

#856) … there are more than 2 dialects in the file

Improve coverage around snowflake functions (#860)

6de49da

Co-authored-by: Serge Smertin <[email protected]> Co-authored-by: SundarShankar89 <[email protected]>

Reconcile Documentation updates (#862)

031d31b

* Formatted `aggregates JSON` configs and added `aggregates` in `reconcile configs`

extra ";" generation has been taken care for Bang command (#858)

1ab2645

fixes #851 --------- Co-authored-by: SundarShankar89 <[email protected]>

[snowflake] parse parameters (#855)

cdcb761

Co-authored-by: Jim Idle <[email protected]>

Added support for ALTER TABLE: ADD COLUMNS, DROP COLUMNS, `RENA…

e42c0c3

…ME COLUMNS`, and `DROP CONSTRAINTS` (#861) Implement Alter Table

Reconcile Table Recon JSON filename updates (#866)

acd70b5

Added a Table Header with some notes

Miscellaneous Snowflake coverage fixes (#872)

287822e

Added query history retrieval from Snowflake (#874)

cb217c0

[snowflake] initial support for lateral views (#838)

8402a18

Add basis of ANTLR utilities, starting with text extractor (#882)

6794006

Adds a start point for useful ANTLR utilities. - Given an ANTLR ParserRuleContext : retrieve the original text from the input source

unresolved commands alter session | stream... create stream `crea…

c6f780a

…te task` (#864) `alter session | stream...` `create stream` `create task` and `execute task`

Fix Query Generation IR for Select Distinct (#887)

a9ec557

now it generates either Project or Deduplicates. --------- Co-authored-by: Serge Smertin <[email protected]>

fixed current_date() generation (#890)

2aab91e

merge after #887

Make CallMapper a Rule[LogicalPlan] (#899)

622af53

fixes #898

Patch/reconcile support temp views (#901)

97e73f8

Right now remorph-reconcile supports only physical tables. However, some of the clients use temporary views for reconciliation. To support temporary views we had to make changes to support `global_temp`.

Make coverage test fail CI in case of failure (#908)

94c141e

Upgrade script Implementation (#777)

1c6c890

closes #769

vil1 and others added 3 commits December 13, 2024 17:03

Make Rule[T] return a Transformation[T] (#1346)

72fde58

Fixes #1278

sundarshankar89 reviewed Dec 16, 2024

View reviewed changes

ericvergnaud and others added 22 commits December 16, 2024 16:02

Isolate sqlglot dependencies (#1320)

cd8786a

Our current code uses sqlglot data structures in places where we should not depend on sqlglot. This PR fixes the issue. Progresses #1298 Requires #1321 to be merged -> Ready now

Rename fixture (#1328)

61a61fc

Rename for clarity Progresses #1298 Requires #1320 to be merged (could have been done independently but not worth extra work)

teradata ddl impl

434c7b3

Merge remote-tracking branch 'origin/feature/teradata_support' into f…

42a1624

…eature/teradata_support

teradata ddl impl

ae64f29

teradata ddl impl

4f4958f

teradata ddl impl

00ccff8

teradata ddl impl

7c8cc14

teradata ddl impl

8302c73

teradata ddl impl

dfb54b3

teradata ddl impl

636aa44

fix issue where unit tests would create files not ignored by git (#1381)

ee3abce

Running unit tests locally creates files in the source hierarchy which need to be removed each time before pushing to git This PR fixes the issue

Merge remote-tracking branch 'origin/main' into feature/teradata_support

f001d89

# Conflicts: # src/databricks/labs/remorph/config.py

teradata ddl impl

1e2f3be

Merge branch 'main' into feature/teradata_support

3f0e793

teradata ddl impl

06979c5

Merge remote-tracking branch 'origin/feature/teradata_support' into f…

4976207

…eature/teradata_support

sriram251-code requested a review from sundarshankar89 December 23, 2024 07:07

sundarshankar89 requested changes Dec 24, 2024

View reviewed changes

sundarshankar89 force-pushed the main branch from 53e5fde to 53e54f2 Compare January 3, 2025 07:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

teradata DDL support #1260

teradata DDL support #1260

sriram251-code commented Dec 2, 2024

sundarshankar89 Dec 16, 2024

sriram251-code Dec 18, 2024

sundarshankar89 Dec 16, 2024

sriram251-code Dec 18, 2024

sundarshankar89 Dec 16, 2024 •

edited

Loading

sriram251-code Dec 19, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 Dec 16, 2024

sriram251-code Dec 23, 2024

sundarshankar89 Dec 16, 2024

sriram251-code Dec 19, 2024

sundarshankar89 Dec 16, 2024

sriram251-code Dec 19, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 Dec 24, 2024

sundarshankar89 commented Jan 8, 2025

		@@ -0,0 +1,219 @@
		from sqlglot.dialects.teradata import Teradata as org_Teradata

	def match_pair_and_advance(self):
	def _match_pair_and_advance(self):

		@@ -0,0 +1,18 @@
		CREATE SET TABLE db_ods_plus.SPH_CNTRY_CRDNTR ,FALLBACK ,


		from ..conftest import FunctionalTestFile, get_functional_test_files_from_directory

		path = Path(__file__).parent / Path('../../resources/functional/teradata/ddl/')

teradata DDL support #1260

Are you sure you want to change the base?

teradata DDL support #1260

Conversation

sriram251-code commented Dec 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundarshankar89 Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundarshankar89 commented Jan 8, 2025

sundarshankar89 Dec 16, 2024 •

edited

Loading