v0.1.5
- Added Pylint Checker (#149). This diff adds a Pylint checker to the project, which is used to enforce a consistent code style, identify potential bugs, and check for errors in the Python code. The configuration for Pylint includes various settings, such as a line length limit, the maximum number of arguments for a function, and the maximum number of lines in a module. Additionally, several plugins have been specified to load, which add additional checks and features to Pylint. The configuration also includes settings that customize the behavior of Pylint's naming conventions checks and handle various types of code constructs, such as exceptions, logging statements, and import statements. By using Pylint, the project can help ensure that its code is of high quality, easy to understand, and free of bugs. This diff includes changes to various files, such as cli.py, morph_status.py, validate.py, and several SQL-related files, to ensure that they adhere to the desired Pylint configuration and best practices for code quality and organization.
- Fixed edge case where column name is same as alias name (#164). A recent commit has introduced fixes for edge cases related to conflicts between column names and alias names in SQL queries, addressing issues #164 and #130. The
check_for_unsupported_lca
function has been updated with two helper functions_find_aliases_in_select
and_find_invalid_lca_in_window
to detect aliases with the same name as a column in a SELECT expression and identify invalid Least Common Ancestors (LCAs) in window functions, respectively. Thefind_windows_in_select
function has been refactored and renamed to_find_windows_in_select
for improved code readability. Thetranspile
andparse
functions in thesql_transpiler.py
file have been updated with try-except blocks to handle cases where a column name matches the alias name, preventing errors or exceptions such asParseError
,TokenError
, andUnsupportedError
. A new unit test, "test_query_with_same_alias_and_column_name", has been added to verify the fix, passing a SQL query with a subquery having a column aliasca_zip
which is also used as a column name in the same query, confirming that the function correctly handles the scenario where a column name conflicts with an alias name. TO_NUMBER
withoutformat
edge case (#172). TheTO_NUMBER without format edge case
commit introduces changes to address an unsupported usage of theTO_NUMBER
function in Databicks SQL dialect when theformat
parameter is not provided. The new implementation introduces constantsPRECISION_CONST
andSCALE_CONST
(set to 38 and 0 respectively) as default values forprecision
andscale
parameters. These changes ensure Databricks SQL dialect requirements are met by modifying the_to_number
method to incorporate these constants. AnUnsupportedError
will now be raised whenTO_NUMBER
is called without aformat
parameter, improving error handling and ensuring users are aware of the requiredformat
parameter. Test cases have been added forTO_DECIMAL
,TO_NUMERIC
, andTO_NUMBER
functions with format strings, covering cases where the format is taken from table columns. The commit also ensures that an error is raised whenTO_DECIMAL
is called without a format parameter.
Dependency updates:
- Bump sqlglot from 21.2.1 to 22.0.1 (#152).
- Bump sqlglot from 22.0.1 to 22.1.1 (#159).
- Updated databricks-labs-blueprint[yaml] requirement from ~=0.2.3 to >=0.2.3,<0.4.0 (#162).
- Bump sqlglot from 22.1.1 to 22.2.0 (#161).
- Bump sqlglot from 22.2.0 to 22.2.1 (#163).
- Updated databricks-sdk requirement from <0.21,>=0.18 to >=0.18,<0.22 (#168).
- Bump sqlglot from 22.2.1 to 22.3.1 (#170).
- Updated databricks-labs-blueprint[yaml] requirement from <0.4.0,>=0.2.3 to >=0.2.3,<0.5.0 (#171).
- Bump sqlglot from 22.3.1 to 22.4.0 (#173).
Contributors: @dependabot[bot], @sundarshankar89, @bishwajit-db