Skip to content

Conversation

@sfc-gh-aalam
Copy link
Contributor

@sfc-gh-aalam sfc-gh-aalam commented May 7, 2025

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-2084165

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  3. Please describe how your code solves the related issue.

    see doc.

@sfc-gh-snowflakedb-snyk-sa
Copy link

sfc-gh-snowflakedb-snyk-sa commented May 7, 2025

🎉 Snyk checks have passed. No issues have been found so far.

security/snyk check is complete. No issues have been found. (View Details)

license/snyk check is complete. No issues have been found. (View Details)

@sfc-gh-aalam sfc-gh-aalam marked this pull request as ready for review May 13, 2025 18:02
@sfc-gh-aalam sfc-gh-aalam requested review from a team as code owners May 13, 2025 18:03
@sfc-gh-aalam sfc-gh-aalam requested a review from sfc-gh-jkew May 13, 2025 18:03
@sfc-gh-aalam sfc-gh-aalam requested a review from a team May 23, 2025 18:21
"""Returns the batch_ids of the children of this node."""
return get_dependent_bind_ids(self.stmt_cache[self.batch_id])

def get_src(self) -> Optional[proto.SrcPosition]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the hybrid client prototype we are using a slightly different method to get the source location; by just using inspect to walk the stack to the appropriate source location. We have to do this because modin is not using any of the AST stuff, but it's also relatively straight forward.

I sort of want to use your debugging tool for snowpandas as well; but we may want to refactor this so we don't require any of the protobuf work.

def get_user_source_location(group: str) -> dict[str, str]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using this function?

def context_manager_code_location(frame_info, func) -> Tuple[str, int]:

Essentially we seem to have three approaches to this problem. I'm /less/ of a fan of the AST because it doesn't help pandas for this type of debugging. but it seems like we might be able to consolidate w/ the open telemetry approach.

@sfc-gh-aalam sfc-gh-aalam requested a review from a team May 29, 2025 00:22
_enable_dataframe_trace_on_error = False


def configure_development_features(
Copy link
Contributor

@sfc-gh-aling sfc-gh-aling May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is similar to what I would expect, see my comment on: #3380 (comment)

I'm thinking of the following to provide unified debug config experience.

@experimental(version="1.33.0")
def debug_config(
    *
    enable_eager_schema_validation=False,
    enable_dataframe_trace_on_error=False,
)

when users want to enable, they do

import snowflake.snowpark.context
snowflake.snowpark.context.debug_config(enable_eager_schema_validation=True)
# or
snowflake.snowpark.context.debug_config(
    enable_eager_schema_validation=True,
    enable_dataframe_trace_on_error=True
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. @sfc-gh-jrose and I are aligned on the name. Let me add the @experimental decorator as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could not import decorator due to circular import issues but added a warning there


def configure_development_features(
*,
enable_dataframe_trace_on_error: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should default to True. That way users that want a basic development mode can call this function without any parameters.

@sfc-gh-aalam sfc-gh-aalam changed the base branch from main to aalam-SNOW-2110972-allow-local-override-for-ast-collection June 4, 2025 23:21
Base automatically changed from aalam-SNOW-2110972-allow-local-override-for-ast-collection to main June 6, 2025 22:53
"""A node representing a dataframe operation in the DAG that represents the lineage of a DataFrame."""

def __init__(self, batch_id: int, stmt_cache: Dict[int, proto.Stmt]) -> None:
self.batch_id = batch_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would argue that this isn't meant to be batch ID anymore. Within each Python session that imports the Snowpark module, each AST ID for a Table or Dataframe will be a UID.

@sfc-gh-aalam sfc-gh-aalam merged commit b0ebd5e into main Jun 10, 2025
36 of 39 checks passed
@sfc-gh-aalam sfc-gh-aalam deleted the aalam-SNOW-2084165-add-error-trace branch June 10, 2025 20:43
@github-actions github-actions bot locked and limited conversation to collaborators Jun 10, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants