Skip to content

Conversation

@tarak271
Copy link
Contributor

@tarak271 tarak271 commented Oct 17, 2025

…t command

What changes were proposed in this pull request?

Display TxnId for each query displayed in show processlist query

Why are the changes needed?

To co-relate TxnId and query

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Created Junit test case as well as Tested in local cluster

0: jdbc:hive2://ip-10-17-78-194.support.fuse.> show processlist; +------------+---------------+-------------------+---------------------------------------+--------------------------+------------------------+----------------------------------------------------+----------+---------+--------------------------+-------------------+---------------+ | User Name | Ip Addr | Execution Engine | Session Id | Session Active Time (s) | Session Idle Time (s) | Query ID | State | Txn ID | Opened Timestamp (s) | Elapsed Time (s) | Runtime (s) | +------------+---------------+-------------------+---------------------------------------+--------------------------+------------------------+----------------------------------------------------+----------+---------+--------------------------+-------------------+---------------+ | hive | 10.17.78.194 | tez | b6a7a9da-a01e-4e4e-b5bd-38c0edd54867 | 119 | 6 | hive_20251017043354_b66fb35a-0e8c-476a-a48c-0805276cfd40 | RUNNING | 610 | 2025-10-17 04:33:54.639 | 9 | Not finished | +------------+---------------+-------------------+---------------------------------------+--------------------------+------------------------+----------------------------------------------------+----------+---------+--------------------------+-------------------+---------------+

@Aggarwal-Raghav
Copy link
Contributor

@tarak271 , code changes LGTM but I'm not able to see the Txn ID column after cherry-picking your changes in my setup

Screenshot 2025-10-17 at 9 17 03 PM

Do we need to make changes in org.apache.hadoop.hive.ql.processors.ShowProcessListProcessor#getSchema()

private final String runtime; // tracks only running portion of the query.
private final long elapsedTime;
private final String state;
private final long txnId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong. Don't we need to read this variable in ShowProcessListProcessor.java?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. Added getter to use in that class

@tarak271
Copy link
Contributor Author

@tarak271 , code changes LGTM but I'm not able to see the Txn ID column after cherry-picking your changes in my setup

Screenshot 2025-10-17 at 9 17 03 PM Do we need to make changes in org.apache.hadoop.hive.ql.processors.ShowProcessListProcessor#getSchema()

@Aggarwal-Raghav yes, made those changes, please try now

@Aggarwal-Raghav
Copy link
Contributor

@tarak271, after the changes the TxnID is visible in the show processlist output. Thanks for adding a UT for the same.

executor.submit(() -> {
try (Connection con = DriverManager.getConnection(miniHS2.getJdbcURL(), user, "bar");
Statement stmt = con.createStatement()) {
stmt.execute("drop database if exists DB_" + Thread.currentThread().threadId() + " cascade");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following is my personal opinion:

  1. The thread pool reuses its 5 threads to execute the 20 tasks, the same thread ID will be used multiple times. Its doesn't align with the intent of having 20 distinct operations as DROP DATABASE might have same threadId.
  2. How 20 iterartions of for loop is decided? Is there any way to make it more deterministic?
  3. if executor.getActiveCount() is 0 then the Assert statement won't get invoked. The core logic might not be getting tested and test will still pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aggarwal-Raghav Please find below responses

  1. The idea is to simulate 10/20 connections/queries which can open transactions. A new txnId will be opened for every query irrespective of the connection or thread
  2. We can take any number of threads, for now I am keeping it to 10, i.e 10 queries will be run and the txnIds for those queries can be captured using show processlist and cross verify it txnid got opened or not
  3. Moved assert out of that loop. This loop will keep on executing until all the queries got executed & all the transactions got captured using show processlist. At the end checking if 10 transactions were opened or not for those 10 queries

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants