Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native query EntityManager.createNativeQuery(…, EntityType.class) returns different result when run via EntityManager.createNativeQuery(…) #3805

Closed
filpano opened this issue Mar 7, 2025 · 6 comments
Assignees
Labels
for: external-project For an external project and not something we can fix

Comments

@filpano
Copy link

filpano commented Mar 7, 2025

I have the following PostgreSQL table DDL:

                                   Partitioned table "service.my_values"
       Column       |            Type             | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
--------------------+-----------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 hour               | timestamp(6) with time zone |           | not null |         | plain   |             |              | 
 client_id          | uuid                        |           | not null |         | plain   |             |              | 
 number_of_invoices | bigint                      |           | not null |         | plain   |             |              | 
 gross_revenue      | numeric(18,3)               |           | not null |         | main    |             |              | 
Partition key: RANGE (hour)
Indexes:
    "my_values_pkey" PRIMARY KEY, btree (client_id, hour)
Partitions: ... (monthly partitions for a given time frame)

And the following data (note: uuid is random):

INSERT INTO service.my_values (hour, client_id, number_of_invoices, gross_revenue)
VALUES
('2024-03-19T00:00:00Z', 'b5aeae94-5e69-4775-a8ea-b5265619cda7', 1, 1),
('2024-03-19T11:00:00Z', 'b5aeae94-5e69-4775-a8ea-b5265619cda7', 1, 1),
('2024-03-20T01:00:00Z', 'b5aeae94-5e69-4775-a8ea-b5265619cda7', 1, 1),
('2024-03-20T04:00:00Z', 'b5aeae94-5e69-4775-a8ea-b5265619cda7', 1, 1);

When I run the following native query repository method:

@Repository
public interface MyRepository extends JpaRepository<MyValue, UUID> {

    /**
     * @implNote JPA behaviour seems broken when grouping by the {@code date_trunc} function,
     * hence we are grouping on the second select "column".
     */
    @Query(value = """
            SELECT
                client_id,
                date_trunc(:#{#unitOfTime.name()}, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * :offset) AS hour,
                SUM(gross_revenue) AS gross_revenue,
                SUM(number_of_invoices) AS number_of_invoices
            FROM
                service.my_values
            WHERE
                client_id = :clientId
                AND hour >= :from
                AND hour < :until
            GROUP BY
                client_id,
                2
            ORDER BY
                2 ASC;
            """, nativeQuery = true)
    Stream<MyValue> findMyValues(
            @Param("clientId") UUID clientId,
            @Param("from") Instant from,
            @Param("until") Instant until,
            @Param("offset") int offset,
            @Param("unitOfTime") UnitOfTime unitOfTime);
}

I expect to get the value: MyValue(clientId=b5aeae94-5e69-4775-a8ea-b5265619cda7, hour=2024-03-19T00:00:00Z, grossRevenue=2, numberOfInvoices=2).

The actual value I get is: MyValue(clientId=b5aeae94-5e69-4775-a8ea-b5265619cda7, hour=2024-03-19T00:00:00Z, grossRevenue=1, numberOfInvoices=1).

If I run the same query using entityManager.createNativeQuery() (note: SPeL part switched for a string as createNativeQuery does not seem to support it):

        var query = entityManager.createNativeQuery("""
            SELECT
                client_id,
                date_trunc(:unitOfTimeString, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * :offset) AS hour,
                SUM(gross_revenue) AS gross_revenue,
                SUM(number_of_invoices) AS number_of_invoices
            FROM
                service.my_values
            WHERE
                client_id = :clientId
                AND hour >= :from
                AND hour < :until
            GROUP BY
                client_id,
                2
            ORDER BY
                2 ASC;
            """)
                .setParameter("clientId", clientId)
                .setParameter("from", timeSpan.from())
                .setParameter("until", timeSpan.until())
                .setParameter("offset", timeUtils.getCurrentTZOffset(dt_2024_03_19T10_00))
                .setParameter("unitOfTimeString", DAY.name());

        var emResult = query.getResultList();

        assertThat(emResult).hasSize(1);

        Object[] row = (Object[]) emResult.get(0);
        assertThat(row[0]).isEqualTo(clientId);
        assertThat(row[1]).isEqualTo(Timestamp.from(expectedDayBoundary));
        assertThat(row[2]).isEqualTo(BigDecimal.valueOf(2));
        assertThat(row[3]).isEqualTo(BigDecimal.valueOf(2));

Then I get the expected result. If I run the same query in the console (using e.g. psql), I also get the expected result.

Using spring.jpa.show-sql=true and spring.jpa.properties.hibernate.format_sql=true, I see the following SQL call logged:

Hibernate: 
    /* dynamic native SQL query */ SELECT
        client_id,
        date_trunc(?, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * ?) AS hour,
        SUM(gross_revenue) AS gross_revenue,
        SUM(number_of_invoices) AS number_of_invoices 
    FROM
        service.my_values 
    WHERE
        client_id = ?     
        AND hour >= ?     
        AND hour < ? 
    GROUP BY
        client_id,
        2 
    ORDER BY
        2 ASC; 
2025-03-07 17:05:26.311 TRACE 1007964 --- [    Test worker] org.hibernate.orm.jdbc.bind              : binding parameter (1:VARCHAR) <- [DAY]
2025-03-07 17:05:26.311 TRACE 1007964 --- [    Test worker] org.hibernate.orm.jdbc.bind              : binding parameter (2:INTEGER) <- [3]
2025-03-07 17:05:26.311 TRACE 1007964 --- [    Test worker] org.hibernate.orm.jdbc.bind              : binding parameter (3:UUID) <- [b5aeae94-5e69-4775-a8ea-b5265619cda7]
2025-03-07 17:05:26.311 TRACE 1007964 --- [    Test worker] org.hibernate.orm.jdbc.bind              : binding parameter (4:TIMESTAMP_UTC) <- [2024-03-19T03:00:00Z]
2025-03-07 17:05:26.312 TRACE 1007964 --- [    Test worker] org.hibernate.orm.jdbc.bind              : binding parameter (5:TIMESTAMP_UTC) <- [2024-03-20T02:59:59.999999999Z]

which is exactly the query I was expecting.

Here are my entity definitions:

@Data
@NoArgsConstructor
@AllArgsConstructor
@Entity
@IdClass(MyKey.class)
@Table(name = "my_values")
public class MyValue {

    @Id
    @JsonIgnore
    private UUID clientId;
    @Id
    private Instant hour;
    @Column(precision = BigDecimalColumnDef.PRECISION, scale = BigDecimalColumnDef.SCALE)
    private BigDecimal grossRevenue;
    private long numberOfInvoices;
}

Id class:

@NoArgsConstructor
@AllArgsConstructor
@EqualsAndHashCode
public class MyKey {
    private UUID clientId;
    private Instant hour;
}

And this is how the failing test which contains this issue creates the database entries:

List<MyValue> entries = List.of(
        /* not in day */ new MyValue(clientId, dt_2024_03_19T01_00, BigDecimal.ONE, 1L),
        new MyValue(clientId, dt_2024_03_19T12_00, BigDecimal.ONE, 1L),
        new MyValue(clientId, dt_2024_03_20T02_00, BigDecimal.ONE, 1L),
        /* not in day */ new MyValue(clientId, dt_2024_03_20T05_00, BigDecimal.ONE, 1L)
);
repository.saveAll(entries);
repository.flush();

(note to avoid confusion: the datetime variables represent times in a +01:00 time zone, but are Instants that match the above UTC times in the INSERT statements)


Seeing how:

  1. JPA generates the query I was expecting, and
  2. The native query via entityManager.createNativeQuery() returns the correct results

this seems to be Spring Data JPA issue.

I have almost exactly the same query (using different tables with very similar structure) in different repositories which works, which is why I'm a bit baffled at this behaviour. Another query that looks practically identical, is as follows:

    @Query(value = """
            SELECT
                client_id,
                date_trunc(:#{#unitOfTime.name()}, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * :offset) AS hour,
                SUM(gross_revenue) AS gross_revenue,
                SUM(net_revenue) AS net_revenue
            FROM
                service.my_other_values
            WHERE
                client_id = :clientId
                AND hour >= :from
                AND hour < :until
            GROUP BY
                client_id,
                2
            ORDER BY
                2 ASC;
            """, nativeQuery = true)
    Stream<MyOtherValue> findOtherValues(
        @Param("clientId") UUID clientId,
        @Param("from") Instant from,
        @Param("until") Instant until,
        @Param("offset") int offset,
        @Param("unitOfTime") UnitOfTime unitOfTime);

with the following entity:

@Data
@NoArgsConstructor // Required by JPA
@AllArgsConstructor
@Entity
@IdClass(MyOtherKey.class)
@Table(name = "my_other_values")
@NamedNativeQueries({ /* named native queries used for other methods */ })
public class MyOtherValue {

    @Id
    @JsonIgnore
    private UUID clientId;
    @Id
    private Instant hour;
    @Column(precision = BigDecimalColumnDef.PRECISION, scale = BigDecimalColumnDef.SCALE)
    private BigDecimal grossRevenue;
    @Column(precision = BigDecimalColumnDef.PRECISION, scale = BigDecimalColumnDef.SCALE)
    private BigDecimal netRevenue;

}

and Id class:

@NoArgsConstructor
@AllArgsConstructor
@EqualsAndHashCode
public class MyOtherKey {
    private UUID clientId;
    private Instant hour;
}

Unfortunately, I'm not sure how to debug this as JPA/Hibernate stacks are very deep and difficult (IMO) to debug unless you have a lot of experience with it.

I unfortunately cannot provide a reproducible sample at this time, but if that's the only way to move forward, I can spend some time next week preparing an MRP. In the meantime, if there's anything I could potentially try, I would appreciate it.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 7, 2025
@filpano filpano changed the title Native query returns different result when run via JPA vs. console Native query returns different result when run via JPA vs. console/entityManager Mar 7, 2025
@filpano filpano changed the title Native query returns different result when run via JPA vs. console/entityManager Native query returns different result when run via JPA native query vs. console/entityManager Mar 7, 2025
@filpano filpano changed the title Native query returns different result when run via JPA native query vs. console/entityManager Native query returns different result when run via JPA vs. console/entityManager Mar 7, 2025
@mp911de
Copy link
Member

mp911de commented Mar 10, 2025

If you would like us to spend some time helping you to diagnose the problem, please spend some time describing it and, ideally, providing a minimal yet complete sample that reproduces the problem.
You can share it with us by pushing it to a separate repository on GitHub or by zipping it up and attaching it to this issue.

@mp911de mp911de added the status: waiting-for-feedback We need additional information before we can continue label Mar 10, 2025
@spring-projects-issues
Copy link

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

@spring-projects-issues spring-projects-issues added the status: feedback-reminder We've sent a reminder that we need additional information before we can continue label Mar 17, 2025
@filpano
Copy link
Author

filpano commented Mar 19, 2025

Sorry for the delay.

spring_data_jpa_3805.zip

I've attached a demo project that shows this issue in the provided test. This should be readily importable into IntelliJ. Should you run into issues with other IDEs, please let me know.

The issue can be seen by running the com.example.demo.TestRepositoryTest#testRepositoryMethod test method.

Note: this demo project uses Spring Boot 3.2.5.. Unfortunately, neither using 3.3.9 nor 3.4.3 fix the issue. I was not able to try 3.5.0-SNAPSHOT as the initializr Gradle config gave me an error using the snapshot version.

Should it be necessary, the same query can be run in the shell using the following steps (note: assumes that no other containers using the postgres:15-alpine image are running):

  1. Put a breakpoint anywhere in the above test method
  2. Run the following commands:
# Exec into container
CONTAINER_ID=$(docker ps | grep "postgres:15-alpine" | cut -d' ' -f1); docker exec -it ${CONTAINER_ID} bash
# Switch to postgres user and then run psql as default user "test"
su postgres
PGPASSWORD=test
psql -U test

afterwards, run the following query in the psql shell (the flyway migration files already insert all required data):

SELECT
    client_id,
    date_trunc('DAY', hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * 4) AS hour,
    SUM(decimal_column) AS decimal_column,
    SUM(counter) AS counter
FROM
    test_service.hourly_test
WHERE
    client_id = 'b5aeae94-5e69-4775-a8ea-b5265619cda7'
    AND hour >= '2024-03-19T03:00:00Z'
    AND hour < '2024-03-20T02:59:59.999Z'
GROUP BY
    client_id,
    2
ORDER BY
    2 ASC;

For me, this results in the following output:

              client_id               |        hour         | decimal_column | counter 
--------------------------------------+---------------------+----------------+---------
 b5aeae94-5e69-4775-a8ea-b5265619cda7 | 2024-03-19 00:00:00 |          2.000 |       2
(1 row)

which is also what I expected.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue status: feedback-reminder We've sent a reminder that we need additional information before we can continue labels Mar 19, 2025
@mp911de
Copy link
Member

mp911de commented Mar 20, 2025

Postgres query logs: The first query is from the repository, the second one via EntityManager

2025-03-20 10:26:24.946 CET [58814] LOG:  execute <unnamed>: /* dynamic native SQL query */ SELECT
	    client_id,
	    date_trunc($1, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * $2) AS hour,
	    SUM(decimal_column) AS decimal_column,
	    SUM(counter) AS counter
	FROM
	    test_service.hourly_test
	WHERE
	    client_id = $3
	    AND hour >= $4
	    AND hour < $5
	GROUP BY
	    client_id,
	    2
	ORDER BY
	    2 ASC
2025-03-20 10:26:24.946 CET [58814] DETAIL:  parameters: $1 = 'DAY', $2 = '4', $3 = 'b5aeae94-5e69-4775-a8ea-b5265619cda7', $4 = '2024-03-19 04:00:00+01', $5 = '2024-03-20 03:59:59.999+01'
2025-03-20 10:26:24.953 CET [58814] LOG:  execute <unnamed>: /* dynamic native SQL query */ SELECT
	    client_id,
	    date_trunc($1, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * $2) AS hour,
	    SUM(decimal_column) AS decimal_column,
	    SUM(counter) AS counter
	FROM
	    test_service.hourly_test
	WHERE
	    client_id = $3
	    AND hour >= $4
	    AND hour < $5
	GROUP BY
	    client_id,
	    2
	ORDER BY
	    2 ASC
2025-03-20 10:26:24.953 CET [58814] DETAIL:  parameters: $1 = 'DAY', $2 = '4', $3 = 'b5aeae94-5e69-4775-a8ea-b5265619cda7', $4 = '2024-03-19 04:00:00+01', $5 = '2024-03-20 03:59:59.999+01'

Interstingly, the mismatch only happens when data is being inserted/updated in the same transaction. Running the query on existing data doesn't cause the problem to pop up.

@mp911de
Copy link
Member

mp911de commented Mar 20, 2025

Alright, so the difference comes from cached entities. Spring Data JPA calls EntityManager.createNativeQuery(sql, TestEntity.class) while you issue a query that returns Object[]. Because of saving previously entities, Hibernate reuses cached entities for the query. It sees that there is an identifier and the session cache contains already an entity instance and so it returns that one instead of creating new instances.

You can verify this by capturing the result of repository.saveAll(…) and compare the object instance identifier (or if you disable Lombok's hashCode generation).

This is rather surprising to me, however, you might want to get in touch with the Hibernate team to see whether there's a good explanation for this behavior.

That being said, there's nothing we can do here.

@mp911de mp911de closed this as not planned Won't fix, can't repro, duplicate, stale Mar 20, 2025
@mp911de mp911de added for: external-project For an external project and not something we can fix and removed status: waiting-for-triage An issue we've not yet triaged status: feedback-provided Feedback has been provided labels Mar 20, 2025
@mp911de mp911de self-assigned this Mar 20, 2025
@mp911de mp911de changed the title Native query returns different result when run via JPA vs. console/entityManager Native query EntityManager.createNativeQuery(…, EntityType.class) returns different result when run via EntityManager.createNativeQuery(…) Mar 20, 2025
@filpano
Copy link
Author

filpano commented Mar 20, 2025

Thank you for the quick and insightful analysis. I'll raise an issue in Hibernate.

Edit: https://hibernate.atlassian.net/browse/HHH-19277

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: external-project For an external project and not something we can fix
Projects
None yet
Development

No branches or pull requests

3 participants