Refactored task manifest persistence to use JPA #3614

mminella · 2019-11-04T22:45:02Z

This commit refactors the persistence mechanism of the task execution
manifest to use JPA instead of raw JDBC. This was done to be consistent
with the rest of the Spring Cloud Data Flow entities.

Resolves #3560

This commit refactors the persistence mechanism of the task execution manifest to use JPA instead of raw JDBC. This was done to be consistent with the rest of the Spring Cloud Data Flow entities. Resolves spring-cloud#3560

ilayaperumalg · 2019-11-05T09:17:58Z

...taflow-core/src/main/java/org/springframework/cloud/dataflow/core/TaskExecutionManifest.java

+	public String toString() {
+		return new ToStringCreator(this)
+				.append("manifest", this.manifest)
+				.append("platformName", this.manifest.getPlatformName())


I think we need the taskDeploymentRequest as well here

I thought we are trying to return the String representation of the TaskExecutionManifest here. Now that I think of it again, the manifest would simply return the toString() of Manifest as it doesn't have an explicit one. Also, it would be nice we call this.manifest.toString() and have the toString() implementation inside the Manifest subclass.

If we decide to have a toString() representation of the entire Manifest, we need to have invoked the sanitization methods to mask the underlying properties as well.

Either way, even if we use the taskDeploymentRequest, we'd still need to sanitize. If you are ok with it, I'd rather just use the whole manifest and sanitize the values.

I'd rather just use the whole manifest and sanitize the values.

Yes, this is a better option.

I actually just removed the toString method all together. I wasn't using it. The reason it was there was the PoC that created that class just used that to do the serialization. Since this class didn't have visibility to the TaskSerializer class (and I didn't want to move it), I just removed the method.

ilayaperumalg · 2019-11-05T09:18:58Z

...taflow-core/src/main/java/org/springframework/cloud/dataflow/core/TaskExecutionManifest.java

+		return new ToStringCreator(this)
+				.append("manifest", this.manifest)
+				.append("platformName", this.manifest.getPlatformName())
+				.append("subTaskDeploymentRequests", this.manifest.getSubTaskDeploymentRequests())


Can we add a check if the list is empty before appending?

@ilayaperumalg In your CTR testing, did you see if this field we even used? If it wasn't, we can remove it completely.

I missed noticing that the subTaskDeploymentRequests are not used at all :-) All we need here is to use the original appDeploymentRequest as TaskDeploymentRequest and the CTR will take care launching the corresponding apps. This means, the entire subTaskDeploymentRequests can be removed from the TaskExecutionManifest.

I believe this was done in another commit that I'll pick up when I rebase...

jvalkeal · 2019-11-05T09:36:40Z

spring-cloud-dataflow-core/pom.xml

@@ -28,10 +28,26 @@
 			<groupId>org.springframework.cloud</groupId>
 			<artifactId>spring-cloud-deployer-spi</artifactId>
 		</dependency>
+		<dependency>
+			<groupId>org.springframework.cloud</groupId>


We should try not to move things from registry to core package as we're been trying to keep deps tidy in core package.

I didn't want to move that stuff around...but in order for the JSON serialization to work I didn't have a choice. I'm open to other ideas.

Can you explain why serialization wouldn't work?

The ObjectMapper needs to have visibility to the classes that are being serialzied/deserialized. In this case, embedded within the manifest are the Resource for the app. That means we need access to those classes.

Now that the objectMapper is customized in spring-cloud-dataflow-server-core, ResourceDeserializer could also be there instead in spring-cloud-dataflow-core where it sees everything. Thus you don't need to move this stuff around. Unless I missed something else wouldn't that work then.

ilayaperumalg · 2019-11-05T09:38:55Z

...-rest-resource/src/main/java/org/springframework/cloud/dataflow/rest/util/TaskSanitizer.java

@@ -42,13 +42,15 @@ public TaskExecution sanitizeTaskExecutionArguments(TaskExecution taskExecution)
 		return taskExecution;
 	}

-	public TaskManifest sanitizeTaskManifest(TaskManifest taskManifest) {
+	public TaskExecutionManifest sanitizeTaskManifest(TaskExecutionManifest taskManifest) {


We can rename this method as well (to have TaskExecutionManifest)

Is there a reason the methods in this class are not static?

Done (although still wondering why these methods are not static)

I don't see any reason why this can't be static. I think we are ok both ways.

Well, it would be nice to not have instances of this class floating around everywhere it's needed if we can make them static. If you're ok with it, I'll make the change.

I agree. Though it is only used by the task configuration beans, it makes sense to have a static method for this. Please go ahead!

jvalkeal · 2019-11-05T09:40:00Z

...taflow-core/src/main/java/org/springframework/cloud/dataflow/core/TaskManifestConverter.java

+	private ObjectMapper objectMapper;
+
+	public TaskManifestConverter() {
+		this.objectMapper = new ObjectMapper();


We've been trying to use a shared single object mapper which gets configured in a boot way and given to us from a boot. This way things around it are not scattered in different places.

Can you make a JPA converter a bean so that you can use DI? If so, I'd be interested in learning how.

Shouldn't it just work as bean per github.com/spring-projects/spring-framework/issues/20852

Thanks for the pointer (never knew that)! I'll update.

ilayaperumalg · 2019-11-05T09:41:44Z

...c/main/java/org/springframework/cloud/dataflow/server/repository/TaskManifestRepository.java

+ *
+ * @since 2.3
+ */
+public interface TaskManifestRepository extends PagingAndSortingRepository<TaskExecutionManifest, Long> {


TaskManifestRepository -> TaskExecutionManifestRepository ?

jvalkeal · 2019-11-05T09:43:44Z

There's a lot of merge conflicts in DefaultTaskExecutionService due to changes what just got merged in to master.

ilayaperumalg · 2019-11-05T09:44:07Z

...in/java/org/springframework/cloud/dataflow/server/service/impl/DefaultTaskDeleteService.java

+			numberOfDeletedTaskManifestRows += this.taskManifestRepository.deleteTaskExecutionManifestByTaskExecutionId(taskExecutionId);
+		}
+
+		this.entityManager.flush();


Could you elaborate on why this is necessary?

Delete below done via dataflowTaskExecutionDao would not even be in a same transaction, afaik, as JdbcTemplate and hibernate are probably working via different jdbc connections. I don't think you can do delete operations like this.

@jvalkeal The JpaTransactionManager (which SCDF is using) supports the same transaction across the two. You can read more about it in the documentation for it here: https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/orm/jpa/JpaTransactionManager.html

@ilayaperumalg The reason it is required is because the EntityManager won't execute the query until the flush happens. Because of this, without the flush, the following call to delete the task execution records fails due to the constraint between the manifest table and the task execution table.

Apart from TaskExecutionsDocumentation.taskExecutionRemoveAndTaskDataRemove() which seem to fail all the time when I commented out that flush there's no test failures. It might work better if you try to delete the actual entity instead of trying to create a query for it. There has to be better way that hacking into EntityManager.

In a skipper's Release entity we used these in a field referencing to other table:

@OneToOne(cascade = { CascadeType.ALL }) @JoinColumn(foreignKey = @ForeignKey(name = "fk_release_manifest")) private Manifest manifest;

Not sure if it helps but we need something to tell hibernate that it should not preoptimize db calls.

I'm not sure I follow. What do you mean by "delete the actual entity". That's what I'm doing. Unless you're proposing I do a find for the entity, then a delete but that will result in the same issue. For the skipper example you are sharing, those are all entitites, correct? Which would mean that JPA would handle it all. The TaskExecution that the TaskExecutionManifest is not a JPA entity which is why this issue arises in the first place.

With deleteTaskExecutionManifestByTaskExecutionId you're essentially creating a query which work differently if you get access to a full entity object and then try to delete by passing that into taskExecutionManifestRepository delete method. Just a theory when things might work better. It'd difficult to try different ways to make this work better as no tests fail if that flush is removed. We should not have these kind of workaround in place which we cannot test.

Looks like I was able to work without explicit flush by adding explicit new propagation to TaskExecutionManifestRepository:

@Transactional(propagation = Propagation.REQUIRES_NEW) int deleteTaskExecutionManifestByTaskExecutionId(long taskExecutionid);

See if that work for you as well!

There were some notes at https://docs.jboss.org/hibernate/core/3.3/reference/en/html/objectstate.html#objectstate-flushing so had an idea that if we wrap this in a new transaction hibernate then sees that boundary and flushes. It's still a hack but we'd be working with a system, not around it.

I can add a test to validate the transaction rollback for the method (We don't do this anywhere else that I'm aware of and this PR should be functionally the same as the code before which is why I didn't add it).

Changing the transaction propagation would be incorrect since the entire service method needs to be in a single transaction, committed or rolled back as one. The current transaction as configured should work that way as coded. Changing the propagation would prevent that I think.

I don't see how adding an explicit call to flush is working around it. That documentation you point out explicitly states that a call to flush is a way to make this work.

All of this being said, this is one line of code. I brought this up in the slack channel before adding it and @markpollack seemed ok with it at the time. Any of these other methods seem like overkill for such a simple thing IMHO.

ilayaperumalg · 2019-11-05T12:06:18Z

For consistency, we can rename all the places of TaskManifest to TaskExecutionManifest.

ilayaperumalg · 2019-11-05T12:09:08Z

...taflow-core/src/main/java/org/springframework/cloud/dataflow/core/TaskExecutionManifest.java

+ * @since 2.3
+ */
+@Entity
+@Table(name = "TaskExecutionMetadata")


I think this is a good time to change the name of the table to TaskExecutionManifest.

ilayaperumalg · 2019-11-05T12:10:15Z

...taflow-core/src/main/java/org/springframework/cloud/dataflow/core/TaskExecutionManifest.java

+ */
+@Entity
+@Table(name = "TaskExecutionMetadata")
+@EntityListeners(AuditingEntityListener.class)


I don't think we use any of the auditing listeners. If we don't use it, we can remove @EntityListeners(AuditingEntityListener.class)

ilayaperumalg · 2019-11-07T10:44:02Z

LGTM

sabbyanandan · 2020-01-21T15:47:16Z

We will issue a new PR in the future.

Refactored task manifest persistence to use JPA

4c0b5d2

This commit refactors the persistence mechanism of the task execution manifest to use JPA instead of raw JDBC. This was done to be consistent with the rest of the Spring Cloud Data Flow entities. Resolves spring-cloud#3560

ilayaperumalg reviewed Nov 5, 2019

View reviewed changes

jvalkeal reviewed Nov 5, 2019

View reviewed changes

ilayaperumalg reviewed Nov 5, 2019

View reviewed changes

jvalkeal reviewed Nov 5, 2019

View reviewed changes

ilayaperumalg reviewed Nov 5, 2019

View reviewed changes

sabbyanandan assigned jvalkeal and ilayaperumalg Nov 5, 2019

mminella added 2 commits November 5, 2019 16:59

Renamed taskManifest to taskExecutionManifest through out

5fa2377

Refinements based on comments

269ead3

mminella added 4 commits November 7, 2019 09:44

Updated to put some things back where they were

3241c35

Removed commented out code and a sysout

dd46f70

Moving things back to where they were

8cd1021

Moved classes back to where they were

a3a335f

sabbyanandan closed this Jan 21, 2020

Refactored task manifest persistence to use JPA #3614

Refactored task manifest persistence to use JPA #3614

Conversation

mminella commented Nov 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mminella Nov 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jvalkeal commented Nov 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mminella Nov 5, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilayaperumalg commented Nov 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilayaperumalg commented Nov 7, 2019

sabbyanandan commented Jan 21, 2020

mminella Nov 6, 2019 •

edited

Loading

mminella Nov 5, 2019 •

edited

Loading