-
Notifications
You must be signed in to change notification settings - Fork 590
Refactored task manifest persistence to use JPA #3614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit refactors the persistence mechanism of the task execution manifest to use JPA instead of raw JDBC. This was done to be consistent with the rest of the Spring Cloud Data Flow entities. Resolves spring-cloud#3560
public String toString() { | ||
return new ToStringCreator(this) | ||
.append("manifest", this.manifest) | ||
.append("platformName", this.manifest.getPlatformName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the taskDeploymentRequest
as well here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we are trying to return the String
representation of the TaskExecutionManifest
here. Now that I think of it again, the manifest
would simply return the toString() of Manifest
as it doesn't have an explicit one. Also, it would be nice we call this.manifest.toString()
and have the toString()
implementation inside the Manifest
subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to have a toString()
representation of the entire Manifest
, we need to have invoked the sanitization methods to mask the underlying properties as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way, even if we use the taskDeploymentRequest
, we'd still need to sanitize. If you are ok with it, I'd rather just use the whole manifest and sanitize the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather just use the whole manifest and sanitize the values.
Yes, this is a better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually just removed the toString method all together. I wasn't using it. The reason it was there was the PoC that created that class just used that to do the serialization. Since this class didn't have visibility to the TaskSerializer class (and I didn't want to move it), I just removed the method.
return new ToStringCreator(this) | ||
.append("manifest", this.manifest) | ||
.append("platformName", this.manifest.getPlatformName()) | ||
.append("subTaskDeploymentRequests", this.manifest.getSubTaskDeploymentRequests()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a check if the list is empty before appending?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilayaperumalg In your CTR testing, did you see if this field we even used? If it wasn't, we can remove it completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed noticing that the subTaskDeploymentRequests
are not used at all :-) All we need here is to use the original appDeploymentRequest
as TaskDeploymentRequest and the CTR
will take care launching the corresponding apps. This means, the entire subTaskDeploymentRequests
can be removed from the TaskExecutionManifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was done in another commit that I'll pick up when I rebase...
spring-cloud-dataflow-core/pom.xml
Outdated
@@ -28,10 +28,26 @@ | |||
<groupId>org.springframework.cloud</groupId> | |||
<artifactId>spring-cloud-deployer-spi</artifactId> | |||
</dependency> | |||
<dependency> | |||
<groupId>org.springframework.cloud</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try not to move things from registry to core package as we're been trying to keep deps tidy in core package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to move that stuff around...but in order for the JSON serialization to work I didn't have a choice. I'm open to other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why serialization wouldn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ObjectMapper needs to have visibility to the classes that are being serialzied/deserialized. In this case, embedded within the manifest are the Resource
for the app. That means we need access to those classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the objectMapper
is customized in spring-cloud-dataflow-server-core
, ResourceDeserializer
could also be there instead in spring-cloud-dataflow-core
where it sees everything. Thus you don't need to move this stuff around. Unless I missed something else wouldn't that work then.
@@ -42,13 +42,15 @@ public TaskExecution sanitizeTaskExecutionArguments(TaskExecution taskExecution) | |||
return taskExecution; | |||
} | |||
|
|||
public TaskManifest sanitizeTaskManifest(TaskManifest taskManifest) { | |||
public TaskExecutionManifest sanitizeTaskManifest(TaskExecutionManifest taskManifest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rename this method as well (to have TaskExecutionManifest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason the methods in this class are not static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (although still wondering why these methods are not static)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any reason why this can't be static. I think we are ok both ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it would be nice to not have instances of this class floating around everywhere it's needed if we can make them static. If you're ok with it, I'll make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Though it is only used by the task configuration beans, it makes sense to have a static method for this. Please go ahead!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private ObjectMapper objectMapper; | ||
|
||
public TaskManifestConverter() { | ||
this.objectMapper = new ObjectMapper(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been trying to use a shared single object mapper which gets configured in a boot way and given to us from a boot. This way things around it are not scattered in different places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make a JPA converter a bean so that you can use DI? If so, I'd be interested in learning how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it just work as bean per github.com/spring-projects/spring-framework/issues/20852
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer (never knew that)! I'll update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* | ||
* @since 2.3 | ||
*/ | ||
public interface TaskManifestRepository extends PagingAndSortingRepository<TaskExecutionManifest, Long> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TaskManifestRepository -> TaskExecutionManifestRepository ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There's a lot of merge conflicts in |
numberOfDeletedTaskManifestRows += this.taskManifestRepository.deleteTaskExecutionManifestByTaskExecutionId(taskExecutionId); | ||
} | ||
|
||
this.entityManager.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete below done via dataflowTaskExecutionDao
would not even be in a same transaction, afaik, as JdbcTemplate
and hibernate are probably working via different jdbc connections. I don't think you can do delete operations like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jvalkeal The JpaTransactionManager
(which SCDF is using) supports the same transaction across the two. You can read more about it in the documentation for it here: https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/orm/jpa/JpaTransactionManager.html
@ilayaperumalg The reason it is required is because the EntityManager
won't execute the query until the flush happens. Because of this, without the flush, the following call to delete the task execution records fails due to the constraint between the manifest table and the task execution table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from TaskExecutionsDocumentation.taskExecutionRemoveAndTaskDataRemove()
which seem to fail all the time when I commented out that flush there's no test failures. It might work better if you try to delete the actual entity instead of trying to create a query for it. There has to be better way that hacking into EntityManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a skipper's Release
entity we used these in a field referencing to other table:
@OneToOne(cascade = { CascadeType.ALL })
@JoinColumn(foreignKey = @ForeignKey(name = "fk_release_manifest"))
private Manifest manifest;
Not sure if it helps but we need something to tell hibernate that it should not preoptimize db calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. What do you mean by "delete the actual entity". That's what I'm doing. Unless you're proposing I do a find for the entity, then a delete but that will result in the same issue. For the skipper example you are sharing, those are all entitites, correct? Which would mean that JPA would handle it all. The TaskExecution
that the TaskExecutionManifest
is not a JPA entity which is why this issue arises in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With deleteTaskExecutionManifestByTaskExecutionId
you're essentially creating a query which work differently if you get access to a full entity object and then try to delete by passing that into taskExecutionManifestRepository
delete method. Just a theory when things might work better. It'd difficult to try different ways to make this work better as no tests fail if that flush
is removed. We should not have these kind of workaround in place which we cannot test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like I was able to work without explicit flush by adding explicit new propagation to TaskExecutionManifestRepository
:
@Transactional(propagation = Propagation.REQUIRES_NEW)
int deleteTaskExecutionManifestByTaskExecutionId(long taskExecutionid);
See if that work for you as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some notes at https://docs.jboss.org/hibernate/core/3.3/reference/en/html/objectstate.html#objectstate-flushing so had an idea that if we wrap this in a new transaction hibernate then sees that boundary and flushes. It's still a hack but we'd be working with a system, not around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I can add a test to validate the transaction rollback for the method (We don't do this anywhere else that I'm aware of and this PR should be functionally the same as the code before which is why I didn't add it).
- Changing the transaction propagation would be incorrect since the entire service method needs to be in a single transaction, committed or rolled back as one. The current transaction as configured should work that way as coded. Changing the propagation would prevent that I think.
- I don't see how adding an explicit call to flush is working around it. That documentation you point out explicitly states that a call to flush is a way to make this work.
All of this being said, this is one line of code. I brought this up in the slack channel before adding it and @markpollack seemed ok with it at the time. Any of these other methods seem like overkill for such a simple thing IMHO.
For consistency, we can rename all the places of |
* @since 2.3 | ||
*/ | ||
@Entity | ||
@Table(name = "TaskExecutionMetadata") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good time to change the name of the table to TaskExecutionManifest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
*/ | ||
@Entity | ||
@Table(name = "TaskExecutionMetadata") | ||
@EntityListeners(AuditingEntityListener.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we use any of the auditing listeners. If we don't use it, we can remove @EntityListeners(AuditingEntityListener.class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
LGTM |
We will issue a new PR in the future. |
This commit refactors the persistence mechanism of the task execution
manifest to use JPA instead of raw JDBC. This was done to be consistent
with the rest of the Spring Cloud Data Flow entities.
Resolves #3560