-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make use of URLConnection cache configurable in PlexusIoURLResource #12
Comments
Please read my comment at #2. |
Hi, How does enabling the cache speeds up your build? Does that means that without the cache a jar is downloaded multiple times(during the build not necessary by the Assembly plugin)? My point is that if a jar is downloaded multiple times, when it is perfectly safe to use cache, then the local .m2 cache should be used instead. So maybe that is the root cause of the issue. |
@plamentotev My current understanding is that the file is kept in memory and each class file is read from the cached jar. When the cache isn't used the 9.6MB eclipse-collections jar is read from disk once per class file which takes unnecessary time. The build does not download the file multiple times, the build works and takes the same amount of time with the network cable unplugged. |
@rbjorklin Did you check your assumption with VisualVM or similar? |
@michael-o With the help of YourKit we could see that the maven build spent a majority of it's time in this class and we decided to try turning the cache on. We're currently using a snapshot build of plexus-io with the cache enabled and seeing good build performance with maven-assembly 3.1.0. |
@rbjorklin Thanks for the update. Willing to review a PR. |
@rbjorklin thanks for the information. Plexus Archiver uses If Plexus Archiver uses diff --git a/src/main/resources/META-INF/plexus/components.xml b/src/main/resources/META-INF/plexus/components.xml
index 71d47ac..1c613b7 100644
--- a/src/main/resources/META-INF/plexus/components.xml
+++ b/src/main/resources/META-INF/plexus/components.xml
@@ -321,7 +321,7 @@
<role>org.codehaus.plexus.components.io.resources.PlexusIoResourceCollection</role>
<role-hint>jar</role-hint>
<!-- there is no implementation of PlexusIoJarFileResourceCollection, but PlexusIoZipFileResourceCollection will do the job -->
- <implementation>org.codehaus.plexus.archiver.zip.PlexusIoZipFileResourceCollection</implementation>
+ <implementation>org.codehaus.plexus.archiver.zip.PlexusArchiverZipFileResourceCollection</implementation>
<instantiation-strategy>per-lookup</instantiation-strategy>
</component>
<component> As far I know |
@plamentotev What is the difference between |
@michael-o if you ask why there are two implementations - all I know is this comment. If you ask what is the difference between the implementations:
At first sight looks like they should be interchangeable, but maybe there is some subtle difference in the behavior. |
That's really confusing because they are all ZIP files after all. |
Yes, that is why I think is better to replace That being said, making the cache configurable may still make sense if |
@plamentotev Just want to make sure I understood you correctly. The change you proposed was in the maven-assembly-plugin itself, right? If so then changing to Any suggestions on a sane way forward? |
Excuse me, I didn't made myself clear which project I have in mind. The
change is in Plexus Archiver(
https://github.com/codehaus-plexus/plexus-archiver). The components.xml in
the Assembly plugin controls only the sar files and have no effect on jars.
…On Mon, May 28, 2018, 16:33 Robin ***@***.***> wrote:
@plamentotev <https://github.com/plamentotev> Just want to make sure I
understood you correctly. The change you proposed was in the
maven-assembly-plugin itself, right? If so then changing to
PlexusArchiverZipFileResourceCollection made no performance difference
unfortunately. So for now we're sticking with our patched plexus-io.
Any suggestions on a sane way forward?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMLobagqvApwU0Hm0U62zRZWRncJKqe5ks5t2_ylgaJpZM4UKY2u>
.
|
Gotcha! Making the suggested changes to plexus-archiver did make the build performance on-par with our hack to turn the cache on in |
@plamentotev Is there anything required on my part for you suggested changes to be implemented and merged into master? |
@rbjorklin I suppose you mean using [1] I think those are Maven Assembly Plugin, Maven Dependency Plugin, Maven Shade Plugin, Maven JAR Plugin, Maven WAR Plugin, Spring Boot Maven Plugin. |
With cache disabled in class sun.net.www.protocol.jar.JarURLConnection.JarURLInputStream with every close() - jar file is closed also. For jars like Bouncycastle with signatures inside - reading every entry (ie open jar) involves signature verification - that is why we see slow performance. java.util.jar.JarFile#getInputStream |
@slachiewicz thanks for the update - that makes sense. I wonder if PlexusArchiverZipFileResourceCollection verifies the signatures. I should check that - that may explain why it is not used for Jar based files. |
We have the same problem. Creating jar with dependencies using Removing following line speeds up our build from 6 minutes to 30 seconds: plexus-io/src/main/java/org/codehaus/plexus/components/io/resources/PlexusIoURLResource.java Line 42 in 8504fa1
I understand that caching should not be enabled by default. So how should this be configurable? Do you think system property ( I can submit PR if there is an agreement how to solve it. |
@plamentotev. I know system props is a bit ugly, but making it configurable through the entire Maven chain will take way more time. |
I investigated how this flag could be propagated from Maybe some shortcut would be better solution in this case. One possibility would be to control the caching using system property (as I mentioned before), another would be to add public static flag to PlexusIoURLResource which could be set directly from AbstractAssemblyMojo. Significant disadvantage of not allowing users to set this flag from plugin (i.e. only system property) would be really bad discoverability/documentation of this feature. |
In general I think if we're going to add the caching flag all the way to the plugins we should add it to the interface ( About the issue discussed here. The root cause for the performance problems is the Having system prop is really ugly IMHO. Especially when we have better option (like fixing the root cause of the performance issues). The only advantage is that is quick. Correct me if I'm wrong but I don't think this issue is that critical to justify making the code ugly or hard to maintain only to fix it quickly. |
I did some more investigation. If I think that is the better solution and if you agree I can implement it (ETA this or next week). |
This issue is important for us so I created our own version of this module (with enabled caching) which we can use until performance is solved. Having "right" long term solution would be ideal, I just don't know how exactly this should be done. If you could do it, it would be really good. |
I've just verified that when using maven-assembly-plugin 3.1.1 we don't need the locally modified snapshot build of plexus-io that rbjorklin did for performance anymore. Thanks @plamentotev! @rbjorklin This issue can be closed. |
I've also verified that maven-assembly-plugin 3.1.1 (with default dependencies) works well. Thanks. |
I'll close the issue as the performance issue is solved. |
See #2 on why the cache was initially disabled.
Maven-assembly-plugin depends on plexus-io and having the cache enabled does wonders for the build performance when building a
jar-with-dependencies
and the dependencies are fairly large such as eclipse-collections.EDIT: https://github.com/codehaus-plexus/plexus-io/blob/master/src/main/java/org/codehaus/plexus/components/io/resources/PlexusIoURLResource.java#L42
The text was updated successfully, but these errors were encountered: