Files from classpath are not properly resolved when classpath JAR contains META-INF with references to other dependencies. #538
Description
Hello,
Intelij has a feature called dynamic.classpath which is used when the amount of items in the classpath required to launch the JVM+program exceed the maximum line size allowed by the OSs terminal. When this happens, Intelij prompts the user to activate this feature.
When active, a single JAR is used on the classpath which, in it, contains a reference to all other JARs required (instead of having them all laid out in 'java -classpath ...') in the META-INF/manifest file.
However when using it with dataflow, it will cause it to find a single dependency is required to be uploaded.
26 Jan 2017 13:39:24,496 [main]: com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.fromOptions(DataflowPipelineRunner.java:302) INFO {} - PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 1 files. Enable logging at DEBUG level to see which files will be staged.
It will cause the workflow to show up correctly in Dataflow dashboard, the pipeline+nodes of the graph all laid out correctly but nothing flows through the source. The logs show the the "Workers started correctly", however, in the worker logs accessible through Stackdriver, we can see that there are plenty of errors regarding starting up the containers and they keep retrying over and over again.
Steps to reproduce:
- Generate a new project using the maven archtype in the Dataflow tutorial.
1,5. Import porject to intelij using pom.xml that was generated. - Then edit the .idea/workspace.xml file to contain
<component name="PropertiesComponent">
...
<property name="dynamic.classpath" value="true" />
...
</component>
- Run any of the WordCount samples - MinimalWordCount would be the simplest.
- Dataflow is now uploaded (only one file uploaded to staging area...) and nothing flows from source.
Only setting it back to 'false' will fix it. My development environment is Windows which I believe has a low limit for terminal lines, hence why I had to use this feature at some point.
I am not aware how prevalent this issue might be with more complex JAR dependencies, but I have seen it happening due to Intelij dynamic.classpath feature.
Cheers,
Dan