Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing custom command to ShellSpoutSpec #299

Closed
macheins opened this issue Aug 9, 2016 · 2 comments
Closed

Passing custom command to ShellSpoutSpec #299

macheins opened this issue Aug 9, 2016 · 2 comments
Labels

Comments

@macheins
Copy link
Contributor

macheins commented Aug 9, 2016

Hello,

I tried to pass a custom command with spaces to ShellSpoutSpec, e.g. /usr/local/bin/python -m streamparse.run. But it seems that command does not support spaced commands. I changed the code, but the supervisor exits with the following error:

java.io.IOException: Cannot run program "/usr/local/bin/python -m streamparse.run" (in directory "/opt/storm/storm-local/supervisor/stormdist/test-2-1470753751/resources"): error=2, No such file or directory

My intention for doing so is to distribute streamparse as part of my JAR with pip install -t src. Since all our code is self-contained and dockerized we can not and do not want to rely on virtualenv and ssh. Storm seems to execute streamparse_run in the root of the extracted JAR and PYTHONPATH is globally set to always contain ., so I think it might work.

Do you have any suggestions how to make storm use a streamparse contained in the JAR?

Thank you very much.

FYI adding a custom start script to the root dir does not work because JARs do not preserve executable permissions.

@dan-blanchard
Copy link
Member

I would really like to add support for using PEX files instead of virtualenvs to make supporting exactly this use case simpler (see #212), but I haven't had the time to do that yet. Also, since you pointed out that JARs don't preserve executable permissions, that is going to be tricky to do.

You are correct that command cannot contain spaces, which is why we use streamparse_run instead of python -m streamparse.run. You can see my complaints about that in the commit message for 3d2760a.

Since all our code is self-contained and dockerized we can not and do not want to rely on virtualenv and ssh

What other people in your situation usually do is have the virtualenv get setup somewhere inside their container as part of their Dockerfile, and then have install_virtualenv and use_ssh_for_nimbus both set to false as described here.

An alternative is to also set use_virtualenv to false and just install all your requirements using /usr/local/bin/pip so that everything is available without activating a virtualenv at all. That only is recommended if you've got a single container per topology, so you never share that the same python executable across topologies.

@macheins
Copy link
Contributor Author

macheins commented Aug 9, 2016

Hey @dan-blanchard

We currently build docker images containing all dependencies and spawn a dockerized cluster of exactly this image for each topology. So its working ok, but if we would be able to distribute streamparse with the JAR this would enable us to submit all jobs to our global dockerized storm cluster and avoid alot of overhead.

Since changing command is not going to work for now we will go half-way and spawn a global streamparse cluster were we will submit all self-contained topologies. Thank you very much for your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants