hadoop - How to package python script with dependencies into zip/tar? -


i've got hadoop cluster i'm doing data analytics on using numpy, scipy, , pandas. i'd able submit hadoop jobs zip/tar file using '--file ' argument command. zip file should have python program needs execute such no matter node script executes on in cluster, won't run importerror @ runtime.

due company policies, installing these libraries on every node isn't feasible, exploratory/agile development. have pip , virtualenv installed create sandboxes needed though.

i've looked @ zipimport , python packaging none of seems fulfill needs/i'm having difficulty using these tools.

has had luck doing this? can't seem find success stories online.

thanks!

i have solved similar problem in apache spark , python context creating docker image has needed python libraries , spark slave script installed. image distributed other machines , when container started auto-joins to cluster, have 1 such image / host machine.

our ever-changing python project submitted zip file along job , imports work transparently form there. luckily need re-create slave images , don't run jobs conflicting requirements.

i'm not sure how applicable in scenario, since (in understanding) python libraries must compiled.


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -