python - How to schedule the execution of spark-submit to specific time -
i have spark batch processing code (basically, model training) execute spark-submit aws emr cluster. want able launch job each day @ specific time. standard way it? should change code , add scheduling inside code? or there way schedule spark-submit job? or maybe should make spark streaming job executed every 24 hours? (though interested in specific time slot, i.e. between 11:00pm , 12pm)
if using linux can setup cron job call spark-submit script http://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/
Comments
Post a Comment