Distributed Workload Management Cont.....
As discussed in previous Distributed Workload Management blog, this one is mainly about prototype implementation for final design
So far, we have been able to implement all the tasks, I worked on following;
both these tasks require remote system interaction, so we decided to club them in a single jar.
We have implemented SCP and SFTP protocols in both the tasks.
I contributed to messaging infrastructure.
We started with pub-sub rabbitmq model but as we moved further we got to know work queue model better suits our requirement. Work queue model guarantees,
message consumption by single worker at a time, if worker fails to process, the message gets queued again for other waiting workers.
I also implemented priority provisioning for rabbitmq messages, this would help alter message priority based on user need.
Gourav and I worked on scheduler, we are yet to come to consensus regarding responsibilities of scheduler, do we need database? how we can support delayed job submission?
In hackillinois when we presented workflow issue in front of students they came up with an interesting match making scheduling algorithm, we might consider using the same,
I shall explain this algorithm in next blog after reviewing its correctness.
As far as orchestrator is concerned we are yet to make significant move, for now we have written simple stub which mocks orchestrator functionalities. Amruta is working on graph db, but still we are exploring some grey patches such as what each node represents? and how different nodes can be mapped with each other? what role orchestrator plays in terms of task context creation and DAG manipulation?
Github Issues
Below are the issues created by me;
- Environment Setup Task
- Implement Data Staging task
- Comment on “Implement a database service for DAG”
Github Commits
My git commits are tracked here