Distributed Workload Management

The motivation for this came from possible enhancements in Apache Airavata. We are tying to address and possibly come up with solution for distributed workload management. This is a hot button concern in distributed environment and there isn’t any one solution, we need to come up with own solution based on application need. The crux is how micro-services communicate and work with each other? sounds easy? but it gets convoluted as we think through different building blocks and their boundaries.(reference)

Potential Solutions

As I said there isn't any trademark solution, each has its own pros and cons. Even though it is designed considering Apache Airavata, we have tried to keep it as generic as possible. We had to go through quite a few iterations to come up with this solution. There is huge chuck of possible solutions, we tried which best suit are requirements. First, we started with state-full vs state-less design and obvious inclination was always towards state-less. This design decisions motivated us to think trough centralized vs decentralized architecture. Still, there are quite a few implementation concerns we are yet to address like which messaging infrastructure will best serve the need.

Here are the pages which explain everything in detail,

  • A state-full design : LINK
  • A state-less design : LINK
  • Final Design : LINK
  • Messaging infrastructures : LINK

Solution Evaluation

Above WiKi links best explains each solution and its pros and cons. Thanks to dev@airavata.apache.org and class discussions We have been able to come towards the architectural consensus.


Conclusion

As we have decided to move ahead with final design, we are building POC to uncover possible corner cases. During this we shall need to evaluate and take critical implementation decisions, any suggestions regarding messaging infrastructure, communication models would be very helpful.


Github Commits

We are still working on prototype. We have started implentation and have completed most of the individual tasks and are planing to integrate soon. My contribution towards the poc is tracked here


Airavata Dev List Discussion

Below are the links to dev@airavata.apache.org discussions,

  • Clearing doubts regarding term workload distribution and evaluating designs based on CAP : LINK
  • Answering implementation and CD/CI concerns : LINK