Skip to main content

Efficient Scheduling of Scientific Workflow Actions in the Cloud Based on Required Capabilities

  • Conference paper
  • First Online:
Data Management Technologies and Applications (DATA 2020)

Abstract

Distributed scientific workflow management systems processing large data sets in the Cloud face the following challenges: (a) workflow tasks require different capabilities from the machines on which they run, but at the same time, the infrastructure is highly heterogeneous, (b) the environment is dynamic and new resources can be added and removed at any time, (c) scientific workflows can become very large with hundreds of thousands of tasks, (d) faults can happen at any time in a distributed system. In this paper, we present a software architecture and a capability-based scheduling algorithm that cover all these challenges in one design. Our architecture consists of loosely coupled components that can run on separate virtual machines and communicate with each other over an event bus and through a database. The scheduling algorithm matches capabilities required by the tasks (e.g. software, CPU power, main memory, graphics processing unit) with those offered by the available virtual machines and assigns them accordingly for processing. Our approach utilises heuristics to distribute the tasks evenly in the Cloud. This reduces the overall run time of workflows and makes efficient use of available resources. Our scheduling algorithm also implements optimisations to achieve a high scalability. We perform a thorough evaluation based on four experiments and test if our approach meets the challenges mentioned above. The paper finishes with a discussion, conclusions, and future research opportunities. An implementation of our algorithm and software architecture is publicly available with the open-source workflow management system “Steep”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 39.99
Price excludes VAT (USA)
Softcover Book
USD 54.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004, pp. 423–424. IEEE (2004)

    Google Scholar 

  2. Apache Airflow: Apache Airflow Website (2020). https://airflow.apache.org/. Accessed 14 Apr 2020

  3. Argo Workflows: Argo Website (2020). https://argoproj.github.io/. Accessed 21 Oct 2020

  4. Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D., Trevisan, L.: Counting distinct elements in a data stream. In: Rolim, J.D.P., Vadhan, S. (eds.) RANDOM 2002. LNCS, vol. 2483, pp. 1–10. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45726-7_1

    Chapter  MATH  Google Scholar 

  5. Berriman, G.B., et al.: Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand. In: Optimizing Scientific Return for Astronomy through Information Technologies, vol. 5493, pp. 221–233. International Society for Optics and Photonics (2004)

    Google Scholar 

  6. Binato, S., Hery, W.J., Loewenstern, D.M., Resende, M.G.C.: A Grasp for Job Shop Scheduling, pp. 59–79. Springer, Boston (2002). https://doi.org/10.1007/978-1-4615-1507-4_3

  7. Blythe, J., et al.: Task scheduling strategies for workflow-based applications in grids. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGrid), pp. 759–767 (2005). https://doi.org/10.1109/CCGRID.2005.1558639

  8. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Committee Data Eng. 36(4), 28–38 (2015)

    Google Scholar 

  9. Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop HCW, pp. 349–363 (2000). https://doi.org/10.1109/HCW.2000.843757

  10. Chircu, V.: Understanding the 8 fallacies of distributed systems (2018). https://dzone.com/articles/understanding-the-8-fallacies-of-distributed-syste. Accessed 18 Feb 2020

  11. Deelman, E., et al.: The future of scientific workflows. Int. J. High Perform. Comput. Appl. 32(1), 159–175 (2018)

    Article  Google Scholar 

  12. Deelman, E., et al.: Pegasus: a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015). https://doi.org/10.1016/j.future.2014.10.008

  13. Di Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., Palumbo, E., Notredame, C.: Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35(4), 316–319 (2017). https://doi.org/10.1038/nbt.3820

    Article  Google Scholar 

  14. Foster, I., Kesselman, C. (eds.): The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers Inc., San Francisco (1998)

    Google Scholar 

  15. Freund, R.F., et al.: Scheduling resources in multi-user heterogeneous computing environments with SmartNet. The NPS Institutional Archive, Calhoun (1998)

    Google Scholar 

  16. Gherega, A., Pupezescu, V.: Multi-agent resource allocation algorithm based on the XSufferage heuristic for distributed systems. In: Proceedings of the 13th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp. 313–320 (2011). https://doi.org/10.1109/SYNASC.2011.37

  17. Giardine, B., et al.: Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15(10), 1451–1455 (2005). https://doi.org/10.1101/gr.4086505

    Article  Google Scholar 

  18. Graves, R., et al.: CyberShake: a physics-based seismic hazard model for southern California. Pure Appl. Geophys. 168(3), 367–381 (2011). https://doi.org/10.1007/s00024-010-0161-6

    Article  Google Scholar 

  19. Hamad, S.A., Omara, F.A.: Genetic-based task scheduling algorithm in cloud computing environment. Int. J. Adv. Comput. Sci. Appl. 7(4), 550–556 (2016). https://doi.org/10.14569/IJACSA.2016.070471

  20. Hemamalini, M.: Review on grid task scheduling in distributed heterogeneous environment. Int. J. Comput. Appl. 40(2), 24–30 (2012)

    Google Scholar 

  21. Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucl. Acids Res. 34, W729–W732 (2006)

    Article  Google Scholar 

  22. Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on nonidentical processors. J. ACM 24(2), 280–289 (1977). https://doi.org/10.1145/322003.322011

    Article  MathSciNet  MATH  Google Scholar 

  23. Johnson, D.S., Garey, M.R.: Computers and Intractability: A Guide to the Theory of NP-Completeness. WH Freeman (1979)

    Google Scholar 

  24. Kane, D.M., Nelson, J., Woodruff, D.P.: An optimal algorithm for the distinct elements problem. In: Proceedings of the 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2010, pp. 41–52. Association for Computing Machinery (2010). https://doi.org/10.1145/1807085.1807094

  25. Krämer, M.: A microservice architecture for the processing of large geospatial data in the cloud. Ph.D. thesis, Technische Universität Darmstadt (2018). https://doi.org/10.13140/RG.2.2.30034.66248

  26. Krämer, M.: Capability-based scheduling of scientific workflows in the cloud. In: Proceedings of the 9th International Conference on Data Science, Technology, and Applications DATA, pp. 43–54. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009805400430054

  27. Li, K., Xu, G., Zhao, G., Dong, Y., Wang, D.: Cloud task scheduling based on load balancing ant colony optimization. In: Proceedings of the 6th Annual Chinagrid Conference, pp. 3–9 (2011). https://doi.org/10.1109/ChinaGrid.2011.17

  28. Maheswaran, M., Ali, S., Siegal, H.J., Hensgen, D., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings. Eighth Heterogeneous Computing Workshop (HCW 1999), pp. 30–44, April 1999. https://doi.org/10.1109/HCW.1999.765094

  29. Mell, P.M., Grance, T.: The NIST definition of cloud computing. Technical report, National Institute of Standards & Technology, Gaithersburg, MD, USA (2011)

    Google Scholar 

  30. Nayak, B., Padhi, S.K.: Mapping of independent tasks in the cloud computing environment. Int. J. Adv. Comput. Sci. Appl. 10(8), 314–318 (2019)

    Google Scholar 

  31. Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004). https://doi.org/10.1093/bioinformatics/bth361

    Article  Google Scholar 

  32. Page, A.J., Naughton, T.J.: Dynamic task scheduling using genetic algorithms for heterogeneous distributed computing. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (2005). https://doi.org/10.1109/IPDPS.2005.184

  33. Russell, N., van van der Aalst, W.M., ter Hofstede, A.H.M. : Workflow Patterns: The Definitive Guide. MIT Press, Cambridge (2016)

    Google Scholar 

  34. Singh, S., Chana, I.: A survey on resource scheduling in cloud computing: issues and challenges. J. Grid Comput. 14, 217–264 (2016). https://doi.org/10.1007/s10723-015-9359-2

    Article  Google Scholar 

  35. Tawfeek, M.A., El-Sisi, A., Keshk, A.E., Torkey, F.A.: Cloud task scheduling based on ant colony optimization. In: Proceedings of the 8th International Conference on Computer Engineering Systems (ICCES), pp. 64–69 (2013). https://doi.org/10.1109/ICCES.2013.6707172

  36. Thennarasu, S., Selvam, M., Srihari, K.: A new whale optimizer for workflow scheduling in cloud computing environment. J. Ambient Intell. Humanized Comput. (2020). https://doi.org/10.1007/s12652-020-01678-9

    Article  Google Scholar 

  37. Topcuoglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002). https://doi.org/10.1109/71.993206

    Article  Google Scholar 

  38. Ullman, J.: NP-complete scheduling problems. J. Comput. Syst. Sci. 10(3), 384–393 (1975). https://doi.org/10.1016/S0022-0000(75)80008-0

    Article  MathSciNet  MATH  Google Scholar 

  39. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. USENIX Association (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Krämer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Krämer, M. (2021). Efficient Scheduling of Scientific Workflow Actions in the Cloud Based on Required Capabilities. In: Hammoudi, S., Quix, C., Bernardino, J. (eds) Data Management Technologies and Applications. DATA 2020. Communications in Computer and Information Science, vol 1446. Springer, Cham. https://doi.org/10.1007/978-3-030-83014-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-83014-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-83013-7

  • Online ISBN: 978-3-030-83014-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics