This section of the Web site reports PhD theses available. General information on Doctoral studies at Politecnico are available at the DEIB Web site. Interested candidates should contact me by e-mail.
Model Driven Design of Big Data Applications
Recent years have seen the rapid growth of interest for enterprise applications built on top of data-intensive technologies such as MapReduce/Hadoop, NoSQL databases, and stream processing systems fed by mobile and sensor data. Moreover, Cloud platform services for Big Data (e.g., Amazon Elastic MapReduce, S3, Kinesis; Microsoft HDInsights) are now creating massive growth opportunities for software vendors to develop and sell novel data-intensive cloud applications in various market segments, from predictive analytics to environmental monitoring, from e-government to smart cities. Since the software development market expects to be dominated by data-intensive cloud applications in the next years, there is now an urgent need for novel, highly productive, software engineering methodologies capable of supporting the design of data-intensive applications.
The focus of the PhD work is to define a quality-driven framework for developing data-intensive applications that leverage Big Data technologies hosted in private or public clouds. The thesis will develop a methodology and tools for data-aware quality-driven development. The work will focus on quality assessment, architecture enhancement, agile delivery and continuous monitoring of data-intensive applications.
Expertise in Model Driven Methodologies, Hadoop/Spark technology stack, performance evaluation, optimization and operation research would be highly regarded.
The PhD work will be supported by the DICE (Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements) H2020 European project.
- M. Malekimajd, A. M. Rizzi, D. Ardagna, M. Ciavotta, M. Passacantando, A. Movaghar. Optimal Capacity Allocation for executing Map Reduce Jobs in Cloud Systems. MICAS-SYNASC 2014 Workshops Proceedings. To Appear.
Resource management in Cloud systems
Cloud computing is an emerging paradigm which allows the on-demand delivering of software, hardware and data as a services, providing end-users with flexible and scalable services accessible through the Internet. Three different models are used among the providers to deliver the services: (1) Infrastructure as a service (IaaS), (2) platform as a service (PaaS) and (3) software as a service (SaaS). The IaaS model is used to provide disk storage, database and/or computation time on-demand from Internet based Data Centers. PaaS allows to develop and deploy cloud applications exploiting advanced middleware solutions, while the SaaS model provides the opportunity to use/integrate full software applications.
Several issues emerge in this framework due to the changing environment, where the cloud-based services live. First of all, in any time instant resources have to be allocated to handle workload fluctuations since continuous changes occur autonomously and unpredictably. Furthermore, end-users must be guaranteed with a Quality of Servise (QoS) levels stipulated in Service Level Agreement (SLA) contracts usually expressed in terms of performance metrics (e.g., response time and throughput) and availability.
The Ph.D. work we will take the perspective of SaaS providers, which deploy their applications on multiple Clouds and want to maximize their profit, while minimizing the cost for the use of the underlying resources. Indeed, since: (i) Cloud performance can vary at any point in time, (ii) elasticity may not ramp at desired speeds, (iii) unavailability problems exist even when 99.9% up-time is advertised (see, e.g., Amazon EC2 and Microsoft Office 365 outages in 2011), the use of multiple Clouds offered by different providers is needed to support the execution of business critical applications.
The PhD project will investigate the possibility to distribute the workload among multiple IaaS/PaaS Data Centers to allocate the resources at the Data Centers that result less expensive at the considered time instant. Secondly, the interaction among multiple SaaS sharing a common PaaS/IaaS infrastructure will be analyzed. New optimization algorithms based on Lagrangian decomposition and distributed techniques will be developed, considering the possibility to redirect workload to the most suitable Data Center and to determine applications resource allocation at different time scales.
The interaction among SaaS/PaaS/IaaS will be grounded on game theoretic methods and approaches. Since in Cloud systems SaaS behaves selfishly and competes with others SaaS for the use of infrastructural resources supplied by the PaaS/IaaS, to capture the behavior of SaaSs in this conflicting situation in which the best choice for one depends on the choices of the others, the Generalized Nash Equilibrium concept will be used
The effectiveness of the solutions proposed in the Ph.D. work will be evaluated by performing an extensive experimentation considering realistic scenarios, for a variety of system and workload configurations through simulation and by running tests in real Cloud environments. ￼￼￼
Expertise in performance evaluation, optimization and operation research would be highly regarded.
- D. Ardagna, E. Di Nitto, D. Petcu, P. Mohagheghi, S. Mosser, P. Matthews, A. Gericke, C. Ballagny, F. D'Andria, C. Nechifor, C. Sheridan. MODACLOUDS: A Model-Driven Approach for the Design and Execution of Applications on Multiple Clouds. MiSE 2012 Workshops Proceedings.
- D. Ardagna, S. Casolari, M. Colajanni, B. Panicucci. Dual Time-scale Distributed Capacity Allocation and Load Redirect Algorithms for Cloud Systems. Journal of Parallel and Distributed Computing, Elsevier. 72(6), 796-808, 2012.
- D. Ardagna, B. Panicucci, M. Passacantando A Game Theoretic Formulation of the Service Provisioning Problem in Cloud Systems. WWW 2011 Proceedings. 177-186. Hyderabad, India.
- D. Ardagna, B. Panicucci, M. Passacantando. Generalized Nash Equilibria for the Service Provisioning Problem in Cloud Systems. IEEE Transactions on Services Computing. 6(4), 429-442, 2013.