<p> To meet the demands of high availability and optimal performance in dynamic environments, modern systems deploy autonomic or self-adaptation mechanisms. However, increasingly today’s enterprise systems are compositions of many subsystems, each an adaptive system. Currently, each autonomic manager operates to maintain locally defined quality-of-service (QoS) objectives, but their independent actions often lead to globally sub-optimal results. Commonly, human administrators handle situations in which the collection of autonomic systems is behaving sub-optimally. However, generating a plan to change the configurations of the constituent autonomic managers is a complex and challenging task in the management of a single autonomous system, but the challenge is exacerbated where there may be tradeoffs in how to balance configuration options across the collection of autonomic subsystems. </p>
<p>These challenges can be addressed by introducing an automated approach, referred to as meta-management, that provides a formal basis for reasoning about changes to the configurations of autonomic subsystems. The automated approach to meta-management is then established as part of a framework that can be used to instantiate a higher level autonomic manager, referred to as a meta-manager, that provides assurance about, and improves the performance of a collection of autonomic systems. This approach and framework includes a MAPE-K control loop specialized to the needs of meta-management, a domain specific language, SEAM, that enables the practical specification of adaptation policies, and a taxonomy of strategy synthesis techniques. The practicality, effectiveness, and applicability of the approach are then evaluated against three case studies. </p>
<p>The first is an AWS Shopping Cart system in which a meta-manager is established to manage a collection of autonomic system represented by a front end user interface, a middleware services tier, and a database services tier. This case study was selected to evaluate the ability of the meta-manager to improve the homeostatic operations of the collection of autonomic systems on popular architectural pattern, code base, and operations platform that is in wide industrial use. </p>
<p>The second is the Google Control plane in which a meta-manager was established to manage a collection of autonomic systems that suffered a significant outage. This case study was selected because it presented a well documented and specific failure scenario that occurred during the period of the research of this thesis that cause of which was, partially, a result of human-centric management of a collection of autonomic systems. </p>
<p>Finally, the third is a simulation of an electrical grid cascade failure that represents the Northeast Blackout of 2003. This case study was selected because it presents an example of a failure of human-centric management of a collection of autonomic systems that was exhaustively documented that occurred in a context outside of information technology and/or cloud based providers. This provides credibility to the applicability claim of the thesis </p>