UCSM Overall Architecture

The architecture of UCSM consists of multiple layers with well-defined boundaries. External interfaces provide communication with the outside world. The Data Management Engine (DME) is the central service that manages the components of a UCS. Application gateways act as a hardware abstraction layer between the DME and the managed End-Points (EPs). The EPs are the actual devices or entities that are managed by UCSM, but they are not considered as part of the UCSM itself.

Selection_044

External Interface Layer

This layer includes the external interfaces for outside world communication. For example SMASH-CLP, CIM-XML, and the UCSM-CLI.

Data Management Engine (DME)

The Data Management Engine (DME) is the central component in UCSM and consists of multiple internal services. The DME is the only component in a UCS that stores and maintains states for the managed devices and elements. The DME is the authoritative source of configuration information. It is responsible for propagating configuration changes to the devices and endpoints in the UCS. The DME manages all devices and elements and represents their state in the form of managed objects (MOs). MOs contain the desired configuration and the current state of a corresponding endpoint. Administrators make changes to MOs. These changes are validated by the DME and propagated to the specific endpoint. For example, suppose an operator initiates a server “power on” request through the GUI. When the DME receives the request, it validates it and, if valid, the DME makes the corresponding state change on the server object in the Model Information Tree (MIT). This state change is then propagated to the server via the appropriate Application Gateway (AG).

DME consist of these services:

  • Object Services – all components (servers, NICs, ports,…) in UCS are represented as Managed Objects (MOs). Object services contains these objects in structured maner (hiearchically organized structure called MIT or Manged Information Tree).
  • Behavior and Orchestration Services – they handle the behavior and rules regarding the objects. Managed objects have their own Finite State Machine (FSM) in the form of a child object that is responsible for scheduling and performing tasks for the object. For example, a hardware inventory of a server is orchestrated by the FSM on the corresponding server object.
  • Transaction Services – Transaction services are responsible for the actual data mutations (configuration and state changes) of the objects (as performed by the “transactor” thread), and replication of state change to the secondary UCSM. Changes are made in an asynchronous and transactional fashion.

Applicaton Gateways

Application gateways are stateless agents that are used by the DME to propagate changes to the end-points. They also report system state from the endpoints to the DME. An AG is a module that converts management information (e.g., configuration, statistics, and faults) from its native representation into the form of a managed object (MO).

Managed Endpoints

The end resources managed by UCSM (servers, chassis, NICs,…)

UCSM HA

UCSM can run in a highly available configuration. This is achieved by connecting two Fabric Interconnect devices together via two cluster ports on each Fabric Interconnect. While the UCS management plane runs in an active-standby configuration, the data plane is active-active. Both Fabric Interconnects are actively sending and receiving LAN and SAN traffic, even though only one of them is running the active UCSM instance.

Selection_045

There is an election which one UCSM is promoted to “primary” and the other UCSM is demoted to “subordinate”. The primary UCSM instance is the owner of a virtual IP address to which all-external management connections are made. The primary instance handles all requests from all interfaces and application gateways, and performs model transformations (like configuration and state changes) in the system. The primary instance also replicates all changes in the system to the subordinate instance.

Both instances of UCSM are monitoring each other via the cluster links. If the cluster links fail a classical cluster quorum algorithm is used to avoid a possible split-brain situation. An odd number of chassis is used as quorum resources to determine the primary instance. For example, in a UCS with four chassis, three are used as quorum devices. Upon detection of a possible split-brain scenario, both instances of UCSM will demote themselves to subordinate and at the same time try to claim ownership of all the quorum resources by writing their identity, through the Fabric Extenders, to the serial EEPROM of the chassis. Next, both UCSM instances will read the EEPROM content of all quorum chassis and the UCSM instance that succeeded to claim most quorum resources will become the primary.

MIM (Management Information Model)

As previously discussed, the Information Model or the UCS Management Information Model (MIM) is a tree structure where each node in the tree is a managed object (MO). Managed objects are abstractions of real-world resources—they represent the physical and logical components of the UCS, such as Fabric Interconnect, chassis, servers, adapters, etc. Certain MOs are implicit and cannot be created by users, they are automatically crated by the UCSM when a new device is discovered such as power supplies and fan modules.

Selection_046

Available Integration Points

Standard (Cut-Through) Interfaces in a UCS

The standard protocols that connect directly (cut-through) to a CIMC (Cisco Integrated Management Controller) via a unique external IP address are bypassing the UCSM. A cut-through interface provides direct access to a single server’s CICMC. The advantage with a cut-through interface like IPMI, is that it allows for existing software that already leverage such interfaces to interact with the servers with no modifications or updates necessary. The disadvantage of using a cut-through interface is that it bypasses the DME. To alleviate this disadvantage, the DME is always in discovery mode, and it will detect any changes made through a cut-through interface, like a reboot. However, if an operator performs a task on a server (via DME, GUI, CLI, XML, API, etc.) and a different operator uses a cut-through interface and performs a contrasting action at same time on the same server, the DME initiated task will always win and complete the request (due to the transactional nature of the DME). The UCS provides a unique external management IP address to each CIMC for external management. This external management IP address for each CIMC must be within the same network as the Fabric Interconnect management ports and UCSM virtual IP.

 Selection_047

Native Interfaces in UCS

UCS CLI, UCS GUI, XML API. The XML API is the most powerful interface used to integrate or interact with a UCSM. It is generic, content-driven, and hierarchical. This is the native language for the DME and therefore there are no restrictions on what can be done through this interface, within the framework of the DME.

Standard Interfaces in UCS

SNMP, SMASH-CLP, CIM-XML

Selection_048