The High Availability Service Framework is a common component for ActiveSpaces Transactions (AST) and Business Events Extreme (BE-X). It provides a generic mechanism for managing highly available applications, and defining how objects are partitioned.

Please see the documentation at https://devzone.tibco.com/display/DOC/Home for more information on ActiveSpaces Transactions.

The following defines the terms used in the High Availability Service Framework component.

High Availability Terms

Node

A node is a AST/BE-X application that is running the highavailability component provided by AST. Multiple nodes can run on the same physical machine. A node can share the same network interface and TCP/IP address, but each node uses a unique TCP/IP port number. Each node is identified by a unique node name. A numeric location code is computed using a hash of the name to allow the runtime to to quickly find the node when dispatching requests.

Node Group

A node group is a collection of nodes. A node can be a member of multiple node groups. Node groups are used to define clusters, and manage quorum for the nodes in the node groups.

Cluster

A cluster is a collection of node groups. It is used when defining a dynamic partition group that distributes partitioned data within the cluster. Since a node can span node groups, a node can be in multiple clusters, though this won't be the case for typical deployments.

Partition

A partition is used to manage the partitioning of distributed objects across nodes. It consists of an active node, where object behaviour will be executed, a set of replica nodes, which contain copies of the instances, and a set of properties that define partition behaviour.

A partition uses distributed transactions to insure data integrity when replicating objects. The AST highavailability component provides methods to migrate partitions to a new set of nodes, or move instances to different partitions. The High Availability Service Framework component is built on top of the AST highavailability component and allows partitions to be defined using configuration files.

Partition Mapper

A mapper is a runtime implementation that maps distributed instances to partitions. The High Availability Service Framework component provides builtin mappers that can be defined via configuration.

Consistent Hash

A consistent hash is a hashing technique that reduces the number of keys that need to be remapped as the hash table size changes. See http://en.wikipedia.org/wiki/Consistent_hashing for an overview of consistent hashing.

Dynamic Partition Group

A dynamic partition group uses a consistent hash to distribute a set of key'd distributed objects across a set of partitions. As nodes are added and removed, the High Availability Service Framework component distributes these partitions across the active set of nodes in order to load balance data. A builtin mapper is provide that accesses the key data for an instance, then maps the instance to a partition in the dynamic partition group.

Static Partition Group

A static partition group maps instances to one of a set of configured partitions. Builtin mappers are provided that maps the instance to a given static partition. Currently, round robin (modulo the number of configured partitions) and hash by field are the supported mappers.

Multi-master cluster

A multi-master cluster, also known as a split-brain cluster, occurs when partitioned object instances have multiple active nodes within a cluster, or across clusters. When this happens, object updates can happen on each active node, resulting in conflicts. Object lifecycle is also affected, one active node may remove the instance, while the other node updates the instance. An active node can delete an instance by key, and create a new instance, resulting in a conflict.

Node Quorum

Node quorum is a technique used to avoid multi-master situations. It uses a node count or percentage to determine if the local node should be taken offline, avoiding the case where multiple active nodes exist for a given partition. Note that using node quorum may result in service outages if all nodes detect loss of quorum.

Routing

Routing is the dispatching of work to the active node of a partitioned object. The AST highavailability component supports routing by always dispatching work to the active node for a partitioned object. The AST runtime also supports the transparent failover of a dispatch request from an active node to a replica node if the active node fails during the dispatch.

Support for explicit routing to a specific node by name is not supported by the High Availability Service Framework at this time.