For applications that require Apache ZooKeeper (e.g. for consensus or leadership election), the Service Fabric provides the `Ensemble Manager`. The `Ensemble Manager` enables the simple & rapid creation of one or more, fully secured, ZooKeeper Ensembles.
This section first introduces the `Ensemble Manager` , then provides an ZooKeeper/Curator based example, and then concludes with Ensemble Best Practices.
The Fabric Ensemble Manager
The Ensemble Manager dramatically minimises the Operational Complexities involved in creating, scaling, repairing and retiring ZooKeeper Ensembles.
Ensembles currently hosted by a Service Fabric are listed under Entire's Ensembles view.
Each Service Fabric has a dedicated Ensemble (e.g.
myFabric), for private use by the Fabric's internal infrastructure services. By default the Service Fabric's own infrastructure Fibres are the members of this Fabric Ensemble.
Click on one of the listed Ensembles to view the ensemble's members.
Ensemble member information includes:
Service Id:The Ensemble member's ID.
Current Role [leader|follower|observer]:Each member's status.
Hosting Fibre:The name of the host Fibre.
managed- The Ensemble member is both Expected and Active.
unmanaged- The Ensemble member has been discovered, but the Fabric Ensemble Manager did not expect it. This state can occur sometimes after failure, or when adding fibres that have residual state.
missing- The expected Ensemble member is missing; this most likely indicates that the hosting fibre has failed.
Make Observer- A
leaderis transformed to an
Observeris the same as
followerexcept that it does not participation the ZooKeeper election / voting process (see - https://zookeeper.apache.org/doc/trunk/zookeeperObservers.html). This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
Make Participant- Transform an
Observerinto a full Ensemble participant (i.e. a
follower)and so able to participate in the Ensemble's voting protocol. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
Manage- Take an unmanaged ensemble member, and add it to the expected set so that it becomes managed. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
Delete- Delete this Ensemble member from the Ensemble. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
Force Delete- This action will proceed even if no quorum is available, and may leave the the ensemble in an unusable state.
ZooKeeper specific configuration for each Ensemble member may be seen by clicking on the Ensemble member's
Server Id URI.
Creating an Ensemble
To create a new Ensemble:
- Specify the Ensemble name in the dialogue box; e.g.
- From the list of available fibres select the initial member; e.g.
- Press the
In addition to name, the fibre list also displays each fibre's `
Labels`. This information is useful when deciding the Ensemble's runtime topology: see Ensemble Best Practices.
As shown, the `
ZKExample` Ensemble is created with one member (
Each fibre selected to participle in an ensemble has a corresponding label set of the form `
ensemble-$Ensemble=member`: e.g. the
spark-infra-1.1 fibre is labelled `
Membership of the `
ZKExample` Ensemble may be expanded at any time by simply selecting and then adding more fibres.
The status of each
ZKExample ensemble member is shown in the `
ZKExample` ensemble view.
`ZKExample` ensemble may be selected for removal, and then removed using the
Delete button. However, don't do this, as the
ZKExample ensemble is required in the following example.
The Fabric Ensemble (e.g.
myFabric) may be expanded or contract down to one member, but cannot be removed.
A ZK Curator based Example
This example demonstrates lock-coordination using the Curator locking example (see https://curator.apache.org/curator-examples/). In addition to the ZooKeeper clients demonstrating the use of the Curator library, the
system.part also demonstrates the use of Web Sockets and the new OSGi Alliances http whiteboard specification.
ZooKeeper Example Source Code
The source code for the example is available at https://github.com/paremus/zookeeper-examples.
If you've already worked through the Hello Tutorial then the structure of the
zookeeper-system will be familiar.
- The system is composed of two
Resource Contractsare defined such as:
instances are deployed to fibres labelled
zookeeper-webinstances are deployed to fibres labelled `
- Zookeeper connection information is passed to both
zookeeper-system document is shown.
To run the
- The ensemble
ZKExamplecreated in the previous section will be required. If you deleted it, then recreate it. A
ZKExamplewith one ensemble member is sufficient for running the example.
zookeeper-systemfrom the URL - https://nexus.paremus.com/content/repositories/public/com/example/zookeeper-1/13/zookeeper-system/1.0.0/zookeeper-system-1.0.0.xml.
Set the `
WebLayer=true` labels on the appropriate fibres in your Fabric.
zookeeper-systemURL endpoint is now visible via Entire's
Appview.. Click on the URL to access the example.
As shown each of the
AppLayer fibres competes to acquire a lock from the
ZKExample ensemble. The fibre that acquires the lock, decrements a shared counter and then releases the lock; then the process is then repeated. All users of the application share a single countdown, and receive notifications of changes in state.
Ensemble Best Practices
When creating ZooKeeper ensembles with Ensemble Manager, the usual best practices still apply:
- Ensure that each Ensemble is formed from an odd number of members. This is required to ensure correct behaviour in simple partition events: i.e. after the environment has been partitioned into two regions; the region with the majority of ensemble members `
>N/2`continues to be active. The other partitioned region with `<
N/2`ensemble members dropping to a read-only behaviour.
- Keep ensemble size small: i.e. i.e. 3 (
1 leader, 2 followers)or 5 (
1 leader, 4 followers).
- From a resilience perspective there is no reason to build large Ensembles. If an Ensemble member fails, this is simply replaced using the Fabric Ensemble Manager.
- Also note that ZooKeeper performance rapidly decreases as the number of ensemble members increase.
- If the solution requires many ZooKeeper ensemble participants, then make the majority of these `o
`Observers`are the same as
`Followers`with the one difference that
`observers`do not participate in the ZooKeeper voting protocol.
Real world failures can be significantly more complex than the simple partitioning scenario presented above. For this reason, Paremus make the following additional recommendations ...
Avoid tightly coupling data-centre applications to a single central consensus service
Create dedicated Ensembles for each business services that requires a consensus / lock-manager service. No matter how robust, a single centralised lock-manager used by all applications introduces a significant single point of failure within your data-centre environment. Also, remember that the Operational overhead and complexities associated with the creation, management and repair of ZooKeeper Ensembles are removed by the Fabric Ensemble Manager.
First avoid, then if not possible , physically colocate tightly coupled components
Try to physically collocate the business service's software components that are tightly coupled to an Ensemble service, with the dedicated Ensemble service. This is simply achieved with the Service Fabric via the use of
resource contracts and fibre
Map Fabrics to your Organisational Structures
Service Fabrics are simple to create and have very low Operational overhead. Hence, rather than implementing one (or a few) very large Service Fabrics, each hosting many applications and associated Ensembles, consider implementing many smaller business aligned Fabric's each hosting a small number of inter-dependent applications and associated Ensembles.
Following these recommendations will help minimise unnecessary runtime dependencies, and help ensure consistency, predictability and minimise the impact of environmental failure: see Fundamentals for further discussion.