Skip to end of metadata
Go to start of metadata

 

For applications that require Apache ZooKeeper (e.g. for consensus or leadership election), the Service Fabric provides the `Ensemble Manager`. The `Ensemble Manager` enables the simple & rapid creation of one or more, fully secured, ZooKeeper Ensembles.

This section first introduces the `Ensemble Manager` , then provides an ZooKeeper/Curator based example, and then concludes with Ensemble Best Practices.

The Fabric Ensemble Manager

The Ensemble Manager dramatically minimises the Operational Complexities involved in creating, scaling, repairing and retiring ZooKeeper Ensembles.

Ensemble Views 

Ensembles currently hosted by a Service Fabric are listed under Entire's Ensembles view. 

Each Service Fabric has a dedicated Ensemble (e.g. myFabric), for private use by the Fabric's internal infrastructure services. By default the Service Fabric's own infrastructure Fibres are the members of this Fabric Ensemble.

Click on one of the listed Ensembles to view the ensemble's members.

Ensemble member information includes:

  • Service Id: The Ensemble member's ID.
  • Current Role [leader|follower|observer]: Each member's status.
  • Hosting Fibre: The name of the host Fibre.
  • Status: 
    • managed - The Ensemble member is both Expected and Active.
    • unmanaged - The Ensemble member has been discovered, but the Fabric Ensemble Manager did not expect it. This state can occur sometimes after failure, or when adding fibres that have residual state.
    • missing - The expected Ensemble member is missing; this most likely indicates that the hosting fibre has failed.
  • Actions:
    • Make Observer - A follower or leader is transformed to an observer. An Observer is the same as follower except that it does not participation the ZooKeeper election / voting process (see - https://zookeeper.apache.org/doc/trunk/zookeeperObservers.html). This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
    • Make Participant - Transform an Observer into a full Ensemble participant (i.e. a follower)and so able to participate in the Ensemble's voting protocol. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
    • Manage - Take an unmanaged ensemble member, and add it to the expected set so that it becomes managed. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
    • Delete - Delete this Ensemble member from the Ensemble. This action requires both the fabric ensemble and the target ensemble to have quorum (i.e. an active leader).
    • Force Delete - This action will proceed even if no quorum is available, and may leave the the ensemble in an unusable state.

ZooKeeper specific configuration for each Ensemble member may be seen by clicking on the Ensemble member's Server Id URI. 

Creating an Ensemble

To create a new Ensemble:

  1. Specify the Ensemble name in the dialogue box; e.g. ZKExample.
  2. From the list of available fibres select the initial member; e.g. spark-infra-1.1.
  3. Press the Create button.

In addition to name, the fibre list also displays each fibre's `Labels`. This information is useful when deciding the Ensemble's runtime topology: see Ensemble Best Practices.

 

As shown, the `ZKExample` Ensemble is created with one member ( spark-infra-1.1).

Each fibre selected to participle in an ensemble has a corresponding label set of the form  `ensemble-$Ensemble=member`: e.g. the spark-infra-1.1 fibre is labelled `ensemble-ZKExample=member`.

Membership of the `ZKExample` Ensemble may be expanded at any time by simply selecting and then adding more fibres.

The status of each ZKExample ensemble member is shown in the `ZKExample` ensemble view. 

Finally the `ZKExample` ensemble may be selected for removal, and then removed using the Delete button. However, don't do this, as the ZKExample ensemble is required in the following example.

The Fabric Ensemble (e.g. myFabric) may be expanded or contract down to one member, but cannot be removed.

 

A ZK Curator based Example

This example demonstrates lock-coordination using the Curator locking example (see https://curator.apache.org/curator-examples/). In addition to the ZooKeeper clients demonstrating the use of the Curator library, the web system.part also demonstrates the use of Web Sockets and the new OSGi Alliances http whiteboard specification.

ZooKeeper Example Source Code

The source code for the example is available at https://github.com/paremus/zookeeper-examples.

 

If you've already worked through the Hello Tutorial then the structure of the zookeeper-system  will be familiar.

  • The system is composed of two system.parts:  zookeeper-worker and zookeeper-web.
  • The Resource Contracts are defined such as:
    • zookeeper-worker instances are deployed to fibres labelled `AppLayer=true`.

    • zookeeper-web instances are deployed to fibres labelled `WebLayer=true`.
  • Zookeeper connection information is passed to both system.parts via the zk.connect property.
<property name="zk.connection" value="${zkClient#ZKExample}"/> 

 

The full zookeeper-system document is shown.

<?xml version="1.0" encoding="UTF-8"?><system xmlns="http://schema.paremus.com/sf/1.2" boundary="fibre" name="com.example.zookeeper-1.13.zookeeper-system" version="1.0.0.SNAPSHOT" repopath="https://nexus.paremus.com/content/repositories/snapshots/com/example/zookeeper-1/13/zookeeper-index/1.0.0-SNAPSHOT/zookeeper-index-1.0.0-20160721.151054-3.xml">
	<description>
		A simple websocket application with a set of 
		distributed workers using ZooKeeper locking.
	</description>
	<admin group="demo"/>

	<!-- Declarative Services ZooKeeper Worker -->
	<system.part name="zookeeper-worker">
        
		<config pid="com.example.zookeeper.worker">
			<property name="zk.connection" value="${zkClient#ZKExample}"/>
		</config>
		<contract features="(AppLayer=true)"/>
		<replication.handler name="scale" type="scalable">
			<property name="scaleFactor" type="float" value="1"/>
			<property name="minimum" type="integer" value="1"/>
		</replication.handler>
	    <system.part.element name="zookeeper-worker" category="osgi.bundle"/>
	</system.part>

	<!-- Remote Web viewer -->
	<system.part name="zookeeper-web">
		<config pid="com.example.zookeeper.web">
			<property name="zk.connection" value="${zkClient#ZKExample}"/>
		</config>
		
		<config pid="org.apache.felix.http">
            <property name="org.osgi.service.http.port" value="8192"/>
        </config>
		<contract features="(WebLayer=true)"/>
		<system.part.element name="org.apache.felix.http.jetty" category="osgi.bundle"/>
		<system.part.element name="zookeeper-web" category="osgi.bundle"/>
	</system.part>
</system>

 

To run the zookeeper-system:

  1. The  ensemble ZKExample created in the previous section will be required. If you deleted it, then recreate it. A ZKExample with one ensemble member is sufficient for running the example.
  2. Import zookeeper-system from the URL - https://nexus.paremus.com/content/repositories/public/com/example/zookeeper-1/13/zookeeper-system/1.0.0/zookeeper-system-1.0.0.xml.
  3. Set the `AppLayer=true` and  `WebLayer=true` labels on the appropriate fibres in your Fabric. 

  4. Deploy zookeeper-system
  5. The zookeeper-system URL endpoint is now visible via Entire's App view.. Click on the URL to access the example.

 

As shown each of the AppLayer fibres competes to acquire a lock from the ZKExample ensemble. The fibre that acquires the lock, decrements a shared counter and then releases the lock; then the process is then repeated.  All users of the application share a single countdown, and receive notifications of changes in state.

Ensemble Best Practices

When creating ZooKeeper ensembles with Ensemble Manager, the usual best practices still apply:

  1. Ensure that each Ensemble is formed from an odd number of members. This is required to ensure correct behaviour in simple partition events: i.e. after the environment has been partitioned into two regions; the region with the majority of ensemble members  `>N/2`  continues to be active. The other partitioned region with `<N/2` ensemble members dropping to a read-only behaviour.
  2. Keep ensemble size small: i.e. i.e. 3 (1 leader, 2 followers) or 5 (1 leader, 4 followers). 
    1. From a resilience perspective there is no reason to build large Ensembles. If an Ensemble member fails, this is simply replaced using the Fabric Ensemble Manager
    2. Also note that ZooKeeper performance rapidly decreases as the number of ensemble members increase. 
  3. If the solution requires many ZooKeeper ensemble participants, then make the majority of these `observers``Observers` are the same as `Followers` with the one difference that `observers` do not participate in the ZooKeeper voting protocol.


Real world failures can be significantly more complex than the simple partitioning scenario presented above. For this reason, Paremus make the following additional recommendations ...

Avoid tightly coupling data-centre applications to a single central consensus service

Create dedicated Ensembles for each business services that requires a consensus / lock-manager service. No matter how robust, a single centralised lock-manager used by all applications introduces a significant single point of failure within your data-centre environment. Also, remember that the Operational overhead and complexities associated with the creation, management and repair of ZooKeeper Ensembles are removed by the Fabric Ensemble Manager. 

First avoid, then if not possible , physically colocate tightly coupled components

Try to physically collocate the business service's software components that are tightly coupled to an Ensemble service, with the dedicated Ensemble service. This is simply achieved with the Service Fabric via the use of system.part resource contracts and fibre labels.

Map Fabrics to your Organisational Structures

Service Fabrics are simple to create and have very low Operational overhead. Hence, rather than implementing one (or a few) very large Service Fabrics, each hosting many applications and associated Ensembles, consider implementing many smaller business aligned Fabric's each hosting a small number of inter-dependent applications and associated Ensembles

Following these recommendations will help minimise unnecessary runtime dependencies, and help ensure consistency, predictability and minimise the impact of environmental failure: see Fundamentals for further discussion.


  • No labels