Skip to end of metadata
Go to start of metadata


 

JVM & Virtual Machines - Performance Considerations

If using Java in conjunction with virtual machines one needs to consider the following.

  • As the number of JVM instances per VM is increased, more overhead/cost is incurred as each additional JVM is initialised on start.
  • Memory in Java is managed inside the JVM and so efforts by the host virtual machine to 'optimize memory usage' by removing pages, will degrade Java application performance across the population of co-located JVM's.  

If JVM's are co-located, then the required heap size for each JVM should be specified, and sufficient physical memory allocated to the underlying virtual machine to meet the total sum of JVM heap and operating system memory requirements.

To side step these considerations, Virtual Machine vendors usually suggest that instead of stacking multiple JVMs within a virtual machine, one should instead host more application instances per JVM by increasing the size of the JVM; adding more threads and enlarging the heap. However this approach simply shifts the focus of the problem from the VM to the JVM. Large heap sizes (beyond 4GB) may be required, and these may be problematic as they impact JVM performance via the increasing cost of garbage collection. Custom JVM's may be purchased with more efficient garbage collection to address the performance degradation; but this does not address the issue that multiple applications are now packed within a single failure domain.

Paremus recommend that the number of Service Fabric Fibre instances (JVM instances) per Atlas Agent (Physical / Virtual resource) should be ideally limited to 1; but no more than 4. If this upper limited needs to be exceeded your requirements should be discussed with Paremus.

Linux

'GConf ...' D-BUS Library (RedHat 6)

The Linux package dbus-1.2.24-7.el6_3.x86_64 used by desktop applications such as GNOME and EMACS is know to cause the JVM to crash on RedHat 6 versions of Linux. This package should be uninstalled and servers being used as Service Fabric Fibres. 

Note: No longer seems the be a problem when using Java 8.

Use of 127.0.1.1 Loopback (Debian/Ubuntu)

As reported in http://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution

The IP address 127.0.1.1 in the second line of this example may not be found on some other Unix-like systems. The Debian Installer creates this entry for a system without a permanent IP address as a workaround for some software (e.g., GNOME) as documented in the bug #719621.
The <host_name> matches the hostname defined in the "/etc/hostname".

Using the 127.0.1.1 loopback address bound to the /etc/hostname with cause Service Fabric Fibres to fail to communicate with each other. 

The /etc/hostname must be bound to the public network interface that you expect the Service Fabric fibre to communicate over.

 

Atlas

Service Fabric start-up failure - opt/container/bin/posh: line 65: exec: java: not found

Ensure Java is installed on the nodes and JAVA_HOME set

 

% atlas -f RAN -update=simple
ACTION          HOSTID          CONTAINER       REASON        
start           test1[1]        simple-infra    [1..1] (!(group=*))
 
[test1:install simple-infra.1] Running
[test1:start simple-infra.1] Error: start failed. check -host-status.
 
%
%
% atlas  --host-status=test1
filterAttrs {os.name=Linux, os.arch=x86, host=test1, os.version=2.6.25-14.fc9.i686}
definition null
fabric null
group null
lease null
max 1
uri atlas://test1:4433?192.168.0.221
simple-infra.1 EXIT 127
simple-infra.1.log -----------------------------------------------------
opt/container/bin/posh: line 65: exec: java: not found

 

The Service Fabric expects Java to be either on the PATH, or specified by the JAVA_HOME environment variable. Typically these are either set system-wide in /etc/environment or /etc/bashrc, or set by an individual user in ~/.bashrc.

Atlas is invoked by /etc/init.d/atlas using ' su -c paremus .... ', so it should inherit the environment of the ' paremus ' user. Alternatively, you can set JAVA_HOME in /opt/atlas-1.1.5/etc/atlas_env.sh.

If a change is made to bashrc or atlas_env.sh, Atlas must be restarted for the new environment to take effect. 

'libgcc_s.so.1 => not found' – or equivalent errors relating to missing 32-blt libraries.

Check missing Atlas dependencies on the Unix platform 

# ldd linux/atlas-agent
linux-gate.so.1 =>  (0x00838000)
libresolv.so.2 => /lib/libresolv.so.2 (0x00b31000)
libpthread.so.0 => /lib/libpthread.so.0 (0x005b6000)
libdl.so.2 => /lib/libdl.so.2 (0x0053e000)
libgcc_s.so.1 => not found
libc.so.6 => /lib/libc.so.6 (0x00653000)
/lib/ld-linux.so.2 (0x00fc3000)

 

Install missing library

 

# yum provides /lib/libgcc_s.so.1
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.linux.duke.edu
* extras: mirrors.gigenet.com
* updates: mirror.steadfast.net
libgcc-4.4.7-3.el6.i686 : GCC version 4.4 shared support library
Repo        : base
Matched from:
Filename    : /lib/libgcc_s.so.1
 
# yum install libgcc.i686

 

Installation

I do not have remote display access to the node to which I intend to install the Service Fabric. Given that I cannot run lzPack on the node, how should I install the Service Fabric software?

The lzPack has the option of creating an install script which may be used to install the Service Fabric to a compute node using the command line; see headless installation.

Troubleshooting Multicast 

If no initial peers are set, the Paremus Service Fabric will attempt to use IP multicast to discover fibres. 

By default the discovery process uses the following IP address and ports:

      224.0.1.84 jini-announcement - port 4160
      224.0.1.85 jini-request - port 4160

If you intend to use Multicast and find that a subset of your fibre population is unable to join a Service Fabric, check that multicast is configured appropriately.

Also note; when using a single machine there are configurations in which multicast packets are not propagated between processes.

Testing Multicast with the nicnack tool.

nicnack.zip ( click to download ) is a lightweight command line tool that can be used to provide initial diagnostics of network & host support for IP multicast. nicnack is network interface aware, i.e. it provides separate information for each network interface.

nicnack may run in one of three modes:

List mode*

This just lists all network interfaces on the machine, along with ip and host name information. For example:

./nicnack.sh list
NICs:

display name: vmnet8
name: vmnet8
ipv6: fe80:0:0:0:250:56ff:fec0:8
ip: 172.16.206.1

display name: vmnet1
name: vmnet1
ipv6: fe80:0:0:0:250:56ff:fec0:1
ip: 192.168.32.1

display name: wlan0
name: wlan0
ipv6: fe80:0:0:0:211:50ff:fe1b:d845
ip: 192.168.2.2

display name: lov
name: lo
ipv6: 0:0:0:0:0:0:0:1
ip: 127.0.0.1 -> localhost

 

Send mode

In this mode nicnack multicasts a custom message every two seconds to a specified multicast group and port. This can be read by other nicnackinstances in receive mode.

> ./nicnack.sh send 225.123.123.123 12345 hello
vmnet8: preparing
vmnet1: preparing
lo: preparing
wlan0: preparing
vmnet8: configured socket
vmnet1: configured socket
wlan0: configured socket
vmnet8: sending
vmnet1: sending
vmnet1: sent hello
wlan0: sending
wlan0: sent hello
vmnet8: sent hello
lo: configured socket
lo: sending
lo: sent hello
vmnet1: sent hello
vmnet8: sent hello
lo: sent hello
wlan0: sent hello
vmnet1: sent hello
wlan0: sent hello
vmnet8: sent hello
lo: sent hello

 

Receive mode.

In this mode nicknack listens for and displays messages sent to a specified multicast group and port by nicknack instances in send mode. Used in conjunction with Send mode this can be used to establish whether or not multicast traffic is successfully propagated between two hosts. Note that multicast visibility is not symmetric, i.e. host A's ability to send multicast packets to host B does not imply host B's ability to send packets to host A. A sample from a receive mode nicknack session follows.

> ./nicnack.sh receive 225.123.123.123 12345
vmnet8: preparing
lo: preparing
wlan0: preparing
vmnet1: preparing
wlan0: configued socket
wlan0: receiving
vmnet1: configued socket
vmnet1: receiving
vmnet8: configued socket
vmnet8: receiving
lo: configued socket
lo: receiving
wlan0: received 'hello'
vmnet1: received 'hello'
vmnet8: received 'hello'
lo: received 'hello'

 

If, having run the nicknack utility, you suspect that IP multicast may be the issue, the following two areas should be looked at in more detail.

Firewalls

The security firewall on one, a subset, or all of your machines that are running the Paremus Service Fabric environment may be configured by default to block IP multicast traffic.

  • Linux - To enable multicast send / receive capability for Linux systems, insert the following entry into the operating system's iptable
INPUT -d 224.0.0.0/4 -j ACCEPT
  • Windows - In the case of Windows XP, by default, the Group Policy settings for the Windows Firewall are "Not Configured" for all objects. This allows the Windows Firewall to use its default settings, which are quite restrictive. With respect to multicast the default settings prohibits unicast response to multicast or broadcast requests.

On the relevant machines, edit Network > Network Connections > Firewall and set a disable policy from the following options.

Prohibit - unicast response to multicast or broadcast requests

Not Configured - The incoming unicast response is accepted if received 
within 3 seconds. The setting can be overridden by a local administrator.

Enabled - The incoming unicast response is dropped. 
This cannot be overridden by a local administrator.

Disabled - The incoming unicast response is accepted if received within 3 seconds. 
This cannot be overridden by a local administrator.

For other Firewall products or alternative Microsoft operating system versions. Check relevant documentation. 

Network Configuration

Simple layer II network switches treat multicast traffic in the same manner as broadcast traffic, that is, they will forwarded multicast packets to all active switch ports. If your Paremus Service Fabric test machines connect to a layer II network, comprising of one or more simple layer II ethernet switches (these interconnected without intervening layer III routers), then the network is unlikely to be the cause of IP multicast connectivity issues.

In more sophisticated environments, network infrastructure supports a mulitcast protocol known as IGMP. Within an IGMP enabled network environment, traffic associated with a multicast group is only forwarded to ports that have members participating in that group. A layer-2 switch supporting IGMP snooping can passively snoop on IGMP Query, Report and Leave (IGMP version 2) packets transferred between IP Multicast Routers/Switches and IP Multicast hosts (on each switch port), to learn the required IP Multicast group membership required by each port. The advantage of using IGMP snooping is that it generates no additional network traffic, whilst significantly reduce multicast traffic passing through your switch - as all multicast is only targeted to hosts that have registered interest in the multicast group.

If Paremus Service Fabric functions correctly when run on a single machine and also when run in a distributed environment with machines connected via a simple layer II network, but fails in a more complex network environments, then multicast configuration of the network is the the most likely cause.

In such circumstances politely explain the problem to your network administrators. The network administrators will be able to help you diagnose the issue in greater detail, and may be willing/able to disable IGMP snooping on the relevant switches to verify whether or not IGMP is a contributing factor.

 

Service Fabric License Management.

How do I install / update a Service Fabric license?

Usually the Service Fabric license is installed during the lzPack installation process process. If for some reason the appropriate license.ini file is not available at installation time, it can be subsequently copied into the $install_root/etc directory. The Fibre image should then be re-built as explained here. The same process is used to update an expired license.

System Documents

How do I request multiple instances of a Managed Service Factory part?

Managed Service Factory (MSF) parts can be created as follows:

<system.part category="msf" name="com.example.hello">
    <property name="language" value="en"/>
</system.part>

In this example, the name attribute specifies both the name of the element within the document and the Persistent Identity (PID) of the configuration record.

In order to create multiple configuration records for the same PID, it is necessary to create two parts with different names, and override the name/PID mapping. This can be done by setting the part attribute as follows:

<system.part category="msf" name="hello-english" part="com.example.hello">
    <property name="language" value="en"/>
</system.part>

<system.part category="msf" name="hello-german" part="com.example.hello">
    <property name="language" value="de"/>
</system.part>

Note that the name attribute is different, in order two create two distinct records, but both records have the same part value and therefore both map to the configuration PID com.example.hello.

Cannot connect to Fabric Management

If the redirector fails to locate Entire, it will print an error message: "Unable to redirect". This could happen if:

  • There is no Infra Fibre currently running in the Fabric.
  • Or a network issue prevents the selected Fibre from seeing the active Infra Fibre(s).

 

Repositories blocked by Proxy

When using Atlas place the equivalent of the following in the $FABRICHOME/var/atlas start scripts.

System.setProperty("http.proxyHost", "localhost");
System.setProperty("http.proxyPort", "8080");

 

Alternatively if you are using an Environment configuration file use the following JVM start flag.

-Dhttp.proxyHost=XX.XX.XX.XX -Dhttp.proxyPort=8080

 

 

Viewing Port Usage

View ports currently in use by a Fibre, using the Unix lsof command:

$ lsof -P -p <PID> | grep LISTEN
java    14523 derek   12u  IPv6 0x12fd65a8       0t0       TCP *:55087 (LISTEN)
java    14523 derek   15u  IPv6 0x13266720       0t0       TCP *:55088 (LISTEN)
...
...
...
java    14523 derek   41u  IPv6 0x13329664       0t0       TCP *:55094 (LISTEN)
java    14523 derek   53u  IPv6 0x1331eff4       0t0       TCP
[::10.0.1.9]:9012 (LISTEN)
java    14523 derek   54u  IPv6 0x131bce4c       0t0       TCP *:9013 (LISTEN)
java    14523 derek   56u  IPv6 0x12fd7ff4       0t0       TCP *:55114 (LISTEN)
java    14523 derek   59u  IPv6 0x1304c720       0t0       TCP *:55119 (LISTEN)
...
...
...
java    14523 derek  151u  IPv6 0x12fd6a70       0t0       TCP *:9021 (LISTEN)
java    14523 derek  154u  IPv6 0x1326480c       0t0       TCP *:55203 (LISTEN)

 

Setting Port Ranges

The base.Port is value is defined in /etc/posh/environment. Each Fibre uses a default port range starting at the base port and then incremented when each additional Fibre is started on the same host: (9000 + (100 * fibre instance-id). 
system:setProperty posh.basePort (expr 9000 + 100 * $INSTANCE) 

 

This value is then used as follows in etc/init.d/00-config

####################################################################
# PORT & NETWORK SETTINGS
####################################################################

setcfg basePort (system:getproperty posh.basePort)
basePort = (getcfg basePort)

# http server port
setsys org.osgi.service.http.port (expr $basePort + 0)

# jmx server port
setcfg jmxPort (expr $basePort + 1)

# port used by clients outside the fabric to discover service locations
# locationPort must also be changed in FabricAdmin/config.ini
setcfg locationPort 49150

# limit the port ranges used to export remote objects
setcfg minPort (expr $basePort + 10)
setcfg maxPort (expr $basePort + 99)
#setcfg maxPort 65535
 
  • The locationPort parameter is used by the manage:detect command; locationPort is not offset by instance-id, as when multiple Nimble nodes are running on the same host, only one of them is required to use the locationPort.
  • minPort and maxPort control the range of dynamically allocated ports for each Fibre.
  • The httpPort is the default value used by an HTTP service if not overridden by the OSGi Configuration Admin (usually in etc/nimble/cm.policy).

The default is to assign dynamic ports in the range 9000 -> 9099. However, this can be limited to a smaller range.

One may run the risk of port starvation if a too restrictive port range is used and too many services are installed into the same Fibre. Before modifying these values changes should be discussed with your Paremus Support Engineer.

  • No labels