
Building a Connected Enterprise for Ingram Content Group | Pragma Edge
Building a Connected Enterprise for Ingram Content Group What Our Clients Say “The Pragma Edge team’s collaboration and expertise were key to
High availability of Global Mailbox is limited during some situations, for example when a data center goes down or when one or more nodes in a data center are not operational.
Protocol action or trading partner action constitutes uploading or downloading a message or file to or from Global Mailbox.
Availability or unavailability of some Global Mailbox components impact the availability of protocol actions. The following sections explain the same in detail:
How Cassandra impacts availability of protocol actions
With delayed (asynchronous) replication (default setting), 50% + 1 of the nodes in the data center where the action is performed must be available. For example, if your data center has 3 Cassandra nodes, then 50% +1 would be 1.5 + 1 = 2.5, which must be rounded off to 2. Therefore, 2 Cassandra nodes must be available for the protocol action to complete. The nodes in other data centers are not needed for the request to be successful.
Let us look at it with another example. If your data center has 4 Cassandra nodes, then 50% + 1 would be 2+1 = 3. Therefore 3 Cassandra nodes must be available for protocol action to complete.
With immediate (synchronous) replication, 50% + 1 of the nodes across the data centers must be available, and two of the available nodes must be in the data center where a user is doing an upload or download. Consider an environment with 3 nodes in data center 1 and 3 nodes in data center 2. Totally, there are 6 nodes in both the data centers. To meet the 50%+1 requirement, 4 nodes are required. Out of the 4 nodes, 2 nodes must be in the data center where the user is uploading or downloading a file.
Immediate replication requires that the file be replicated to at least one other data center. To achieve this the other data center must have 50% + 1 nodes up and running. If the nodes are not available, the upload fails.
In both the scenarios, it is required to have 2 nodes in the data center where a user is performing an action. To ensure that Global Mailbox can survive the loss of a node in the data center, we must add an additional node. This means we must have a minimum of 3 nodes in each data center.
How ZooKeeper nodes impact availability of protocol actions
During normal functioning of the Global Mailbox, a minimum of 2 ZooKeeper servers must be operational across the data centers for trading partner action or protocol actions to complete. The ZooKeeper watchdog process must also be running on all ZooKeeper nodes so that ZooKeeper failures can be detected, and recovery with a reduced ensemble can be initiated. While the watchdog is setting up the reduced ensemble, there is a window of time where trading partner action fails. The action proceeds after the reduced ensemble is started. If the number of running ZooKeeper servers reduces to 1 or 0, all trading partner actions fail until the number of ZooKeeper servers comes back to 2 or more.
How shared disk impacts availability of protocol actions
When the Sterling B2B Integrator Global Mailbox Client Adapter starts, it needs to load the configuration files on the shared disk. If the shared disk fails before the configuration file is loaded, protocol actions cannot be completed. If the shared disk fails after the configuration file is loaded, all files regardless of any size will fail unless there is already an active session before the shared disk is unavailable.
How replication server and payload replication type impact the availability of protocol actions
If your replication type is delayed (asynchronous), you can upload files of any size to the data center when the replication server is down in the local or other data centers. Files are replicated later when the replication server becomes operational.
synchronous), non-inline payloads must be replicated to at least one other data center for the upload to be successful. For example, if you have uploaded a file in data center 1, then the following components must be operational:These components are required so that the Global Mailbox management node in data center 2 can pull the file into the shared disk in data center 2 (local shared disk). The Global Mailbox management node pulls the file by using the replication server in data center 1.
Mailboxes can be created by using a protocol because those tasks are independent of payload replication.
How WebSphere MQ impacts availability of protocol actions
When files are uploaded, a message is created on WebSphere® MQ that an event should be created. If WebSphere MQ is down, the message is not created and the upload fails.
File downloads are not impacted by WebSphere MQ availability.
Messages are processed in the data center in which they are received. A message or file need not be replicated before the event is raised.
Processing of messages is performed by the Event Rule Adapter in Sterling B2B Integrator. The following sections outline the availability aspects of Event Rule Adapter in relation with the availability or non-availability of other Global Mailbox components:
How Cassandra impacts message processing
To process messages, the data center where a file is uploaded must have a majority of Cassandra nodes available. For example, if you have 3 nodes in data center 1, for messages to be processed in data center 1, you must have 2 nodes up in data center 1.
How ZooKeeper impacts message processing
During normal functioning of Global Mailbox, a minimum of 2 ZooKeeper servers must be operational across data centers for message processing to complete. The ZooKeeper watchdog process must also be running on all ZooKeeper nodes so that ZooKeeper failures can be detected, and recovery with a reduced ensemble can be initiated. While the watchdog is setting up the reduced ensemble, there is a window of time where message processing fails. The processing proceeds after the reduced ensemble is started. If the number of running ZooKeeper servers reduces to 1 or 0, all trading partner actions fail until the number of ZooKeeper servers comes back to 2 or more.
How shared disk impacts message processing
A file is processed in the data center in which it is received. The shared disk in which the file is received must be available to process the file in that data center.
How replication server and payload replication type impact message processing
Message processing is not dependent on replication server .
How WebSphere MQ impacts message processing
A multi-instance (active, passive) queue manager is required to ensure that events can be sent from Global Mailbox to Sterling B2B Integrator for processing.
WebSphere® MQ must be available in the data center where a file is uploaded for Global Mailbox to publish events (producer side) to it and for Sterling B2B Integrator to consume events from it (consumer side). For a Sterling B2B Integrator node to process a message, it must connect to the local WebSphere MQ queue to get those messages.
Global Mailbox management node provides the Global Mailbox administration user interface. At least one Global Mailbox management node must be available in a data center for an administrator to use the administration user interface.
Global Mailbox management node also performs non-inline payload replication. If any Global Mailbox managed node is not operational in a data center, then that data center cannot replicate non-inline payloads for files that are uploaded to other data center. Therefore, not having a Global Mailbox management node operational in a data center is equivalent to not having any replication server operational in the data center.
Global Mailbox management node also runs the scheduler. Some scheduled jobs operate on resources that belong to the local data center, for example, PayloadPurgeJob. If there are no Global Mailbox management nodes running in the data center, then the jobs that operate on the local data center resources do not run until a node is started on that data center.
How Cassandra impacts availability of Global Mailbox management node
For Global Mailbox management node to be available, the local data center must have a majority of Cassandra nodes operational.
How ZooKeeper impacts availability of Global Mailbox management node
For Global Mailbox management node to be available, a minimum of 2 ZooKeeper servers must be operational across all data centers.
Payload replication constitutes creating a copy of the message payload that is uploaded to one data center, in all other data centers.
Availability or unavailability of some Global Mailbox components impact the availability of the payload replication. The following sections explain the same in detail:
How Cassandra nodes impact availability of payload replication
For inline payloads, replication is done by Cassandra. Payload does not appear in another data center unless the Cassandra nodes are up in that data center. If the Cassandra nodes are down when the file was uploaded, you must run a repair to replicate the payload to the nodes that were down.
For non-inline payloads, after payload is received by a data center, the Global Mailbox management nodes in other data centers detect the new payload and pull them into their resident data centers. The Global Mailbox management nodes query the Cassandra database looking for new messages that have non-inline payload. 50% + 1 of the nodes in the local data center must be up for the Global Mailbox management node to replicate payload to the local data center.How ZooKeeper nodes impact availability of payload replication
During normal functioning of the Global Mailbox, a minimum of 2 ZooKeeper servers must be operational across the data centers for payload replication to complete. The ZooKeeper watchdog process must also be running on all ZooKeeper nodes so that ZooKeeper failures can be detected, and recovery with a reduced ensemble can be initiated. While the watchdog is setting up the reduced ensemble, there is a window of time when replication fails. Payload replication proceeds after the reduced ensemble is started. If the number of running ZooKeeper servers reduces to 1 or 0, payload replication fails until the number of ZooKeeper servers comes back to 2 or more.
How shared disk impacts availability of payload replication
The shared disk is not used to store inline payload, therefore the replication of these payloads is not dependent on the shared disk.
Non-inline payloads are stored in shared disk. The shared disk must be available for the Global Mailbox management node to pull the payload from another data center to the local data center.
How replication server impacts payload replication
For inline payloads, there is no dependency on replication server, because inline payloads are replicated by Cassandra.
For non-inline payloads, the Global Mailbox management node pulls payloads from another data center to the local data center. To do so, at least one replication server must be operational in the local and the other data center.
How WebSphere MQ impacts payload replication
Payload replication is not dependent on WebSphere® MQ.
Global Mailbox offers command line utilities to modify default configuration with Sterling B2B Integrator, configure Cassandra database, configure data centers, scheduler, and storage passphrase.
appConfigUtilityschedulerConfigUtilitystoragePassphrasedbConfigUtilitydcConfigUtilitymasterPassphraseAvailability or unavailability of some Global Mailbox components impact the availability of command line utilities. The following sections explain the same in detail:
How Cassandra impacts availability of command line utilities
For command line utilities to function, 50% + 1 number of Cassandra nodes must be available in the data center where you run the command.
For masterPassphrase, 50% + 1 number of Cassandra nodes must be available in each data center in the Global Mailbox topology.
How ZooKeeper nodes impact availability of command line utilities
During normal functioning of the Global Mailbox, to use the command line utilities, a minimum of 2 ZooKeeper servers must be operational across the data centers. The ZooKeeper watchdog process must also be running on all ZooKeeper nodes so that ZooKeeper failures can be detected, and recovery with a reduced ensemble can be initiated. While the watchdog is setting up the reduced ensemble, there is a window of time when the utilities cannot be used. However, the utilities can be used after the reduced ensemble is started. If the number of running ZooKeeper servers reduces to 1 or 0, the command line utilities cannot be used until the number of ZooKeeper servers comes back to 2 or more.
How shared disk impacts availability of command line utilities
The command line utility uses the configuration files that are stored in the shared disk. Therefore shared disk must be available in the data center where you run the command.
How replication server impacts the availability of command line utilities
Command line utilities are not dependent on replication server .
How WebSphere MQ impacts command line utilities
Command line utilities are not dependent on WebSphere® MQ.
In general terms, split-brain arises when two or more data centers cannot connect and communicate with each other. Split-brain is also called as network partition.
In most of the cases, split-brain can lead to lack of availability and data inconsistency issues. In Global Mailbox, split-brain happens when Cassandra nodes in one data center cannot communicate with the Cassandra nodes in another data center. The communication is lost primarily because of network issues.
Cassandra servers in all data centers work together to provide a consistent view of data. They do this by communicating over the network to replicate data as quickly as possible, ensuring the data is kept as up to date as possible on all servers. If the servers in a data center cannot communicate with servers in the other data centers, this might lead to operations and updates done on out-of-date data. Conflicting operations might be performed on each side of the partition. For example, a user might remove a user mailbox permission in data center 1, but someone else might add permissions for that user in data center 2. Since there is a network partition, the data is out of sync.
After the partition is resolved, Cassandra attempts to resolve any conflicts on the objects that are created, updated, and deleted during the partition. It chooses the change that has the latest time stamp.
What you cannot do during split-brain
During a split-brain, you can only disable or enable an event rule.
Global Mailbox behavior during split-brain
allowDuplicates=false. This is because the system cannot reach the other data center to verify the message. However, upload of duplicate messages in the local data center is not allowed if allowDuplicates=false.A quorum of data centers is used to ensure that synchronous replication is achieved even when replication is not completed across all data centers. A quorum is a majority of the number of data centers in a setup. The quorum is automatically set when you install or upgrade your setup.
The quorum is calculated by halving the number of data centers plus 1.
quorum = (number of data centers)/2 + 1
Example
If your setup has 5 data centers, your quorum would be 3.
IBM Global Mailbox - System principles
IBM Global Mailbox - Technical overview
Global Mailbox deployment considerations
Browse Categories
Share Blog Post

Building a Connected Enterprise for Ingram Content Group What Our Clients Say “The Pragma Edge team’s collaboration and expertise were key to

IBM Sterling for Modern B2B Integration, EDI, and Managed File Transfer: A Practitioner’s Guide Introduction B2B integration today is more than just

AI-Powered Loan Origination: Exploring Two Agentic AI Architectures with Camunda 8 In the rapidly changing world of financial services, automating and simplifying
| Cookie | Duration | Description | 
|---|---|---|
| cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". | 
| cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". | 
| cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". | 
| cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. | 
| cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". | 
| viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. | 
Thank you for submitting your details.
For more information, Download the PDF.
Thank you for registering for the conference ! Our team will confirm your registration shortly.
Invite and share the event with your colleagues
IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges
IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges
IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges