Frequently Asked Questions
-
Does JChannel have a limit on the size
of messages sent over it ?
-
Is it time consuming to port all of our
code from JChannel to EnsChannel?
-
I receive an error message when starting
EnsChannel 'Some IO garbage at ensemble outboard startup'. What does it
mean ?
-
When 2 EnsChannels start up, they do not
seem to 'find' each other (they don't form a process group).
-
How can I tunnel I firewall ?
-
What happens if I pull the plug on 2 members at
the same time ?
Q: Does JChannel have a limit on the size of messages sent over it ?
-
We are using JChannel. When reading image (*.gif), JChannel seems have
size limit. We have no problem to show tiny images, but we cannot show
large images. The wrong message is that UDP has limited size.
-
I have a question concerning with Java Groups. I am using JChannel to send
packets, using channel.Receive(0), etc. Have you ever experience massive
lost of packets. From what I am understanding, Java groups is running on
top of UDP, which is really unreliable. That means we will experience lost
of packets. Thanks.
A:
Currently (Nov 98) JChannel uses UDP to send/receive messages. UDP has
a size limitation which is the cause for the problem encountered. I intend
to write a FRAG layer which fragments larger messages into smaller ones
and defragments them at the receiver side. That layer can then just be
used on top of the UDP layer. There are 2 ways to overcome this problem:
-
Use EnsChannel or IbusChannel instead of JChannel when using packets larger
than 8KB in size. Both have fragmentation layers.
-
Write a FRAG layer yourself. This should not be very difficult. I'm currently
writing a paper that describes how to write protocol layers and use them
with JavaGroups. We welcome useful layers and will integrate and distribute
them together with JavaGroups.
Q: Is it time consuming to port all of our code from JChannel to EnsChannel?
A:
Absolutely not. Applications should be written against the Channel
abstract class. An actual implementation might for example be JChannel,
EnsChannel or IbusChannel. You can parameterize an application to choose
the desired stack subclass when started. Applications may also use instances
of each type of stack in the same application.
Q: I receive an error message when starting EnsChannel 'Some IO garbage
at ensemble outboard startup'. What does it mean ?
The exact error message is:
Waiting for the outboard process to start
java.net.ConnectException: Connection refused
Some IO garbage at ensemble outboard startup.
A:
Q: When EnsChannel starts up, it spawns the outboard executable
and then tries to connect to it via a socket. To ensure that outboard has
enough time to start and initialize, EnsChannel waits 2.5 seconds before
it tries to connect. There are 2 main problems that cause the above error
message:
-
The outboard executable cannot be found. Make sure it is in the
PATH.
-
2.5 seconds may be too short for outboard to start up. Therefore EnsChannel
cannot connect correctly to outboard via socket. The timeout can be increased
by changing file JavaGroups/Ensemble/Hot_Ensemble.java (look for
sleep(2500)).
Q: When 2 EnsChannels start up, they do not seem to 'find' each other (they
don't form a process group).
A:
There is probably no gossip daemon running. Refer to the Ensemble documentation
on how to start it. Also, check that the ENS_* environment variables
have been set correctly.
Q: How can I tunnel a firewall ?
A:
Okay, there are 2 things: a gossip deamon and a router.
-
The gossip daemon is used to register channels, and keep track of channels
and groups. Channels periodically register with the gossip daemon. When
a registration from a channel hasn't been received for a certain period
of time (10 secs), the channel is dropped. New channels query the gossip
daemon for initial membership. The gossip daemon is used when IP multicast
is disabled. Otherwise, IP multicast would ping to a well-known IP mcast
address to find the initial membership.
-
The router is used to tunnel traffic through a firewall using TCP. Your
stack has to contain a TUNNEL layer at the bottom, instead of a UDP layer.
TUNNEL establishes a TCP connection with JRouter, and sends outgoing packets
over that connection, and receives incoming packets.
In your case, I would use both the gossip daemon and the router. You would
start the components in the following order: 1. Start gossip daemon: JavaGroups/JavaStack/GossipServer
(starts on port 12001 by default) 2. Start JRouter: JavaGroups/JavaStack/JRouter
(starts on port 12002 by default) 3. Create your channel: new JChannel
The channel properties in this case have to be defined as follows: "TUNNEL(router_host=janet.cs.cornell.edu;router_port=12002):"
+ "PING(gossip_host=janet.cs.cornell.edu;gossip_port=12001):FD:GMS"; 'janet.cs.cornell.edu'
would have to be replaced by the hostname on which you run gossip and JRouter.
When starting a new channel, you would see messages in both the gossip
server's window, and the JRouter. These messages would tell you what happens.
Q: What happens if I pull the plug on 2 members at the same time
?
A:
This should not be a problem as long as there are group members around.
If you pull the plug on 2 participants P1
and P2 (none of them is the coordinator), then the coordinator will
mcast 2 new views: V1 excludes P1 and V2
excludes P2 (or P2 and then P1, depending on which member is suspected
first. In any case, there will not be a
view which excludes P1 and P2 at the same time. This can only happen
when you do the following: create a
group with ca. 7 members (P0 - P6). Kill P4 and P5 and immediately
afterwards make P6 leave regularly (e.g.
press 'leave' on Draw). The leave protocol tries to flush all pending
mcasts and therefore sends a FLUSH to all
members including P6. However, while doing so, it detects that both
P4 and P5 have failed. Therefore it excludes
them dynamically from the FLUSH destinations, so the FLUSH is only
sent to P0-P3 and P6. This results in a view
that excludes 3 members at the same time (P4,P5,P6).
If you pull the plug on both the coordinator and a participant, the
following happens (depending on who is
suspected first): if it is the coordinator, another member will take
over. Then the participant is suspected. The new
coordinator will the exclude the participant and mcast a new view.
If the part is suspected first, since there is no
coordinator to handle this, the SUSPECT events regarding the part will
go unheard. However the SUSPECT
events wrt coord will be handled and a new coord will be elected. Only
then will the SUSPECT events regarding
the part be handled (by the new coord) and the part will be excluded.
It gets a bit trickier if the failed part is the one who would take
over the coordinator role. But essentially, this just
lasts a bit longer, but still works correctly.
This page is hosted by