The Lag And Netsplit FAQ

This FAQ is aimed at answering questions that are asked about lag and netsplits as part of the Undernet User Committee's Document Project. If you have any queries or comments, please email documents@undernet.org.


Contents

1.   LAG


The first thing to look at is server structure. Consider a basic series of IRC servers:
A - B - C - D

This network is extremely basic. A large network such as the Undernet would look like the diagram below.

G H I J K - L M N \| | / \ / | A - B - C - D - E - F \ / \ \ \ O P Q R S - T | / \ U V W

For the purposes of this explanation, however, we will remain with the original four servers in the series to make things easier. By looking at the diagram, we can see that for data to get anywhere, there is only one route of travel. So if someone sends data from server A to server D, it can only go from A to B, to C, and then on to its destination at D. The point is that it has to travel and travel takes time. If there is a bad connection between B and C, it would slow the data down, and as a result, the whole journey would take longer.

Now, let's look at lag itself. Say you ping someone (/ctcp nickname PING) and it returns with the following:

[SweetNes PING reply]: 37 secs....

We would assume that SweetNes is lagging by 37 seconds; that is, it takes 37 seconds for data to reach SweetNes and return. Let's also assume that on a day where lag is particularly bad, the time difference between the servers may look like this:

A ------- B -------- C -------- D

9 sec 6 sec 10 sec

Since lag is only proportional, someone pinging someone on server D from server A would get a ping reply of 50 seconds. This is the total of all the time differences multiplied by 2. So it would take 50 seconds for data (the ping) sent to server D to reach its destination and return, and someone pinging someone on server B from server D would get a reply of 32 seconds. The point is that lag is proportional, and therefore no one really lags, except to each other.

Let's look at this in an example:

A ------- B -------- C -------- D

1 2 3 4

If we have one person on each of our servers, and you, on server A, ping them all, the ping replies would look like this:

[1 PING reply]: 0 seconds
[2 PING reply]: 18 seconds
[3 PING reply]: 30 seconds
[4 PING reply]: 50 seconds

Now, let's look at where the problems arise. We have two channel ops, one on server A (op 1) and one on server D (op 2).

A ----- B ------ C ------ D
9 sec 6 sec 10 sec op 1 op 2

A user joins the channel from Server B. 9 seconds after the user joins, channel op 1 on server A sees the user and ops him. Then 7 seconds later, channel op 2 on server D sees the user join and ops him. So all the users on server A think that channel op 1 opped the user, and all the users on server D think that channel op 2 opped the user. The user himself would, of course, think that channel op 1 opped him because that channel op is closer time-wise.
This means that even when lag is particularly high, conversations can quickly become confused and intolerable. Users can avoid the problem by simply moving to a common server. Ideally the lag from server to server is in the milliseconds, in that case, it isnít usually a problem. However, sometimes due to problems with the network as a whole, they can range up to 20 or more seconds from server to server. There are several reasons for these delays, such as overloaded servers, too many users on a server, server abuse (i.e., flooding and wasting bandwidth), congested links (usually during peak times),or poor routing.
Sometimes, the problem can be with parts (hops) of the Internet itself. This can affect servers and can also cause a user to be lagged to all the servers. If you find that you are lagged to everyone on a channel, you can try reconnecting to your ISP (Internet Service Provider). Sometimes simply changing servers will change the path you take to Undernet in a way that bypasses an Internet bottleneck and therefore eliminates your lag.

2.   NETSPLITS


A netsplit occurs when two servers lose contact with each other. Usually the servers have a noticeable time difference between them, which grows to a point where they can't exchange data quickly enough, so the servers physically split. But other than that, very little happens. Using the 4 simple servers as an example, let's assume that servers A and B split.

A -------- B ------- C ------- D before

A --- --- B ------- C ------- D after

>From the point of view of the people on servers B, C, and D, the people on server A have left IRC, and vice versa for the people on server A. A typical quit message during a netsplit will look something like this:

[18:40] *** Quits: WHIZZARD *netsplit

This illustrates that the 2 servers have split apart. But remember: it's all relative. To the rest of the network, it looks as if everyone on server A and server B have quit, and to those users on server A and server B everyone else has quit.
The IRC Operators (IRCops) whose function is to deal with routing and server administration will usually attempt to reconnect the server or servers to the other side of the network; that is, if the cause of the split has been solved or if it is worth making the reconnection. It may be that the lag times are so poor that it is not worth re-connecting until the problem is completely solved.

Many of the "less intelligent" users try to look for netsplits when they occur. They try to move to the side with the fewest people and join a channel, which is empty on one side of the netsplit but very busy on the other. This way they think they automatically gain ops.
However, because the new servers use new technology, this is now impossible. They will be de-oped as soon as the servers rejoin.

When the servers do join after a split, the servers they connect to will not necessarily be the ones they were connected to before the split. Looking at the example again, they might now look like this:

B ---- C ---- D ---- A

Now all the tables are turned, because the people that were lagging to you are not any longer and vice versa. That's the nature of lag: sometimes it's bad, but usually you never have to worry about it.
On the Undernet for the most part, lag times are very good except when there are global problems such as routers crashing or segments of the Internet is experiencing problems which no one can do anything about.

3.   DCC (Direct Client To Client)


Since lag involves servers, to eliminate lag you must eliminate the server, and DCC does exactly that. When you start a DCC Chat session with someone, what happens is you send a bit of data to them saying you want to chat. The IRC clients select a port and deal with connecting, and all you have to do is acknowledge the DCC Chat request.
Then all the data during the conversation will be passed from client to client directly. Usually when you request a session, it takes time to set up, since the initial correspondence goes through the server. However once a DCC Chat session has been initiated, you can immediately see the benefits.

It is not very wise, however, to start up DCCs with people you donít know. Most people will reject them, since on some clients they are hard to manage. When you send a file by DCC, it follows the same method as the above to reduce server workload and make the whole operation more efficient.


Copyright © 2002 Undernet User Committee