networks

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.

From: Chris Watt <cnww@hfx.andara.com>
Date: Monday, September 13, 1999 7:43 AM
Subject: RE: OT?: Future of Linux

At 08:14 PM 9/13/99 +1200, you wrote:

> I agree with most of what you've said, but this piqued my curiosity:
>
> How do you get your Linux box to do 6MBps ftp transfers? On a switched
> 100Mbps non-congested LAN, the best I could get was 4.2MBps with ftp.

<shrug> I hate to say it, but I simply plugged it in and it worked. If it
helps at all here is a quick list of the stuff involved:
Server:
Pentium 133 on Gigabyte GA586-HX mainboard, 64mb of ram, 6.4gb Western
Digital Caviar hard disk (IDE; it has a couple of other IDE drives as
ll,
but this is the one usually in use as it contains /var and /home), LinkSys
10/100 PNIC based NIC. System is running Redhat 6.0 with kernel 2.2.5-22
compiled from the kernel-source package.

Physical network:
Cat 5 cable connected to a LinkSys Etherfast dual-speed-per-port 10/100 b

Client:
Pentium 2 300 on an Asus P2B-S mainboard, 256mb of ram (windows needs it),
Fujitsu 8.4gb IDE hard disk, LinkSys NIC (identical to one in server).
Running Windows 95 r2.

Can send you screenshots if you like ;)

> >> Internet, and supports Java bytecode just like native binaries (Windows
>> cannot do this, and thus has inferior Java support imho).

>Well, Windows has a Java Virtual Machine that appears to be
>much faster than anything you get under Linux...

As a virtual machine yes. If you simply use a recent version
of ecgs/gcc to compile the Java code to a native binary then
you get execution much faster than any JVM. . . It depends
on how much effort you want to put into it (i.e. to run
stuff slowly through the kernel module is easier than
running a Java program in Windows, and to run after doing a
little work compiling is faster than running a Java program
in Windows. Thus you have a choice. But that's nearly always
the difference between M$ and Linux, no?).

===

Subject: Re: OT?: Future of Linux
From: "Henry Ngai" <hngai@neec-usa.com>
Date: Mon, 13 Sep 1999 10:37:33 -0700

As a networking person. I would like to shed some light on the subject.
First, a switched network is not faster than a hub based network!!!

Let's say there is a small network with only two hosts doing FTP on a hub
v.s. the same two hosts doing FTP over a switch, both running at 100Mbps.
The hub based FTP will be much faster, and this is why.

Most 10/100 switches are store and forward. Thus for FTPs, the transfer has
to take place on the first leg of the network from the host to the switch.
The switch stores the data, do a search on the MAC address and determined
that it has to be forwarded. It then sends the packet out to the receiving
host. The poor packet takes twice as much time to reach the destination
comparing to the hub based solution.

So, only use a switch with a fairly loaded network. Otherwise, use a
reliable hub such as the one mentioned. A bad hub will cause CRC errors and
result in lost packets which slows down the transfer by waiting for software
timeouts to recover from the lost packet.

===

Subject: Re: OT?: Future of Linux
From: Graham Hemmings <gh-work@netcomuk.co.uk>
Date: Mon, 13 Sep 1999 22:53:13 +0000

At 10:37 13/09/99 -0700, you wrote:

>As a networking person. I would like to shed some light on the subject.
>First, a switched network is not faster than a hub based network!!!

This depends on the switch. Most switches (except the cheapies) are not 
just store & forward,  however more to the point is that you can run 
full-duplex on a switch which you can't on a hub - this eliminates all 
collisions.  On a shared half-duplex connection such as a hub, throughput 
is effectively limited to around 40% (at best) of actual bandwidth by 
collisions - this equates to just under 500KB/Sec for a 10Mbps HD 
connection and this is shared between transmit and receive.  A full-duplex 
connection on the other hand can run at 90%+ of bandwidth giving around 
1,100KB/Sec for a 10Mbps connection - this figure is not shared between 
tx/rx but is available to both at the same time as it is full-duplex.  So 
you can see that a switched full-duplex connection has the potential to run 
over 4 times faster than a non-switched half-duplex one, assuming the 
device is transmitting and receiving as hard as it can.

===

Subject: RE: OT?: Future of Linux
From: "Juha Saarinen" <juha_saarinen@email.msn.com>
Date: Tue, 14 Sep 1999 11:43:59 +1200

> As a networking person. I would like to shed some light on the subject.
> First, a switched network is not faster than a hub based network!!!

I'm a bit baffled by this statement, because as far as I know, switches are
very low-latency devices that avoid collisions as seens in hubs. Collisions
incur millisecond delays, whereas switches have nano-second forwarding
delays.

> Most 10/100 switches are store and forward. Thus for FTPs, the
> transfer has
> to take place on the first leg of the network from the host to the switch.
> The switch stores the data, do a search on the MAC address and determined
> that it has to be forwarded. It then sends the packet out to the receiving
> host. The poor packet takes twice as much time to reach the destination
> comparing to the hub based solution.

Hmmm... not sure about this. Dual-mode switches (cut-through at low loads,
store-and-forward at higher loads) are quite common, aren't they? The MAC
address is usually cached so the switch doesn't have to do a look up for
each packet anyway.

I'd be interested to hear some more on this topic.

====

Subject: Re: OT?: Future of Linux
From: "Henry Ngai" <hngai@neec-usa.com>
Date: Mon, 13 Sep 1999 17:22:54 -0700

Almost all modern 10mbps only switch are cut through. But 10/100? I believe
a lot of them are cut through on 10 to 10, it is impossible to do cut
through for 10 to 100, and quite a few new switches will not do 100 to 100
cut through, and some will do 100 to 10 cut through.

Even if they are cut through 100 to 100, a packet still have to be partially
received, analyzed, before it can be forwarded.

Networking in general is a ping pong protocol. If you trace a network
session, you can see what I am saying. Therefore, in a small network, where
there is no chance of constant activities on the network, there is almost no
chance of collision, even at half duplex.

As for the 40% utilization? Where does the figure come from? Many studies
concluded that with a 1024 node network running half duplex, maximum
utilization achievable is about 30%. Since no body I know of runs 1024 nodes
in a single collision domain, the utilization of the wire is way above 30%.

As for the real issue, the perception of how fast it feels to the user? And
how much throughput can be achieved by a single computer? Then the maximum
utilization of the cable is of no revelance. If you can run 100% on the
cable, but 100 hosts is using it, you get 1% of the bandwidth.

Since small LANs are inherently lightly loaded, there is next to no chance
of collision. In the case I sighted with only two hosts, and one
application? No collision with half duplex! That is, unless you are doing
mget and mput at the same time to test both up and down links of the
Ethernet wire.

Want go get a file across in the fastest possible way, or best network
response in a lightly loaded network? Get a hub. Switch does not help here.
Too many hosts on a LAN and it is slowering you down? Get a switch with the
lowest prop delay, and make sure there is enough buffer so no packet will be
dropped to avoid software timeout.

===

Subject: Re: OT?: Future of Linux
From: Chris Watt <cnww@hfx.andara.com>
Date: Mon, 13 Sep 1999 21:42:34 -0300

At 10:53 PM 9/13/99 +0000, Graham Hemmings wrote:

>collisions.  On a shared half-duplex connection such as a hub, throughput 
>is effectively limited to around 40% (at best) of actual bandwidth by 
>collisions - this equates to just under 500KB/Sec for a 10Mbps HD 

But I can get well over 500KB/Sec on a 10Mbps connection over my Hub. . .
Usually between 700 and 900KB/sec, and (as I mentioned) I get 6MB/sec on a
100Mbps connection. Also over 40% of it's theoretical maximum of about
12.5MB per second (in fact I think the limiting factor here may be the
speed at which the client is able to ackknowledge and store data to it's
hdd. This based on the fact that the first 10 megabytes or so of a file go
usually go much faster) All the documentation I have read suggests that
using full duplex just lets transfers go at 100Mbps in both directions at
once. It appears that all my NICs support full duplex, and to quote
directly from the manual for the card in the machine I'm sending this from
"Boasting an incredible maximum data throughput of 200 megabits per second
in full duplex mode (100Mbps in half duplex)" it would appear that I can
theoretically get 100Mbps in half duplex.
Anyhow, I'm going to go write a cheesy "how much data can we cram through
in UDP if we don't do anything with it" program and see what it comes up
with. Will post results if anyone is still interested.

===

Subject: Re: OT?: Future of Linux
From: "Henry Ngai" <hngai@neec-usa.com>
Date: Mon, 13 Sep 1999 18:06:27 -0700

It is all relative, but there is room for gain in a particular setup.

To fully understand the subject, you start with the format of the packet.

The beginning of a packet is called preamble. a total of 64 bits. Then there
is the Destination MAC address, a total of 6 bytes. Then there is the rest
of the packet consisting of Source MAC address, Type/Length, Data, and CRC.

In a hub, or repeater in 802.3 terms, packets are re-transmitted as soon as
preamble is detected. This is because a repeater does not check data
integrity, so it can go out right away. So the delay from in to out may be
only a few bits, say 3.

In a switch, or bridge in 802.3 terms, packets are received, CRC checked, it
then does a lookup, and then forwards the packet. See the delay here?

Early switch implementation uses cut through. Basically it takes in the
preamble, then the Destination MAC address. At this point, there is enough
information, so a lookup is performed, and then the packet is forwarded.
Even if all packets have cache hits, it will be delayed for at least 64 bits
of preamble plus 48 bits of Destination MAC address.

112 bits is not too bad a delay, except that it does not work. That is due
to collision. In Ethernet, there is an official collision window that
collision can happen. That is within the first 64 bytes of the packet, not
including the preamble. This is why Ethernet (10/100) has a minimum packet
size of 64 bytes. If a collision happens in such a system, fragments will
happen in a switch and it will cost additional delay by shutting off the
medium. (An IPG, or Inter packet gap is specified as silence between
packets. So any fragment will cost additional delay.

To clean up fragments, all modern switch (or bridge) uses modified cut
through when they do cut through. That is, a minimum of 64 bytes must be
received without collision before a forwarding is allowed.

As such, in a modern switch, each packet, even in cut through mode, will
have a delay of 64 bits of preamble plus 512 bits of data before a packet
can be forwarded.

The sliding window nature of FTP helps by not requiring an acknowledgement
for every packet send. Therefore, during a file transfer, a few packets is
send back to back before an ack is required. Since the delay created by the
forwarding process is not cumulative, it only has a single delay of 572 bits
per block of packets. In 100Mbps Ethernet, 572 bits equal to 5720 ns, or 5.7
us.

If you have 6 MB of data, and you window size in TCP/IP is 32KB, you are
talking about 192 blocks with a delay of 192 x 5.72us. Since there are equal
number of acks, each delayed 572 bits, the total wasted time comes out to be
2.2ms

All this wasted time is not accounting for time to do look up, even cached
ones needs additional time. And if your window size is 8KB instead, you
wasted time becomes 8.8ms.

The theoretical speed for 100 Ethernet is about 12.5MB. For a 6MB file which
takes 500ms, you used up a minimum of 8.8 ms.

In reality, the protocol uses more packet to handshake than what I have
described. So there is more time spend on the delay. I would say a
performance hit between 2 to 5% is normal. (5% being a guess, may be higher
or lower depending on other things happening at your site)

So far we haven't consider the lookup, be it cached or not. Let's say each
block caused a cumulative delay of a few hundred ns?

I uses a hub in my own small network. It is less expensive, and more
performance.

As for dual mode switches, all cut through switches are dual mode. If a
destination port is in use, any packet destined for that port is stored
until that port is free.

The parameters to increase performace, given a fixed network topology, be it
hub or switch based, are: window size, the larger the better for large file
transfers. CPU, the faster the better as it reduces turn around time such as
ack timing. NIC driver, does it support cut through? Early Interrupt? 3COM
has it but it results in increased CPU utilization for Windows environment.
I am not sure of their current status.

The parameters to increase ones performace oftens degrades others, as the
link to the server is shared among many hosts. Early interrupt hogs up CPU
cycles. Large sliding window uses more memory.

Of all of this, I would always go for the larger sliding window size. May be
it is called buffer size in Linux.

I hope the answer is clear enough for you. But I am afraid that this may be
using up bandwidth for other people not interested in Networking. So unless
someone else wants to know more, I would suggest moving this off line to a
private chat.

===

Subject: Re: pump broken?
From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Wed, 15 Sep 1999 01:19:09 +0100 (BST)


> So, in closing, no, I haven't gotten an answer and am entering this into
> bugzilla as I type.  I'd like to see this resolved as it's quite annoying
> to think that if the hurricane knocks power out for >8 hours, my IP will
> likely change :(

In theory it shouldnt matter. DHCP servers are supposed to use free addresses
for new MAC addresses and only reuse existing ones when the pool is exhausted.
That is they are intended to minimise address reuse.

If you tell it to get you a 30 day lease and it doesnt that however is still
a bug ..

===
the rest of The Pile (a partial mailing list archive)
doom@kzsu.stanford.edu