Tuesday, January 26, 2010

Ethernet errors

Ethernet errors
6.2.7 This page will define common Ethernet errors.


Knowledge of typical errors is invaluable for understanding both the operation and troubleshooting of Ethernet networks.

The following are the sources of Ethernet error:

• Collision or runt – Simultaneous transmission occurring before slot time has elapsed
• Late collision – Simultaneous transmission occurring after slot time has elapsed
• Jabber, long frame and range errors – Excessively or illegally long transmission
• Short frame, collision fragment or runt – Illegally short transmission
• FCS error – Corrupted transmission
• Alignment error – Insufficient or excessive number of bits transmitted
• Range error – Actual and reported number of octets in frame do not match
• Ghost or jabber – Unusually long Preamble or Jam event

While local and remote collisions are considered to be a normal part of Ethernet operation, late collisions are considered to be an error. The presence of errors on a network always suggests that further investigation is warranted. The severity of the problem indicates the troubleshooting urgency related to the detected errors. A handful of errors detected over many minutes or over hours would be a low priority. Thousands detected over a few minutes suggest that urgent attention is warranted.

Jabber is defined in several places in the 802.3 standard as being a transmission of at least 20,000 to 50,000 bit times in duration. However, most diagnostic tools report jabber whenever a detected transmission exceeds the maximum legal frame size, which is considerably smaller than 20,000 to 50,000 bit times. Most references to jabber are more properly called long frames.

A long frame is one that is longer than the maximum legal size, and takes into consideration whether or not the frame was tagged. It does not consider whether or not the frame had a valid FCS checksum. This error usually means that jabber was detected on the network.

A short frame is a frame smaller than the minimum legal size of 64 octets, with a good frame check sequence. Some protocol analyzers and network monitors call these frames “runts". In general the presence of short frames is not a guarantee that the network is failing.

The term runt is generally an imprecise slang term that means something less than a legal frame size. It may refer to short frames with a valid FCS checksum although it usually refers to collision fragments.

The next page will continue the discussion of Ethernet frame errors.

Types of collisions

Types of collisions
6.2.6 This page covers the different types of collisions and their characteristics.


Collisions typically take place when two or more Ethernet stations transmit simultaneously within a collision domain. A single collision is a collision that was detected while trying to transmit a frame, but on the next attempt the frame was transmitted successfully. Multiple collisions indicate that the same frame collided repeatedly before being successfully transmitted. The results of collisions, collision fragments, are partial or corrupted frames that are less than 64 octets and have an invalid FCS. Three types of collisions are:

• Local
• Remote
• Late

To create a local collision on coax cable (10BASE2 and 10BASE5), the signal travels down the cable until it encounters a signal from the other station. The waveforms then overlap, canceling some parts of the signal out and reinforcing or doubling other parts. The doubling of the signal pushes the voltage level of the signal beyond the allowed maximum. This over-voltage condition is then sensed by all of the stations on the local cable segment as a collision.

In the beginning the waveform in Figure represents normal Manchester encoded data. A few cycles into the sample the amplitude of the wave doubles. That is the beginning of the collision, where the two waveforms are overlapping. Just prior to the end of the sample the amplitude returns to normal. This happens when the first station to detect the collision quits transmitting, and the jam signal from the second colliding station is still observed.

On UTP cable, such as 10BASE-T, 100BASE-TX and 1000BASE-T, a collision is detected on the local segment only when a station detects a signal on the RX pair at the same time it is sending on the TX pair. Since the two signals are on different pairs there is no characteristic change in the signal. Collisions are only recognized on UTP when the station is operating in half duplex. The only functional difference between half and full duplex operation in this regard is whether or not the transmit and receive pairs are permitted to be used simultaneously. If the station is not engaged in transmitting it cannot detect a local collision. Conversely, a cable fault such as excessive crosstalk can cause a station to perceive its own transmission as a local collision.

The characteristics of a remote collision are a frame that is less than the minimum length, has an invalid FCS checksum, but does not exhibit the local collision symptom of over-voltage or simultaneous RX/TX activity. This sort of collision usually results from collisions occurring on the far side of a repeated connection. A repeater will not forward an over-voltage state, and cannot cause a station to have both the TX and RX pairs active at the same time. The station would have to be transmitting to have both pairs active, and that would constitute a local collision. On UTP networks this is the most common sort of collision observed.

There is no possibility remaining for a normal or legal collision after the first 64 octets of data has been transmitted by the sending stations. Collisions occurring after the first 64 octets are called “late collisions". The most significant difference between late collisions and collisions occurring before the first 64 octets is that the Ethernet NIC will retransmit a normally collided frame automatically, but will not automatically retransmit a frame that was collided late. As far as the NIC is concerned everything went out fine, and the upper layers of the protocol stack must determine that the frame was lost. Other than retransmission, a station detecting a late collision handles it in exactly the same way as a normal collision.

The next page will discuss the sources of Ethernet errors.

Error handling

Error handling
6.2.5 This page will describe collisions and how they are handled on a network.


The most common error condition on Ethernet networks are collisions. Collisions are the mechanism for resolving contention for network access. A few collisions provide a smooth, simple, low overhead way for network nodes to arbitrate contention for the network resource. When network contention becomes too great, collisions can become a significant impediment to useful network operation.

Collisions result in network bandwidth loss that is equal to the initial transmission and the collision jam signal. This is consumption delay and affects all network nodes possibly causing significant reduction in network throughput.

The considerable majority of collisions occur very early in the frame, often before the SFD. Collisions occurring before the SFD are usually not reported to the higher layers, as if the collision did not occur. As soon as a collision is detected, the sending stations transmit a 32-bit “jam” signal that will enforce the collision. This is done so that any data being transmitted is thoroughly corrupted and all stations have a chance to detect the collision.

In Figure two stations listen to ensure that the cable is idle, then transmit. Station 1 was able to transmit a significant percentage of the frame before the signal even reached the last cable segment. Station 2 had not received the first bit of the transmission prior to beginning its own transmission and was only able to send several bits before the NIC sensed the collision. Station 2 immediately truncated the current transmission, substituted the 32-bit jam signal and ceased all transmissions. During the collision and jam event that Station 2 was experiencing, the collision fragments were working their way back through the repeated collision domain toward Station 1. Station 2 completed transmission of the 32-bit jam signal and became silent before the collision propagated back to Station 1 which was still unaware of the collision and continued to transmit. When the collision fragments finally reached Station 1, it also truncated the current transmission and substituted a 32-bit jam signal in place of the remainder of the frame it was transmitting. Upon sending the 32-bit jam signal Station 1 ceased all transmissions.

A jam signal may be composed of any binary data so long as it does not form a proper checksum for the portion of the frame already transmitted. The most commonly observed data pattern for a jam signal is simply a repeating one, zero, one, zero pattern, the same as Preamble. When viewed by a protocol analyzer this pattern appears as either a repeating hexadecimal 5 or A sequence. The corrupted, partially transmitted messages are often referred to as collision fragments or runts. Normal collisions are less than 64 octets in length and therefore fail both the minimum length test and the FCS checksum test.

The next page will define different types of collisions.

Interframe spacing and backoff

Interframe spacing and backoff
6.2.4 This page explains how spacing is used in an Ethernet network for data transmission.


The minimum spacing between two non-colliding frames is also called the interframe spacing. This is measured from the last bit of the FCS field of the first frame to the first bit of the preamble of the second frame.

After a frame has been sent, all stations on a 10-Mbps Ethernet are required to wait a minimum of 96 bit-times (9.6 microseconds) before any station may legally transmit the next frame. On faster versions of Ethernet the spacing remains the same, 96 bit-times, but the time required for that interval grows correspondingly shorter. This interval is referred to as the spacing gap. The gap is intended to allow slow stations time to process the previous frame and prepare for the next frame.

A repeater is expected to regenerate the full 64 bits of timing information, which is the preamble and SFD, at the start of any frame. This is despite the potential loss of some of the beginning preamble bits because of slow synchronization. Because of this forced reintroduction of timing bits, some minor reduction of the interframe gap is not only possible but expected. Some Ethernet chipsets are sensitive to a shortening of the interframe spacing, and will begin failing to see frames as the gap is reduced. With the increase in processing power at the desktop, it would be very easy for a personal computer to saturate an Ethernet segment with traffic and to begin transmitting again before the interframe spacing delay time is satisfied.

After a collision occurs and all stations allow the cable to become idle (each waits the full interframe spacing), then the stations that collided must wait an additional and potentially progressively longer period of time before attempting to retransmit the collided frame. The waiting period is intentionally designed to be random so that two stations do not delay for the same amount of time before retransmitting, which would result in more collisions. This is accomplished in part by expanding the interval from which the random retransmission time is selected on each retransmission attempt. The waiting period is measured in increments of the parameter slot time.

If the MAC layer is unable to send the frame after sixteen attempts, it gives up and generates an error to the network layer. Such an occurrence is fairly rare and would happen only under extremely heavy network loads, or when a physical problem exists on the network.

The next page will discuss collisions.

Ethernet timing

Ethernet timing
6.2.3 This page explains the importance of slot times in an Ethernet network.


The basic rules and specifications for proper operation of Ethernet are not particularly complicated, though some of the faster physical layer implementations are becoming so. Despite the basic simplicity, when a problem occurs in Ethernet it is often quite difficult to isolate the source. Because of the common bus architecture of Ethernet, also described as a distributed single point of failure, the scope of the problem usually encompasses all devices within the collision domain. In situations where repeaters are used, this can include devices up to four segments away.

Any station on an Ethernet network wishing to transmit a message first “listens” to ensure that no other station is currently transmitting. If the cable is quiet, the station will begin transmitting immediately. The electrical signal takes time to travel down the cable (delay), and each subsequent repeater introduces a small amount of latency in forwarding the frame from one port to the next. Because of the delay and latency, it is possible for more than one station to begin transmitting at or near the same time. This results in a collision.

If the attached station is operating in full duplex then the station may send and receive simultaneously and collisions should not occur. Full-duplex operation also changes the timing considerations and eliminates the concept of slot time. Full-duplex operation allows for larger network architecture designs since the timing restriction for collision detection is removed.

In half duplex, assuming that a collision does not occur, the sending station will transmit 64 bits of timing synchronization information that is known as the preamble. The sending station will then transmit the following information:

• Destination and source MAC addressing information

• Certain other header information

• The actual data payload

• Checksum (FCS) used to ensure that the message was not corrupted along the way

Stations receiving the frame recalculate the FCS to determine if the incoming message is valid and then pass valid messages to the next higher layer in the protocol stack.

10 Mbps and slower versions of Ethernet are asynchronous. Asynchronous means that each receiving station will use the eight octets of timing information to synchronize the receive circuit to the incoming data, and then discard it. 100 Mbps and higher speed implementations of Ethernet are synchronous. Synchronous means the timing information is not required, however for compatibility reasons the Preamble and Start Frame Delimiter (SFD) are present.

For all speeds of Ethernet transmission at or below 1000 Mbps, the standard describes how a transmission may be no smaller than the slot time. Slot time for 10 and 100-Mbps Ethernet is 512 bit-times, or 64 octets. Slot time for 1000-Mbps Ethernet is 4096 bit-times, or 512 octets. Slot time is calculated assuming maximum cable lengths on the largest legal network architecture. All hardware propagation delay times are at the legal maximum and the 32-bit jam signal is used when collisions are detected.

The actual calculated slot time is just longer than the theoretical amount of time required to travel between the furthest points of the collision domain, collide with another transmission at the last possible instant, and then have the collision fragments return to the sending station and be detected. For the system to work the first station must learn about the collision before it finishes sending the smallest legal frame size. To allow 1000-Mbps Ethernet to operate in half duplex the extension field was added when sending small frames purely to keep the transmitter busy long enough for a collision fragment to return. This field is present only on 1000-Mbps, half-duplex links and allows minimum-sized frames to be long enough to meet slot time requirements. Extension bits are discarded by the receiving station.

On 10-Mbps Ethernet one bit at the MAC layer requires 100 nanoseconds (ns) to transmit. At 100 Mbps that same bit requires 10 ns to transmit and at 1000 Mbps only takes 1 ns. As a rough estimate, 20.3 cm (8 in) per nanosecond is often used for calculating propagation delay down a UTP cable. For 100 meters of UTP, this means that it takes just under 5 bit-times for a 10BASE-T signal to travel the length the cable.

For CSMA/CD Ethernet to operate, the sending station must become aware of a collision before it has completed transmission of a minimum-sized frame. At 100 Mbps the system timing is barely able to accommodate 100 meter cables. At 1000 Mbps special adjustments are required as nearly an entire minimum-sized frame would be transmitted before the first bit reached the end of the first 100 meters of UTP cable. For this reason half duplex is not permitted in 10-Gigabit Ethernet.

The next page defines interframe spacing and backoff.

MAC / MAC rules and collision detection/backoff

MAC
6.2.1 This page will define MAC and provide examples of deterministic and non-deterministic MAC protocols.


MAC refers to protocols that determine which computer in a shared-media environment, or collision domain, is allowed to transmit data. MAC and LLC comprise the IEEE version of the OSI Layer 2. MAC and LLC are sublayers of Layer 2. The two broad categories of MAC are deterministic and non-deterministic.

Examples of deterministic protocols include Token Ring and FDDI. In a Token Ring network, hosts are arranged in a ring and a special data token travels around the ring to each host in sequence. When a host wants to transmit, it seizes the token, transmits the data for a limited time, and then forwards the token to the next host in the ring. Token Ring is a collisionless environment since only one host can transmit at a time.

Non-deterministic MAC protocols use a first-come, first-served approach. CSMA/CD is a simple system. The NIC listens for the absence of a signal on the media and begins to transmit. If two nodes transmit at the same time a collision occurs and none of the nodes are able to transmit.

Three common Layer 2 technologies are Token Ring, FDDI, and Ethernet. All three specify Layer 2 issues, LLC, naming, framing, and MAC, as well as Layer 1 signaling components and media issues. The specific technologies for each are as follows:

• Ethernet – uses a logical bus topology to control information flow on a linear bus and a physical star or extended star topology for the cables
• Token Ring – uses a logical ring topology to control information flow and a physical star topology
• FDDI – uses a logical ring topology to control information flow and a physical dual-ring topology

The next page explains how collisions are avoided in an Ethernet network.

MAC rules and collision detection/backoff
6.2.2 This page describes collision detection and avoidance in a CSMA/CD network.


Ethernet is a shared-media broadcast technology. The access method CSMA/CD used in Ethernet performs three functions:

• Transmitting and receiving data packets

• Decoding data packets and checking them for valid addresses before passing them to the upper layers of the OSI model

• Detecting errors within data packets or on the network

In the CSMA/CD access method, networking devices with data to transmit work in a listen-before-transmit mode. This means when a node wants to send data, it must first check to see whether the networking media is busy. If the node determines the network is busy, the node will wait a random amount of time before retrying. If the node determines the networking media is not busy, the node will begin transmitting and listening. The node listens to ensure no other stations are transmitting at the same time. After completing data transmission the device will return to listening mode.

Networking devices detect a collision has occurred when the amplitude of the signal on the networking media increases. When a collision occurs, each node that is transmitting will continue to transmit for a short time to ensure that all nodes detect the collision. When all nodes have detected the collision, the backoff algorithm is invoked and transmission stops. The nodes stop transmitting for a random period of time, determined by the backoff algorithm. When the delay periods expire, each node can attempt to access the networking media. The devices that were involved in the collision do not have transmission priority.

The Interactive Media Activity shows the procedure for collision detection in an Ethernet network.

The next page will discuss Ethernet timing.

Saturday, January 16, 2010

Ethernet frame structure / Ethernet frame fields

Ethernet frame structure
6.1.6 This page will describe the frame structure of Ethernet networks.


At the data link layer the frame structure is nearly identical for all speeds of Ethernet from 10 Mbps to 10,000 Mbps. However, at the physical layer almost all versions of Ethernet are very different. Each speed has a distinct set of architecture design rules.

In the version of Ethernet that was developed by DIX prior to the adoption of the IEEE 802.3 version of Ethernet, the Preamble and Start-of-Frame (SOF) Delimiter were combined into a single field. The binary pattern was identical. The field labeled Length/Type was only listed as Length in the early IEEE versions and only as Type in the DIX version. These two uses of the field were officially combined in a later IEEE version since both uses were common.

The Ethernet II Type field is incorporated into the current 802.3 frame definition. When a node receives a frame it must examine the Length/Type field to determine which higher-layer protocol is present. If the two-octet value is equal to or greater than 0x0600 hexadecimal, 1536 decimal, then the contents of the Data Field are decoded according to the protocol indicated.

The next page will discuss the information included in a frame.
Ethernet frame fields
6.1.7 This page defines the fields that are used in a frame.


Some of the fields permitted or required in an 802.3 Ethernet frame are as follows:

• Preamble
• SOF Delimiter
• Destination Address
• Source Address
• Length/Type
• Header and Data
• FCS
• Extension

The preamble is an alternating pattern of ones and zeros used to time synchronization in 10 Mbps and slower implementations of Ethernet. Faster versions of Ethernet are synchronous so this timing information is unnecessary but retained for compatibility.

A SOF delimiter consists of a one-octet field that marks the end of the timing information and contains the bit sequence 10101011.

The destination address can be unicast, multicast, or broadcast.

The Source Address field contains the MAC source address. The source address is generally the unicast address of the Ethernet node that transmitted the frame. However, many virtual protocols use and sometimes share a specific source MAC address to identify the virtual entity.

The Length/Type field supports two different uses. If the value is less than 1536 decimal, 0x600 hexadecimal, then the value indicates length. The length interpretation is used when the LLC layer provides the protocol identification. The type value indicates which upper-layer protocol will receive the data after the Ethernet process is complete. The length indicates the number of bytes of data that follows this field.

The Data field and padding if necessary, may be of any length that does not cause the frame to exceed the maximum frame size. The maximum transmission unit (MTU) for Ethernet is 1500 octets, so the data should not exceed that size. The content of this field is unspecified. An unspecified amount of data is inserted immediately after the user data when there is not enough user data for the frame to meet the minimum frame length. This extra data is called a pad. Ethernet requires each frame to be between 64 and 1518 octets.

A FCS contains a 4-byte CRC value that is created by the device that sends data and is recalculated by the destination device to check for damaged frames. The corruption of a single bit anywhere from the start of the Destination Address through the end of the FCS field will cause the checksum to be different. Therefore, the coverage of the FCS includes itself. It is not possible to distinguish between corruption of the FCS and corruption of any other field used in the calculation.

This page concludes this lesson. The next lesson will discuss the functions of an Ethernet network. The first page will introduce the concept of MAC.