Search This Blog

Thursday, December 8, 2011

Capture the burst in traffic

here is a story about a customer with symmetric 30Mbps MPLS link, which on it he transmit all kind of traffic including voice and video.
After few days the customer starts to complain about packet loss in his voice/video application, with no specific time or event, on the receive direction.
taking a look in the customer's router interface didn't show almost nothing:

GigabitEthernet0/1 is up, line protocol is up
  Hardware is iGbE, address is 4022.22ff.2220 (bia 4022.22ff.2220)
  Internet address is
  MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
     reliability 255/255, txload 65/255, rxload 18/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full Duplex, 1Gbps, media type is SX
  output flow-control is unsupported, input flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 1w0d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 25981268
  Queueing strategy: Class-based queueing
  Output queue: 0/1000/0 (size/max total/drops)
  30 second input rate 7158000 bits/sec, 2225 packets/sec
  30 second output rate 25707000 bits/sec, 2524 packets/sec
     926151472 packets input, 4244873795 bytes, 0 no buffer
     Received 101822 broadcasts (0 IP multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 100176 multicast, 0 pause input
     1387944109 packets output, 363334323 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     89711 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out

beside the fact that there are output drops....
but if the input/output rate is not exceeding the CIR from where does these output drops coming from???

Now the router allows us to define the minimum interface statistics interval to 30 seconds,  we calculate the CIR in megabits per second and the router itself thinks in milliseconds, so we will need to find a way to calculate and to "see" the same way the router sees.

Using Wireshark and SPAN i have done packet capture for few minutes on the receive direction for this interface, after that i ran statistics->IO graphs for analyzing the traffic and at first glance it's seems that now packet loss should occur, all traffic didn't reach over 12Mbps:

note that the tick interval is set to 1 second, now changing it to 0.001msec show a complete different picture...

note how does the Y Axis scale has changed automatically to max of 100Mbps, the spike in the middle shows clearly a traffic burst of more the 70Mbps!!! keep searching along the graph shows a few more spikes like that.

Network devices, using policing/shaping or rate limiting, can't handle these kinds of spikes for that super short period of time and that's the reason i was able to see these spikes on my capture. In order to a police/shape/rate-limit to work the burst should exists for a duration of more then 100msec.

Anyway this was the cause of the packet loss and implementing MQC, on the customer routers, with reservation for video and voice traffic pretty end this  problem.

No comments:

Post a Comment