Introduction to QoS

What is QoS? The best definition I heard is that QoS is „managed unfairness“ Different type of traffic is managed differently – make sense 😉

There are lots of QoS mechanism out there but all of them can be grouped to 3 gategories:

Best effort – this is not strict
DiffServ – less strict
IntServ – strict

Best effort is not really a QoS mechanism. That is just FIFO. IntServ (integrated services) is very strict and it uses RSVP (resource reservation protocol). With IntServ you reserve bandwidth just for one application and no other application can touch this bandwidth. As I said it is very strict.

Usually todays QoS implementation uses the middle approach called DiffServ or differentiated services. And it really differentiate flows and apply to this flows different policies. For example we can say that voice call will have at lest 768 kb/s traffic. But its not as strict as Intserv. If you are not using that amount of traffic currently some other app can take it. So we kind of share the BW.

Common QoS Mechanisms:

Classification and marking – you should classify and mark the packet very soon – as soon as possible. By marking the packet very soon the next router or switch can apply specific policy based on this marking like forwarding or dropping decision. Classification and marking dont alter the behavior of traffic. It just gives it a label so it can be treated differently depending on its marking. Classification organizes packets into different traffic types, to which different policies can be applied. Classification happen without marking. Marking writes a value into the packet and establishes a trust boundary at the network edge. We can classify packets based on many attributes like L1 (ingress interface) L2 (802.1Q) L3 – IP DSCP, L7 (NBAR)… When comes to marking we are more limited, you mark at L2, L2.5 (MPLS experimental bits) or L3.
Queuing – Imagine you are connecting your 1 gig aggregated LAN traffic in router towards WAN 4 mbit interface. What is going to happen? The router allocate some memory called a buffer or a queue. It temporary stores the packets in this memory hoping the over subscription situation will be solved and the router will be able to send the packets on its way. However what if the buffer is full? We start dropping packet, all packets – the good and the bad. This is obviously not good situation. What if we take this one big queue and subdivide it, break it into sub queues. For example VOIP goes into one queue and other traffic to second queue. If then we have oversubscription situation the VOIP traffic have its own queue and we can influence what queue will have precedence and priority. In that way the VOIP traffic will not be dropped. So queuing is all about guarantee of some amount of bandwidth for some applications.
Congestion avoidance – what if we dont want to face a situation where our queue will be full at all? We can use something called congestion avoidance. Cisco congestion avoidance is called WRED (Weighted random early detection). WRED will discard some of the packets before they reach the queue so the queue is not full. WRED will sacrify few packet flows for the benefit of all the other flows.
Policing and Shaping – they are called also traffic conditioners. While queueing is all about how do I guarantee a minimum amount of traffic for this application the policing and shaping is precise opposite. It is speed limit. It says what amount of bandwidth one application cannot reach. So we will not face the situation where one application will exceed the whole link bandwidth. Difference between policing and shaping is that policing will discard the traffic if no bandwidth is available for the app and shaping will not discard it, it will put it to some queue.
Link Efficiency -interactive traffic such as Telnet and Voice over IP (VoIP) is susceptible to increased latency when the network processes large packets such as LAN-to-LAN FTP transfers traversing a WAN. Packet delay is especially significant when the FTP packets are queued on slower links within the WAN. To solve delay problems on slow bandwidth links, a method for fragmenting larger packets and then queueing the smaller packets between fragments of the large packets is required. The LFI scheme is relatively simple: Large datagrams are multilink encapsulated and fragmented to packets of a size small enough to satisfy the delay requirements of the delay-sensitive traffic; small delay-sensitive packets are not multilink encapsulated, but are interleaved between fragments of the large datagram.