What is Quality of Service (QoS)?

Quality of Service (QoS) is a set of techniques used to manage network traffic and ensure that critical or latency-sensitive traffic (like voice and video) receives preferential treatment over less time-sensitive traffic (like email or file downloads). QoS works by classifying and marking traffic, then managing bandwidth, delay, jitter, and packet loss through mechanisms like queuing, traffic shaping, and traffic policing.

What is the difference between traffic shaping and traffic policing?

Traffic shaping smooths out bursts by buffering excess traffic in a queue and releasing it at a controlled rate. Traffic that exceeds the configured rate is delayed, not dropped — so shaping causes delay but not packet loss. Traffic policing enforces a hard rate limit: traffic that exceeds the limit is immediately dropped (or re-marked to a lower priority class). Shaping is gentler and preferred for TCP applications; policing is harsher but provides stricter enforcement.

DSCP (Differentiated Services Code Point) is a 6-bit field in the IP header's ToS (Type of Service) byte used to mark packets with a QoS priority class. Routers and switches read the DSCP value to determine how to handle each packet — which queue it goes into and whether it gets preferential forwarding. Key values: EF (Expedited Forwarding, DSCP 46) for voice; AF41 (DSCP 34) for video; CS0 (DSCP 0) for best-effort traffic.

Quality of Service (QoS) — DSCP, Shaping & Queuing

⚡ Quick Answer

QoS (Quality of Service) manages network traffic to give priority to latency-sensitive applications. Key concepts: Classification = identify and mark traffic (DSCP field in IP header). Traffic shaping = buffer excess traffic and release it smoothly (causes delay, not drops). Traffic policing = enforce hard rate limits and drop excess traffic immediately. DSCP EF (46) = voice (highest priority). DSCP AF41 (34) = video. DSCP CS0 (0) = best-effort. CoS = Layer 2 equivalent using 802.1p bits in the VLAN tag.

Why QoS Matters

Networks have finite bandwidth. When a link becomes congested, packets queue up and routers must decide which ones to forward first and which ones to drop. Without QoS, that decision is made randomly — a large file download could cause equal congestion impact as a VoIP call, even though the call is far more sensitive to delay.

QoS solves this by providing a framework to classify, mark, and manage traffic based on its type and business importance. Voice calls get forwarded immediately with guaranteed bandwidth. Video gets a large dedicated queue. Bulk file transfers get whatever bandwidth remains. The result is consistent call quality even on a busy network.

Three properties of traffic quality matter most for real-time applications: latency (one-way delay — voice needs under 150ms to sound natural), jitter (variation in latency — causes choppy audio/video), and packet loss (dropped packets — TCP retransmits them but UDP doesn't, causing gaps in voice/video).

QoS Models

🏗️

IntServ (Integrated Services)

Applications reserve bandwidth end-to-end across the network using the RSVP (Resource Reservation Protocol) signalling protocol. Each router along the path must be aware of and maintain the reservation. Provides guaranteed, hard QoS. Drawback: doesn't scale — every router must maintain state for every flow. Rarely used on large networks today.

RSVPHard guaranteesPer-flow state

📊

DiffServ (Differentiated Services)

Traffic is classified and marked at the network edge using the DSCP field. Core routers apply forwarding behaviors based on DSCP markings without tracking individual flows — highly scalable. The dominant QoS model in modern enterprise and service provider networks. Per-Hop Behavior (PHB) is the forwarding behavior applied at each router based on the DSCP value.

DSCPScalablePer-Hop Behavior

DSCP Markings — Traffic Classes

DSCP (Differentiated Services Code Point) uses 6 bits in the IP header's ToS field, allowing 64 possible values. In practice, a handful of well-known values are used for specific traffic types.

DSCP Name	DSCP Value	Decimal	Typical Use	Drop Probability
EF — Expedited Forwarding	101110	46	VoIP / real-time voice	None — highest priority queue
AF41 — Assured Forwarding 4,1	100010	34	Video conferencing	Low
AF31 — Assured Forwarding 3,1	011010	26	Business-critical apps (ERP, CRM)	Low
AF21 — Assured Forwarding 2,1	010010	18	Transactional data	Low
AF11 — Assured Forwarding 1,1	001010	10	Bulk data, background transfers	Low
CS0 — Best Effort (BE)	000000	0	Default — unclassified traffic	Highest (dropped first)
CS6 / CS7	110000 / 111000	48 / 56	Network control traffic (routing protocols, OSPF, BGP)	None — reserved for network use

🎯 Exam Tip — DSCP Numbers to Know

EF = 46 = VoIP. This is the most tested value. If the question mentions voice traffic and a DSCP number, it's 46.

AF4x = Video. AF41 (34) is standard for video conferencing.

CS0 = 0 = Best Effort. Default marking for traffic that hasn't been classified. It gets whatever's left over.

CS6/CS7 are reserved for network infrastructure (routing protocol traffic). Never use these for user applications.

CoS — Class of Service (Layer 2)

DSCP operates at Layer 3 (IP header). CoS (Class of Service), defined in IEEE 802.1p, is the Layer 2 equivalent — it uses 3 bits in the 802.1Q VLAN tag to mark traffic with a priority from 0 (lowest) to 7 (highest). This allows QoS to be applied on Ethernet switches before traffic is routed.

📌 CoS Values in Practice

CoS 5 = VoIP (maps to DSCP EF). CoS 4 = Video. CoS 3 = Business-critical data. CoS 0 = Best effort (default).

When traffic moves from Layer 2 to Layer 3, CoS values are typically mapped to corresponding DSCP values. Network devices re-mark traffic at trust boundaries — where traffic enters from an untrusted source, markings are reset or re-applied according to policy.

Traffic Shaping vs Traffic Policing

Both mechanisms enforce a rate limit, but they handle traffic that exceeds the limit differently. Understanding this distinction is frequently tested.

🔄

Traffic Shaping

Buffers excess traffic and releases it at a controlled rate. Bursts are absorbed into a queue and transmitted smoothly — like a token bucket. No packet loss from the shaping mechanism itself (only if the buffer fills completely). Causes additional delay. Best for outbound traffic where you want smooth transmission without drops. TCP handles this well — smooth flow without retransmits.

Buffers excessAdds delayNo drops

🚔

Traffic Policing

Drops or re-marks excess traffic immediately — no buffering. Traffic that exceeds the configured rate is either discarded or remarked to a lower DSCP class (so it gets dropped later during congestion). Causes packet loss for TCP applications (triggers retransmits, reducing effective throughput). Used on ingress interfaces and on service provider networks to enforce contracted rates (CIR — Committed Information Rate).

Drops excessHard limitCIR enforcement

Queuing Mechanisms

When congestion occurs, packets queue before being transmitted. The queuing mechanism determines which packets are served next — this is where priority enforcement actually happens.

FIFO

First In, First Out

Packets are transmitted in the order they arrived — no priority. Simple but provides no QoS. All traffic is treated equally. Default on unconfigured interfaces. A large burst from one flow can delay all other traffic (head-of-line blocking).

Priority Queuing

Traffic is assigned to one of 4 queues (high, medium, normal, low). High-priority queue is always served first — lower queues only get bandwidth when the high queue is empty. Risk of starvation: low-priority traffic may never be served if high-priority traffic never stops. Used for strict QoS enforcement.

WFQ

Weighted Fair Queuing

Automatically identifies flows and assigns bandwidth proportional to IP precedence. Ensures all flows get some bandwidth — prevents starvation while still giving priority to marked traffic. Good for mixed environments without manual queue configuration. Default queuing on many Cisco interfaces below 2 Mbps.

CBWFQ

Class-Based Weighted Fair Queuing

Extends WFQ — traffic classes are defined by the administrator (matching ACLs, DSCP values, protocols) and each class is guaranteed a minimum bandwidth percentage. Most commonly used queuing method in enterprise QoS policies. Often combined with LLQ for voice.

LLQ

Low Latency Queuing

Adds a strict priority queue to CBWFQ for voice and real-time traffic. Voice traffic always gets forwarded immediately from the priority queue; all other classes share remaining bandwidth via CBWFQ. The recommended queuing mechanism for VoIP networks.

RED / WRED

Weighted Random Early Detection

Proactively drops packets randomly before queues fill completely, signalling TCP senders to slow down before congestion gets critical. WRED respects DSCP markings — packets with lower priority are dropped at lower queue fill thresholds than high-priority packets. Prevents global synchronisation of TCP flows.

Bandwidth Guarantees — CIR and CBR

📌 Key Rate Terms

CIR (Committed Information Rate) — the guaranteed minimum bandwidth contracted with a service provider. Traffic up to the CIR is always forwarded. Traffic above the CIR (burst) may be forwarded if bandwidth is available, or dropped/re-marked.

CBR (Constant Bit Rate) — a fixed, continuous rate with no variation. Used for voice circuits and legacy TDM connections. Every packet of a CBR flow is transmitted at exactly the same rate.

PIR (Peak Information Rate) — the maximum rate allowed during burst periods above the CIR. Traffic between CIR and PIR is marked "excess" and may be dropped during congestion.

QoS Trust Boundaries

A trust boundary is the point in the network where you stop trusting incoming QoS markings and enforce your own policy. Endpoints (PCs, phones) can mark their own traffic with any DSCP value — including spoofing high-priority markings to get preferential treatment. The trust boundary is where you re-classify and re-mark traffic according to network policy.

🎯 Exam Tip — Trust Boundary Placement

IP phones: Trust DSCP markings from IP phones (they correctly mark voice at EF=46) but re-mark PC traffic behind the phone to CS0 regardless of what the PC claims.

Access layer switches: The most common trust boundary. Markings from untrusted endpoints are reset here; the switch re-marks based on interface or ACL-based policies.

Never trust markings from the internet — ISPs strip or ignore DSCP markings at their boundaries. QoS works within your network, not across arbitrary internet paths.

Exam Scenarios

Scenario 1: Users report poor voice call quality when the network is heavily loaded with large file transfers. What QoS technique should be implemented? Answer: Configure LLQ (Low Latency Queuing) with a strict priority queue for voice traffic (DSCP EF = 46). Limit file transfer traffic to a separate CBWFQ class with capped bandwidth. Voice will always be forwarded immediately from the priority queue.

Scenario 2: A company's internet connection is contracted for 100 Mbps. An ISP wants to enforce this limit. Which technique drops packets that exceed 100 Mbps immediately rather than buffering them? Answer: Traffic policing. Traffic above the CIR is dropped (or re-marked) immediately. Traffic shaping would buffer excess traffic, which is not what strict CIR enforcement requires.

Scenario 3: An administrator needs to ensure that VoIP traffic always receives the highest DSCP marking. A user is attempting to mark their web browser traffic as DSCP 46 to gain priority. How is this prevented? Answer: Define a QoS trust boundary at the access layer switch. Reset all DSCP markings from untrusted user devices to CS0. Apply a policy that re-marks only known VoIP sources (by IP address or port 5060/RTP) to DSCP EF.

Scenario 4: What DSCP value should be assigned to VoIP traffic, and what is the equivalent CoS value? Answer: VoIP traffic should be marked DSCP EF = 46 at Layer 3. The equivalent Layer 2 CoS value is CoS 5 in the 802.1p field of the VLAN tag (802.1Q).

Scenario 5: A network team wants to implement QoS without tracking individual flows or reserving per-connection bandwidth across every router. Which QoS model should they use? Answer: DiffServ (Differentiated Services). Traffic is classified and marked at the edge, and core routers apply Per-Hop Behaviors based on DSCP values — no per-flow state is maintained, making it scalable. IntServ/RSVP would require per-flow reservations at every hop and doesn't scale.

QoS and Application Performance — Real-World Impact

Understanding why QoS matters requires connecting the technical mechanisms to their real-world impact on application performance and user experience. The applications most sensitive to network conditions are the ones that drive QoS deployment decisions in enterprises.

Unified Communications (UC) platforms — Microsoft Teams, Cisco Webex, Zoom — combine voice, video, screen sharing, and messaging in a single platform. Each media type has different QoS requirements: voice (real-time, low bitrate, extremely sensitive to jitter and loss), video (higher bitrate, sensitive to jitter and loss but tolerates slightly more than voice), and screen sharing/content sharing (less real-time, more tolerant of minor buffering). Teams and Webex both use DSCP markings to differentiate their traffic types and expect network infrastructure to honor these markings. Microsoft publishes specific QoS recommendations for Teams: voice at DSCP 46 (EF), video at DSCP 34 (AF41), and application sharing at DSCP 18 (AF21). Networks without QoS treat all Teams traffic equally — and during congestion, voice calls suffer the same drops as web browsing.

ERP and financial applications (SAP, Oracle, financial trading platforms) often have latency-sensitive database transactions where 50ms of additional latency is the difference between a usable system and a frustrating one. While not real-time like voice, these applications benefit from QoS policies that prioritize their traffic over bulk transfers and general web browsing during peak hours.

Cloud applications present a QoS challenge because traffic leaves the enterprise network — and DSCP markings are typically stripped or ignored at the internet boundary. QoS is effective within the enterprise LAN and WAN (MPLS), but once traffic hits the public internet, there's no mechanism to enforce priority. This is one reason why SD-WAN with multiple WAN paths is increasingly important — rather than trying to enforce QoS across the internet, SD-WAN routes latency-sensitive traffic over the best-performing available path, achieving QoS-like outcomes through intelligent path selection.

Real-World QoS Implementation Example

Understanding QoS in the abstract is one thing — seeing how a real enterprise policy is structured helps connect the concepts. A typical enterprise QoS policy for a branch office with VoIP, video conferencing, and general data traffic follows a consistent pattern.

At the edge of the network (where packets enter from outside), traffic is classified and marked. The classification uses ACLs or NBAR (Network-Based Application Recognition) to identify application types — SIP and RTP traffic gets marked as DSCP EF 46, video conferencing traffic (Webex, Teams) gets marked as DSCP AF41 (34), and everything else defaults to DSCP CS0 (0, best effort). Trust boundaries are set at the access layer: IP phones are trusted to mark their own traffic correctly, but PC workstations' self-applied DSCP markings are overridden at the switch port.

In the core of the network, routers honor the DSCP markings and apply queuing policies — LLQ with a strict priority queue for EF-marked voice traffic (allocated ~20% of WAN bandwidth), CBWFQ classes for video (20%), business applications (30%), and best effort (30%). The LLQ strict priority queue ensures voice packets are never delayed by other traffic, while the CBWFQ classes guarantee minimum bandwidth to each traffic category without completely starving any class.

At the WAN edge, traffic shaping is applied outbound to smooth traffic to the contracted CIR, preventing bursty traffic from triggering policing at the ISP's edge. The QoS policy is applied consistently at all branch sites — managed centrally via SD-WAN policies that push consistent configurations to all sites simultaneously rather than requiring per-device CLI configuration.

QoS for Voice and Video — Why It Matters

VoIP (Voice over IP) and video conferencing have fundamentally different network requirements from traditional data applications, and understanding why drives the entire QoS discipline.

A file download doesn't care if packets are delayed by 50ms or arrive slightly out of order — TCP retransmits missing packets, and the file eventually arrives complete. The user notices a slightly longer download time, not a failed transfer. Real-time voice and video are the opposite: they cannot wait for retransmits, they cannot tolerate excessive delay, and they cannot absorb variable delay (jitter) without audible or visible artifacts.

The ITU-T G.114 standard recommends a maximum one-way delay of 150ms for voice calls to sound natural. Beyond 150ms, conversational interactions become awkward — speakers talk over each other because they don't hear each other quickly enough. At 300ms+ latency, calls become noticeably unusable. Jitter (variation in packet arrival time) causes audio to break up and voice to become choppy — even if average latency is acceptable, bursts of 50ms variation will cause audible gaps. Packet loss of even 1–2% causes noticeable voice degradation because lost RTP packets simply produce silence in the audio stream.

For video conferencing, the requirements are similarly strict but with higher bandwidth demands. HD video conferencing codecs typically require 1–4 Mbps of guaranteed bandwidth per session with tight jitter requirements. When this traffic competes with bulk transfers or web browsing on an unmanaged network, call quality degrades unpredictably.

IntServ — Integrated Services (RSVP)

IntServ (Integrated Services) is a QoS model where individual flows reserve bandwidth end-to-end through the network before sending data. The reservation mechanism is RSVP (Resource Reservation Protocol). A sender issues an RSVP PATH message that travels to the receiver; the receiver sends an RSVP RESV message back, causing each router along the path to reserve bandwidth for that specific flow.

IntServ guarantees bandwidth mathematically — if a path is reserved, those resources are held exclusively for that flow regardless of other traffic. This makes IntServ the only model that can provide absolute, deterministic QoS guarantees. However, this comes at a massive scalability cost: every router along the path must maintain per-flow state for every active reservation. In a network carrying millions of simultaneous flows, this is computationally infeasible.

For exam purposes: IntServ/RSVP is the older, per-flow reservation model that doesn't scale to large networks. It appears in exam questions as the contrast to DiffServ — when you need to identify which model uses per-flow reservations, the answer is IntServ. DiffServ (DSCP-based) replaced IntServ as the practical enterprise QoS model because it marks traffic at the edges and applies class-based treatment in the core without per-flow state.

QoS in Wireless Networks

Implementing QoS on Wi-Fi networks introduces additional complexity because wireless is a shared, half-duplex medium — only one device transmits at a time on a given channel. Standard 802.11 uses CSMA/CA (Carrier Sense Multiple Access with Collision Avoidance), where every device competes equally for channel access using random backoff timers.

WMM (Wi-Fi Multimedia), defined in IEEE 802.11e, extends QoS to wireless by defining four access categories with different channel access priorities: Voice (AC_VO), Video (AC_VI), Best Effort (AC_BE), and Background (AC_BK). Devices and APs that support WMM use shorter backoff intervals for higher-priority access categories, giving voice and video traffic a better chance of accessing the channel before less important traffic. Most modern Wi-Fi devices and APs support WMM, and it should always be enabled for networks carrying VoIP or video conferencing.

The limitation of wireless QoS is that it only controls how devices compete for channel access — it cannot prevent interference, signal degradation, or the fundamental half-duplex nature of Wi-Fi from introducing latency and jitter. Wireless QoS supplements, but does not replace, the need for adequate signal strength, proper channel planning, and sufficient AP density for the number of devices and traffic volume.

Ace your Network+ exam

Free cheat sheets covering QoS, routing, WAN, and more.

View Cheat Sheet →

Quality of Service (QoS) Explained