How to diagnose streaming performance issues?

My ISP delivers both internet and iptv over a fiber, and each of those are in separate VLANs. I have configured my router so that the WAN interface uses the internet VLAN, and I have created a separate interface for iptv on the iptv VLAN. This interface receives a separate IP over DHCP, and the STBs are within my LAN, so I have igmpproxy running with the iptv interface as upstream, and my LAN as downstream.

Everything works, but there is a performance issue that I haven't figured out. The multicast streams received by the STBs are typically somewhere around 7 Mbit/s each, and in my case there is a maximum of 4 simultaneous streams running, so an aggregate bandwidth of around 24 Mbit/s. However, in my testing I'm just using 1 stream, so less than 10 Mbit/s on average.

Now, the problem is that other activity on the router causes disruptions in the multicast, so the TV picture sometimes stutters and sound breaks up, which is unacceptable.

For example, if I run a Speedtest form my computer (which uses the WAN interface), I can provoke these issues. The WAN connection is capped on 300 Mbit/s (symmetric), and the problem seems worse on upload than on download.

The router is a Netgear R7800 running on 17.01 stable (a build based off of @hnyman's community build), so it should have plenty of power to handle this. I have tried building a firmware with the shortcut-fe patch included, but that didn't make much of a difference.

How can I diagnose what's causing this? And if anyone has suggestions I can try, I'd be very grateful.

I've now done the following test:

On a computer I assigned a static IP (192.680.10.66) in the IPTV network, and made this computer part of the network on the switch. Then I started an iperf3 server in UDP mode on this computer, and on a different computer in my LAN (192.168.12.101), I ran an iperf3 client. This simulates the traffic flow which happens for the TV, with the exception that it is unicast instead of multicast. That probably doesn't matter, since it is still UDP.

To simulate data coming from the IPTV side, I need to use reverse mode in the client (iperf for some reason defaults the server to be the receiver, not the transmitter).

This is what I get if I simulate a 10 Mbit/s stream:

C:\Programs\nettools\iperf-3.1.3-win64>iperf3 -c 192.168.10.66 -t 10 -u -b 10M -R
Connecting to host 192.168.10.66, port 5201
Reverse mode, remote host 192.168.10.66 is sending
[  4] local 192.168.12.101 port 64678 connected to 192.168.10.66 port 5201
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-1.00   sec  1.31 MBytes  11.0 Mbits/sec  0.330 ms  0/168 (0%)
[  4]   1.00-2.00   sec  1.20 MBytes  10.0 Mbits/sec  0.119 ms  0/153 (0%)
[  4]   2.00-3.00   sec  1.20 MBytes  10.0 Mbits/sec  0.271 ms  0/153 (0%)
[  4]   3.00-4.00   sec  1.19 MBytes  9.97 Mbits/sec  0.494 ms  0/152 (0%)
[  4]   4.00-5.00   sec  1.17 MBytes  9.83 Mbits/sec  0.430 ms  3/153 (2%)
[  4]   5.00-6.00   sec  1.17 MBytes  9.83 Mbits/sec  0.301 ms  2/152 (1.3%)
[  4]   6.00-7.00   sec  1.20 MBytes  10.0 Mbits/sec  0.114 ms  0/153 (0%)
[  4]   7.00-8.00   sec  1.20 MBytes  10.0 Mbits/sec  0.307 ms  0/153 (0%)
[  4]   8.00-9.00   sec  1.19 MBytes  9.97 Mbits/sec  0.246 ms  0/152 (0%)
[  4]   9.00-10.00  sec  1.20 MBytes  10.0 Mbits/sec  0.352 ms  0/153 (0%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-10.00  sec  12.0 MBytes  10.1 Mbits/sec  0.352 ms  5/1542 (0.32%)
[  4] Sent 1542 datagrams

iperf Done.

As you can see, even at 10 Mbit/s, I start seeing packet loss, and this is with nothing special going on at the router. With 30 Mbit/s, it gets worse:

C:\Programs\nettools\iperf-3.1.3-win64>iperf3 -c 192.168.10.66 -t 10 -u -b 30M -R
Connecting to host 192.168.10.66, port 5201
Reverse mode, remote host 192.168.10.66 is sending
[  4] local 192.168.12.101 port 63741 connected to 192.168.10.66 port 5201
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-1.00   sec  3.77 MBytes  31.6 Mbits/sec  0.200 ms  63/545 (12%)
[  4]   1.00-2.00   sec  3.58 MBytes  30.0 Mbits/sec  0.293 ms  0/458 (0%)
[  4]   2.00-3.00   sec  3.62 MBytes  30.3 Mbits/sec  0.133 ms  0/463 (0%)
[  4]   3.00-4.00   sec  3.58 MBytes  30.0 Mbits/sec  0.240 ms  0/458 (0%)
[  4]   4.00-5.00   sec  3.52 MBytes  29.5 Mbits/sec  0.305 ms  0/450 (0%)
[  4]   5.00-6.00   sec  3.58 MBytes  30.0 Mbits/sec  0.262 ms  4/462 (0.87%)
[  4]   6.00-7.00   sec  3.56 MBytes  29.9 Mbits/sec  0.282 ms  0/456 (0%)
[  4]   7.00-8.00   sec  3.62 MBytes  30.4 Mbits/sec  0.130 ms  0/463 (0%)
[  4]   8.00-9.00   sec  3.54 MBytes  29.7 Mbits/sec  0.309 ms  0/453 (0%)
[  4]   9.00-10.00  sec  3.52 MBytes  29.6 Mbits/sec  0.280 ms  8/459 (1.7%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-10.00  sec  36.5 MBytes  30.6 Mbits/sec  0.254 ms  75/4671 (1.6%)
[  4] Sent 4671 datagrams

iperf Done.

If I do the same with TCP instead (maximizing bandwidth):

C:\Programs\nettools\iperf-3.1.3-win64>iperf3 -c 192.168.10.66 -t 10 -R
Connecting to host 192.168.10.66, port 5201
Reverse mode, remote host 192.168.10.66 is sending
[  4] local 192.168.12.101 port 54189 connected to 192.168.10.66 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  64.2 MBytes   539 Mbits/sec
[  4]   1.00-2.00   sec  77.7 MBytes   652 Mbits/sec
[  4]   2.00-3.00   sec  83.8 MBytes   703 Mbits/sec
[  4]   3.00-4.00   sec  70.7 MBytes   593 Mbits/sec
[  4]   4.00-5.00   sec  76.7 MBytes   643 Mbits/sec
[  4]   5.00-6.00   sec  73.6 MBytes   618 Mbits/sec
[  4]   6.00-7.00   sec  77.7 MBytes   652 Mbits/sec
[  4]   7.00-8.00   sec  64.0 MBytes   537 Mbits/sec
[  4]   8.00-9.00   sec  71.2 MBytes   598 Mbits/sec
[  4]   9.00-10.00  sec  60.7 MBytes   509 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   721 MBytes   605 Mbits/sec                  sender
[  4]   0.00-10.00  sec   721 MBytes   605 Mbits/sec                  receiver

iperf Done.

As you can see, that gives me a bandwidth of 600 Mbit/s. The IPTV stream does not become disrupted while doing this test. However, If I run some heavy data transfer (by using Speedtest) on the WAN connection, I get disruptions in the iptv stream. It seems that the disruptions are more frequent and severe during upload than during download.

What can I do to mitigate this issue? I need to protect the traffic on the IPTV interface as much as I can, I just have no clue how (and why this is happening in the first place).

1 Like

@mroek I've got the exact same setup with my ISP (2 VLANs, one for Internet, one for IPTV) and the exact same issue: multicast IPTV streams are disrupted while doing data transfers.
My understanding is that linux kernel is not able to classify and prioritize real-time flows as a dedicated hardware will do. I’m sure that some software optimization can be found (as ISP’s home gateway), but I didn’t found anything at this time. I also looked at SQM scripts for a while, as it can classify / queue flows, but the main purpose here is minimizing bufferbloat in the uplink. It’s different.

I can give you one trick that help me a little on R7800 : change cpu affinity for eth0 and eth1, this had been discussed earlier in the main R7800 thread. For example, this is better for me:

echo 2 >/proc/irq/30/smp_affinity
echo 2 >/proc/irq/31/smp_affinity

To be honest, I’m not routing multicast IPTV anymore with the R7800: the IPTV VLAN is switched from WAN to LAN (with R7800 switch), then I put a dedicated OpenWRT VM on a Xeon based hardware which is in charge of igmp-proxying and routing multicast in my LAN. On the LAN side, I'm also using a Netgear Switch with hardware QoS.

I’m still very interested to find a solution to route correctly multicast through R7800.

@Nague: Good to hear that I'm not alone with this issue, but I find it hard to believe that this is unsolvable with the R7800. A dual core 1.7 GHz CPU should have enough power to handle this quite easily. The ISP home router (which I have too little control over) handles this without any problems, and that's also a Linux-based router (Zyxel FMG 3542). That's why I need to bypass it.

I'll test the trick you mention, but I have already been testing irqbalance, which I believe will do something similar, just automatic.

@mroek There is no doubt that it's definitely possible to enhance real-time routing with R8700 and its dual core CPU. As you say, most of the ISP home router are linux based.

The big question still remains: How to achieve this? If the issue stems from the fact that the eth0 interface becomes saturated due to the high traffic, thus dropping some of the UDP traffic, how can this be avoided? I suspect that this is what's happening, I don't think it is the CPU that can't cope.

However, the physical speed of the eth0 interface is 1 Gbit/s, so there should be plenty of headroom for a 10-40 Mbit/s UDP stream in addition to the other internet communication which tops out at 300 Mbit/s in my case.

I found what looks like a workaround for this problem. I experimented a little with SQM, and when I enabled SQM on my internet VLAN (102) and set the speed to 200Mbit/s both on ingress and egress, I could run the speedtest without observing drops on the TV. However that means I sacrifice 100 Mbit/s of internet speed, so it isn't a desirable workaround.

I also tried enabling SQM on the parent device (i.e eth0), but that did not help. I still believe this is maybe the way to go, provided it is possible to prioritize the multicast (or just UDP, perhaps). The multicast traffic from my ISP comes with DSCP set like so:

Differentiated Services Field: 0x80 (DSCP: CS4, ECN: Not-ECT)
    1000 00.. = Differentiated Services Codepoint: Class Selector 4 (32)
    .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)

I believe (well, more hope) it should be possible to use this for something, as the SQM settings has a setting for DSCP under "Advanced configuration". I tried fiddling with those settings (setting DSCP to "DO NOT SQUASH" and "allow"), but it didn't seem to do anything. Then I looked in the debug log, and found some errors that may have significance:

iptables -t mangle -D POSTROUTING -o eth0.102 -m mark --mark 0x00/0xff -g QOS_MARK_eth0.102
iptables v1.4.21: goto 'QOS_MARK_eth0.102' is not a chain

However, these errors appeared in the section for stopping the SQM instance, not in the starting section. Nevertheless that indicates it expected to see some rules relating to QOS there (which is why it wanted to delete them).

If anyone could chime in with some info on this, it would be great. Are there settings to help prioritize multicast/UDP on ingress in SQM?

Eth0 is your LAN interface, right ? And VLAN 102 is on the WAN side, so basically, eth1.102 ?

No, eth0 is the WAN interface (as is the default in LEDE for this router), so eth0.102 is my internet WAN, and eth0.101 is the upstream IPTV interface from my ISP. There are no other VLANs on eth0.

I did initially experiment with having the IPTV interface on eth1, but I've rearranged this a bit after I got a new upstream switch. I had the same issues before, though.

I'm struggling with the understanding of SQM and how I can prioritize ingress traffic marked with CS4. I've looked into the scripts, but I can't figure out how to do it. For instance, say I use cake and the script layer_cake.qos (and I also set DO NOT SQUASH and allow for DSCP). I set my interface to eth0, since this is the parent interface for both my internet and my iptv. If I then show the stats for eth0, I see something like this:

root@R7800:/usr/lib/sqm# tc -s qdisc show dev eth0
qdisc cake 8054: root refcnt 2 bandwidth 200Mbit diffserv3 triple-isolate rtt 100.0ms raw
 Sent 15874122 bytes 180943 pkt (dropped 0, overlimits 608 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 7040b of 10000000b
 capacity estimate: 200Mbit
                 Bulk   Best Effort      Voice
  thresh     12500Kbit     200Mbit      50Mbit
  target         5.0ms       5.0ms       5.0ms
  interval     100.0ms     100.0ms      10.0ms
  pk_delay         0us        11us        10us
  av_delay         0us         5us         3us
  sp_delay         0us         3us         3us
  pkts               0      180761         182
  bytes              0    15860350       13772
  way_inds           0        1443           0
  way_miss           0        3452          34
  way_cols           0           0           0
  drops              0           0           0
  marks              0           0           0
  sp_flows           0           1           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0        1514         278

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 3193293031 bytes 2409395 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Almost all of the traffic ends up in the "Best Effort" category, it seems. There's also a Voice category that sees some traffic, but it's very little (and I don't know what traffic it is). How can I set up a category for the multicast video stream to give that priority? Is it even possible?

@moeller0: May I ask for your assistance here? Is it even possible to use SQM the way I want? In reality the iptv interface isn't bandwidth limited from the ISP side (only the wan is), but my issue is that heavy internet traffic disturbs the iptv traffic, and I guess it is because both use the eth0 interface as the parent.

So have a look at https://github.com/dtaht/sch_cake/blob/master/sch_cake.c especially from line 1390, you will see that out of the available diffserv profiles diffserv8 and diffserv4 will treat CS4 maked packets as low latency (but beware cake will only give a small fraction of the available bandwidth to the low latency tier (there needs to be a cost/trade-ff for low-latency otherwise every application would request low latency, effectively resulting in normal or even high latency)). I would recommend you try diffserv4 based on the following:
static int cake_config_diffserv4(struct Qdisc sch)
{
/
Further pruned list of traffic classes for four-class system:
*

  •  Latency Sensitive  (CS7, CS6, EF, VA, CS5, CS4)
    
  •  Streaming Media    (AF4x, AF3x, CS3, AF2x, TOS4, CS2, TOS1)
    
  •  Best Effort        (CS0, AF1x, TOS2, and those not specified)
    
  •  Background Traffic (CS1)
    
  •  Total 4 traffic classes.
    

*/

You should use layer cake and put the diffserv4 keyword into the advanced options for both ingress and egress.

Correct, that is only sane after you checked that your ISP returns sane DSCP markings though. There have been reports of ISPs remarking lots of packets to CS1 (Background) and that might not be exactly what you want, better measure and then test whether it works as you find acceptable.

This becomes clear if you look at the diffserv3 mappings:
static int cake_config_diffserv3(struct Qdisc sch)
{
/
Simplified Diffserv structure with 3 tins.

  •  Low Priority		(CS1)
    
  •  Best Effort
    
  •  Latency Sensitive	(TOS4, VA, EF, CS6, CS7)
    

*/
Anything not CS1, TOS4, VA, EF, CS6, CS7 will be in best effort.

In that case it should be sufficient to instantiate the shaper on the internet VLAN. But what is your link technology and what is the network topology?

In that case I would expect setting a shaper on the internet VLAN with a rate of say 250Mbps might work (but make sure to instantiate a shaer on ingress and egress). Could you try starting with say 150Mbps for internet and successively increase the rate until your problems start to appear?

Best Regards

Yesterday I thought I had this solved, but that turned out to not be true after all.
My setup was cake with layer_cake.qos, and "Ignore DSCP on ingress" set to allow. I also edited defaults.sh and set the INGRESS_CAKE_OPTS, first to diffserv4, then to diffserv8.

When it was set to diffserv4, I still got no packets in the "Video" category, but with diffserv8 I saw that the video stream ended up in "Tin 6":

qdisc cake 80ad: dev ifb4eth0 root refcnt 2 bandwidth 400Mbit diffserv8 triple-isolate wash rtt 100.0ms raw
 Sent 105729827 bytes 80361 pkt (dropped 0, overlimits 1446 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 76Kb of 15140Kb
 capacity estimate: 400Mbit
                 Tin 0       Tin 1       Tin 2       Tin 3       Tin 4       Tin 5       Tin 6       Tin 7
  thresh       400Mbit     350Mbit  306250Kbit  267968Kbit  234472Kbit  205163Kbit  179518Kbit  157078Kbit
  target         5.0ms       5.0ms       5.0ms       5.0ms       5.0ms       5.0ms       5.0ms       5.0ms
  interval     100.0ms     100.0ms     100.0ms     100.0ms     100.0ms     100.0ms     100.0ms     100.0ms
  pk_delay         0us         0us       309us         4us         0us         0us        68us       208us
  av_delay         0us         0us        78us         0us         0us         0us         6us         5us
  sp_delay         0us         0us         8us         0us         0us         0us         5us         5us
  pkts               0           0        3205           2           0           0       77136          18
  bytes              0           0     2319108         190           0           0   103409435        1094
  way_inds           0           0           0           0           0           0           0           0
  way_miss           0           0          90           2           0           0          11          13
  way_cols           0           0           0           0           0           0           0           0
  drops              0           0           0           0           0           0           0           0
  marks              0           0           0           0           0           0           0           0
  sp_flows           0           0           0           0           0           0           0           0
  bk_flows           0           0           0           0           0           0           1           0
  un_flows           0           0           0           0           0           0           0           0
  max_len            0           0        6056          96           0           0        1358          86

With diffserv4, my video did not end up in the "Video"-tin, so cake obviously does not regard CS4-marked packets (which my ISP is setting for the video) as video. At first it seemed that running with diffserv8 my problem was solved, but for some reason after turning it off and then back on (and other combinations), the problem returned.

It is also somewhat annoying that I really don't understand why this issue appears in the first place. The maximum ingress of eth0 is 300 Mbit/s (capped by the ISP) plus the video stream (which is uncapped, but always under 40 Mbit/s), so total aggregate bandwidth is only 340 Mbit/s. This should not be a problem for this hardware even without any SQM.

@moeller0: Thanks you very much for chiming in! My answer just above here was posted at the same time as your reply, so I hadn't read your answer when I wrote that.

Setting diffserv4 puts my video in the "Voice" category, which I guess is "Latency Sensitive". And that's probably OK, the name of the tin doesn't matter.

The link from the ISP is gigabit fiber, which I currently terminate in a Cisco SG300 switch. As mentioned, the video and internet are in different VLANs, so in the switch I'm sending those two VLANs to my router WAN, and then I split it in two virtual interfaces, where the internet is my WAN, and the video has it's own interface that I have called IPTV.

Setting a shaper only on the internet VLAN did seem to work, but it also meant I had to sacrifice a lot of bandwidth, and I have issues understanding why that should be necessary. As mentioned in my previous post, the total ingress on eth0 will never exceed around 340 Mbit/s, which I would have thought it should be able to handle.

I think I'll have to just admit defeat on this one. It seems impossible to get the router to avoid disturbing the multicast when loading the internet connection, which is extremely annoying. The hardware should be capable enough, but it just doesn't work properly.

The Mikrotik RB3011 is based on the IPQ8064, which is slower than the IPQ8065 in the R7800, and has the same switch chips (just two of them for twice the number of ports), and that router handles the exact same setup with ease (and with a 500 Mbit/s WAN line also). So I am quite sure this has to do with the software (i.e LEDE), but I am unfortunately not knowledgeable enough to figure out what's wrong. SQM shouldn't really be necessary I think, since the ISP caps the WAN bandwidth anyway, so in total it is way less than the raw capacity of the interface.

This means I'll be forced to do the same as @Nague did, and use different hardware to handle the IPTV stream. I've set up my old WDR4300 (running LEDE 17.01.2) for this task (and this task only), and it seems to work just fine. But I'm really unhappy that I couldn't get the R7800 to do it all, as I think it is powerful enough if set up right.

If I get any new input for how to solve this problem, I will of course test it, but for others reading this thread, my advice is clear: Don't bother trying to get IPTV/multicast working on this device if you also have a high-speed internet connection in a similar setup as described here.

2 Likes

Is this a replacement test? So did you try to simply replace the R7800 with the RB3011 or is the later running at a different link?

Best Regards

@moeller0: No, not a replacement test. What I am referring to is a user that I know on a different forum, who has the same ISP and a comparable setup as mine, and who bought the RB3011 just a few days ago. He has a 500 Mbit/s WAN (I'm on 300 Mbit/s), and with that router there are no issues with multicast dropouts/disturbances when he's running the same tests as I do (only his WAN speed is even higher). I asked him explicitly to test WAN heavily while observing the TV to see if he had issues, and he reported back that there were no problems. The Mikrotik router reported around 40 % CPU load (but I don't know how Mikrotik calculates their load figure) when he maxed out everything.