Some sqm performance reference. EA4500 and WRT1900acv1

While I was troubleshooting low wireless throughput for EA4500 in station mode, I got bored and decided to test out the cpus in EA4500 and WRT1900acv1.

First off, some observation:

  1. nat+sqm definitely hurts performance. You might want a dedicated sqm device with wan and lan bridged together as a transparent sqm device.
  2. lan <-> lan doesn't seem to use the CPU at all, as expected
  3. Even when wan and lan and bridged together (no NAT), it still consumes cpu to do so. But in my case both ea4500 and wrt1900acv1 can handle max gigabit speed no problem (of course)
  4. For wrt1900acv1, I found that some time the firmware "glitch out" using only one core i.e. cpu0 has 100% sirq and cpu1 is idle. But iperf throughput is most consistent with (100, 0). Given the same sqm setting if the router utilize both cpu, I get throughput jumping around quite a lot with (70, 90) ish cpu consumption.

(x,y) => (cpu 0 sirq, cpu 1 sirq)

All test results for wrt1900acv1 is recorded when the router glitches out and stuck at (100,0)
wrt1900acv1, MV78230 1.2ghz dual core, bridge over wan and lan, no nat. For this test, I use a binary search (very rough, only couple iteration) method on the interval [300 Mbps,600 Mbps].

fq_codel + simple.qos

.\iperf3.exe -c 192.168.1.9
Connecting to host 192.168.1.9, port 5201
[ 4] local 192.168.1.203 port 58087 connected to 192.168.1.9 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 45.5 MBytes 382 Mbits/sec
[ 4] 1.00-2.00 sec 51.6 MBytes 433 Mbits/sec
[ 4] 2.00-3.00 sec 56.8 MBytes 476 Mbits/sec
[ 4] 3.00-4.00 sec 56.6 MBytes 475 Mbits/sec
[ 4] 4.00-5.00 sec 56.5 MBytes 474 Mbits/sec
[ 4] 5.00-6.00 sec 56.2 MBytes 472 Mbits/sec
[ 4] 6.00-7.00 sec 55.8 MBytes 468 Mbits/sec
[ 4] 7.00-8.00 sec 56.1 MBytes 471 Mbits/sec
[ 4] 8.00-9.00 sec 56.4 MBytes 473 Mbits/sec
[ 4] 9.00-10.00 sec 56.5 MBytes 474 Mbits/sec


[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 548 MBytes 460 Mbits/sec sender
[ 4] 0.00-10.00 sec 548 MBytes 460 Mbits/sec receiver

iperf Done.

fq_codel + simplest

.\iperf3.exe -c 192.168.1.9
Connecting to host 192.168.1.9, port 5201
[ 4] local 192.168.1.203 port 58125 connected to 192.168.1.9 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 54.8 MBytes 459 Mbits/sec
[ 4] 1.00-2.00 sec 69.2 MBytes 581 Mbits/sec
[ 4] 2.00-3.00 sec 68.4 MBytes 573 Mbits/sec
[ 4] 3.00-4.00 sec 68.1 MBytes 572 Mbits/sec
[ 4] 4.00-5.00 sec 68.8 MBytes 577 Mbits/sec
[ 4] 5.00-6.00 sec 67.4 MBytes 565 Mbits/sec
[ 4] 6.00-7.00 sec 68.2 MBytes 573 Mbits/sec
[ 4] 7.00-8.00 sec 67.9 MBytes 569 Mbits/sec
[ 4] 8.00-9.00 sec 68.2 MBytes 572 Mbits/sec
[ 4] 9.00-10.00 sec 68.2 MBytes 573 Mbits/sec


[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 669 MBytes 561 Mbits/sec sender
[ 4] 0.00-10.00 sec 669 MBytes 561 Mbits/sec receiver

iperf Done.

piece of cake

.\iperf3.exe -c 192.168.1.9
Connecting to host 192.168.1.9, port 5201
[ 4] local 192.168.1.203 port 58097 connected to 192.168.1.9 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 56.6 MBytes 474 Mbits/sec
[ 4] 1.00-2.00 sec 68.4 MBytes 575 Mbits/sec
[ 4] 2.00-3.00 sec 69.2 MBytes 581 Mbits/sec
[ 4] 3.00-4.00 sec 68.5 MBytes 574 Mbits/sec
[ 4] 4.00-5.00 sec 68.8 MBytes 577 Mbits/sec
[ 4] 5.00-6.00 sec 69.2 MBytes 581 Mbits/sec
[ 4] 6.00-7.00 sec 68.6 MBytes 576 Mbits/sec
[ 4] 7.00-8.00 sec 68.5 MBytes 575 Mbits/sec
[ 4] 8.00-9.00 sec 69.1 MBytes 580 Mbits/sec
[ 4] 9.00-10.00 sec 68.6 MBytes 576 Mbits/sec


[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 676 MBytes 567 Mbits/sec sender
[ 4] 0.00-10.00 sec 676 MBytes 567 Mbits/sec receiver

iperf Done.

EA4500 88F6282 1.2ghz single core
I changed the test measurement (yeah..I don't want to bother retesting wrt1900acv1).

fq_codel + simple

52,78,99
100,200,400
ultimate 419

fq_codel + simplest

48,75,99,
100,200,400
ultimate 443

cake + piece of cake

42,62,88,
100,200,400
ultimate 453

The first two rows are a table,
A1,A2,A3
B1,B2,B3

Ax are the sirq%, Bx are the corresponding sqm setting. The last row is just the fastest throughput I could get for the given qdisc and qos script used.

Note though the throughput measured is around 3~5% less than the setting i.e. 400 mbps limit -> ~380 mbps measured throughput.

Interestingly enough cake+piece of cake doesn't do too much better in terms of the highest throughput it can get compared to fq_codel + simplest.qos. Though for the lower throughput settings it has a lower sirq.

Would be interesting to see the same test done on ipq8064, ipq8065 and the newer marvell soc used on wrt1900acv2, wrt1900acs, and wrt3200acm (88F6820 1.33ghz, 1.6ghz, 1.8ghz)