Build for Netgear R7800

Have you enabled offloading with option flow_offloading '1' in default section of firewall config?

I have indeed:

config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'REJECT'
	option flow_offloading '1'
	option flow_offloading_hw '1'

Are you using pppoe for your internet access?

option flow_offloading_hw '1' does only work on mt7621 soc´s...

I'm not using PPPOE, I've just enabled both options under firewall settings:
image

Does this help at all (for whatever reason, htop doesn't show the irq interupt):
image

As mentioned by juppin, ipq8065 doesn't have hardware flow-offload (yet), so disable it.

I've since disabled hardware offloading, I've left the setting above it enabled. However, I am still seeing extremely high ksoftirqd/1 and suspect it's affecting my throughput. Not sure where the issue lies. Not much has been changed from the base image other than basic wireless settings, which are fine at 1.3Gbps... Any assistance would be greatly appreciated (unless this is a known 'issue' on this firmware, in which case I'll wait it out):

Apologies if these images are a bit much, but it's all I can provide with my limited knowledge (here's a top report where it shows):
image

@philjohn
This is wireless, flow offload is enabled (I've since disabled hardware offload). As far as affinity settings, I'm not familiar with what you mean unfortunately. As mentioned above, this is a pretty bone-stock flash with barely anything tweaked past wireless settings to attain the fastest throughput, etc.

Check the kernel version of your firmware image, 4.14.50 is claimed to have introduced performance regressions, which are supposed to be fixed in 4.14.51.

I seem to be on 1.14.52, so would imagine this has been fixed:

OpenWrt 18.06-SNAPSHOT, r7111-ab7cabd09d
root@OpenWrt:~# uname -r
4.14.52

I guess I'll wait for a newer build and see if this resolves any of my issues...

OK, first of all then, 2.4Ghz or 5Ghz wireless? If 2.4 that's about the max you'll get, if 5 make sure you're close to the router as it attenuates very quickly.

Here's the set_cpu_affinity script, it needs to be dropped in /etc/init.d and chmod'd to 755

#!/bin/sh /etc/rc.common
# First start irqbalance with the --oneshot option
# Try to balance manually both eth to core2 and wifi0 to core2 ifthey are not balanced correctly
# System -> startup -> Local Startup
#  /etc/init.d/set_cpu_affinity

START=99

set_irq_affinity() {
	local name="$1"
	local val="$2"

case "$name" in
wifi0)
  	local irq_wifi0=`grep -E -m1 'ath10k_ahb|qcom-pcie-msi' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
	[ -n "$irq_wifi0" ] || echo "$name irq not found."
	echo "$val" > "/proc/irq/$irq_wifi0/smp_affinity"
	;;
wifi1)
  	local irq_wifi1=`grep -E -m2 'ath10k_ahb|qcom-pcie-msi' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
	[ -n "$irq_wifi1" ] || echo "$name irq not found."
	echo "$val" > "/proc/irq/$irq_wifi1/smp_affinity"
	;;
eth0)
  	local irq_eth0=`grep -E -m3 'eth0' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
	[ -n "$irq_eth0" ] || echo "$name irq not found."
	echo "$val" > "/proc/irq/$irq_eth0/smp_affinity"
	;;
eth1)
  	local irq_eth1=`grep -E -m3 'eth1' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
	[ -n "$irq_eth1" ] || echo "$name irq not found."
	echo "$val" > "/proc/irq/$irq_eth1/smp_affinity"
	;;
*)
  	local irq=`grep -m 1 "$name" /proc/interrupts | cut -d: -f1 | sed 's, *,,'`
	[ -n "$irq" ] || echo "$name irq not found."
	echo "$val" > "/proc/irq/$irq/smp_affinity"
	;;
esac
}

start() {
	. /lib/functions.sh

	/usr/sbin/irqbalance --oneshot --debug > /var/log/irqbalance.log

	# WAN and 802.11AC wifi on separate cores
	set_irq_affinity eth0 1
	set_irq_affinity wifi0 1

	# LAN and 802.11b/g/n on separate cores (and balanced - one wifi chip per core, one network chip per core)
	set_irq_affinity eth1 1
	set_irq_affinity wifi1 2
}
1 Like

OMG! the script is removed every sysupgrade!
there is a way to make it permanent?
but if then it is useful for which reason is not inserted by default?

1 Like

add the full path to it to /etc/sysupgrade.conf
it will then be saved in between upgrades.

1 Like

Let me just thank you a million times for your great effort and pushing this awesome hardware @hnyman. (I just registered here so I can thank you :slight_smile: )
Yesterday I flashed stock master over my R7800 and was pretty disappointed of its performance (still huge ping spikes and suboptimal wifi performance). But with your build and distributing IRQs its so much better now!

Are those patches (like v8-toimii-970-ath10k-QCA-LED-support.patch) going into master any time soon? Any links to track the merge process?

Also, are you building this kernel with CONFIG_PREEMPT? I read somewhere that this might get latency down, but someone fixed the root cause it looked, so it is actually not needed anymore?

Thanks again!

Actually, that v8 is the old wifi LED patch that is not any more included in the build as a modified v13 is already in the master sources. The patch is there just for my archiving purposes (and thatswhy it does not show up in the detailed diffs that I provide for each build). All patches actually used in the build are visible in the diffs.

Regarding latency, there shouldn't actually be that much difference to master snapshot, as I build with the default kernel options. So, no CONFIG_PREEPMT. (I am not quite sure what you mean by "stock master"? the actual master (buildbot snapshots), or the ancient 17.01 release?)

Most of my own changes have already been incorporated into the Openwrt sources, so currently my build is mostly about selecting key packages into the build.

Sorry for the wired naming. I've just seen it after sending. I meant snapshot build from here:
https://downloads.openwrt.org/snapshots/targets/ipq806x/generic/

So WIFI LEDs and latency should be okay with "official" snapshots? That's great to hear!

@philjohn Thanks for providing the script. I've just placed it inside init.d and 755'd it. Unfortunately, I'm still seeing high IRQ load when running a speed test over 5Ghz, and can see it's affecting the performance. Is there something else I'm missing, some sort of setting, which could be attributing to this?:

image

Wifi driver is run by IRQs, so what is the surprise with the load?

@hnyman Sorry I should probably clarify. It's not a surprise that there's any load, it's just that throughput seems to die the moment load hits 100%. There are reports of others hitting 400Mbps, whereas I can barely hit 300 at times on 5Ghz (I'm maybe two meters from the router, strong signal, stable ping, etc). I've hard-wired into the router and ran tests and can hit 350+ easily (regularly hitting 450), but when connected wireless seems truly difficult to hit (especially when I'm connected at 1.3Gbps).

I'm not certain the irq load is the 'issue' either. I have limited knowledge on this, and I've exhausted my personal knowledge and the irq load is the only indication of why the throughput would be low. I'm all ears if there are other things I need to check, ie ,should I be using a specific cypher, or any other setting.

I appreciate all the help and the firmware, so thank you!

Each arriving packet from wan, lan or wlan basically triggers the respective driver via IRQs, so high software IRQ load during speed test does not sound strange. CPU is not consumed for user-space computing, but is needed for network drivers (and firewall, routing etc.)

I just tested with flent using my 100/12 connection and I get

  • 20% sirq load with wired
  • 42%sirq load with 2.4GHz wifi
  • 60% sirq load with 5GHz wifi

This is without any balancing the irqs to different CPU cores, as my "slow" connection does not usually cause any need for that.

There is useful IRQ balancing discussion in the R7800 exploration thread from early 2017. Read from here onward: Netgear R7800 exploration (IPQ8065, QCA9984) - #45 by Nague
But IRQ balancing is no magic wand. But may help in getting some additional throughput.

Ps. the script linked above by @philjohn looks a bit strange, as its actions do not match the comments inside the script. (both wan and wlan0 are assigned to the same core, despite the comment saying otherwise)

1 Like

@hnyman any way to get ngnx ssl working on your build?

Good spot, I found this gave better throughput for 5ghz and lan to wan, keeping them local to a single core. Need to update the comments.