Netgear R7800 exploration (IPQ8065, QCA9984)

why v8 and not v12?

Because that was already known to be successfully applied for 17.01

Have you tried to do v11 or v12 by yourself? The v9+ versions create new config symbols that need to be handled (as they enable building without LED functionality for boards without LEDs). Additional complication :frowning:

Feel free to refine patch with v12

Iā€™ll have a go at v12 sometime this week, if I can find some time.

On a separate note, has anyone attempted to use the NSS cores for the ipq806x? Thereā€™s a separate thread discussion increased latency for the R7800. Using NSS for the GMAC interface should help. Iā€™m able to compile and even load the nss-gmac kernel module for my R7800 but it doesnā€™t seems to do anything as the stmmac driver is still loaded for the ethernet interface.

Has anyone any luck with hacking the nss of ipq806x and managed to get it to work?

Upgraded to @hnyman build r6323 and so far the wifi LEDs have worked perfectly even after several soft reboots and power cycles.

Decided to do a small experiment with the wifi LEDs:

root@R7800RT1:~# echo none >/sys/devices/virtual/leds/ath10k-phy0/trigger
root@R7800RT1:~# echo none >/sys/devices/virtual/leds/ath10k-phy1/trigger

Both LEDs are now off.
Reinitialize led service for testing.

root@R7800RT1:~# service led restart
setting up led USB 1
setting up led USB 2
setting up led WAN
setting up led eSATA
Skipping trigger 'ide-disk' for led 'eSATA' due to missing kernel module
setting up led WLAN 2G
setting up led WLAN 5G

Both LEDs are now working.

To get a list of available services, enter the following:

root@R7800RT1:~# service

service "" not found, the following services are available:
adblock           dnsmasq           fstab             luci_statistics   rpcd              system            vsftpd
boot              done              gpio_switch       miniupnpd         sqm               uhttpd
collectd          dropbear          led               network           sysctl            umount
cron              etherwake         linksys_recovery  nlbwmon           sysfixtime        urandom_seed
ddns              firewall          log               odhcpd            sysntpd           usbmode

To get a list of "service" commands, enter the following:

root@R7800RT1:~# service led
Syntax: /etc/init.d/led [command]

Available commands:
	start	Start the service
	stop	Stop the service
	restart	Restart the service
	reload	Reload configuration files (or restart if service does not implement reload)
	enable	Enable service autostart
	disable	Disable service autostart

Not all of these commands really work. Example:

root@R7800RT1:~# service led stop
Doesn't do anything. "start", "restart", "reload", "enable", and "disable" work.

If you enter service led disable, nothing appears to happen, but on the next reboot many LEDs will not work except for the Power and LAN port LEDs.

  • Magnetron1.1

Just read about the NETGEAR XR500. Basically the R7800 , the size of FW is increased. It was max 25MB for R7800. It is 32MB for XR500.

DumaOS is based on OpenWrt?

Does this new device help us with the R7800 software development?

Also see https://www.smallnetbuilder.com/other/ces/ces-2018/33178-netgear-s-ces-2018-announcements

Other features include QoS for gaming device prioritization, OpenVPN support and network monitor.

wait... no... if you don't give a fu*k about the netgear partition.... you can increase it to 128 mb...

How? Does it mean we can flash it with XR500 firmware?

No (and not just for the reason that the XR500 is supposed to ship with 256 MB flash instead of 128 MB flash).

No you canā€™t, since a last minute change included 256MB flash on the XR500. I was a tester for the XR500ā€™s OS/firmware, we tested it on the R7800s that we were given. At the end we recieved special firmware builds as a thank you (not available to regular consumers) because of the 128->256MB change making it not compatible.

FYI even stock R7800 firmware is OpenWRT based, though a rather old one with proprietary blobs. Iā€™m assuming (donā€™t quote me on this though) that the Duma UI/QoS stuff is overlaid on stock firmware, which is why when you click ā€œNetgear Settingsā€ in the Duma UI you get the standard Netgear UI/layout for the rest of the settings minus Streamboost QoS of course which is removed.

Has anybody got the ath10k wifi LED patches v9-v12 working?

Those versions move gpio stuff to another source file and introduce new config symbols like CONFIG_ATH10K_LEDS, and I guess that the difference of being actually in kernel or just a separate "backports" build makes a difference, e.g. defining config symbols as CPTCFG_...
I have tried several approaches (config settings, Makefile dependency changes, forcing module selections) to make things work, and I have got the stuff compiled, but no life in the wifi LEDs.

v8 patch seems to work ok, but it would be nice to move to the "final" patch version v12.

So, if anybody has got the ath10k wifi LEDs in R7800, NBG6817, whatever... working, please share the sources. @slh ?

It may be that getting v9-v12 to work requires help from somebody like @nbd who really understands the mac80211 internals and Makefile & Kconfig dependencies.

It wont work since that config symbol does not exist in current backports.
But it could be patched with a patch backporting them

I haven't gotten it to work (last test with v9, which didn't work at all - although /sys/class/led/ & friends were present) so far. Even in the best of all cases the results are unreliable and erratic on my nbg6817, therefore I'm currently not using those patches (and they will easily need another 10 iterations before it might be acceptable for upstream, as even the most basic checkpatch issues like trailing whitespace and using spaces instead of tabs are still ignored).

@luaraneda, by the way, it seems as if the ipq806x target would be split into independent ones for ipq40xx and ipq806x (see blogic's staging tree), which could help getting the kernel just a tad smaller again.

Yes, I was just noticing that. If I remember correctly, on 2017 there was an attempt to do that by a member of Codeaurora, but it was dropped/rejected (don't remember which).
Maybe the kernel size issue triggered the change. I couldn't find any information on the mailing list...

On other news, yesterday I tried to compile kernel 4.14 using GCC 7.3, but the resulting image is approximately 4 kB bigger that the generated by GCC 5, which is not good for the affected devices.
Some notes about GCC 7.3:

  • uboot-fritz4040 is not compiling. I deselected it and continue my tests.
  • I only compiled the images. This time I didn't test them as usual on my Asus RT-AC58U.

I've been testing gcc 7.3 for arm with kernel 4.9 for a while now (nbg6817, so 4 MB kernel partition), I haven't noticed any problems so far (but obviously I have no size restrictions yet).

--- a/toolchain/gcc/Config.in
+++ b/toolchain/gcc/Config.in
@@ -3,7 +3,7 @@
 choice
 	prompt "GCC compiler Version" if TOOLCHAINOPTS
 	default GCC_USE_VERSION_7_1_ARC if arc
-	default GCC_USE_VERSION_7 if x86_64 || i386
+	default GCC_USE_VERSION_7 if x86_64 || i386 || arm
 	default GCC_USE_VERSION_5
 	help
 	  Select the version of gcc you wish to use.
--- a/toolchain/gcc/Config.version
+++ b/toolchain/gcc/Config.version
@@ -6,6 +6,7 @@ config GCC_VERSION_7
 	default y if GCC_USE_VERSION_7
 	default y if (!TOOLCHAINOPTS && x86_64)
 	default y if (!TOOLCHAINOPTS && i386)
+	default y if (!TOOLCHAINOPTS && arm)
 	bool
 
 config GCC_VERSION

gcc 7.3 will become particularly interesting for spectre mitigation (retpoline), but the required ARMv7 specific patches haven't been merged mainline yet.

@luaraneda all pcie patches that are missing in my tree are upstreamed
Btw ipq target is being split apart into ipq40xx and ipq806x. Check blogic's and mkresin's staging trees. This should decrease the size of the kernel image by removing ipq40xx specific config options.

1 Like

Sounds similar as my own result with v9 or later.

So far v8 is the best.

A question to the experts here. For a while now I have been experiencing significant latency spikes that affect VoIP and page loading. I tested by pinging 8.8.8.8 when the router (Netgear R7800) is pretty much idle. I would start 6..8 concurrent ping sessions and every minute I would see two or three spikes to 50..100 ms like below and they would most of the time happen synchronously in multiple sessions.

2018-03-03 15:26:07 64 bytes from 8.8.8.8: icmp_seq=525 ttl=60 time=58.013 ms
2018-03-03 15:26:35 64 bytes from 8.8.8.8: icmp_seq=553 ttl=60 time=37.198 ms
2018-03-03 15:27:07 64 bytes from 8.8.8.8: icmp_seq=585 ttl=60 time=76.856 ms
2018-03-03 15:28:06 64 bytes from 8.8.8.8: icmp_seq=643 ttl=60 time=60.067 ms

As suggested in this thread, I tried moving IRQ's to CPU1/0 in different permutations, tried both 17.01 and master, etc and all with no success. Then I noticed that while the IRQ's are running on their CPU exclusively, all the other processes (kernel workers, hostapd, dnsmasq, etc) are constantly jumping back and forth between the CPU's.

I used @hnyman's build env and built a firmware with (and without) isolcpus=1 (based on master code): the only differences from the original here are an additional boot param and a few more recent commits. The rest remained the same.

Then I tested four permutations below (using wired connection and the same source code):

  1. No isolcpus=1 and IRQ's on CPU0
  2. No isolcpus=1 and IRQ's on CPU1
  3. isolcpus=1 and IRQ's on CPU0
  4. isolcpus=1 and IRQ's on CPU1

The first three yielded no difference, but the last one dropped the size of the spikes to ~20ms and they are now 10+ minutes apart vs several each minute.

2018-03-03 15:34:03 PING 8.8.8.8 (8.8.8.8): 56 data bytes
2018-03-03 15:37:49 64 bytes from 8.8.8.8: icmp_seq=226 ttl=60 time=21.615 ms
2018-03-03 15:50:47 
2018-03-03 15:50:47 --- 8.8.8.8 ping statistics ---
2018-03-03 15:50:47 1000 packets transmitted, 1000 packets received, 0.0% packet loss
2018-03-03 15:50:47 round-trip min/avg/max/stddev = 11.082/11.963/21.615/0.475 ms
2018-03-03 15:50:47 PING 8.8.8.8 (8.8.8.8): 56 data bytes
2018-03-03 15:57:12 64 bytes from 8.8.8.8: icmp_seq=382 ttl=60 time=22.734 ms
2018-03-03 16:07:33 
2018-03-03 16:07:33 --- 8.8.8.8 ping statistics ---
2018-03-03 16:07:33 1000 packets transmitted, 1000 packets received, 0.0% packet loss
2018-03-03 16:07:33 round-trip min/avg/max/stddev = 10.913/11.839/22.734/0.492 ms
2018-03-03 16:07:33 PING 8.8.8.8 (8.8.8.8): 56 data bytes
2018-03-03 16:24:15 
2018-03-03 16:24:15 --- 8.8.8.8 ping statistics ---
2018-03-03 16:24:15 1000 packets transmitted, 1000 packets received, 0.0% packet loss
2018-03-03 16:24:15 round-trip min/avg/max/stddev = 11.192/11.921/15.246/0.326 ms

So CPU1 is now only for servicing IRQ's for eth0, eth1, wifi0, and wifi1 while everything else is running on CPU0.

Does this make sense or I am seeing things? I am not quite sure I can explain why there is such a difference.

2 Likes

I've suffered ping spikes as well when running OpenWRT/LEDE. Using stock firmware, they go away. My original post is located here: Build for Netgear R7800

Though my testing is absolutely not all encompassing, as compared to yours.

For now, I had to move back to stock firmware, since nlbwmon has shown itself to be unstable and unpredictable in behavior. Constant issues with displaying the data from the GUI; it randomly consumed 100% CPU for a period of at least 16 hours. That, and the noticeable increase in latency, as you mentioned.

There have been some PCIe fixes recently for this platform, and they do mitigate some of the issues with the relationship between OpenWRT/LEDE<->ipq806x.

Yes, stock firmware runs with no spikes and introduces almost unnoticeable addition latency overhead while LEDE adds around 1..2 ms. That I could live with, but the spikes were atrocious. I am running the latest the greatest, so should have all the patches in.