Netgear R7800 exploration (IPQ8065, QCA9984)

Hi All--

Is it possibly significant that there isn't an ID listed for the QCA9984 chips in pci.ids.gz? I mean, other previous/recent QCAs are listed in there.

Thanks for any comments/insight on this.

Ben--

Never heard about that file in Openwrt/LEDE context.
How is it related to the build?

It's part of the pciutils package (provides 'lspci') ... I ask because of the ath10k_pci error I see in my R7800 and like one of the last posts above.

Error=errors

It's kind of a stab but I'm wondering if the missing PCI ID for QCA9984 is insignificant or perhaps contributes to some of the ath10k_pci problems I've seen on my router (corrupt rx/tx ring buffers, "received unexpected tx_fetch_ind event", the error the guy aboetc.) ...

Not sure if it's related or just adds to it, whenever I've built my own images (Snapshots, LEDE-17.01 and 17.01.1) I always get this error towards the end of the build from the pciutils package build:

/usr/share/hwdata/pci.ids is read-only, exiting.
rm: cannot remove 'pci.ids.gz.old': No such file or directory

When circling back on that I went looking into the file and notice that there's no QCA9984 subdevice listed.

lspci shows the following on my R7800:

Network controller [0280]: Qualcomm Atheros Device [168c:0046]
Subsystem: Qualcomm Atheros Device [168c:cafe]

In pci.ids.tgz there is no Atheros device "0046":

168c Qualcomm Atheros

0040 QCA9980/9990 802.11ac Wireless Network Adapter
0041 QCA6164 802.11ac Wireless Network Adapter
0042 QCA9377 802.11ac Wireless Network Adapter
0050 QCA9887 802.11ac Wireless Network Adapter

Where does the 168c:cafe subdevice come from?

gunzip -c /usr/share/pci.ids.gz | grep -i cafe
4101 OLPC Cafe Controller Secure Digital Controller
cafe VirtualBox Guest Service
cafe Chrysalis-ITS
cafe Kona SD

I'm looking between the installed pci.ids.tgz file and on both https://pci-ids.ucw.cz/v2.2/pci.ids and pcidatabase.com

Any idea who the proper person is to add the info into the IDs database? I emailed the admin address at pci-ids.ucw.cz to see what they say.

The PCI ID's absense from that external PCI database package is likely insignificant. Package "pciutils" is not part of the default package set for R7800. I have never used that package. It should have no impact on normal operations ath10k.

I think that the ath10k driver (or the firmware blob) has a small database of the supported PCI IDs needed for the identifying the radio hardware.

The package Makefile here ( https://github.com/openwrt/packages/blob/master/utils/pciutils/Makefile ) gives the URL http://mj.ucw.cz/sw/pciutils/
That page says

If lspci doesn't recognize some device in your machine and you know what the device is, please submit an update to the database.

Link link leads to a page with more submitting details: http://pci-ids.ucw.cz/

Ps. why have you installed pciutils?

That's just it, the "missing board data" messages have left me feeling unsettled. Perhaps I'm making too much of it and going down a rabbit hole. Figured I'd ask you all since you're deep into the builds.

I've been trying the different builds, either my own compiled image or occasionally one of yours, and always seem to end up with some version of this (from 17.01-SNAPSHOT r3356-8b9f7bd7bd):

ath10k_pci 0001:01:00.0: failed to fetch board data for bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe from ath10k/QCA9984/hw1.0/board-2.bin

(sometimes it's garbled text or has code points in it)

This is my lack of knowledge here but somehow the chip is being identified with unpublished data that doesn't line up with one or two of the qca9984 related firmware files. I started poking around to see if I could track down what was going on.

So I installed pciutils to use lspci to interrogate the device tree to see if there was anything interesting to glean.

What originally got me on this is I've been trying to figure out why the calibration files are seemingly useless, they seem important to this chip and other QCAs:

ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0001:01:00.0.bin failed with error -2
ath10k_pci 0001:01:00.0: Falling back to user helper
firmware ath10k!pre-cal-pci-0001:01:00.0.bin: firmware_loading_store: map pages failed
ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/cal-pci-0001:01:00.0.bin failed with error -2

Most of the firmware versions look for just cal-pci files, other versions look for pre-cal-pci, and this version is looking for both. So those seem like somewhat moving targets.

Is firmware-5.bin the only important firmware blob for the QCA9984?

You should read the discussion in this thread starting about on 22 March containing e.g. links to ath10k development mailing list. They pretty much tell what information there is about those warnings. The driver/firmware for 9984 does not support OTP identification and ath10k lists there errors/warnings related to various methods until it finds a suitable method to get the board data.

But that has nothing to do with external PCI device ID databases.

Thanks, and also thanks for the other patient responses. Please trust me when I say that I have read through the part of this thread your talking about, and pretty much any other thread that mentions the R7800, ath10k, or QCA9984. That's not to imply that I fully understood everything either. I got tired of reading, trying something I read, over and over with no results. I'm not a big fan of any errors and supposedly benign errors and figured I'd chime in about the odd pci id thing.

Going briefly OT I'm really digging the project overall, can you or anyone else point me to where I could get into helping/contributing?

@blogic is there any update on qca8k implementation in lede? Really looking forward to testing it.

@hnyman have you noticed that now default network configuration is based on vlan interfaces ethx.y, i.e. eth0.2 - wan (vlan2) and eth1.1 - LAN (vlan1)
This leads to a problem that when you are trying to configure vlans on switch level it totally breaks connectivity until interfaces are adjusted as well.
I don't think it's expected behavior and it seems that it has been brought with these commits:
https://git.lede-project.org/?p=source.git;a=commit;h=5e0441aaf0531e18222093e4084f4795fcba2343
https://git.lede-project.org/?p=source.git;a=commit;h=73d923ed6baabe3f8844f13216c50a6383a79a46

Setting wan/LAN interfaces back to eth0/eth1 brings back the former behavior and fixes the issue.

@jow

This has been the normal behaviour for years on most swconfig supported devices.

Is there advantage in such approach? It messes the configuration for a common user and brings device undiscoverable. Maybe there is a way to accordingly adjust interfaces when adjusting vlans on the 'switch' tab in luci?

The problem that I have faced to:
My ISP provides internet tagged on vlan3. The router has 2 interfaces: eth0 vlan2 - WAN, eth1 vlan1 - LAN.
Default configuration for lede now is eth0.2 - WAN, eth1.1 - LAN.
When I renumber the vlan2 on switch tab to vlan3, wan interface remains on eth0.2 thus breaking the connectivity. This may lead to a situation of completely undiscoverable router when adjusting LAN vlans. Also adding vlans and deleting vlans and restoring the previous state on switch tab breaks interfaces totally, so you have to reconfigure it from scratch.

I understand that you should have some knowledge when using lede, but this case is not evident.

Is there support for the e-sata port on this router with LEDE?

2 Likes

I have no devices to check, but it should work.

I took a stab at it and made the following PR: https://github.com/openwrt/luci/pull/1175
I'd appreciate if you could help testing this change.

@hnyman we've got dsa driver working, here's the commit if you are interested


After this you'll have interface per port, so you'll have to adjust your setup

Interesting. Have you been able to see any performance benefits yet? Or is this just the initial implementation of the driver that still needs practical tweaking to really work?

Actually I am not aware of that yet, I don't have wired devices powerful enough to pass the 500mbit mark, I use wireless mostly.
But maybe someone in the thread can test that?