Netgear R7800 exploration (IPQ8065, QCA9984)

zreladdr was still enabled in the config. i just pushed a patch to disable it, please retest the tree. once working i will also look at the nand issue and then the usb issue

I already tested that zreladdr change yesterday. I made that same change manually. I tested the (master + your k49 fixes patch) + (zreladdr y/disabled) x (dissent's patches on/off). Altogether I flashed four variations of the firmware and all failed. But slightly differently.

I will attach a serial cable in the next few days to get better understanding how the failures differ from each other.

ok, keep me posted. my ipq4019 hw arrived today so i'll start adding support for that aswell to the v4.9 series

I've been trying to get LEDE running on my R7800, and I have been running into some issues. I can get TFTP to work, and upload a build that I built myself, but it doesn't seem to run. So I figured I'd connect to the serial console and see if there's any error being reported, but that got me bupkes: either I'm not actually connecting to the serial console, or the boot prom isn't outputting anything to it.

Here are some pictures I took of the R7800: https://plus.google.com/+TedLemon/posts/gVFU3952Ft7

There's a four-prong header that you can see on the middle left of the board as pictured that I'm assuming is the console, although I suppose it could be anything. It ohms out as a console connector—the pin on the left is ground, the pin on the right is +3.3, the center pin to the left appears to be floating low, and the center pin on the right is just under 3.3v. But no output.

I notice that people are including console output here. Are you guys using these pins, or something else?

[quote="antikythera, post:204, topic:285"]
the pin on the left is ground, the pin on the right is +3.3, the center pin to the left appears to be floating low, and the center pin on the right is just under 3.3v.
[/quote]How do you refer left & right? When the "J1" text is readable, the white printed dot in the circuitboard is at the left pin (toward the circuitboard center). Is that the ground, or +3.3?

I have seen earlier with some USB-TTL converters with Netgear routers (WDNR3700) that the serial console did not work if the cable was already attached when the router was powered on. I needed to attach the cable after I turned power on. The are similar experiences also from others (with similar usb-ttl converter: silicon labs cp210x usb) . See e.g discussion at https://forum.openwrt.org/viewtopic.php?pid=352348#p352348

[quote="antikythera, post:204, topic:285"]
I notice that people are including console output here. Are you guys using these pins, or something else?
[/quote]I think that all R7800 output here so far has been from the kernel log of a succesfully booted router. The only u-boot boot loader extracts have been from a TP-Link C2600 router that uses the same processor family.

Thanks for the reply! To be honest, looking back at what I wrote, I'm not sure I was reporting accurately. The pin with the white dot is +3.3. I played around with it some more and managed to get it to spew gibberish, which is always a good sign. Unfortunately it doesn't work reliably. I don't know if my serial cable is garbage or if the console is garbage, but the output is sometimes clear at 115200,n,8,1, but sometimes it's total gibberish. Very frustrating. At least this confirms that it is a console. I don't know whether the problem is the console port or the serial cable. Could be that the clock speed isn't quite accurate or something.

@reiffert has experience on setting serial console on Netgear r7500, I suppose it has the same schematics. Maybe he could help.

Btw I've adjusted dts for all ipq806x nand devices on top of blogic's staging tree if someone with console could provide a bootlog.
https://github.com/dissent1/r7800/tree/blogic1

Thanks. Poking around online yielded the suggestion that the USB cable may not be properly grounded, and that may be creating sufficient noise to render the output unintelligible. I will try to find a better serial port. Not the easiest thing to do these days, unfortunately.

Likely it is something related to your USB serial cable as I got my serial cable attached without any problems. I bended the three needed pins (ground, rx, tx) that are luckily next to the circuitboard's edge, so that I could attach the serial cable connector without separating the circuitboard, heatsink and cover from each other:

Using the markings in my USB-ttl converter, the pin order starting from the edge is: ground, rx, tx

For me the serial connection was successful at the first try. No garbage.

I saved the kernel 4.4 bootlog starting from u-boot boot loader as a baseline:
https://gist.github.com/hnyman/8a68f6280ef5d267bba2ee80601fe8f0

EDIT:
I documented the case opening process to the message #2 of this thread:

@blogic @dissent1

bootlogs from R7800 for you:

Nice. I was thinking of taking a dremel to the heat sink. :slight_smile:

Nice one, thanks!
Seems my changes are correct but nand still fails, here is he viable part

[ 0.673183] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xa1
[ 0.673216] nand: Micron MT29F1G08ABBEAH4
[ 0.679900] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
[ 0.683807] 10 ofpart partitions found on MTD device qcom_nand.0
[ 0.691274] Creating 10 MTD partitions on "qcom_nand.0":
[ 0.697337] 0x000000000000-0x000000c80000 : "qcadata"
[ 0.708380] random: fast init done
[ 0.710098] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 0.710878] pgd = c0204000
[ 0.719188] [00000004] *pgd=00000000
[ 0.725022] Internal error: Oops: 5 [#1] SMP ARM
[ 0.725280] Modules linked in:
[ 0.732829] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.13 #0
[ 0.732920] Hardware name: Generic DT based system
[ 0.738651] task: dd44c000 task.stack: dd446000
[ 0.743520] PC is at submit_descs+0x20/0x4c

I am not so sure that the nand driver is the actual problem, as the error is virtually similar with your patches included and without them. In both cases the router stumbles into a memory addressing problem, but the call stack leading to the error is different.

With your patches the nand is recognised and the first partition is correctly named and its size is ok. But at that point the router runs into a memory addressing problem :frowning:

The point is that I experience similar behavior while backporting nand driver to k4.4, so this error should be interconnected some how

the scm boot loop is caused by the firmware not being loaded properly. i fixed this on ipq8064 by removing the ram clock. i am wondering why it fails on ipq8065. @hnyman i'll send you a test/debug patch tomorrow to try and narrow this down.

Guys, I can only offer you my gratitude on working so hard on this issue.
Rest assured dozens of people are very grateful for the hard work you are putting into this.

@hnyman @dissent1

please restet, felix just pushed a cleaned up version of my tree. i think it should all be functional now. the SCM crash is fixed aswell as nand. we are working on ethernet and IPQ4019 support now

Thanks.
Looks like a partial success.
R7800 boots ok and seems stable. Also e.g. CPU temperature reading works. There is ok-looking fluctuation in CPU frequency, so probably also CPU frequency scaling works, although I see some errors in the boot log.

But I seem to have lost wireless completely. At the first glance I see no trace of wireless in the bootlog. Something wrong in the DTS ???

Bootlog from R7800 with kernel 4.9 at r3738:
https://gist.github.com/hnyman/89d4a144b37edd112969112f8f2d8f39

hooray ... pci is not getting probed, let me check ...