Overclocking router devices

I haven't opened mine yet, but there are some board photos available here. Do you see anything that looks like JTAG here?

Seems no jtag header on board...
But it is no bga nand and therefore desoldering and soldering is not so hard...

i think there is jtag near the cpu
image

that one looks like a wireless cpu, no ?

Looks like the qca9882 wireless chip...
Don't think this tpXX labeled pads are a jtag header...

CPU: 1000MHz RAM: 600MHz AHB: 400MHz SPI: 25MHz QCA9531 board TL-MR22U, seems stable but more thorough tests should be done, hooked AWUS036NHA to its USB port, in iperf test reached 93Mbps.

tried to set 1100MHz, u-boot starts but doesn't load image. based on @pepe2k u-boot_mod http://bit.do/eoEVn

About JTAG for R6100 (AR9344) in datsheet its:
GPIO0 | TCK Input JTAG Clock
GPIO1 | TDI Input JTAG data input
GPIO2 | TDO Output JTAG data output
GPIO3 | TMS Input JTAG test mode

  • LED_WLAN has GPIO0
  • BTN_WIRELESS has GPIO1
  • you need to find GPIO2, may be it pulled via resistor to the ground/power
  • BTN_WPS has GPIO3 for this board.
1 Like

For how long have you been running your QCA9531 at those clocks? I mean, how feasable would it be to keep it like that for long periods.

I have an AR9344 (WDR4320) stable at CPU: 760MHz RAM: 450MHz AHB: 250MHz. Had CPU clocked at 800MHz before, stable as well, but my common sense kicked in. Anything over 450MHz for RAM meant heavy instabillity.

Until you confirm that with a scope (you can use CLK_OBS feature) or some other way, I strongly doubt that your clocks actually run at that frequencies.

1 Like

OpenSSL benchmarks!

I want also see some benchmarks...

Here i have some simple benchmarks and also a simple memtest.
All three have a directory named precompiled with static precompiled binaries for mips32r2.
The two benchmarks did also have a directory with results from some of my devices.

Please show us some comparisons between default clocks and oc clocks...



1 Like

few times for several hours without rebooting/changing config/doing anything with the device, but practically i've started experimenting with this few days ago. maybe i do the long period test...

in my case at 400MHz RAM clock USB port could do 65Mbps at best, at 600MHz 95Mbps. but that throughput drops when onboard wifi is activated and USB device receives data for more than 20 seconds. probably overheating issue. system stays stable though..

i can see boot time reduced for several seconds, after i log in to device it's uptime is 17 sec. most luci pages load almost instantly (besides processes and startup script page).
i will try to do some openssl benchmarks as the guys here suggest

btw you should see what's left of this 2A samsung charger (that was in reality delivering 1.6A) https://ae01.alicdn.com/kf/UT8BlvNXoBbXXagOFbXN.jpg the morning after i put it to charge router battery and left it powered on overnight at 1000MHz and wifi on at default 19dBm...

Scope or check if your time base on the router is correct.

I see. I've made a ghetto mod on the WDR4320, SoC is being activelly cooled by a 55mm copper heatsink with a fan, also smaller heatsinks ontop of every other chip. The case is half open so hot air doesn't get stuck. Looks funny but it works.

Tonight I'll try running at 900MHz CPU and stress it/benchmark for some minutes to see if it throttles.

The procompiled binaries are very easy to use.
Copy only the binary to /tmp and run it from there... They have no dependencies because they are build static.
The memtester is very good to ensure no bit swaps appear.

If you could easily open your device, check the soc temp with a infrared thermometer...

quick test result with clocks 1000/600/200:

root@OpenWrt:/tmp# ./coremark_mips32r2
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 13122
Total time (secs): 13.122000
Iterations/Sec   : 2286.236854
Iterations       : 30000
Compiler version : GCC5.4.0
Compiler flags   : -O2 -s -static   -lrt
Memory location  : Please put data memory location here
			(e.g. code in flash, data on heap etc)
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x5275
Correct operation validated. See readme.txt for run and reporting rules.
CoreMark 1.0 : 2286.236854 / GCC5.4.0 -O2 -s -static   -lrt / Heap
root@OpenWrt:/tmp# ./membench_4.6MB
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 200000 (elements), Offset = 0 (elements)
Memory per array = 1.5 MiB (= 0.0 GiB).
Total memory required = 4.6 MiB (= 0.0 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 2 microseconds.
Each test below will take on the order of 21856 microseconds.
   (= 10928 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             393.4     0.008231     0.008134     0.008478
Scale:            101.5     0.032716     0.031537     0.034919
Add:              197.0     0.025135     0.024365     0.026570
Triad:             85.5     0.056325     0.056163     0.056508
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

root@OpenWrt:/tmp# ./membench_11.4MB
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 500000 (elements), Offset = 0 (elements)
Memory per array = 3.8 MiB (= 0.0 GiB).
Total memory required = 11.4 MiB (= 0.0 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 2 microseconds.
Each test below will take on the order of 55570 microseconds.
   (= 27785 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             390.8     0.020517     0.020473     0.020750
Scale:             97.3     0.084234     0.082253     0.085347
Add:              182.8     0.067994     0.065634     0.071910
Triad:             85.2     0.141840     0.140804     0.144222
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
1 Like

Looks almost realistic that you have 1000 MHz on your cpu and 600 MHz on ram if i compare the values with AR9344@750MHz.

But probably your system timer isn´t accurate and then the benchmarks will be not true.

@juppin what will happen if your CPU clock runs @ 500 MHz but the kernel thinks that it runs @ 1000 MHz (in case of ath79/mips, time base is based on CPU ticking)? Small hint: all results from tests which are time-based are wrong.

1 Like

Yes, that was also my conclusion at the same moment. :smiley:

We could only be sure if one measure with a oscilloscope...

A benchmark that executes always the same code and a time measure with putty between start and end print could be much better than this benchmarks that rely on systick timer...