Shaping performance

Naively I would think that if we package the whole thing as an opkg we might use the automatic dependency resolution method og opkg (but I have a feeling I am getting ahead of myself here). Personally, I am not fluent in lua, but I think this is still readable enough to allow users advanced enough to convince themselves that the script is benign. One thing though, if we go and parse /proc/stat, we might as well bite the bullet and extract %idle for each cpu individually, so we can also deal with multicore soc's gracefully (which is something that busybox top does not do, so this actually a rationale to do this not per visual inspection of top output, even though that has the advantage of being possible without needing additional tools). Regarding top, since I am on a tangent already, I wonder whether the updates will not cause a noticeable ssh load that might actually eat into the shapeable bandwidth? (Well, I am sure in principle it will, I am just not sure how big the cost is and whether that is not actually a reasonable "simulation" of a router doing a few other chores in addition to traffic shaping).

I know basically nothing about creating opkg type packages, on the other hand @richb-hanover seems to have some background here. Here's what I'll try to do, I'll create some kind of skeleton and post it here as preformatted / code, we can discuss it, and go from there. It might be over a period of several days at this point though. I'll try to get something basic up today.

@moeller0, I like your suggestion of doing per-cpu stats to some extent, but I'm not sure what we're actually after when we do it that way. are we looking for the minimum idle throughout the duration of the test across all cpus? Does that accurately represent the "bottleneck point" for SQM?

As a first approximation yes, that would be my approach. Thinking of it, it might make sense to not only store the minimum, but also a time series of all samples of all CPUs to allow for occasional confirmation whether the assumptions like minimum of all CPU still hold true (the time series might be best for all values, maybe expressed as deltas).

Good question, I would guess that the sm CPU under load should show low idle and high sirq (just like the wlan CPU would too, if these things would acyually end up being bound to different CPUs).

@dlakelan how about we start in a hackish fashion and simply append* "date +%s%N >> stat_ouput.txt ; cat /proc/stat >> stat_output.txt" and then parse this with a lua script after the measurement is done? Then we have the freedom to easily play with the different variables with much less time constraints. Now this will require a real date as busybox date seems to insist upon reporting just seconds...
We can combine this with your idea of using gnu sleep to run this say every 100ms to get reasonable high resolution?

*) This is the gist of the recording, it might still make sense to do this from a script so other data like interface traffic counters, and wlan rates can also be collected, timestamped and saved out.

1 Like

As a first pass this seems reasonable, let's also collect /proc/net/dev to figure out bandwidths.

You guys are amazing! So glad you're figuring it out. I'm just going to watch from the sidelines for now, but I think this will be a really useful feature in the wiki once it's sorted out.

Here is basic shell code to get started:

#!/bin/sh

## truncate the output file
echo "" > /tmp/stat_output.txt
i=1

echo "Beginning data collection. Run speed test now."

while [ i -lt 200 ]; do ## 20 seconds of data collection
   date +%s%N >> /tmp/stat_output.txt
   cat /proc/stat >> /tmp/stat_output.txt
   echo "" >> /tmp/stat_output.txt ## blank line for separating two files
   cat /proc/net/dev >> /tmp/stat_output.txt
   sleep 0.1
   i = $((i+1))
done

echo "Data collection done. See data file in /tmp/stat_output.txt"

## run analysis script here

@moeller0, can you run this on your router with the speed test and then send me the collected data file? I will start by analyzing it in R and seeing what info it really shows us, perhaps we have a methodology problem that would be useful to know before coding up a lua script.

2 Likes

@moeller0

What if we just collected this kind of file and uploaded it to an "analyzer". At the end of the script we basically say "Do you want to upload the statistics file to contribute to the OpenWRT performance estimator?" if the user says "Y" we wget --post-file= it up to some website that then runs the parser/analyzer and adds the results to a database. The user is free to decide not to contribute, but we don't support analysis of the data on the router, a major reason to do that is because analyzing the file on the router requires running parser/analysis code that is much less auditable. It's very easy for people to see "gee the data collected is innocuous I could contribute it no problem" than "gee this complicated lua parser/analyzer is ok to run as root on my router".

I'd say it's no problem to make the whole database be public information (say a sqlite file that you can download), and of course the results of our analysis would be used to predict max bandwidth for each router in the ToH.

To do it this way, I'd basically just add a header on top of the file... something that has say Kernel Version, SQM package version, Router HW specifications, and an sha256sum hash of some unique identifier, such as concatenating /proc/version and /etc/openwrt_release and the MAC address of eth0

1 Like

Okay, on my netgear wnder3700v2, after:
opkg update ; opkg install coreutils-date coreutils-sleep

I ran:

#!/bin/sh

## truncate the output file
echo "" > /tmp/stat_output.txt
i=1

echo "Beginning data collection. Run speed test now."

while [ "$i" -lt 600 ]; do ## 60 seconds of data collection
   echo "###date" >> /tmp/stat_output.txt
   date +%s%N >> /tmp/stat_output.txt
   echo "###/proc/stat" >> /tmp/stat_output.txt
   cat /proc/stat >> /tmp/stat_output.txt
   echo "" >> /tmp/stat_output.txt ## blank line for separating two files
   echo "###/proc/net/dev" >> /tmp/stat_output.txt
   cat /proc/net/dev >> /tmp/stat_output.txt
   sleep 0.1
   i=$(( ${i} + 1 ))
done

echo "Data collection done. See data file in /tmp/stat_output.txt"

## run analysis script here

exit 0

The speedtest result is here:
http://www.dslreports.com/speedtest/32640248

SQM info:

root@nacktmulle:~# cat /etc/config/sqm

config queue
	option debug_logging '0'
	option verbosity '5'
	option upload '9545'
	option linklayer 'ethernet'
	option overhead '34'
	option linklayer_advanced '1'
	option tcMTU '2047'
	option tcTSIZE '128'
	option tcMPU '64'
	option qdisc_advanced '1'
	option ingress_ecn 'ECN'
	option egress_ecn 'NOECN'
	option qdisc_really_really_advanced '1'
	option squash_dscp '0'
	option squash_ingress '0'
	option download '46246'
	option qdisc 'cake'
	option script 'layer_cake.qos'
	option interface 'pppoe-wan'
	option linklayer_adaptation_mechanism 'default'
	option iqdisc_opts 'nat dual-dsthost ingress mpu 64'
	option eqdisc_opts 'nat dual-srchost mpu 64'
	option enabled '1'

root@router:~# uname -a
Linux router 4.9.91 #0 Tue Apr 24 15:31:14 2018 mips GNU/Linux

root@ router:~# cat /etc/openwrt_release 
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r6755-d089a5d773'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r6755-d089a5d773'
DISTRIB_TAINTS='no-all busybox'

root@router:~# cat /proc/version
Linux version 4.9.91 (perus@ub1710) (gcc version 7.3.0 (OpenWrt GCC 7.3.0 r6687-d13c7acd9e) ) #0 Tue Apr 24 15:31:14 2018

Script and result file to be found under http://workupload.com/archive/YPzFKTt (please note that this is the first time I try a free file hoster, so let me know if this thing is malicious); here are the hash values for the uploaded files to make modifications detectable:

bash-3.2$ openssl sha1 proc_sampler.sh 
SHA1(proc_sampler.sh)= beff294dba4b573f140950b5756af63d09509aad
bash-3.2$ openssl sha1 
proc_sampler.sh  stat_output.txt  
bash-3.2$ openssl sha1 stat_output.txt 
SHA1(stat_output.txt)= 16cd13ea610861005cd6f8b4c973244322d1fff0

I started the speedtest after around 8 seconds, but the speedtest itself has a rather long preparation time. The measurement was done over the 5GHz radio. The upload test shows some issues, but the goal here is not to get the perfect measurement, but rather an initial reasonably well documented data set for testing whether our discussed approach has merit :wink:

Many Thanks

1 Like

This sounds great, as first step I would say let's get an analyzer coded and tested and then we could just distribute that for interested users. That will allow anything under the sun as analysis software, be it R, octave, or for the so inclined even forth :wink:

1 Like

what no LOLCODE? https://en.wikipedia.org/wiki/LOLCODE

1 Like

Quick thought - does this code account for multi-core processors? Will a dual-core always look 50% free since only one CPU is working? I just don't know enough to know how CPU utilization is calculated here.

The /proc/stat output gives per-cpu usage, so we should be able to account for multi-core at least somewhat. We'll see how things go. All of that is down to the analysis code rather than the data collection code.

1 Like

Probably nothing malicious, but it's hard to see the files (there's too much clicking required).

I'm partial to pastebin.com (and there are tons of others...) You can post files (write-only) files anonymously, or sign up for an account, which allows you to edit your posted files.

1 Like

re the workupload link, I downloaded the zip file and uncompressed without any issues. So at least for the purposes of putting together a parser/data extractor program I'm set, now I just need to get a little block of time, which probably won't happen until next week given all the various issues I'm handling right now.

I haven't forgotten this project, but I've been extremely busy with personal and family activities. So here's some thoughts I've had over the downtime:

The format of these various files is not particularly uniform or easy to parse, hand-writing a parser will involve a lot of by-hand if-then garbage, and reading n bytes from here, then skipping q bytes, then reading to end of line... etc etc etc

So, I had the thought that perhaps an ANTLR grammar with a javascript target would make sense, let a machine write the parser code, and then just walk the parse tree and extract the key bits of information, and slam them into a sqlite file. Once the sqlite file is created, then analysis of the data in R is how I'd proceed.

So, any thoughts? Does anyone have specific experience with ANTLR?

To give an idea how this might look, here's the skeleton of the grammar:

STATFILE : STATENTRY* EOF ;

STATENTRY: DATE STAT PROCDEV ;

DATE : COMMENT* INT ;

STAT : COMMENT* CPUENT* IGNORABLE

PROCDEV : COMMENT* PDHEADER INTFC

INTFC: WS* INAME ":" (WS* INT)*

etc etc

then ANTLR does magic, you run it all, and then extract just the juicy bits you care about, like you look at every STATENTRY and grab out its associated date, the CPUENTs and the INTFC info, do a calculation, and then store a row in the SQL table.

An alternative idea I had was to use something like m4 to text-process the raw data file into something much simpler to parse, and then manually parse it in R.

For example, it could strip out all the irrelevant lines and irrelevant numbers on given lines, and leave you with a file that looked like:

{
time : 123456,
cpuIdle : {193331,19300,19331},
intface : {{"lo",...},{"eth0",...}...},
...
}

It seems like it would be not too impossible to make the output compliant JSON and then parse it using a JSON parser. Most of the m4 macros would be responsible for just reading lines and deleting them :wink: the other macros would be responsible for transforming things like

lo : 1999 1999 1999 0 0 0 0 0 ...

into

{'lo',1999,1999,1999,0,0,0,0,0,...}

or some such thing

This has a considerable appeal in that it probably involves a lot less overhead, but on the other hand, m4 macros aren't exactly the easiest thing to write, debug, or understand. Still, considering the complexity of getting ANTLR to do this relatively simple task, I'm kinda in favor of this approach.

+1 to that; I know that using an intermediary format like JSON is not going to save any work in writing the initial parser, but it will be much easier to change to different consumers of the parsed data that way, plus, unlike sqlite, it is still easy to read those files without any additional tools. But I have no knowledge whatsover in ANTLR or m4, so all I can do is watch and be duely impressed