Could someone tell me the status of using OpenSSL 1.1? I read that the API has changed in such a way that it breaks a lot of packages used by OpenWRT. Last posts on this are almost 1 year old. Depends on what is still broken, I would like to give the never version a try.
If someone could provide me with a makefile, that would be excellent
Ah, I was specifically interested in the AF_ALG engine additions. I found some old sources to add that as a plugin for v1.0.2, so I will have a go at those.
I have the whole tree compile with openssl 1.1.0. I haven't tested everything, but it runs everything I need. If you're interested, let me know and I can send you the patches.
It's a lot of patches, and I don't have them online at github. I've done a quick git diff from my local branch. See what you can do. There will be changes unrelated to openssl, but it compiles 100% with mips (brcm47xx and ramips).
Here they are:
They are all based on current development master branch. I'll post the next diffs in a new post. As a newly registered user, I can only do two links, but you can figure out the filenames for the routing and telephony feeds, just in case.
Here are the other two feeds: routing feed (commit fdaa4cde3b2c105dcad7a874826910e4309748a2) telephony feed (commit 3dd2e9183c4844902738fe89643c685c74aab83b)
Thanks @cotequeiroz,
testing with openssl 1.1.0h on mipsel yields a 5-10% performance increase across the board, negligible increase on mips. Biggest gain is the addition of "new" chacha20 suites, where the performance boost is more than double against aes. I really hope the move to 1.1 happens, but looking at the incompatible packages, it may take some time yet. For now, since I got the free space needed on my devices, I run a combination of the two libraries.
Just out of curiosity, has anyone tried compiling with libressl?
I finally got around to have a look at @cotequeiroz âs patches. I was most interested in the AFALG engine to get more native hardware crypto support via the standard API.
It turns out, I still need to compile with Cryptodev flags enabled And the cryptodev module needs to be loaded.
This defeats the purpose of having the AFALG engine. I was hoping for a slightly overall smaller footprint because of less stuff included and no more need for the cryptodev.
I can't really test AFALG since I don't have any hardware that supports it, but, at least the build system doesn't force AFALG/cryptodev dependency. You should be able to build them independently of each other. Just to make sure, you're building it with the latest versions of the patch, from https://github.com/openwrt/openwrt/pull/965, right?
Cheers,
I did some more testing. It seems that OpenSSL needs to be compiled with "HAVE_CRYPTODEV" otherwise no offloading to the hardware occurs. Even with this flag during compilation, without the cryptodev module loaded, still no hardware offloading occurs, even when specifying AFALG as engine.
root@OpenWrt:~# time -v openssl speed -elapsed -evp aes-256-cbc -engine afalg
engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 897941 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 264831 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 68595 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 17492 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 2150 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 1036 aes-256-cbc's in 3.00s
OpenSSL 1.1.0h 27 Mar 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DOPENSSL_API_COMPAT=0x10100000L -DOPENSSL_SMALL_FOOTPRINT -DOPENSSL_NO_ASYNC -DHAVE_CRYPTODEV -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\"" -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/usr/include -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/usr/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include/fortify -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 4789.02k 5649.73k 5853.44k 5970.60k 5851.43k 5657.94k
Command being timed: "openssl speed -elapsed -evp aes-256-cbc -engine afalg"
User time (seconds): 17.53
System time (seconds): 0.12
Percent of CPU this job got: 97%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.04s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 11648
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 117
Voluntary context switches: 1
Involuntary context switches: 2029
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
root@OpenWrt:~# cat /proc/interrupts
CPU0
4: 10394 MIPS 4 mt76x2e
5: 150 MIPS 5 10100000.ethernet
6: 33320 MIPS 6 mt7603e
7: 28042 MIPS 7 timer
21: 0 INTC 13 10004000.crypto
25: 2 INTC 17 esw
28: 14 INTC 20 ttyS0
40: 0 GPIO 38 gpio-keys
41: 0 GPIO 37 gpio-keys
ERR: 62
root@OpenWrt:~# opkg install /tmp/kmod-cryptodev_4.14.44\+1.9.git-2017-10-04-ram
ips-1_mipsel_24kc.ipk
Installing kmod-cryptodev (4.14.44+1.9.git-2017-10-04-ramips-1) to root...
Configuring kmod-cryptodev.
root@OpenWrt:~# time -v openssl speed -elapsed -evp aes-256-cbc -engine afalg
engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 87648 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 68664 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 85626 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 60228 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 28735 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 17696 aes-256-cbc's in 3.00s
OpenSSL 1.1.0h 27 Mar 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DOPENSSL_API_COMPAT=0x10100000L -DOPENSSL_SMALL_FOOTPRINT -DOPENSSL_NO_ASYNC -DHAVE_CRYPTODEV -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\"" -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/usr/include -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/usr/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include/fortify -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 467.46k 1464.83k 7306.75k 20557.82k 78465.71k 96643.75k
Command being timed: "openssl speed -elapsed -evp aes-256-cbc -engine afalg"
User time (seconds): 0.60
System time (seconds): 4.83
Percent of CPU this job got: 28%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.75s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 11872
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 118
Voluntary context switches: 348622
Involuntary context switches: 363
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
root@OpenWrt:~#
If the conclusion is that cryptodev need to be present anyway, then there is no need to have the AFALG engine at all. Better keep using cryptodev directly and save a few bytes in flash.
With âjustâ cryptodev loaded and doing a benchmark with âspeed -evp aes-256-cbcâ it uses cryptodev by default. Adding â-engine cryptodevâ makes no difference.
I compiled a few versions with different settings. I even removed the âhave cryptodevâ from the makefile which is selected automatically when hardware is enabled.
It needs have hardware, have engine and have cryptodev. Otherwise it defaults back to software. To use AFALG it need the afalg.so obviously plus everything else. Otherwise, back to software again.
As for testing: it doesnât specifically need hardware to test. Using AFALG would for it to use âa kernel provided cipherâ this could still be software. At least thatâs the theory, which is why itâs confusing that it needs cryptodev loaded.
Using â-multi xâ it will created multiple threads. This should test how it handles multiple requests concurrently. (To my understanding).
What I expect in the driver is a queue of requests building up since given enough concurrent requests (multiple threads / different apps), can only be handle on a FIFO basis. However adding a temporary simple counter at the point where my driver does a queue request, and a countdown again when 1 request was handled shows the queue never gets above 1??? For sure the hardware is not fast enough to handle the requests to never have to queue. Cause if that would be the case it would be able to handle to the maximum of the RAM memory speed as limitation.
Another few assumptions I made turn out to be wrong. I assumed that the "official" af_alg engine now in OpenSSL 1.1 was the same (or improved) version of the plugin previously available for the older openssl versions.
For example this implementation: (fork) https://github.com/sarnold/af_alg
The documentation clearly states that it can be used for "any" cipher or digest, but was intended for hardware offload. The OpenSSL 1.1.0h as pointed out above only has aes-128-cbc and only with a recent patch includes 192/256-cbc. No digests it seems like.
Basically that makes it "useless" in my opinion until "most" ciphers and digests are supported. Even then, given the performance difference I am sticking with the cryptodev implementation.
Combine that with the fact that AIO has to be compiled into the kernel to make it work, it makes it impossible to have it loaded after the basic firmware has been flashed, which is a "no-no" for a big portion of the user-base.
It still makes sense to try to get version 1.1.0 running because the software implementation has additional benefits and performance enhancements.
Thanks to this commit by cotequeiroz, I was able to run libressl-2.7.4 on mipsel24kc (mt7621) and mips24kc (qca9533). Curiously, not only was libressl pretty much head-to-head with openssl-1.1.0, on block sizes >= 256 byte, it does chacha20-poly1305 much faster.
I am curious because it is well known, that on architectures like x86, arm, libressl is significantly slower.
Can these results be right?
LibreSSL 2.7.4
built on: date not available
options:bn(64,32) rc4(ptr,int) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 2908.87k 10534.74k 30538.59k 58616.69k 79886.97k
sha1 3922.78k 11975.47k 28001.47k 42478.99k 49350.68k
sha256 3158.07k 7166.33k 12452.35k 15132.41k 16326.85k
sha512 889.88k 3556.19k 5073.66k 6928.84k 7674.90k
aes-128 gcm 2714.52k 4949.03k 6374.31k 6887.42k 7018.99k
aes-256 gcm 2407.97k 4368.61k 5478.66k 5885.43k 5991.08k
chacha20 poly1305 3169.64k 11687.55k 22240.36k 28411.41k 30909.15k
OpenSSL 1.1.0i 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\"" -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
md5 9405.69k 25718.90k 53897.81k 74557.44k 83601.95k 83754.35k
sha1 7593.95k 19747.29k 37288.65k 47507.09k 51465.36k 51877.15k
sha256 4135.46k 9457.83k 16539.81k 20801.24k 22312.93k 22386.01k
sha512 892.65k 3463.54k 4811.66k 6923.95k 7688.50k 7713.00k
aes-128-gcm 6040.78k 6886.25k 7097.69k 7159.50k 7198.62k 7163.24k
aes-256-gcm 5220.86k 5829.95k 6010.30k 5988.69k 6066.43k 6031.05k
chacha20-poly1305 11508.99k 18238.93k 21107.77k 22023.85k 22937.60k 22926.68k
```.