Status of OpenSSL 1.1 Lede/OpenWrt?

Could someone tell me the status of using OpenSSL 1.1? I read that the API has changed in such a way that it breaks a lot of packages used by OpenWRT. Last posts on this are almost 1 year old. Depends on what is still broken, I would like to give the never version a try.

If someone could provide me with a makefile, that would be excellent :smiley:

1 Like

Ah, I was specifically interested in the AF_ALG engine additions. I found some old sources to add that as a plugin for v1.0.2, so I will have a go at those.

Similar to what happened with PolarSSL, OpenSSL 1.0.2 will be used until it stops being supported.

I have the whole tree compile with openssl 1.1.0. I haven't tested everything, but it runs everything I need. If you're interested, let me know and I can send you the patches.

Edit: I finally have the repos uploaded to github under https://github.com/cotequeiroz

2 Likes

Please!

I build several things that depend on OpenSSL and am curious if I can drop "legacy API" support. I'd rather patch against a forward-looking version.

1 Like

It's a lot of patches, and I don't have them online at github. I've done a quick git diff from my local branch. See what you can do. There will be changes unrelated to openssl, but it compiles 100% with mips (brcm47xx and ramips).
Here they are:

openwrt (commit 044e84fa8a914eb48f517c2f2905a9a806c2ad30)
packages feed (commit 68a3d3d6eea90232f5a5c0413eac178da3668ac3)

They are all based on current development master branch. I'll post the next diffs in a new post. As a newly registered user, I can only do two links, but you can figure out the filenames for the routing and telephony feeds, just in case.

2 Likes

Here are the other two feeds:
routing feed (commit fdaa4cde3b2c105dcad7a874826910e4309748a2)
telephony feed (commit 3dd2e9183c4844902738fe89643c685c74aab83b)

2 Likes

Thanks @cotequeiroz,
testing with openssl 1.1.0h on mipsel yields a 5-10% performance increase across the board, negligible increase on mips. Biggest gain is the addition of "new" chacha20 suites, where the performance boost is more than double against aes. I really hope the move to 1.1 happens, but looking at the incompatible packages, it may take some time yet. For now, since I got the free space needed on my devices, I run a combination of the two libraries.
Just out of curiosity, has anyone tried compiling with libressl?

1 Like

I finally got around to have a look at @cotequeiroz ‘s patches. I was most interested in the AFALG engine to get more native hardware crypto support via the standard API.

It turns out, I still need to compile with Cryptodev flags enabled And the cryptodev module needs to be loaded.

This defeats the purpose of having the AFALG engine. I was hoping for a slightly overall smaller footprint because of less stuff included and no more need for the cryptodev.

I can't really test AFALG since I don't have any hardware that supports it, but, at least the build system doesn't force AFALG/cryptodev dependency. You should be able to build them independently of each other. Just to make sure, you're building it with the latest versions of the patch, from https://github.com/openwrt/openwrt/pull/965, right?
Cheers,

can you tell me what do we need to use opessl 1.1.0?

Apply that patch and then... patch for other package based on openssl?

Yes.

Disabling deprecated APIs will break a lot more. That's still a work in progress.

I did some more testing. It seems that OpenSSL needs to be compiled with "HAVE_CRYPTODEV" otherwise no offloading to the hardware occurs. Even with this flag during compilation, without the cryptodev module loaded, still no hardware offloading occurs, even when specifying AFALG as engine.

root@OpenWrt:~# time -v openssl speed -elapsed -evp aes-256-cbc -engine afalg
engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 897941 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 264831 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 68595 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 17492 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 2150 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 1036 aes-256-cbc's in 3.00s
OpenSSL 1.1.0h  27 Mar 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) 
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DOPENSSL_API_COMPAT=0x10100000L -DOPENSSL_SMALL_FOOTPRINT -DOPENSSL_NO_ASYNC -DHAVE_CRYPTODEV -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\""  -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/usr/include -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/usr/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include/fortify -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc       4789.02k     5649.73k     5853.44k     5970.60k     5851.43k     5657.94k
	Command being timed: "openssl speed -elapsed -evp aes-256-cbc -engine afalg"
	User time (seconds): 17.53
	System time (seconds): 0.12
	Percent of CPU this job got: 97%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.04s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 11648
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 117
	Voluntary context switches: 1
	Involuntary context switches: 2029
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
root@OpenWrt:~# cat /proc/interrupts 
           CPU0       
  4:      10394      MIPS   4  mt76x2e
  5:        150      MIPS   5  10100000.ethernet
  6:      33320      MIPS   6  mt7603e
  7:      28042      MIPS   7  timer
 21:          0      INTC  13  10004000.crypto
 25:          2      INTC  17  esw
 28:         14      INTC  20  ttyS0
 40:          0      GPIO  38  gpio-keys
 41:          0      GPIO  37  gpio-keys
ERR:         62
root@OpenWrt:~# opkg install /tmp/kmod-cryptodev_4.14.44\+1.9.git-2017-10-04-ram
ips-1_mipsel_24kc.ipk 
Installing kmod-cryptodev (4.14.44+1.9.git-2017-10-04-ramips-1) to root...
Configuring kmod-cryptodev.
root@OpenWrt:~# time -v openssl speed -elapsed -evp aes-256-cbc -engine afalg
engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 87648 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 68664 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 85626 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 60228 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 28735 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 17696 aes-256-cbc's in 3.00s
OpenSSL 1.1.0h  27 Mar 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) 
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DOPENSSL_API_COMPAT=0x10100000L -DOPENSSL_SMALL_FOOTPRINT -DOPENSSL_NO_ASYNC -DHAVE_CRYPTODEV -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\""  -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/usr/include -I/home/drbrains/source/staging_dir/target-mipsel_24kc_musl/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/usr/include -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include/fortify -I/home/drbrains/source/staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/include -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc        467.46k     1464.83k     7306.75k    20557.82k    78465.71k    96643.75k
	Command being timed: "openssl speed -elapsed -evp aes-256-cbc -engine afalg"
	User time (seconds): 0.60
	System time (seconds): 4.83
	Percent of CPU this job got: 28%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.75s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 11872
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 118
	Voluntary context switches: 348622
	Involuntary context switches: 363
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
root@OpenWrt:~# 

If the conclusion is that cryptodev need to be present anyway, then there is no need to have the AFALG engine at all. Better keep using cryptodev directly and save a few bytes in flash.

This seems to mirror my experience with mvebu. OpenSSL generates no IRQs to the crypto driver.

With “just” cryptodev loaded and doing a benchmark with “speed -evp aes-256-cbc” it uses cryptodev by default. Adding “-engine cryptodev” makes no difference.

I compiled a few versions with different settings. I even removed the “have cryptodev” from the makefile which is selected automatically when hardware is enabled.

It needs have hardware, have engine and have cryptodev. Otherwise it defaults back to software. To use AFALG it need the afalg.so obviously plus everything else. Otherwise, back to software again.

As for testing: it doesn’t specifically need hardware to test. Using AFALG would for it to use “a kernel provided cipher” this could still be software. At least that’s the theory, which is why it’s confusing that it needs cryptodev loaded.

Another observation:

Using “-multi x” it will created multiple threads. This should test how it handles multiple requests concurrently. (To my understanding).

What I expect in the driver is a queue of requests building up since given enough concurrent requests (multiple threads / different apps), can only be handle on a FIFO basis. However adding a temporary simple counter at the point where my driver does a queue request, and a countdown again when 1 request was handled shows the queue never gets above 1??? For sure the hardware is not fast enough to handle the requests to never have to queue. Cause if that would be the case it would be able to handle to the maximum of the RAM memory speed as limitation.

Another few assumptions I made turn out to be wrong. I assumed that the "official" af_alg engine now in OpenSSL 1.1 was the same (or improved) version of the plugin previously available for the older openssl versions.

For example this implementation: (fork) https://github.com/sarnold/af_alg
The documentation clearly states that it can be used for "any" cipher or digest, but was intended for hardware offload. The OpenSSL 1.1.0h as pointed out above only has aes-128-cbc and only with a recent patch includes 192/256-cbc. No digests it seems like.

Basically that makes it "useless" in my opinion until "most" ciphers and digests are supported. Even then, given the performance difference I am sticking with the cryptodev implementation.

Combine that with the fact that AIO has to be compiled into the kernel to make it work, it makes it impossible to have it loaded after the basic firmware has been flashed, which is a "no-no" for a big portion of the user-base.

It still makes sense to try to get version 1.1.0 running because the software implementation has additional benefits and performance enhancements.

Thanks to this commit by cotequeiroz, I was able to run libressl-2.7.4 on mipsel24kc (mt7621) and mips24kc (qca9533). Curiously, not only was libressl pretty much head-to-head with openssl-1.1.0, on block sizes >= 256 byte, it does chacha20-poly1305 much faster.
I am curious because it is well known, that on architectures like x86, arm, libressl is significantly slower.
Can these results be right?

LibreSSL 2.7.4
built on: date not available
options:bn(64,32) rc4(ptr,int) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5               2908.87k    10534.74k    30538.59k    58616.69k    79886.97k
sha1              3922.78k    11975.47k    28001.47k    42478.99k    49350.68k
sha256            3158.07k     7166.33k    12452.35k    15132.41k    16326.85k
sha512             889.88k     3556.19k     5073.66k     6928.84k     7674.90k
aes-128 gcm       2714.52k     4949.03k     6374.31k     6887.42k     7018.99k
aes-256 gcm       2407.97k     4368.61k     5478.66k     5885.43k     5991.08k
chacha20 poly1305     3169.64k    11687.55k    22240.36k    28411.41k    30909.15k


OpenSSL 1.1.0i  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: mipsel-openwrt-linux-musl-gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DAES_ASM -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSLDIR="\"/etc/ssl\"" -DENGINESDIR="\"/usr/lib/engines-1.1\"" -znow -zrelro
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
md5               9405.69k    25718.90k    53897.81k    74557.44k    83601.95k    83754.35k
sha1              7593.95k    19747.29k    37288.65k    47507.09k    51465.36k    51877.15k
sha256            4135.46k     9457.83k    16539.81k    20801.24k    22312.93k    22386.01k
sha512             892.65k     3463.54k     4811.66k     6923.95k     7688.50k     7713.00k
aes-128-gcm       6040.78k     6886.25k     7097.69k     7159.50k     7198.62k     7163.24k
aes-256-gcm       5220.86k     5829.95k     6010.30k     5988.69k     6066.43k     6031.05k
chacha20-poly1305    11508.99k    18238.93k    21107.77k    22023.85k    22937.60k    22926.68k
```.

Your numbers show OpenSSL as faster.

And it generally is, except on chacha20-poly1305 on block sizes higher than 256 bytes.