RSS Atom Add a new post titled:

Disclaimer: I'm neither an experienced programmer nor proficient in reverse engineering, but I like at least to try to figure out how things work. Sometimes the solution is so easy, that even I manage to find it, still take this with a grain of salt.

I lately witnessed the setup of an Envertech EnverBridge ENB-202 which is kind of a classic Chinese IoT device. Buy it, plug it in, use some strange setup software, and it will report your PV statistics to a web portal. The setup involved downloading a PE32 Windows executable, with an UI that basically has two input boxes and a sent button. You've to input the serial number(s) of your inverter boxes and the ID of your EnverBridge. That made me interested in what this setup process really looks like.

The EnverBridge device itself has on one end a power plug, which is also used to communicate with the inverter via some Powerline protocol, and a network plug with a classic RJ45 end you plug into your network. If you power it up it will request an IPv4 address via DHCP. That brings us to the first oddity, the MAC address is in the BC:20:90 prefix which I could not find in the IEEE lists.

Setting Up the SetID Software

You can download the Windows software in a Zipfile, once you unpack it you end up with a Nullsoft installer .exe. Since this is a PE32 executable we've to add i386 as foreign architecture to install the wine32 package.

dpkg --add-architecture i386
apt update
apt install wine32:i386
wget http://www.envertec.com/uploads/bigfiles/Set%20ID.zip
unzip Set\ ID.zip
wine Set\ ID.exe

The end result is an installation in ~/.wine/drive_c/Program Files/SetID which reveals that this software is build with Qt5 according to the shipped dll's. The tool itself is in udpCilentNow.exe, and looks like this: Envertech SetID software

The Network Communication

To my own surprise, the communication is straight forward. A single UDP paket is sent to the broadcast address (255.255.255.255) on port 8765.

Envertech SetID udp paket in wireshark

I expected some strange binary protocol, but the payload is just simple numbers. They're a combination of the serial numbers of the inverter and the ID of the Enverbridge device. One thing I'm not 100% sure about are the inverter serial numbers, there are two of them, but on the inverter I've seen the serial numbers are always the same. So the payload is assembled like this:

  • ID of the EnverBridge
  • char 9 to 16 of the inverter serial 1
  • char 9 to 16 of the inverter serial 2

If you've more inverter the serials are just appended in the same way. Another strange thing is that the software does close to no input validation, they only check that the inverter serials start with CN and then just extract char 9 to 16.

The response from the EnverBridge is also a single UDP paket to the broadcast address on port 8764, with exactly the same content we've sent.

Writing a Replacement

My result is probably an insult for all proficient Python coders, but based on a bit of research and some cut&paste programming I could assemble a small script to replicate the function of the Windows binary. I guess the usefulness of this exercise was mostly my personal entertainment, though it might help some none Windows users to setup this device. Usage is also very simple:

./enverbridge.py -h
Usage: enverbridge.py [options] MIIDs

Options:
  -h, --help         show this help message and exit
  -b BID, --bid=BID  Serial Number of your EnverBridge

./enverbridge.py -b 90087654 CN19100912345678 CN19100912345679

This is basically 1:1 the behaviour of the Windows binary, though I tried to add a bit more validation then the original binary and some more error messages. I also assume the serial numbers are also always the same so I take only one as the input, and duplicate it for the paket data.

Posted Sun Mar 29 18:58:48 2020

I lately moved my Archer C7 v2 running OpenWRT from 18.06 using the ar71xx target to 19.07rc2 using the new ath79 target. Since the release notes are a bit sparse regarding how to achieve the target move, I asked on the IRC channel. Basically you can use sysupgrade as usual, but you have to

  • use force -F and
  • drop the config -n

That implies you have to manually backup your config (basically /etc). I decided to drop everything, since I wanted to repurpose the device slightly.

As a first step I did a reset:

firstboot && reboot now

After setting the password again I scp'ed the ath79 sysupgrade image for the device, and installed it with

sysupgrade -F -n /tmp/tplink_archer-c7-v2-squashfs-sysupgrade.bin

After several minutes the "sun" LED on the right to the power led was just blinking, but the device was still unresponsive so I did a full power cycle, and a minute later it was back on.

Thanks to PaulFertser from #openwrt on Freenode.

Upgrade sidenote for dnsmasq and DNSSEC

Usually I install dnsmasq-full to have DNSSEC support. So every time I upgrade I've to remember to disable dnssec before upgrading, because after the upgrade only the default dnsmasq is available, which will fail to start due to a config option it can not understand. :(

Posted Thu Dec 26 13:43:47 2019

I guess that is just one of the things you've to know, so maybe it helps someone else.

We saw some warnings in our playbook rollouts like

[WARNING]: sftp transfer mechanism failed on [192.168.23.42]. Use
ANSIBLE_DEBUG=1 to see detailed information

They were actually reported for sftp and scp usage. If you look at the debug output it's not very helpful for the average user, similar if you go to verbose mode with -vvv. The later one at least helped to see parameters passed to sftp and scp, but you still see no error message. But if you set

scp_if_ssh: True

or

scp_if_ssh: False

you will suddenly see the real error message

fatal: [docker-023]: FAILED! => {"msg": "failed to transfer file to /home/sven/testme.txt /home/sven/
.ansible/tmp/ansible-tmp-1568643306.1439135-27483534812631/source:\n\nunknown option -- A\r\nusage: scp [-346BCpqrv]
[-c cipher] [-F ssh_config] [-i identity_file]\n           [-l limit] [-o ssh_option] [-P port] [-S program] source
... target\n"}

Lesson learned, as long as ansible is running in "smart" mode it will hide all error messages from the user. Now we could figure out that the culprit is the -A for AgentForwarding, which is for obvious reasons not available in sftp and scp. One can move it to group_vars ansible_ssh_extra_args. The best documentation regarding this, beside of the --help output, seems to be the commit message of 3ad9b4cba62707777c3a144677e12ccd913c79a8.

Posted Mon Sep 16 18:16:25 2019

Update 2019-09-06: we experienced the name issue on Debian/stretch hosts using the latest stretch-backports kernel based on the same buster kernel release. Still no idea what's going on. Created https://bugs.debian.org/939328 for further tracking.

For yet unknown reasons some recently installed HPE DL360G10 running buster changed back the interface names from the expected "onboard" based names eno5 and eno6 to ethX after a reboot.

My current workaround is a link file which kind of enforces the onboard scheme.

$ cat /etc/systemd/network/101-onboard-rd.link 
[Link]
NamePolicy=onboard kernel database slot path
MACAddressPolicy=none

The hosts are running the latest buster kernel

Linux foobar 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u2 (2019-08-08) x86_64 GNU/Linu

A downgrade of the kernel did not change anything. So I currently like to believe this is not related a kernel change.

I tried to collect a few information on one of the broken systems while in a broken state:

root@foobar:~# SYSTEMD_LOG_LEVEL=debug udevadm test-builtin net_setup_link /sys/class/net/eth0
=== trie on-disk ===
tool version:          241
file size:         9492053 bytes
header size             80 bytes
strings            2069269 bytes
nodes              7422704 bytes
Load module index
Found container virtualization none.
timestamp of '/etc/systemd/network' changed
Skipping overridden file '/usr/lib/systemd/network/99-default.link'.
Parsed configuration file /etc/systemd/network/99-default.link
Created link configuration context.
ID_NET_DRIVER=i40e
eth0: No matching link configuration found.
Builtin command 'net_setup_link' fails: No such file or directory
Unload module index
Unloaded link configuration context.

root@foobar:~# udevadm test-builtin net_id /sys/class/net/eth0 2>/dev/null
ID_NET_NAMING_SCHEME=v240
ID_NET_NAME_MAC=enx48df37944ab0
ID_OUI_FROM_DATABASE=Hewlett Packard Enterprise
ID_NET_NAME_ONBOARD=eno5
ID_NET_NAME_PATH=enp93s0f0

Most interesting hint right now seems to be that /sys/class/net/eth0/name_assign_type is invalid While on sytems before the reboot that breaks it, and after setting the .link file fix, contains a 4.

Since those hosts were intially installed with buster there are no remains on any ethX related configuration present. If someone has an idea what is going on write a mail (sven at stormbind dot net), or blog on planet.d.o.

I found a vaguely similar bug report for a Dell PE server in #929622, though that was a change from 4.9 (stretch) to the 4.19 stretch-bpo kernel and the device names were not changed back to the ethX scheme, and Ben found a reason for it inside the kernel. Also the hardware is different using bnxt_en, while I've tg3 and i40e in use.

Posted Tue Aug 13 15:24:17 2019

I could not find much information on the interwebs how many containers you can run per host. So here are mine and the issues we ran into along the way.

The Beginning

In the beginning there were virtual machines running with 8 vCPUs and 60GB of RAM. They started to serve around 30 containers per VM. Later on we managed to squeeze around 50 containers per VM.

Initial orchestration was done with swarm, later on we moved to nomad. Access was initially fronted by nginx with consul-template generating the config. When it did not scale anymore nginx was replaced by Traefik. Service discovery is managed by consul. Log shipping was initially handled by logspout in a container, later on we switched to filebeat. Log transformation is handled by logstash. All of this is running on Debian GNU/Linux with docker-ce.

At some point it did not make sense anymore to use VMs. We've no state inside the containerized applications anyway. So we decided to move to dedicated hardware for our production setup. We settled with HPe DL360G10 with 24 physical cores and 128GB of RAM.

THP and Defragmentation

When we moved to the dedicated bare metal hosts we were running Debian/stretch + Linux from stretch-backports. At that time Linux 4.17. These machines were sized to run 95+ containers. Once we were above 55 containers we started to see occasional hiccups. First occurences lasted only for 20s, then 2min, and suddenly some lasted for around 20min. Our system metrics, as collected by prometheus-node-exporter, could only provide vague hints. The metric export did work, so processes were executed. But the CPU usage and subsequently the network throughput went down to close to zero.

I've seen similar hiccups in the past with Postgresql running on a host with THP (Transparent Huge Pages) enabled. So a good bet was to look into that area. By default /sys/kernel/mm/transparent_hugepage/enabled is set to always, so THP are enabled. We stick to that, but changed the defrag mode /sys/kernel/mm/transparent_hugepage/defrag (since Linux 4.12) from the default madavise to defer+madvise.

This moves page reclaims and compaction for pages which were not allocated with madvise to the background, which was enough to get rid of those hiccups. See also the upstream documentation. Since there is no sysctl like facility to adjust sysfs values, we're using the sysfsutils package to adjust this setting after every reboot.

Conntrack Table

Since the default docker networking setup involves a shitload of NAT, it shouldn't be surprising that nf_conntrack will start to drop packets at some point. We're currently fine with setting the sysctl tunable

net.netfilter.nf_conntrack_max = 524288

but that's very much up to your network setup and traffic characteristics.

Inotify Watches and Cadvisor

Along the way cadvisor refused to start at one point. Turned out that the default settings (again sysctl tunables) for

fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 8192

are too low. We increased to

fs.inotify.max_user_instances = 4096
fs.inotify.max_user_watches = 32768

Ephemeral Ports

We didn't ran into an issue with running out of ephemeral ports directly, but dockerd has a constant issue of keeping track of ports in use and we already see collisions to appear regularly. Very unscientifically we set the sysctl

net.ipv4.ip_local_port_range = 11000 60999

NOFILE limits and Nomad

Initially we restricted nomad (via systemd) with

LimitNOFILE=65536

which apparently is not enough for our setup once we were crossing the 100 container per host limit. Though the error message we saw was hard to understand:

[ERROR] client.alloc_runner.task_runner: prestart failed: alloc_id=93c6b94b-e122-30ba-7250-1050e0107f4d task=mycontainer error="prestart hook "logmon" failed: Unrecognized remote plugin message:

This was solved by following the official recommendation and setting

LimitNOFILE=infinity
LimitNPROC=infinity
TasksMax=infinity

The main lead here was looking into the "hashicorp/go-plugin" library source, and understanding that they try to read the stdout of some other process, which sounded roughly like someone would have to open at some point a file.

Running out of PIDs

Once we were close to 200 containers per host (test environment with 256GB RAM per host), we started to experience failures of all kinds because processes could no longer be forked. Since that was also true for completely fresh user sessions, it was clear that we're hitting some global limitation and nothing bound to session via a pam module.

It's important to understand that most of our workloads are written in Java, and a lot of the other software we use is written in go. So we've a lot of Threads, which in Linux are presented as "Lightweight Process" (LWP). So every LWP still exists with a distinct PID out of the global PID space.

With /proc/sys/kernel/pid_max defaulting to 32768 we actually ran out of PIDs. We increased that limit vastly, probably way beyond what we currently need, to 500000. Actuall limit on 64bit systems is 222 according to man 5 proc.

Posted Fri Aug 2 16:44:25 2019

While you can find a lot of information regarding the Java "Project Jigsaw", I could not really find a good example on "assembling" your own JVM. So I took a few minutes to figure that out. My usecase here is that someone would like to use Instana (non free tracing solution) which requires the java.instrument and jdk.attach module to be available. From an operations perspektive we do not want to ship the whole JDK in our production Docker Images, so we've to ship a modified JVM. Currently we base our images on the builds provided by AdoptOpenJDK.net, so my examples are based on those builds. You can just download and untar them to any directory to follow along.

You can check the available modules of your JVM by runnning:

$ jdk-11.0.3+7-jre/bin/java --list-modules | grep -E '(instrument|attach)'
java.instrument@11.0.3

As you can see only the java.instrument module is available. So let's assemble a custom JVM which includes all the modules provided by the default AdoptOpenJDK.net JRE builds and the missing jdk.attach module:

$ jdk-11.0.3+7/bin/jlink --module-path jdk-11.0.3+7 --add-modules $(jdk-11.0.3+7-jre/bin/java --list-modules|cut -d'@' -f 1|tr '\n' ',')jdk.attach --output myjvm

$ ./myjvm/bin/java --list-modules | grep -E '(instrument|attach)'
java.instrument@11.0.3
jdk.attach@11.0.3

Size wise the increase is, as expected, rather minimal:

$ du -hs myjvm jdk-11.0.3+7-jre jdk-11.0.3+7
141M    myjvm
121M    jdk-11.0.3+7-jre
310M    jdk-11.0.3+7

For the fun of it you could also add the compiler so you can execute source files directly:

$ jdk-11.0.3+7/bin/jlink --module-path jdk-11.0.3+7 --add-modules $(jdk-11.0.3+7-jre/bin/java --list-modules|cut -d'@' -f 1|tr '\n' ',')jdk.compiler --output myjvm2
$ ./myjvm2/bin/java HelloWorld.java
Hello World!
Posted Wed Jul 10 16:31:27 2019

If you've a logstash filter that contains a json filter/decoding step like

filter {
    json {
      source => "log"
    }
}

this, and you end up with an error message like that:

[2019-06-21T09:47:58,243][WARN ][logstash.filters.json    ] Error parsing json {:source=>"log", :raw=>{"file"=>{"path"=>"/var/lib/docker/containers/abdf3db21fca8e1dc17c888d4aa661fe16ae4371355215157cf7c4fc91b8ea4b/abdf3db21fca8e1dc17c888d4aa661fe16ae4371355215157cf7c4fc91b8ea4b-json.log"}}, :exception=>java.lang.ClassCastException}

It might be just telling you that the field log actually does contain valid json, and no decoding is required.

Posted Fri Jun 21 10:08:59 2019

tl;dr If you want to run Kafka 2.x use 2.1.1rc1 or later.

So someone started to update from Kafka 1.1.1 to 2.1.0 yesterday and it kept crashing every other hour. It pretty much looks like https://issues.apache.org/jira/browse/KAFKA-7697, so we're now trying out 2.1.1rc1 because we missed the rc2 at http://home.apache.org/~cmccabe/kafka-2.1.1-rc2/. So ideally you go with rc2 which has a few more fixes for unrelated issues.

Beside of that, be aware that the update to 2.1.0 is a one way street! Read http://kafka.apache.org/21/documentation.html#upgrade carefully. There is a schema change in the consumer offset topic which is internally used to track your consumer offsets since those moved out of Zookeeper some time ago.

For us the primary lesson is that we've to put way more synthetic traffic on our testing environments, because 2.1.0 was running in the non production environments for several days without an issue, and the relevant team hit the deadlock in production within hours.

Posted Tue Feb 12 16:15:04 2019

I guess in the past everyone used CGIs to achieve something similar, it just seemed like a nice detour to use the nginx Lua module instead. Don't expect to read something magic. I'm currently looking into different CDN providers and how they behave regarding cache-control header, and what additional header they sent by default and when you activate certain feature. So I setup two locations inside the nginx configuration using a content_by_lua_block {} for testing purpose.

location /header {
  default_type 'text/plain';
  content_by_lua_block {
   local myheads=ngx.req.get_headers()
   for key in pairs(myheads) do
    local outp="Header '" .. key .. "': " .. myheads[key]
    ngx.say(outp)
  end
 }
}

location /cc {
 default_type 'text/plain';
  content_by_lua_block {
   local cc=ngx.req.get_headers()["cc"]
   if cc ~= nil then
    ngx.header["cache-control"]=cc
    ngx.say(cc)
   else
    ngx.say("moep - no cc header found")
   end
  }
 }

The first one is rather boring, it just returns you the request header my origin server received, like this

$ curl -is https://nocigar.shx0.cf/header
HTTP/2 200 
date: Sun, 02 Dec 2018 13:20:14 GMT
content-type: text/plain
set-cookie: __cfduid=d503ed2d3148923514e3fe86b4e26f5bf1543756814; expires=Mon, 02-Dec-19 13:20:14 GMT; path=/; domain=.shx0.cf; HttpOnly; Secure
strict-transport-security: max-age=2592000
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 482e16f7ae1bc2f1-FRA

Header 'x-forwarded-for': 93.131.190.59
Header 'cf-ipcountry': DE
Header 'connection': Keep-Alive
Header 'accept': */*
Header 'accept-encoding': gzip
Header 'host': nocigar.shx0.cf
Header 'x-forwarded-proto': https
Header 'cf-visitor': {"scheme":"https"}
Header 'cf-ray': 482e16f7ae1bc2f1-FRA
Header 'cf-connecting-ip': 93.131.190.59
Header 'user-agent': curl/7.62.0

The second one is more interesting, it copies the content of the "cc" HTTP request header to the "cache-control" response header to allow you convenient evaluation of the handling of different cache-control header settings.

$ curl -H'cc: no-store,no-cache' -is https://nocigar.shx0.cf/cc/foobar42.jpg
HTTP/2 200 
date: Sun, 02 Dec 2018 13:27:46 GMT
content-type: image/jpeg
set-cookie: __cfduid=d971badd257b7c2be831a31d13ccec77f1543757265; expires=Mon, 02-Dec-19 13:27:45 GMT; path=/; domain=.shx0.cf; HttpOnly; Secure
cache-control: no-store,no-cache
cf-cache-status: MISS
strict-transport-security: max-age=2592000
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 482e22001f35c26f-FRA

no-store,no-cache

$ curl -H'cc: public' -is https://nocigar.shx0.cf/cc/foobar42.jpg
HTTP/2 200 
date: Sun, 02 Dec 2018 13:28:18 GMT
content-type: image/jpeg
set-cookie: __cfduid=d48a4b571af6374c759c430c91c3223d71543757298; expires=Mon, 02-Dec-19 13:28:18 GMT; path=/; domain=.shx0.cf; HttpOnly; Secure
cache-control: public, max-age=14400
cf-cache-status: MISS
expires: Sun, 02 Dec 2018 17:28:18 GMT
strict-transport-security: max-age=2592000
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 482e22c8886627aa-FRA

public

$ curl -H'cc: no-cache,no-store' -is https://nocigar.shx0.cf/cc/foobar42.jpg
HTTP/2 200 
date: Sun, 02 Dec 2018 13:30:33 GMT
content-type: image/jpeg
set-cookie: __cfduid=dbc4758b7bb98d556173a89aa2a8c2d3a1543757433; expires=Mon, 02-Dec-19 13:30:33 GMT; path=/; domain=.shx0.cf; HttpOnly; Secure
cache-control: public, max-age=14400
cf-cache-status: HIT
expires: Sun, 02 Dec 2018 17:30:33 GMT
strict-transport-security: max-age=2592000
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 482e26185d36c29c-FRA

public

As you can see this endpoint is currently fronted by Cloudflare using a default configuration. If you burned one request path below "/cc/" and it's now cached for a long time you can just use a random different one to continue your test, without any requirement to flush the CDN caches.

Posted Sun Dec 2 14:40:07 2018
# docker version|grep Version
Version:      18.03.1-ce
Version:      18.03.1-ce

# cat Dockerfile
FROM alpine
RUN addgroup service && adduser -S service -G service
COPY --chown=root:root debug.sh /opt/debug.sh
RUN chmod 544 /opt/debug.sh
USER service
ENTRYPOINT ["/opt/debug.sh"]

# cat debug.sh
#!/bin/sh
ls -l /opt/debug.sh
whoami

# docker build -t foobar:latest .; docker run foobar
Sending build context to Docker daemon   5.12kB
[...]
Sucessfully built 41c8b99a6371
Successfully tagged foobar:latest
-r-xr--r--    1 root     root            37 Nov 14 22:42 /opt/debug.sh
service

# docker version|grep Version
Version:           18.09.0
Version:          18.09.0

# docker run foobar
standard_init_linux.go:190: exec user process caused "permission denied"

That changed with 18.06 and just uncovered some issues. I was, well let's say "surprised", that this ever worked at all. Other sets of perms like 0700 or 644 already failed with different error message on docker 18.03.1.

Posted Wed Nov 14 23:53:28 2018