After emailing three of the committers to the original Polycube project, and receiving short replies from each of that basically said, polycube was never tested on an arm-based system will likely not work without significant efforts as well as, I believe the [polycube] project is no longer active, I wanted to follow through and test the former statement and really see how much effort would it take to get a compiled binary of polycubed
running on an Arm-based system.
With my previous Work In Progress, I appeared to be able to successfully build and compile an executable, but when run, the program did nothing but consume 100% of one core of the Raspberry Pi's processes.
What does this mean? A hung process, consuming 100% of one core; that feels to me like it is getting stuck in a loop without having an exit/break condition met. I started by doing what any ham-handed developer would do: I started at main()
in polycubed.cpp
and started to put std::cerr << "Code gets to this spot #1" << std:endl;
into the code.
I narrowed this initial issue of the process hang to the following:
try { if (!config.load(argc, argv)) { exit(EXIT_SUCCESS); } std::cerr << "Configs loaded..." << std::endl; } catch (const std::exception &e) { // The problem of the error in loading the config file may be due to // polycubed executed as normal user if (getuid()) logger->critical("polycubed should be executed with root privileges"); logger->critical("Error loading config: {}", e.what()); exit(EXIT_FAILURE); }
Both of the cerr
statements that I added were never getting called. This narrowed down the issue to config.load(argc, argv)
.
Looking at config.cpp
and specifically at the method, load(int argc, char *argv[])
, you will find the following:
bool Config::load(int argc, char *argv[]) { logger = spdlog::get("polycubed"); int option_index = 0; char ch; // do a first pass looking for "configfile", "-h", "-v" while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) != -1) { switch (ch) { case 'v': show_version(); return false; case 'h': show_usage(argv[0]); return false; case 4: configfile = optarg; break; } } load_from_file(configfile); load_from_cli(argc, argv); check(); if (cubes_dump_clean_init) { std::ofstream output(cubes_dump_file); if (output.is_open()) { output << "{}"; output.close(); } } return true; }
Through some amateur debugging statements, I determined that while (( ch = getopt_long..) != -1)
was never ceasing. The while
loop never exited. Why would this statement work flawlessly on Intel amd64-based systems and not on Arm64 systems? I am still stumped as why it would matter. However, implementing the while
look as the following got me slightly further in the start-up process:
while(true) { const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index); switch (ch) { case 'v': show_version(); return false; case 'h': show_usage(argv[0]); return false; case 4: configfile = optarg; break; } if(-1 == ch) { break; } }
Maybe someone with more systems experience and C++ knowledge might have an idea as to why these two blocks of code behave differently when run on different architectures.
Anyway, being able to get a little farther into the start-up process was a sign I should keep looking into the issue. Using my Bush-league skills of debugging (e.g. liberal use of std::cerr
), I determined that things were getting bound up on:
load_from_cli(argc, argv);
A look at that method reveals another, similar, while
statement:
void Config::load_from_cli(int argc, char *argv[]) { int option_index = 0; char ch; optind = 0; while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) != -1) { switch (ch) { case 'l': setLogLevel(optarg); break; case 'p': setServerPort(optarg); break; case 'd': setDaemon(optarg ? std::string(optarg) : "true"); break; case 'a': setServerIP(optarg); break; case 'c': setCertPath(optarg); break; case 'k': setKeyPath(optarg); break; case '?': throw std::runtime_error("Missing argument, see stderr"); case 1: setLogFile(optarg); break; case 2: setPidFile(optarg); break; case 5: setCACertPath(optarg); break; case 6: setCertWhitelistPath(optarg); break; case 7: setCertBlacklistPath(optarg); break; case 8: setCubesDumpFile(optarg); break; case 9: setCubesDumpCleanInit(); break; case 10: //setCubesNoDump(); setCubesDumpEnabled(); break; } } }
Again, I determined that while (( ch = getopt_long..) != -1)
was never breaking from the while
loop. Changing it to:
while(true) { const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index); ... if(-1 == ch) { break; } }
This did the trick, as it had done with the previous while
loop. I was able to execute polycubed
but ran into a new error:
[2023-01-26 15:25:19.131] [polycubed] [info] configuration parameters: [2023-01-26 15:25:19.131] [polycubed] [info] loglevel: info [2023-01-26 15:25:19.131] [polycubed] [info] daemon: false [2023-01-26 15:25:19.131] [polycubed] [info] pidfile: /var/run/polycube.pid [2023-01-26 15:25:19.131] [polycubed] [info] port: 9000 [2023-01-26 15:25:19.131] [polycubed] [info] addr: localhost [2023-01-26 15:25:19.131] [polycubed] [info] logfile: /var/log/polycube/polycubed.log [2023-01-26 15:25:19.131] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml [2023-01-26 15:25:19.132] [polycubed] [info] cubes-dump-clean-init: false [2023-01-26 15:25:19.132] [polycubed] [info] cubes-dump-enable: false [2023-01-26 15:25:19.132] [polycubed] [info] polycubed starting... [2023-01-26 15:25:19.132] [polycubed] [info] version v0.9.0 modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.15.84-v8+ Unable to find kernel headers. Try rebuilding kernel with CONFIG_IKHEADERS=m (module) chdir(/lib/modules/5.15.84-v8+/build): No such file or directory [2023-01-26 15:25:19.180] [polycubed] [error] error creating patch panel: Unable to initialize BPF program [2023-01-26 15:25:19.188] [polycubed] [critical] Error starting polycube: Error creating patch panel
Next, I grabbed the linux kernel source from Raspberry Pi's github and setup a symlink for polycubed
to find kernel headers:
git clone --depth=1 https://github.com/raspberrypi/linux.git
mv linux linux-upstream-5.15.89-v8+
sudo ln -s /usr/src/linux-upstream-5.15.89-v8+ /lib/modules/5.15.89-v8+/build
sudo ~/polycube/build/src/polycubed/src/polycubed
This results in:
[2023-01-26 15:40:19.035] [polycubed] [info] configuration parameters: [2023-01-26 15:40:19.035] [polycubed] [info] loglevel: trace [2023-01-26 15:40:19.035] [polycubed] [info] daemon: false [2023-01-26 15:40:19.036] [polycubed] [info] pidfile: /var/run/polycube.pid [2023-01-26 15:40:19.036] [polycubed] [info] port: 9000 [2023-01-26 15:40:19.036] [polycubed] [info] addr: localhost [2023-01-26 15:40:19.036] [polycubed] [info] logfile: /var/log/polycube/polycubed.log [2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml [2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-clean-init: false [2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-enable: false [2023-01-26 15:40:19.036] [polycubed] [info] polycubed starting... [2023-01-26 15:40:19.036] [polycubed] [info] version v0.9.0 bpf: Failed to load program: Invalid argument jump out of range from insn 9 to 37 processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 [2023-01-26 15:40:46.751] [polycubed] [error] cannot load ctrl_rx: Failed to load controller_module_rx: -1 [2023-01-26 15:40:46.800] [polycubed] [critical] Error starting polycube: cannot load controller_module_rx
It is entirely possible that I am including the wrong version of bcc;
BPF Compiler Collection (BCC)
BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of extended BPF (Berkeley Packet Filters), formally known as eBPF, a new feature that was first added to Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.
I decided to step back, and grab a clean copy of polycubed
from github.
pi@raspberrypi:~/polycube $ git submodule update --init --recursive pi@raspberrypi:~/polycube/build $ cmake .. -DENABLE_PCN_IPTABLES=ON \ -DENABLE_SERVICE_BRIDGE=ON \ -DENABLE_SERVICE_DDOSMITIGATOR=OFF \ -DENABLE_SERVICE_FIREWALL=ON \ -DENABLE_SERVICE_HELLOWORLD=OFF \ -DENABLE_SERVICE_IPTABLES=ON \ -DENABLE_SERVICE_K8SFILTER=OFF \ -DENABLE_SERVICE_K8SWITCH=OFF \ -DENABLE_SERVICE_LBDSR=OFF \ -DENABLE_SERVICE_LBRP=OFF \ -DENABLE_SERVICE_NAT=ON \ -DENABLE_SERVICE_PBFORWARDER=ON \ -DENABLE_SERVICE_ROUTER=ON \ -DENABLE_SERVICE_SIMPLEBRIDGE=ON \ -DENABLE_SERVICE_SIMPLEFORWARDER=ON \ -DENABLE_SERVICE_TRANSPARENTHELLOWORLD=OFF \ -DENABLE_SERVICE_SYNFLOOD=OFF \ -DENABLE_SERVICE_PACKETCAPTURE=OFF -DENABLE_SERVICE_K8SDISPATCHER=OFF -- The C compiler identification is GNU 10.2.1 -- The CXX compiler identification is GNU 10.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Version is v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty] -- Latest recognized Git tag is v0.9.0 -- Git HEAD is a143e3c0325400dad7b9ff3406848f5a953ed3d1 -- Revision is 0.9.0-a143e3c0 -- Performing Test HAVE_NO_PIE_FLAG -- Performing Test HAVE_NO_PIE_FLAG - Success -- Performing Test HAVE_REALLOCARRAY_SUPPORT -- Performing Test HAVE_REALLOCARRAY_SUPPORT - Success -- Found LLVM: /usr/lib/llvm-9/include 9.0.1 (Use LLVM_ROOT envronment variable for another version of LLVM) -- Found BISON: /usr/bin/bison (found version "3.7.5") -- Found FLEX: /usr/bin/flex (found version "2.6.4") -- Found LibElf: /usr/lib/aarch64-linux-gnu/libelf.so -- Performing Test ELF_GETSHDRSTRNDX -- Performing Test ELF_GETSHDRSTRNDX - Success -- Could NOT find LibDebuginfod (missing: LIBDEBUGINFOD_LIBRARIES LIBDEBUGINFOD_INCLUDE_DIRS) -- Using static-libstdc++ -- Could NOT find LuaJIT (missing: LUAJIT_LIBRARIES LUAJIT_INCLUDE_DIR) -- jsoncons v0.142.0 -- Looking for pthread.h -- Looking for pthread.h - found -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY -- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success -- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success -- Performing Test COMPILER_HAS_DEPRECATED_ATTR -- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success -- The following OPTIONAL packages have been found: * BISON * FLEX * Threads -- The following REQUIRED packages have been found: * LibYANG * LLVM * LibElf -- The following OPTIONAL packages have not been found: * LibDebuginfod * LuaJIT -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2") -- Found OpenSSL: /usr/lib/aarch64-linux-gnu/libcrypto.so (found version "1.1.1n") -- Checking for module 'libnl-3.0' -- Found libnl-3.0, version 3.4.0 -- Checking for module 'libnl-genl-3.0' -- Found libnl-genl-3.0, version 3.4.0 -- Checking for module 'libnl-route-3.0' -- Found libnl-route-3.0, version 3.4.0 -- Checking for module 'libtins' -- Found libtins, version 3.5 -- Found nlohmann_json: /home/pi/polycube/cmake/nlohmann_json/Findnlohmann_json.cmake (Required is at least version "3.5.0") -- Checking for module 'systemd' -- Found systemd, version 247 -- systemd services install dir: /lib/systemd/system -- Configuring done -- Generating done -- Build files have been written to: /home/pi/polycube/build
cd ../src/libs/prometheus-cpp mkdir build; cd build cmake .. -DBUILD_SHARED_LIBS=ON make sudo make install
I made changes to config.cpp
to deal with our issue with getopt_long
and the while
loop. The changes are in my polycube clone.
I also did not have to add any of the #include
lines that I had added during my first attempt on a SOQuartz module.
sudo src/polycubed/src/polycubed [2023-01-26 20:58:06.453] [polycubed] [info] loading configuration from /etc/polycube/polycubed.conf [2023-01-26 20:58:06.456] [polycubed] [info] configuration parameters: [2023-01-26 20:58:06.456] [polycubed] [info] loglevel: info [2023-01-26 20:58:06.456] [polycubed] [info] daemon: false [2023-01-26 20:58:06.456] [polycubed] [info] pidfile: /var/run/polycube.pid [2023-01-26 20:58:06.456] [polycubed] [info] port: 9000 [2023-01-26 20:58:06.456] [polycubed] [info] addr: localhost [2023-01-26 20:58:06.456] [polycubed] [info] logfile: /var/log/polycube/polycubed.log [2023-01-26 20:58:06.456] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml [2023-01-26 20:58:06.456] [polycubed] [info] cubes-dump-clean-init: false [2023-01-26 20:58:06.457] [polycubed] [info] cubes-dump-enable: false [2023-01-26 20:58:06.457] [polycubed] [info] polycubed starting... [2023-01-26 20:58:06.457] [polycubed] [info] version v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty] prog tag mismatch 3e70ec38a5f6710 1 WARNING: cannot get prog tag, ignore saving source with program tag prog tag mismatch 1e2ac42799daebd8 1 WARNING: cannot get prog tag, ignore saving source with program tag [2023-01-26 20:58:23.636] [polycubed] [info] rest server listening on '127.0.0.1:9000' [2023-01-26 20:58:23.637] [polycubed] [info] rest server starting ... [2023-01-26 20:58:23.740] [polycubed] [info] service bridge loaded using libpcn-bridge.so [2023-01-26 20:58:23.779] [polycubed] [info] service firewall loaded using libpcn-firewall.so [2023-01-26 20:58:23.882] [polycubed] [info] service nat loaded using libpcn-nat.so [2023-01-26 20:58:24.012] [polycubed] [info] service pbforwarder loaded using libpcn-pbforwarder.so [2023-01-26 20:58:24.145] [polycubed] [info] service router loaded using libpcn-router.so [2023-01-26 20:58:24.210] [polycubed] [info] service simplebridge loaded using libpcn-simplebridge.so [2023-01-26 20:58:24.239] [polycubed] [info] service simpleforwarder loaded using libpcn-simpleforwarder.so [2023-01-26 20:58:24.282] [polycubed] [info] service iptables loaded using libpcn-iptables.so [2023-01-26 20:58:24.412] [polycubed] [info] service dynmon loaded using libpcn-dynmon.so [2023-01-26 20:58:24.412] [polycubed] [info] loading metrics from yang files
The daemon successfully runs. I do, however, need to capture the work I did in getting the linux kernel source headers
in place for the daemon to find to compile the eBPF code into byte code.
- Clone the Linux repository from Raspberry Pi, https://github.com/raspberrypi/linux, into
/usr/src
on the Raspberry Pi - In
/lib/modules/5.15.84-v8+/
make a symlink namedbuild
and point it to/usr/src/linux-upstream-5.15.89-v8+
That will be it for the Work In Progress posts on polycube
; I could attempt to recreate the steps taken, but I feel my notes across three posts should be enough. It is also isn't like polycube
deployments are in hot demand. There is a strong likely hood that I am the first and only person who has run it on Arm-based hardware. The next post on polycube
will be actually using it and in particular, the drop in replacement for iptables
; that is what I am most interested in.