After emailing three of the committers to the original Polycube project, and receiving short replies from each of that basically said, polycube was never tested on an arm-based system will likely not work without significant efforts as well as, I believe the [polycube] project is no longer active, I wanted to follow through and test the former statement and really see how much effort would it take to get a compiled binary of polycubed
running on an Arm-based system.
With my previous Work In Progress, I appeared to be able to successfully build and compile an executable, but when run, the program did nothing but consume 100% of one core of the Raspberry Pi's processes.
What does this mean? A hung process, consuming 100% of one core; that feels to me like it is getting stuck in a loop without having an exit/break condition met. I started by doing what any ham-handed developer would do: I started at main()
in polycubed.cpp
and started to put std::cerr << "Code gets to this spot #1" << std:endl;
into the code.
I narrowed this initial issue of the process hang to the following:
try {
if (!config.load(argc, argv)) {
exit(EXIT_SUCCESS);
}
std::cerr << "Configs loaded..." << std::endl;
} catch (const std::exception &e) {
// The problem of the error in loading the config file may be due to
// polycubed executed as normal user
if (getuid())
logger->critical("polycubed should be executed with root privileges");
logger->critical("Error loading config: {}", e.what());
exit(EXIT_FAILURE);
}
Both of the cerr
statements that I added were never getting called. This narrowed down the issue to config.load(argc, argv)
.
Looking at config.cpp
and specifically at the method, load(int argc, char *argv[])
, you will find the following:
bool Config::load(int argc, char *argv[]) {
logger = spdlog::get("polycubed");
int option_index = 0;
char ch;
// do a first pass looking for "configfile", "-h", "-v"
while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) !=
-1) {
switch (ch) {
case 'v':
show_version();
return false;
case 'h':
show_usage(argv[0]);
return false;
case 4:
configfile = optarg;
break;
}
}
load_from_file(configfile);
load_from_cli(argc, argv);
check();
if (cubes_dump_clean_init) {
std::ofstream output(cubes_dump_file);
if (output.is_open()) {
output << "{}";
output.close();
}
}
return true;
}
Through some amateur debugging statements, I determined that while (( ch = getopt_long..) != -1)
was never ceasing. The while
loop never exited. Why would this statement work flawlessly on Intel amd64-based systems and not on Arm64 systems? I am still stumped as why it would matter. However, implementing the while
look as the following got me slightly further in the start-up process:
while(true) {
const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index);
switch (ch) {
case 'v':
show_version();
return false;
case 'h':
show_usage(argv[0]);
return false;
case 4:
configfile = optarg;
break;
}
if(-1 == ch) {
break;
}
}
Maybe someone with more systems experience and C++ knowledge might have an idea as to why these two blocks of code behave differently when run on different architectures.
Anyway, being able to get a little farther into the start-up process was a sign I should keep looking into the issue. Using my Bush-league skills of debugging (e.g. liberal use of std::cerr
), I determined that things were getting bound up on:
load_from_cli(argc, argv);
A look at that method reveals another, similar, while
statement:
void Config::load_from_cli(int argc, char *argv[]) {
int option_index = 0;
char ch;
optind = 0;
while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) !=
-1) {
switch (ch) {
case 'l':
setLogLevel(optarg);
break;
case 'p':
setServerPort(optarg);
break;
case 'd':
setDaemon(optarg ? std::string(optarg) : "true");
break;
case 'a':
setServerIP(optarg);
break;
case 'c':
setCertPath(optarg);
break;
case 'k':
setKeyPath(optarg);
break;
case '?':
throw std::runtime_error("Missing argument, see stderr");
case 1:
setLogFile(optarg);
break;
case 2:
setPidFile(optarg);
break;
case 5:
setCACertPath(optarg);
break;
case 6:
setCertWhitelistPath(optarg);
break;
case 7:
setCertBlacklistPath(optarg);
break;
case 8:
setCubesDumpFile(optarg);
break;
case 9:
setCubesDumpCleanInit();
break;
case 10:
//setCubesNoDump();
setCubesDumpEnabled();
break;
}
}
}
Again, I determined that while (( ch = getopt_long..) != -1)
was never breaking from the while
loop. Changing it to:
while(true) {
const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index);
...
if(-1 == ch) {
break;
}
}
This did the trick, as it had done with the previous while
loop. I was able to execute polycubed
but ran into a new error:
[2023-01-26 15:25:19.131] [polycubed] [info] configuration parameters:
[2023-01-26 15:25:19.131] [polycubed] [info] loglevel: info
[2023-01-26 15:25:19.131] [polycubed] [info] daemon: false
[2023-01-26 15:25:19.131] [polycubed] [info] pidfile: /var/run/polycube.pid
[2023-01-26 15:25:19.131] [polycubed] [info] port: 9000
[2023-01-26 15:25:19.131] [polycubed] [info] addr: localhost
[2023-01-26 15:25:19.131] [polycubed] [info] logfile: /var/log/polycube/polycubed.log
[2023-01-26 15:25:19.131] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 15:25:19.132] [polycubed] [info] cubes-dump-clean-init: false
[2023-01-26 15:25:19.132] [polycubed] [info] cubes-dump-enable: false
[2023-01-26 15:25:19.132] [polycubed] [info] polycubed starting...
[2023-01-26 15:25:19.132] [polycubed] [info] version v0.9.0
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.15.84-v8+
Unable to find kernel headers. Try rebuilding kernel with CONFIG_IKHEADERS=m (module)
chdir(/lib/modules/5.15.84-v8+/build): No such file or directory
[2023-01-26 15:25:19.180] [polycubed] [error] error creating patch panel: Unable to initialize BPF program
[2023-01-26 15:25:19.188] [polycubed] [critical] Error starting polycube: Error creating patch panel
Next, I grabbed the linux kernel source from Raspberry Pi's github and setup a symlink for polycubed
to find kernel headers:
git clone --depth=1 https://github.com/raspberrypi/linux.git
mv linux linux-upstream-5.15.89-v8+
sudo ln -s /usr/src/linux-upstream-5.15.89-v8+ /lib/modules/5.15.89-v8+/build
sudo ~/polycube/build/src/polycubed/src/polycubed
This results in:
[2023-01-26 15:40:19.035] [polycubed] [info] configuration parameters:
[2023-01-26 15:40:19.035] [polycubed] [info] loglevel: trace
[2023-01-26 15:40:19.035] [polycubed] [info] daemon: false
[2023-01-26 15:40:19.036] [polycubed] [info] pidfile: /var/run/polycube.pid
[2023-01-26 15:40:19.036] [polycubed] [info] port: 9000
[2023-01-26 15:40:19.036] [polycubed] [info] addr: localhost
[2023-01-26 15:40:19.036] [polycubed] [info] logfile: /var/log/polycube/polycubed.log
[2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-clean-init: false
[2023-01-26 15:40:19.036] [polycubed] [info] cubes-dump-enable: false
[2023-01-26 15:40:19.036] [polycubed] [info] polycubed starting...
[2023-01-26 15:40:19.036] [polycubed] [info] version v0.9.0
bpf: Failed to load program: Invalid argument
jump out of range from insn 9 to 37
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
[2023-01-26 15:40:46.751] [polycubed] [error] cannot load ctrl_rx: Failed to load controller_module_rx: -1
[2023-01-26 15:40:46.800] [polycubed] [critical] Error starting polycube: cannot load controller_module_rx
It is entirely possible that I am including the wrong version of bcc;
BPF Compiler Collection (BCC)
BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of extended BPF (Berkeley Packet Filters), formally known as eBPF, a new feature that was first added to Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.
I decided to step back, and grab a clean copy of polycubed
from github.
pi@raspberrypi:~/polycube $ git submodule update --init --recursive
pi@raspberrypi:~/polycube/build $ cmake .. -DENABLE_PCN_IPTABLES=ON \
-DENABLE_SERVICE_BRIDGE=ON \
-DENABLE_SERVICE_DDOSMITIGATOR=OFF \
-DENABLE_SERVICE_FIREWALL=ON \
-DENABLE_SERVICE_HELLOWORLD=OFF \
-DENABLE_SERVICE_IPTABLES=ON \
-DENABLE_SERVICE_K8SFILTER=OFF \
-DENABLE_SERVICE_K8SWITCH=OFF \
-DENABLE_SERVICE_LBDSR=OFF \
-DENABLE_SERVICE_LBRP=OFF \
-DENABLE_SERVICE_NAT=ON \
-DENABLE_SERVICE_PBFORWARDER=ON \
-DENABLE_SERVICE_ROUTER=ON \
-DENABLE_SERVICE_SIMPLEBRIDGE=ON \
-DENABLE_SERVICE_SIMPLEFORWARDER=ON \
-DENABLE_SERVICE_TRANSPARENTHELLOWORLD=OFF \
-DENABLE_SERVICE_SYNFLOOD=OFF \
-DENABLE_SERVICE_PACKETCAPTURE=OFF -DENABLE_SERVICE_K8SDISPATCHER=OFF
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Version is v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty]
-- Latest recognized Git tag is v0.9.0
-- Git HEAD is a143e3c0325400dad7b9ff3406848f5a953ed3d1
-- Revision is 0.9.0-a143e3c0
-- Performing Test HAVE_NO_PIE_FLAG
-- Performing Test HAVE_NO_PIE_FLAG - Success
-- Performing Test HAVE_REALLOCARRAY_SUPPORT
-- Performing Test HAVE_REALLOCARRAY_SUPPORT - Success
-- Found LLVM: /usr/lib/llvm-9/include 9.0.1 (Use LLVM_ROOT envronment variable for another version of LLVM)
-- Found BISON: /usr/bin/bison (found version "3.7.5")
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
-- Found LibElf: /usr/lib/aarch64-linux-gnu/libelf.so
-- Performing Test ELF_GETSHDRSTRNDX
-- Performing Test ELF_GETSHDRSTRNDX - Success
-- Could NOT find LibDebuginfod (missing: LIBDEBUGINFOD_LIBRARIES LIBDEBUGINFOD_INCLUDE_DIRS)
-- Using static-libstdc++
-- Could NOT find LuaJIT (missing: LUAJIT_LIBRARIES LUAJIT_INCLUDE_DIR)
-- jsoncons v0.142.0
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- The following OPTIONAL packages have been found:
* BISON
* FLEX
* Threads
-- The following REQUIRED packages have been found:
* LibYANG
* LLVM
* LibElf
-- The following OPTIONAL packages have not been found:
* LibDebuginfod
* LuaJIT
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2")
-- Found OpenSSL: /usr/lib/aarch64-linux-gnu/libcrypto.so (found version "1.1.1n")
-- Checking for module 'libnl-3.0'
-- Found libnl-3.0, version 3.4.0
-- Checking for module 'libnl-genl-3.0'
-- Found libnl-genl-3.0, version 3.4.0
-- Checking for module 'libnl-route-3.0'
-- Found libnl-route-3.0, version 3.4.0
-- Checking for module 'libtins'
-- Found libtins, version 3.5
-- Found nlohmann_json: /home/pi/polycube/cmake/nlohmann_json/Findnlohmann_json.cmake (Required is at least version "3.5.0")
-- Checking for module 'systemd'
-- Found systemd, version 247
-- systemd services install dir: /lib/systemd/system
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pi/polycube/build
cd ../src/libs/prometheus-cpp
mkdir build; cd build
cmake .. -DBUILD_SHARED_LIBS=ON
make
sudo make install
I made changes to config.cpp
to deal with our issue with getopt_long
and the while
loop. The changes are in my polycube clone.
I also did not have to add any of the #include
lines that I had added during my first attempt on a SOQuartz module.
sudo src/polycubed/src/polycubed
[2023-01-26 20:58:06.453] [polycubed] [info] loading configuration from /etc/polycube/polycubed.conf
[2023-01-26 20:58:06.456] [polycubed] [info] configuration parameters:
[2023-01-26 20:58:06.456] [polycubed] [info] loglevel: info
[2023-01-26 20:58:06.456] [polycubed] [info] daemon: false
[2023-01-26 20:58:06.456] [polycubed] [info] pidfile: /var/run/polycube.pid
[2023-01-26 20:58:06.456] [polycubed] [info] port: 9000
[2023-01-26 20:58:06.456] [polycubed] [info] addr: localhost
[2023-01-26 20:58:06.456] [polycubed] [info] logfile: /var/log/polycube/polycubed.log
[2023-01-26 20:58:06.456] [polycubed] [info] cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 20:58:06.456] [polycubed] [info] cubes-dump-clean-init: false
[2023-01-26 20:58:06.457] [polycubed] [info] cubes-dump-enable: false
[2023-01-26 20:58:06.457] [polycubed] [info] polycubed starting...
[2023-01-26 20:58:06.457] [polycubed] [info] version v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty]
prog tag mismatch 3e70ec38a5f6710 1
WARNING: cannot get prog tag, ignore saving source with program tag
prog tag mismatch 1e2ac42799daebd8 1
WARNING: cannot get prog tag, ignore saving source with program tag
[2023-01-26 20:58:23.636] [polycubed] [info] rest server listening on '127.0.0.1:9000'
[2023-01-26 20:58:23.637] [polycubed] [info] rest server starting ...
[2023-01-26 20:58:23.740] [polycubed] [info] service bridge loaded using libpcn-bridge.so
[2023-01-26 20:58:23.779] [polycubed] [info] service firewall loaded using libpcn-firewall.so
[2023-01-26 20:58:23.882] [polycubed] [info] service nat loaded using libpcn-nat.so
[2023-01-26 20:58:24.012] [polycubed] [info] service pbforwarder loaded using libpcn-pbforwarder.so
[2023-01-26 20:58:24.145] [polycubed] [info] service router loaded using libpcn-router.so
[2023-01-26 20:58:24.210] [polycubed] [info] service simplebridge loaded using libpcn-simplebridge.so
[2023-01-26 20:58:24.239] [polycubed] [info] service simpleforwarder loaded using libpcn-simpleforwarder.so
[2023-01-26 20:58:24.282] [polycubed] [info] service iptables loaded using libpcn-iptables.so
[2023-01-26 20:58:24.412] [polycubed] [info] service dynmon loaded using libpcn-dynmon.so
[2023-01-26 20:58:24.412] [polycubed] [info] loading metrics from yang files
The daemon successfully runs. I do, however, need to capture the work I did in getting the linux kernel source headers
in place for the daemon to find to compile the eBPF code into byte code.
- Clone the Linux repository from Raspberry Pi, https://github.com/raspberrypi/linux, into
/usr/src
on the Raspberry Pi
- In
/lib/modules/5.15.84-v8+/
make a symlink named build
and point it to /usr/src/linux-upstream-5.15.89-v8+
That will be it for the Work In Progress posts on polycube
; I could attempt to recreate the steps taken, but I feel my notes across three posts should be enough. It is also isn't like polycube
deployments are in hot demand. There is a strong likely hood that I am the first and only person who has run it on Arm-based hardware. The next post on polycube
will be actually using it and in particular, the drop in replacement for iptables
; that is what I am most interested in.