After emailing three of the committers to the original Polycube project, and receiving short replies from each of that basically said, polycube was never tested on an arm-based system will likely not work without significant efforts as well as, I believe the [polycube] project is no longer active, I wanted to follow through and test the former statement and really see how much effort would it take to get a compiled binary of polycubed running on an Arm-based system.

With my previous Work In Progress, I appeared to be able to successfully build and compile an executable, but when run, the program did nothing but consume 100% of one core of the Raspberry Pi's processes.

What does this mean? A hung process, consuming 100% of one core; that feels to me like it is getting stuck in a loop without having an exit/break condition met. I started by doing what any ham-handed developer would do: I started at main() in polycubed.cpp and started to put std::cerr << "Code gets to this spot #1" << std:endl; into the code.

I narrowed this initial issue of the process hang to the following:

try {

    if (!config.load(argc, argv)) {
        exit(EXIT_SUCCESS);
    }

    std::cerr << "Configs loaded..." << std::endl;

} catch (const std::exception &e) {

    // The problem of the error in loading the config file may be due to
    // polycubed executed as normal user
    if (getuid())
        logger->critical("polycubed should be executed with root privileges");

    logger->critical("Error loading config: {}", e.what());
    exit(EXIT_FAILURE);
}

Both of the cerr statements that I added were never getting called. This narrowed down the issue to config.load(argc, argv).

Looking at config.cpp and specifically at the method, load(int argc, char *argv[]), you will find the following:

bool Config::load(int argc, char *argv[]) {
  logger = spdlog::get("polycubed");

  int option_index = 0;
  char ch;

  // do a first pass looking for "configfile", "-h", "-v"
  while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) !=
         -1) {
    switch (ch) {
    case 'v':
      show_version();
      return false;
    case 'h':
      show_usage(argv[0]);
      return false;
    case 4:
      configfile = optarg;
      break;
    }
  }

  load_from_file(configfile);
  load_from_cli(argc, argv);
  check();

  if (cubes_dump_clean_init) {
    std::ofstream output(cubes_dump_file);
    if (output.is_open()) {
      output << "{}";
      output.close();
    }
  }

  return true;
}

Through some amateur debugging statements, I determined that while (( ch = getopt_long..) != -1) was never ceasing. The while loop never exited. Why would this statement work flawlessly on Intel amd64-based systems and not on Arm64 systems? I am still stumped as why it would matter. However, implementing the while look as the following got me slightly further in the start-up process:

  while(true) {
    const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index);

    switch (ch) {
    case 'v':
      show_version();
      return false;
    case 'h':
      show_usage(argv[0]);
      return false;
    case 4:
      configfile = optarg;
      break;
    }

    if(-1 == ch) {
      break;
    }
  }

Maybe someone with more systems experience and C++ knowledge might have an idea as to why these two blocks of code behave differently when run on different architectures.

Anyway, being able to get a little farther into the start-up process was a sign I should keep looking into the issue. Using my Bush-league skills of debugging (e.g. liberal use of std::cerr), I determined that things were getting bound up on:

load_from_cli(argc, argv);

A look at that method reveals another, similar, while statement:

void Config::load_from_cli(int argc, char *argv[]) {
  int option_index = 0;
  char ch;
  optind = 0;
  while ((ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index)) !=
         -1) {
    switch (ch) {
    case 'l':
      setLogLevel(optarg);
      break;
    case 'p':
      setServerPort(optarg);
      break;
    case 'd':
      setDaemon(optarg ? std::string(optarg) : "true");
      break;
    case 'a':
      setServerIP(optarg);
      break;
    case 'c':
      setCertPath(optarg);
      break;
    case 'k':
      setKeyPath(optarg);
      break;
    case '?':
      throw std::runtime_error("Missing argument, see stderr");
    case 1:
      setLogFile(optarg);
      break;
    case 2:
      setPidFile(optarg);
      break;
    case 5:
      setCACertPath(optarg);
      break;
    case 6:
      setCertWhitelistPath(optarg);
      break;
    case 7:
      setCertBlacklistPath(optarg);
      break;
    case 8:
      setCubesDumpFile(optarg);
      break;
    case 9:
      setCubesDumpCleanInit();
      break;
    case 10:
      //setCubesNoDump();
      setCubesDumpEnabled();
      break;
    }
  }
}

Again, I determined that while (( ch = getopt_long..) != -1) was never breaking from the while loop. Changing it to:

  while(true) {

    const auto ch = getopt_long(argc, argv, "l:p:a:dhv", options, &option_index);

    ...

    if(-1 == ch) {
      break;
    }

  }

This did the trick, as it had done with the previous while loop. I was able to execute polycubed but ran into a new error:

[2023-01-26 15:25:19.131] [polycubed] [info] configuration parameters:
[2023-01-26 15:25:19.131] [polycubed] [info]  loglevel: info
[2023-01-26 15:25:19.131] [polycubed] [info]  daemon: false
[2023-01-26 15:25:19.131] [polycubed] [info]  pidfile: /var/run/polycube.pid
[2023-01-26 15:25:19.131] [polycubed] [info]  port: 9000
[2023-01-26 15:25:19.131] [polycubed] [info]  addr: localhost
[2023-01-26 15:25:19.131] [polycubed] [info]  logfile: /var/log/polycube/polycubed.log
[2023-01-26 15:25:19.131] [polycubed] [info]  cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 15:25:19.132] [polycubed] [info]  cubes-dump-clean-init: false
[2023-01-26 15:25:19.132] [polycubed] [info]  cubes-dump-enable: false
[2023-01-26 15:25:19.132] [polycubed] [info] polycubed starting...
[2023-01-26 15:25:19.132] [polycubed] [info] version v0.9.0
modprobe: FATAL: Module kheaders not found in directory /lib/modules/5.15.84-v8+
Unable to find kernel headers. Try rebuilding kernel with CONFIG_IKHEADERS=m (module)
chdir(/lib/modules/5.15.84-v8+/build): No such file or directory
[2023-01-26 15:25:19.180] [polycubed] [error] error creating patch panel: Unable to initialize BPF program
[2023-01-26 15:25:19.188] [polycubed] [critical] Error starting polycube: Error creating patch panel

Next, I grabbed the linux kernel source from Raspberry Pi's github and setup a symlink for polycubed to find kernel headers:

git clone --depth=1 https://github.com/raspberrypi/linux.git
mv linux linux-upstream-5.15.89-v8+
sudo ln -s /usr/src/linux-upstream-5.15.89-v8+ /lib/modules/5.15.89-v8+/build
sudo ~/polycube/build/src/polycubed/src/polycubed

This results in:

[2023-01-26 15:40:19.035] [polycubed] [info] configuration parameters:
[2023-01-26 15:40:19.035] [polycubed] [info]  loglevel: trace
[2023-01-26 15:40:19.035] [polycubed] [info]  daemon: false
[2023-01-26 15:40:19.036] [polycubed] [info]  pidfile: /var/run/polycube.pid
[2023-01-26 15:40:19.036] [polycubed] [info]  port: 9000
[2023-01-26 15:40:19.036] [polycubed] [info]  addr: localhost
[2023-01-26 15:40:19.036] [polycubed] [info]  logfile: /var/log/polycube/polycubed.log
[2023-01-26 15:40:19.036] [polycubed] [info]  cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 15:40:19.036] [polycubed] [info]  cubes-dump-clean-init: false
[2023-01-26 15:40:19.036] [polycubed] [info]  cubes-dump-enable: false
[2023-01-26 15:40:19.036] [polycubed] [info] polycubed starting...
[2023-01-26 15:40:19.036] [polycubed] [info] version v0.9.0
bpf: Failed to load program: Invalid argument
jump out of range from insn 9 to 37
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

[2023-01-26 15:40:46.751] [polycubed] [error] cannot load ctrl_rx: Failed to load controller_module_rx: -1
[2023-01-26 15:40:46.800] [polycubed] [critical] Error starting polycube: cannot load controller_module_rx

It is entirely possible that I am including the wrong version of bcc;

BPF Compiler Collection (BCC)

BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of extended BPF (Berkeley Packet Filters), formally known as eBPF, a new feature that was first added to Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.


I decided to step back, and grab a clean copy of polycubed from github.

pi@raspberrypi:~/polycube $ git submodule update --init --recursive
pi@raspberrypi:~/polycube/build $ cmake ..  -DENABLE_PCN_IPTABLES=ON \
                                            -DENABLE_SERVICE_BRIDGE=ON \    
                                            -DENABLE_SERVICE_DDOSMITIGATOR=OFF \     
                                            -DENABLE_SERVICE_FIREWALL=ON    \
                                            -DENABLE_SERVICE_HELLOWORLD=OFF   \
                                            -DENABLE_SERVICE_IPTABLES=ON    \
                                            -DENABLE_SERVICE_K8SFILTER=OFF    \
                                            -DENABLE_SERVICE_K8SWITCH=OFF    \
                                            -DENABLE_SERVICE_LBDSR=OFF    \
                                            -DENABLE_SERVICE_LBRP=OFF  \
                                            -DENABLE_SERVICE_NAT=ON   \
                                            -DENABLE_SERVICE_PBFORWARDER=ON   \
                                            -DENABLE_SERVICE_ROUTER=ON    \
                                            -DENABLE_SERVICE_SIMPLEBRIDGE=ON    \
                                            -DENABLE_SERVICE_SIMPLEFORWARDER=ON    \
                                            -DENABLE_SERVICE_TRANSPARENTHELLOWORLD=OFF   \
                                            -DENABLE_SERVICE_SYNFLOOD=OFF   \
                                            -DENABLE_SERVICE_PACKETCAPTURE=OFF     -DENABLE_SERVICE_K8SDISPATCHER=OFF
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Version is v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty]
-- Latest recognized Git tag is v0.9.0
-- Git HEAD is a143e3c0325400dad7b9ff3406848f5a953ed3d1
-- Revision is 0.9.0-a143e3c0
-- Performing Test HAVE_NO_PIE_FLAG
-- Performing Test HAVE_NO_PIE_FLAG - Success
-- Performing Test HAVE_REALLOCARRAY_SUPPORT
-- Performing Test HAVE_REALLOCARRAY_SUPPORT - Success
-- Found LLVM: /usr/lib/llvm-9/include 9.0.1 (Use LLVM_ROOT envronment variable for another version of LLVM)
-- Found BISON: /usr/bin/bison (found version "3.7.5")
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
-- Found LibElf: /usr/lib/aarch64-linux-gnu/libelf.so  
-- Performing Test ELF_GETSHDRSTRNDX
-- Performing Test ELF_GETSHDRSTRNDX - Success
-- Could NOT find LibDebuginfod (missing: LIBDEBUGINFOD_LIBRARIES LIBDEBUGINFOD_INCLUDE_DIRS)
-- Using static-libstdc++
-- Could NOT find LuaJIT (missing: LUAJIT_LIBRARIES LUAJIT_INCLUDE_DIR)
-- jsoncons v0.142.0
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- The following OPTIONAL packages have been found:

 * BISON
 * FLEX
 * Threads

-- The following REQUIRED packages have been found:

 * LibYANG
 * LLVM
 * LibElf

-- The following OPTIONAL packages have not been found:

 * LibDebuginfod
 * LuaJIT

-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2")
-- Found OpenSSL: /usr/lib/aarch64-linux-gnu/libcrypto.so (found version "1.1.1n")  
-- Checking for module 'libnl-3.0'
--   Found libnl-3.0, version 3.4.0
-- Checking for module 'libnl-genl-3.0'
--   Found libnl-genl-3.0, version 3.4.0
-- Checking for module 'libnl-route-3.0'
--   Found libnl-route-3.0, version 3.4.0
-- Checking for module 'libtins'
--   Found libtins, version 3.5
-- Found nlohmann_json: /home/pi/polycube/cmake/nlohmann_json/Findnlohmann_json.cmake (Required is at least version "3.5.0")
-- Checking for module 'systemd'
--   Found systemd, version 247
-- systemd services install dir: /lib/systemd/system
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pi/polycube/build
cd ../src/libs/prometheus-cpp
mkdir build; cd build
cmake .. -DBUILD_SHARED_LIBS=ON
make
sudo make install

I made changes to config.cpp to deal with our issue with getopt_long and the while loop. The changes are in my polycube clone.

I also did not have to add any of the #include lines that I had added during my first attempt on a SOQuartz module.

sudo src/polycubed/src/polycubed
[2023-01-26 20:58:06.453] [polycubed] [info] loading configuration from /etc/polycube/polycubed.conf
[2023-01-26 20:58:06.456] [polycubed] [info] configuration parameters:
[2023-01-26 20:58:06.456] [polycubed] [info]  loglevel: info
[2023-01-26 20:58:06.456] [polycubed] [info]  daemon: false
[2023-01-26 20:58:06.456] [polycubed] [info]  pidfile: /var/run/polycube.pid
[2023-01-26 20:58:06.456] [polycubed] [info]  port: 9000
[2023-01-26 20:58:06.456] [polycubed] [info]  addr: localhost
[2023-01-26 20:58:06.456] [polycubed] [info]  logfile: /var/log/polycube/polycubed.log
[2023-01-26 20:58:06.456] [polycubed] [info]  cubes-dump-file: /etc/polycube/cubes.yaml
[2023-01-26 20:58:06.456] [polycubed] [info]  cubes-dump-clean-init: false
[2023-01-26 20:58:06.457] [polycubed] [info]  cubes-dump-enable: false
[2023-01-26 20:58:06.457] [polycubed] [info] polycubed starting...
[2023-01-26 20:58:06.457] [polycubed] [info] version v0.9.0+ [git: (branch/commit): master/a143e3c0-dirty]
prog tag mismatch 3e70ec38a5f6710 1
WARNING: cannot get prog tag, ignore saving source with program tag
prog tag mismatch 1e2ac42799daebd8 1
WARNING: cannot get prog tag, ignore saving source with program tag
[2023-01-26 20:58:23.636] [polycubed] [info] rest server listening on '127.0.0.1:9000'
[2023-01-26 20:58:23.637] [polycubed] [info] rest server starting ...
[2023-01-26 20:58:23.740] [polycubed] [info] service bridge loaded using libpcn-bridge.so
[2023-01-26 20:58:23.779] [polycubed] [info] service firewall loaded using libpcn-firewall.so
[2023-01-26 20:58:23.882] [polycubed] [info] service nat loaded using libpcn-nat.so
[2023-01-26 20:58:24.012] [polycubed] [info] service pbforwarder loaded using libpcn-pbforwarder.so
[2023-01-26 20:58:24.145] [polycubed] [info] service router loaded using libpcn-router.so
[2023-01-26 20:58:24.210] [polycubed] [info] service simplebridge loaded using libpcn-simplebridge.so
[2023-01-26 20:58:24.239] [polycubed] [info] service simpleforwarder loaded using libpcn-simpleforwarder.so
[2023-01-26 20:58:24.282] [polycubed] [info] service iptables loaded using libpcn-iptables.so
[2023-01-26 20:58:24.412] [polycubed] [info] service dynmon loaded using libpcn-dynmon.so
[2023-01-26 20:58:24.412] [polycubed] [info] loading metrics from yang files

The daemon successfully runs. I do, however, need to capture the work I did in getting the linux kernel source headers in place for the daemon to find to compile the eBPF code into byte code.

  1. Clone the Linux repository from Raspberry Pi, https://github.com/raspberrypi/linux, into /usr/src on the Raspberry Pi
  2. In /lib/modules/5.15.84-v8+/ make a symlink named build and point it to /usr/src/linux-upstream-5.15.89-v8+

That will be it for the Work In Progress posts on polycube; I could attempt to recreate the steps taken, but I feel my notes across three posts should be enough. It is also isn't like polycube deployments are in hot demand. There is a strong likely hood that I am the first and only person who has run it on Arm-based hardware. The next post on polycube will be actually using it and in particular, the drop in replacement for iptables; that is what I am most interested in.