Results Summary Tool

KernelCI results summary tool, monitor and all that DISCLAIMER: I’m typing this really really fast, so expect many miskates.

1 - result_summary tool Code: https://github.com/kernelci/kernelci-pipeline/blob/main/src/result_summary.py

NOTE: This is currently programmed as a pipeline stage, although it’s basically a standalone tool and it doesn’t communicate with any other pipeline stage. Potential Action Item (low priority): move it away from the pipeline repo and make it a standalone client.

1.1 - What does it do

This tool gets results from a KernelCI API instance and builds output reports based on templates. It can be used to query the API instance for results based on a set of query parameters (summary mode) or to listen for events in real time (monitor mode).

summary mode: will generate a single report file with the list of results found.
monitor mode: will generate an individual report file for each event received that matches the filter parameters

Since query parameters can get complicated and calling a command-line tool with hundreds of characters can get messy, parameters are organized as presets for user convenience. The file is self-documented.

The results involve a little bit of processing: currently, logs are fetched for every node. If a node doesn’t specify any logs, they are searched up the node hierarchy until found. Then they are downloaded and a log snippet is included in the reports. If the log points to an empty file, it’s considered null.

1.2 - Usage 1 - Fetch the kernelci-pipeline repo git clone https://github.com/kernelci/kernelci-pipeline.git 2 - Point the configuration to the target KernelCI API instance: edit config/kernelci.toml and change the setting of the “api_config” parameter. To use the staging instance, put “staging” here. 3 - Create a .env file with additional parameters (for summary mode it may be empty) 4 - Run the tool: docker-compose run result_summary –preset=“mypreset”

1.2.1 - Getting result summaries Presets for summaries should define a “summary” action in the preset metadata (see the presets file for examples).

Date ranges for results are typical command-line parameters for summary mode. For instance:

docker-compose run result_summary --preset="mainline-next-test-failures" --created-from=2024-04-26

If no dates are specified in the command-line nor in the preset, it’ll retrieve nodes created since yesterday until now.

Additional query parameters can be specified in the command line together with a preset specification. For example:

docker-compose run result_summary --preset="mainline-next-test-failures" --query-params="data.arch=arm64"

The result file will be generated in the data/output directory. The directory name inside data/output and the file name can be specified in the preset or as command-line options.

1.2.2 - Running a monitor Presets for monitors should define a “monitor” action in the preset metadata.

NOTE: In order to run in monitor mode you’ll need a KernelCI API authorization token with read/write permissions. If you have one, set it up as an env variable (KCI_API_TOKEN) in your .env file.

Using the same kind of query parameters, a monitor preset defines which node events to listen to. The monitor connects to the KernelCI API and subscribes to node events of a certain type. Then it listens for events in a continuous loop. For each event received, it checks if it matches the parameters specified in the preset. If it does, it generates a report in data/output (the subdirectory name and file name can be specified as well).

Example:

docker-compose run result_summary --preset="monitor-all-build-failures" --output-dir="monitor-all-build-failures"

Note that it’s convenient to run this in a loop, since the KernelCI API instance may be restarted sometimes.

2 - Collecting the results As a proof of concept, I wrote a few scripts that help me run configurable instances of monitor processes, deploy the results to web host and notify users in mattermost/slack. You may laugh at what you’re about to witness, but this is pretty much how Amazon started:

2.1 - run_service.py Code: https://gitlab.collabora.com/rcn/rcn/-/blob/master/tasks/KernelCI/001-kernelci_results_viewer/run_service.py?ref_type=heads

This takes a config file defining some general parameters and a list of monitor instances and:

With the ‘-s’ command line option: generates a dedicated output directory in data/output for each monitor instance, with a toml file to describe certain parameters for each monitor (such as the start timestamp)
With the ‘-r’ option: generates a shell script (runscreen.sh) and starts a terminal (good ol’ xterm) that runs it. The script starts GNU screen with one tab per monitor and runs each monitor instance in its own tab in a loop, each monitor runs a defined preset and outputs the files in a dedicated directory in data/output (created in the step above). They run in a loop as a crude way to recover from any possible KernelCI API instance reset. An additional tab then runs the run.sh script (see below)
With the ‘-c’ option, it clears all the output directories and saves the current reports in a “vault” dir.

2.2 - run.sh Code: https://gitlab.collabora.com/rcn/rcn/-/blob/master/tasks/KernelCI/001-kernelci_results_viewer/run.sh?ref_type=heads

The most originally-named script ever runs a loop once per minute where it collects all the monitor results into a “reports” directory, then calls the generate_pages.py script (see below) to sanitize them and generate index pages and then pushes them to a web space.

2.3 - generate_pages.py Code: https://gitlab.collabora.com/rcn/rcn/-/blob/master/tasks/KernelCI/001-kernelci_results_viewer/generate_pages.py?ref_type=heads

This script goes through every monitor subdirectory in “reports” and generates an index page for each directory and then a global index. It counts the number of reports for each monitor, removes duplicates (oh, because KernelCI nodes can be updated, so you can get more than one notification per node) and extracts some info from them (title, etc).

2.4 - notifier.py Code: https://gitlab.collabora.com/rcn/rcn/-/blob/master/tasks/KernelCI/001-kernelci_results_viewer/notifier.py?ref_type=heads

Finally, this reads the config file and a previous status file (if present) and generates a notification for all monitors that have “notify: true”. If there are new reports since the last one and there are subscribers listed in the monitor definition, it’ll also create a notification for them. Currently, notifications are pushed to Mattermost. Slack notifications pending.

3 - Conclusion So, basically I’m taking a bunch of generated files from a tool and piping them through other different tools in pure UNIX fashion to process them, add info and transform the result. It’s nothing more than a proof of concept to have something running on my machine and, as pure as it is from a design perspective (like UNIX) it also sucks (like UNIX) and turning this into a proper planned tool is something that we should consider at some point if we want to become better people. At least this can serve us as a prototype for part of the functionality that we’d like from the definitive front-end.

Thanks for reading, have a great day.