Commit Graph

674 Commits

Author SHA1 Message Date
Kir Kolyshkin 3e0829d195 tests/rootless.sh: fix skipping idmap tests for systemd
When RUNC_USE_SYSTEMD is set, tests/rootless.sh is using

	ssh -tt rootless@localhost

to run tests as rootless user. In this case, local environment is not
passed to the user's ssh session (unless explicitly specified), and so
the tests do not get ROOTLESS_FEATURES.

As a result, idmap-related tests are skipped when running as rootless
using systemd cgroup driver:

	integration test (systemd driver)
	...
	[02] run rootless tests ... (idmap)
	...
	ok 286 runc run detached ({u,g}id != 0) # skip test requires rootless_idmap
	...

Fix this by creating a list of environment variables needed by the
tests, and adding those to ssh command line (in case of ssh) or
exporting (in case of sudo) so both cases work similarly.

Also, modify disable_idmap to unset variables set in enable_idmap so
they are not exported at all if idmap is not in features.

Fixes: bf15cc99 ("cgroup v2: support rootless systemd")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-04-07 11:27:48 -07:00
Kir Kolyshkin ac2a53be8e tests: rename AUX_{DIR,UID} to ROOTLESS_AUX_*
Also, fix the typo (AUX_DIX) in cleanup.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-04-07 10:53:37 -07:00
Kir Kolyshkin 9932ad19be tests/int: introduce the concept of unsafe tests
Some of runc integration tests may do something that I would not like
when running those on my development laptop. Examples include

 - changing the root mount propagation [1];
 - replacing /root/runc [2];
 - changing the file in /etc (see checkpoint.bats).

Yet it is totally fine to do all that in a throwaway CI environment,
or inside a Docker container.

Introduce a mechanism to skip specific "unsafe" tests unless an
environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it
from a specific checkpoint/restore test which modifies
/etc/criu/default.conf.

[1]: https://github.com/opencontainers/runc/pull/5200
[2]: https://github.com/opencontainers/runc/pull/5207

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-04-02 20:03:47 -07:00
Aleksa Sarai 47fba7e4b1 go fix: use (*sync.WaitGroup).Go
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2026-03-30 13:13:53 -07:00
Kir Kolyshkin f9a9a36fa8 tests/int: allow cpu quota cgroup v1 files fds
Since switching to Go 1.25 in go.mod, the "detect fd leaks" test fails
like this:

> not ok 57 runc create[detect fd leak as comprehensively as possible]
> # (in test file tests/integration/create.bats, line 76)
> #   `[ "$violation_found" -eq 0 ]' failed
> ...
> # Violation: FD 9 -> '/system.slice/runc-test_busybox.scope/cpu.cfs_quota_us'
> # Violation: FD 10 -> '/system.slice/runc-test_busybox.scope/cpu.cfs_period_us'
> ...

This happens because Go 1.25 adds a feature to dynamically set GOMAXPROC
based on current CPU quota values. This feature can be disabled by setting

	GODEBUG=containermaxprocs=0,updatemaxprocs=0

but it is harmless to keep it (except for the above test failure).

Add an exception to the test case.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-03-30 13:13:50 -07:00
lifubang 7fdab1cb69 test: check mount source fds are cleaned up with idmapped mounts
Signed-off-by: lifubang <lifubang@acmcoder.com>
2026-03-20 01:17:08 +00:00
Kir Kolyshkin 0079bee17f Support specs.LinuxSeccompFlagWaitKillableRecv
This adds support for WaitKillableRecv seccomp flag
(also known as SCMP_FLTATR_CTL_WAITKILL in libseccomp and
as SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV in the kernel).

This requires:
 - libseccomp >= 2.6.0
 - libseccomp-golang >= 0.11.0
 - linux kernel >= 5.19

Note that this flag does not make sense without NEW_LISTENER, and
the kernel returns EINVAL when SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV
is set but SECCOMP_FILTER_FLAG_NEW_LISTENER is not set.

For runc this means that .linux.seccomp.listenerPath should also be set,
and some of the seccomp rules should have SCMP_ACT_NOTIFY action. This
is why the flag is tested separately in seccomp-notify.bats.

At the moment the only adequate CI environment for this functionality is
Fedora 43. On all other platforms (including CentOS 10 and Ubuntu 24.04)
it is skipped similar to this:

> ok 251 runc run [seccomp] (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV) # skip requires libseccomp >= 2.6.0 and API level >= 7 (current version: 2.5.6, API level: 6)

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-03-16 10:48:42 -07:00
Aleksa Sarai bb9ee2b0df integration: output debug information in fd leak test
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2026-03-12 17:58:09 +09:00
Ricardo Branco f18e97d312 tests/int: Disable coredumps for SCMP_ACT_KILL tests
SCMP_ACT_KILL terminates the process with a fatal signal, which may
produce a core dump depending on the host configuration.

While this is harmless on ephemeral CI instances, it can leave unwanted
core files on developer or customer systems. It also interferes with
test environments that detect unexpected core dumps.

Signed-off-by: Ricardo Branco <rbranco@suse.de>
2026-02-25 13:22:17 +01:00
Kir Kolyshkin 1fdbab8107 tests/int: add "runc exec [init changes cgroup]"
Add a test case to reproduce runc issue 5089.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2026-02-11 11:57:27 -08:00
lifubang 5560d55bfd libct/specconv: fix partial clear of atime mount flags
When parsing mount options into recAttrSet and recAttrClr,
the code sets attr_clr to individual atime flags (e.g.
MOUNT_ATTR_NOATIME or MOUNT_ATTR_STRICTATIME) when clearing
atime attributes. However, this violates the kernel's
requirement documented in mount_setattr(2)[1]:

> Note that, since the access-time values are an enumeration
> rather than bit values, a caller wanting to transition to a
> different access-time setting cannot simply specify the
> access-time setting in attr_set, but must also include
> MOUNT_ATTR__ATIME in the attr_clr field.  The kernel will
> verify that MOUNT_ATTR__ATIME isn't partially set in
> attr_clr (i.e., either all bits in the MOUNT_ATTR__ATIME
> bit field are either set or clear), and that attr_set
> doesn't have any access-time bits set if MOUNT_ATTR__ATIME
> isn't set in attr_clr.

Passing only a single atime flag (e.g. MOUNT_ATTR_RELATIME) in
attr_clr causes mount_setattr() to fail with EINVAL.

This change ensures that whenever an atime mode is updated,
attr_clr includes MOUNT_ATTR__ATIME to properly reset the
entire access-time attribute field before applying the new mode.

[1] https://man7.org/linux/man-pages/man2/mount_setattr.2.html

Signed-off-by: lifubang <lifubang@acmcoder.com>
2026-02-06 03:30:55 +00:00
lifubang 9632f1e198 integration: quote shell value to prevent word splitting
Signed-off-by: lifubang <lifubang@acmcoder.com>
2026-01-06 10:02:03 +00:00
Ricardo Branco c1ba275d88 integration: Skip test for new privileges if NoNewPrivs is set
Signed-off-by: Ricardo Branco <rbranco@suse.de>
2026-01-06 00:55:15 +01:00
lifubang 15d7c214cd integration: add some tests for bind mount through dangling symlinks
We intentionally broke this in commit d40b3439a9 ("rootfs: switch to
fd-based handling of mountpoint targets") under the assumption that most
users do not need this feature. Sadly it turns out they do, and so
commit 3f925525b4 ("rootfs: re-allow dangling symlinks in mount
targets") added a hotfix to re-add this functionality.

This patch adds some much-needed tests for this behaviour, since it
seems we are going to need to keep this for compatibility reasons (at
least until runc v2...).

Co-developed-by: lifubang <lifubang@acmcoder.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-26 21:04:05 +11:00
lifubang d8706501cf integration: verify syscall compatibility after seccomp enforcement
Signed-off-by: lifubang <lifubang@acmcoder.com>
2025-11-20 19:43:22 +08:00
lifubang b209358db3 ci: detect file descriptor leaks as comprehensively as possible
Co-authored-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: lifubang <lifubang@acmcoder.com>
2025-11-20 19:43:22 +08:00
lifubang bba7647d09 ci: ensure the cgroup(v1) parent always exists for rootless
On some systems (e.g., AlmaLinux 8), systemd automatically removes cgroup paths
when they become empty (i.e., contain no processes). To prevent this, we spawn
a dummy process to pin the cgroup in place.
Fix: https://github.com/opencontainers/runc/issues/5003

Signed-off-by: lifubang <lifubang@acmcoder.com>
2025-11-18 13:58:46 +00:00
Aleksa Sarai 72421e0e25 tests: add pids.limit tests
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-11 15:16:50 +11:00
Aleksa Sarai 9a9719eeb4 rootfs: only set mode= for tmpfs mount if target already existed
This was always the intended behaviour but commit 72fbb34f50 ("rootfs:
switch to fd-based handling of mountpoint targets") regressed it when
adding a mechanism to create a file handle to the target if it didn't
already exist (causing the later stat to always succeed).

A lot of people depend on this functionality, so add some tests to make
sure we don't break it in the future.

Fixes: 72fbb34f50 ("rootfs: switch to fd-based handling of mountpoint targets")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-08 23:11:57 +11:00
Kir Kolyshkin 3c2683f52f tests/int/cgroups: use heredoc to break a long line
This is mostly to improve readability. While at it, make the script more
robust by adding -e option to shell. The exception is echo $pid which is
opportunistic and may fail depending on the order of pids in the file.

Also, remove the empty comment and a shellcheck annotation.

Fixes: c91fe9ae
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:35:51 -07:00
Kir Kolyshkin b82ae3afdc tests/int/delete: fix pause test for rootless case
The "runc delete --force [paused container]" test case does not check
runc pause exit code, and if added, the test fails in rootless tests,
because:
 - not all rootless tests have access to cgroups;
 - rootless containers doesn't have default cgroups path.

To fix, add:
  - setup for rootless case;
  - require cgroups_freezer;
  - runc pause exit code check.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:33:27 -07:00
Kir Kolyshkin ad72eab6c7 tests/int/checkpoint: fix using run twice
In our bats tests, runc itself is a wrapper which calls bats run helper,
so using "run runc" is wrong as it results in calling run helper twice.

Fixes: 8d180e965
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin 92f3d1b225 tests/int/cgroups.bats: fix a wrong comment
This misleading comment is obviously a copy/paste from the previous
test. Fix it.

Fixes: dd696235
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin b3a9f423b9 tests/int: remove bogus $status checks
Commands that are not run via "run" helper (cat, mkdir, __runc)
do not set $status, so it makes no sense to check it.

Fixes: 94505a04, ed548376
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin 693a471af8 tests/int: use run with a status check
...instead of an explicit or absent status check.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin 773a44cc1d tests/int/netdev: slight refactoring
Move the repetitive code and comment into setup_netns.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin 0eb03ef86f tests/int: remove useless/obvious comments
This is a bit opinionated, but some comments in integration tests do not
really help to understand the nature of the tests being performed by
stating something very obvious, like

	# run busybox detached
	runc run -d busybox

To make things worse, these not-so-helpful messages are being
copy/pasted over and over, and that is the main reason to remove them.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin 772e91062d tests/int/README: update
1. Remove the devicemapper driver mentions, and is it no longer
   supported by docker (or podman).

2. Remove the test example -- we have plenty of real ones.

3. Add a link to (well written and extensive) bats documentation.

4. Fix capitalization in a sentence.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-18 15:30:16 -07:00
Kir Kolyshkin ef61b7f0be tests/int: add check for hugetlb stats
As promised in

	https://github.com/opencontainers/cgroups/pull/24#pullrequestreview-3007872832

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-10-08 00:42:08 -07:00
Aleksa Sarai a672a5f36c merge #4726 into opencontainers/runc:main
Antti Kervinen (1):
  Add memory policy support

LGTMs: lifubang AkihiroSuda cyphar
2025-10-08 05:18:13 +11:00
Antti Kervinen eda7bdf80c Add memory policy support
Implement support for Linux memory policy in OCI spec PR:
https://github.com/opencontainers/runtime-spec/pull/1282

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-10-07 15:06:37 +03:00
Joshua Rogers 8c1b3f9608 fix(seccompagent): close received FDs, not loop index
Prevents accidentally closing 0/1/2 on error paths.

Signed-off-by: Joshua Rogers <MegaManSec@users.noreply.github.com>
2025-10-06 06:16:33 +08:00
Aleksa Sarai 627054d246 lint/revive: add package doc comments
This silences all of the "should have a package comment" lint warnings
from golangci-lint.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-10-03 15:17:43 +10:00
lfbzhm 00aec12c71 Merge pull request #4842 from tianon/busybox
Update `busybox:glibc` in integration tests to latest builds
2025-09-27 16:04:38 +08:00
Kir Kolyshkin 5af4dd4e64 runc exec: use CLONE_INTO_CGROUP when available
It makes sense to make runc exec benefit from clone2(CLONE_INTO_CGROUP),
if it is available. Since it requires a recent kernel and might not work,
implement a fallback to older way of joining the cgroup.

Based on:
 - https://go-review.googlesource.com/c/go/+/417695
 - https://github.com/coreos/go-systemd/pull/458
 - https://github.com/opencontainers/cgroups/pull/26
 - https://github.com/opencontainers/runc/pull/4822

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-09-26 14:27:18 -07:00
Tianon Gravi ce5400da08 Update busybox:glibc in integration tests to latest (1.37.0) builds
This removes `mips64le` (no longer supported by the image / upstream in Debian Trixie+) and adds `riscv64`.

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
2025-09-25 17:06:20 -07:00
Kir Kolyshkin 7d81b21c1a Merge pull request #4900 from lifubang/fix-Personality-seccomp
libct: setup personality before initializing seccomp
2025-09-25 16:59:28 -07:00
lifubang 57f1bef422 test: runc run with personality syscall blocked by seccomp
Signed-off-by: lifubang <lifubang@acmcoder.com>
2025-09-25 09:54:08 +00:00
Kir Kolyshkin 77ead42c9f Merge pull request #4822 from kolyshkin/add-pid
runc exec: use manager.AddPid
2025-09-17 18:43:25 -07:00
Kir Kolyshkin 37b5acc2d7 libct: use manager.AddPid to add exec to cgroup
The main benefit here is when we are using a systemd cgroup driver,
we actually ask systemd to add a PID, rather than doing it ourselves.
This way, we can add rootless exec PID to a cgroup.

This requires newer opencontainers/cgroups and coreos/go-systemd.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-09-16 13:31:16 -07:00
donettom-1 830c479ae2 tests/int/cgroups: Use 64K aligned limits for memory.max
When a non–page-aligned value is written to memory.max, the kernel aligns it
down to the nearest page boundary. On systems with a page size greater
than 4K (e.g., 64K), this caused failures because the configured
memory.max value was not 64K aligned.

This patch fixes the issue by explicitly aligning the memory.max value
to 64K. Since 64K is also a multiple of 4K, the value is correctly
aligned on both 4K and 64K page size systems.

However, this approach will still fail on systems where the hardcoded
memory.max value is not aligned to the system page size.

Fixes: https://github.com/opencontainers/runc/issues/4841

Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
2025-09-16 17:31:35 +05:30
Kir Kolyshkin a38f42ab87 tests/int/help: simplify and fix
1. In case runc binary file name is not runc, the test fails like
   below. The fix is to get the binary name from $RUNC.

	 ✗ runc command -h
	   (in test file tests/integration/help.bats, line 27)
	     `[[ ${lines[1]} =~ runc\ checkpoint+ ]]' failed
	   runc-go1.25.0-main checkpoint -h (status=0):
	   NAME:
	      runc-go1.25.0-main checkpoint - checkpoint a running container

2. Simplify the test by adding a loop for all commands. While at it, add
   a loop for -h --help as well.

3. Add missing commands (create, ps, features).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-27 18:08:51 -07:00
Kir Kolyshkin c5e7bc8710 tests/int/selinux: fix for non-standard binary name
The setup in selinux.bats assumes $RUNC binary name ends in runc, and
thus it fails when we run it like this:

	sudo -E RUNC=$(pwd)/runc.patched bats tests/integration/selinux.bats

Fix is easy.

Fixes: b39781b06 ("tests/int: add selinux test case")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-08-27 18:08:51 -07:00
Aleksa Sarai 121192ade6 libct: reset CPU affinity by default
In certain deployments, it's possible for runc to be spawned by a
process with a restrictive cpumask (such as from a systemd unit with
CPUAffinity=... configured) which will be inherited by runc and thus the
container process by default.

The cpuset cgroup used to reconfigure the cpumask automatically for
joining processes, but kcommit da019032819a ("sched: Enforce user
requested affinity") changed this behaviour in Linux 6.2.

The solution is to try to emulate the expected behaviour by resetting
our cpumask to correspond with the configured cpuset (in the case of
"runc exec", if the user did not configure an alternative one). Normally
we would have to parse /proc/stat and /sys/fs/cgroup, but luckily
sched_setaffinity(2) will transparently convert an all-set cpumask (even
if it has more entries than the number of CPUs on the system) to the
correct value for our usecase.

For some reason, in our CI it seems that rootless --systemd-cgroup
results in the cpuset (presumably temporarily?) being configured such
that sched_setaffinity(2) will allow the full set of CPUs. For this
particular case, all we care about is that it is different to the
original set, so include some special-casing (but we should probably
investigate this further...).

Reported-by: ningmingxiao <ning.mingxiao@zte.com.cn>
Reported-by: Martin Sivak <msivak@redhat.com>
Reported-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 08:25:46 +10:00
Aleksa Sarai d1f6acfab0 tests: add RUNC_CMDLINE for tests incompatible with functions
Sometimes we need to run runc through some wrapper (like nohup), but
because "__runc" and "runc" are bash functions in our test suite this
doesn't work trivially -- and you cannot just pass "$RUNC" because you
you need to set --root for rootless tests.

So create a setup_runc_cmdline helper which sets $RUNC_CMDLINE to the
beginning cmdline used by __runc (and switch __runc to use that).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 08:23:15 +10:00
Aleksa Sarai ea385de40c tests: add sane_run helper
"runc" was a special wrapper around bats's "run" which output some very
useful diagnostic information to the bats log, but this was not usable
for other commands. So let's make it a more generic helper that we can
use for other commands.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-28 08:23:15 +10:00
Aleksa Sarai e6b4b5a128 tests: bfq: skip tests on misbehaving udev systems
openSUSE has an unfortunate default udev setup which forcefully sets all
loop devices to use the "none" scheduler, even if you manually set it.
As this is a property of the host configuration (and udev is monitoring
from the host) we cannot really change this behaviour from inside our
test container.

So we should just skip the test in this (hopefully unusual) case.
Ideally tools running the test suite should disable this behaviour when
running our test suite.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-02 20:01:24 +10:00
Aleksa Sarai ceef984fb3 tests: clean up loopback devices properly
If an error occurs during a test which sets up loopback devices, the
loopback device is not freed. Since most systems have very conservative
limits on the number of loopback devices, re-running a failing test
locally to debug it often ends up erroring out due to loopback device
exhaustion.

So let's just move the "losetup -d" to teardown, where it belongs.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-08-02 20:01:24 +10:00
Kir Kolyshkin 314dd812f5 tests/cmd: simplify getting net.UnixConn
The typecast can't fail, so it doesn't make sense checking for errors
here.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-29 14:07:29 -07:00
Kir Kolyshkin 66a533eb3e tests/int/events.bats: don't require root
These tests should work as rootless as long as cgroup access works.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-07-22 16:38:07 -07:00