When deprecating Relabel field, its json attributes were mistakenly
removed, so now it is:
- saved to JSON under "Relabel" (rather than "relabel");
- won't be ignored if empty.
Let's fix it before it's too late.
Fixes: 8b2b5e94 ("libct: remove relabeling dead code")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There is no way to set Mount.Relabel field via OCI spec (config.json),
and so the relabeling code is never used.
My guess it's a leftover from times when runc used to be part of Docker.
Remove it, and mark Relabel field as deprecated.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
These were inadvertently added to our exported APIs by commit
eeda7bdf80cca ("Add memory policy support"). We couldn't remove them
from runc 1.4.x, but we deprecated them in commit 3741f9186d
("libct/configs: mark MPOL_* constants as deprecated") and marked them
for removal in runc 1.5. Users should never have used these in the first
place.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This was deprecated in commit e6a4870e4ac40 ("libct: better errors for
hooks"), and users have had ample time to migrate to Hooks.Run since.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
These were all marked deprecated in commit a75076b4a4 ("Switch to
opencontainers/cgroups") when we switched maintenance of our cgroup code
to opencontainers/cgroups.
Users have had ample time to switch to opencontainers/cgroups
themselves, so we can finally remove this.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The Config type is quite big (currently 554 bytes on a 64 bit Linux)
and using non-pointer receivers in its methods results in copying which
is totally unnecessary.
Change the methods to use pointer receivers.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Rename a function parameter (containerId -> containerID) to avoid a
linter warning:
> var-naming: method parameter containerId should be containerID (revive)
In many other places, including config.json (.linux.uidMappings and
.gidMappings) it is already called containerID, so let's rename.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Alas, these new constants are already in v1.4.0 release so we can't
remove those right away, but we can mark them as deprecated now
and target removal for v1.5.0.
So,
- mark them as deprecated;
- redefine via unix.MPOL_* counterparts;
- fix the validator code to use unix.MPOL_* directly.
This amends commit a0e809a8.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is mostly a mechanical change, but we also need to change some
types to match the "mode int" argument that golang.org/x/sys/unix
decided to use.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The linux.intelRdt.enableMonitoring field enables the creation of
a per-container monitoring group. The monitoring group is removed when
the container is destroyed.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
Implement support for the linux.intelRdt.schemata field of the spec.
This allows management of the "schemata" file in the resctrl group in a
generic way.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
These sysctls are all per-userns (termed `ucounts` in the kernel code) are
settable with CAP_SYS_RESOURCE in the user namespace.
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
If intelRdt is specified in the spec, check that the resctrl fs is
actually mounted. Fixes e.g. the case where "intelRdt.closID" is
specified but runc silently ignores this if resctrl is not mounted.
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
The per-file deprecation in cgroup_deprecated.go is not working,
let's replace it.
Link to Hooks.Run in Hook.Run deprecation notice.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Implement support for passing Linux Network Devices to the container
network namespace.
The network device is passed during the creation of the container,
before the process is started.
It implements the logic defined in the OCI runtime specification.
Signed-off-by: Antonio Ojea <aojea@google.com>
This makes the state.json file 1303 bytes or almost 25% smaller (when
using the default spec, YMMV) by omitting default values.
Before: 5496 bytes
After: 4193 bytes
(With cgroups#9 applied, the new size is 3424, which is almost 40%
savings, compared to the original).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As per
- https://github.com/opencontainers/runtime-spec/pull/1253
- https://github.com/opencontainers/runtime-spec/pull/1261
CPU affinity can be set in two ways:
1. When creating/starting a container, in config.json's
Process.ExecCPUAffinity, which is when applied to all execs.
2. When running an exec, in process.json's CPUAffinity, which
applied to a given exec and overrides the value from (1).
Add some basic tests.
Note that older kernels (RHEL8, Ubuntu 20.04) change CPU affinity of a
process to that of a container's cgroup, as soon as it is moved to that
cgroup, while newer kernels (Ubuntu 24.04, Fedora 41) don't do that.
Because of the above,
- it's impossible to really test initial CPU affinity without adding
debug logging to libcontainer/nsenter;
- for older kernels, there can be a brief moment when exec's affinity
is different than either initial or final affinity being set;
- exec's final CPU affinity, if not specified, can be different
depending on the kernel, therefore we don't test it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This removes libcontainer/cgroups packages and starts
using those from github.com/opencontainers/cgroups repo.
Mostly generated by:
git rm -f libcontainer/cgroups
find . -type f -name "*.go" -exec sed -i \
's|github.com/opencontainers/runc/libcontainer/cgroups|github.com/opencontainers/cgroups|g' \
{} +
go get github.com/opencontainers/cgroups@v0.0.1
make vendor
gofumpt -w .
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Avoid splitting mount data into []string if it does not contain
options we're interested in. This should result in slightly less
garbage to collect.
2. Use if / else if instead of continue, to make it clearer that
we're processing one option at a time.
3. Print the whole option as a sting in an error message; practically
this should not have any effect, it's just simpler.
4. Improve some comments.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using strings.CutPrefix (available since Go 1.20) instead of
strings.HasPrefix and/or strings.TrimPrefix makes the code
a tad more straightforward.
No functional change.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use the old package name as an alias to minimize the patch.
No functional change; this just eliminates a bunch of deprecation
warnings.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Make CommandHook.Command a pointer, which reduces the amount of data
being copied when using hooks, and allows to modify command hooks.
2. Add SetDefaultEnv, which is to be used by the next commit.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is an internal implementation detail and should not be either
public or visible.
Amend setIOPriority to do own class conversion.
Fixes: bfbd0305 ("Add I/O priority")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This allows to omit a call to c.currentOCIState (which can be somewhat
costly when there are many annotations) when the hooks of a given kind
won't be run.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We have quite a few external users of libcontainer/cgroups packages,
and they all have to depend on libcontainer/configs as well.
Let's move cgroup-related configuration to libcontainer/croups.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In these cases, this is exactly what we want to find out.
Slightly improves performance and readability.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The warnings fixed were:
libcontainer/configs/config_test.go:205:12: printf: non-constant format string in call to (*testing.common).Errorf (govet)
t.Errorf(fmt.Sprintf("Expected error to not occur but it was %+v", err))
^
libcontainer/cgroups/fs/blkio_test.go:481:13: printf: non-constant format string in call to (*testing.common).Errorf (govet)
t.Errorf(fmt.Sprintf("test case '%s' failed unexpectedly: %s", testCase.desc, err))
^
libcontainer/cgroups/fs/blkio_test.go:595:13: printf: non-constant format string in call to (*testing.common).Errorf (govet)
t.Errorf(fmt.Sprintf("test case '%s' failed unexpectedly: %s", testCase.desc, err))
^
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using ints for all of our mapping structures means that a 32-bit binary
errors out when trying to parse /proc/self/*id_map:
failed to cache mappings for userns: failed to parse uid_map of userns /proc/1/ns/user:
parsing id map failed: invalid format in line " 0 0 4294967295": integer overflow on token 4294967295
This issue was unearthed by commit 1912d5988b ("*: actually support
joining a userns with a new container") but the underlying issue has
been present since the docker/libcontainer days.
In theory, switching to uint32 (to match the spec) instead of int64
would also work, but keeping everything signed seems much less
error-prone. It's also important to note that a mapping might be too
large for an int on 32-bit, so we detect this during the mapping.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>