Skip to content

user: implement user factory #106

Open
emilyalbini wants to merge 9 commits intoea-ptqwnqpswsuvfrom
ea-user-factory
Open

user: implement user factory #106
emilyalbini wants to merge 9 commits intoea-ptqwnqpswsuvfrom
ea-user-factory

Conversation

@emilyalbini
Copy link
Copy Markdown
Member

This PR implements a new factory for buildomat, spawning jobs in ephemeral users in the same host system running the factory. Documentation on how to use the factory is available in the factory README.

I was careful during the implementation of the factory to make sure it will always attempt to clean up after itself (never releasing a slot until the cleanup stage finishes) and that it alerts the operator when something goes wrong (by failing the worker, which triggers an hold on it: I plan in the future to add monitoring for held workers).

This PR also makes multiple changes to the agent installation to support this, each in its separate commit. I can move those to a single separate PR or multiple separate PRs if you'd prefer.

The implementation of the factory was based on @jclulow's 2024 work on a work-in-progress hubris factory.

@emilyalbini emilyalbini requested a review from jclulow May 6, 2026 17:12
Copy link
Copy Markdown

@lzrd lzrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently testing against this PR and noticed one minor doc vs code issue.

Comment thread factory/user/src/config.rs
Comment thread factory/user/README.md Outdated
Copy link
Copy Markdown
Collaborator

@jclulow jclulow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started taking a look at this, and have left some thoughts on what I've seen so far. I think it would help to have more of a complete picture of how this will get deployed and configured for hubris CI as well, when evaluating it all.

Comment thread agent/src/main.rs
* Install the agent binary with the control program name in a location in
* the default PATH so that job programs can find it.
*/
let cprog = format!("/usr/bin/{CONTROL_PROGRAM}");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to move the control program location for environments where it has already existed in /usr/bin. That's part of what this abstraction is about:

enum Root {
Global,
PerUser(PathBuf),
}
impl Root {
fn etc(&self) -> PathBuf {
match self {
Root::Global => "/opt/buildomat/etc".into(),
Root::PerUser(top) => top.join("etc"),
}
}
fn lib(&self) -> PathBuf {
match self {
Root::Global => "/opt/buildomat/lib".into(),
Root::PerUser(top) => top.join("lib"),
}
}
fn usrbin(&self) -> PathBuf {
match self {
Root::Global => "/usr/bin".into(),
Root::PerUser(top) => top.join("bin"),
}
}
pub fn config_path(&self) -> PathBuf {
self.etc().join("agent.json")
}
pub fn job_path(&self) -> PathBuf {
self.etc().join("job.json")
}
pub fn agent(&self) -> PathBuf {
self.lib().join("agent")
}
pub fn control_program(&self) -> PathBuf {
self.usrbin().join(CONTROL_PROGRAM)
}
pub fn should_install_service(&self) -> bool {
match self {
Root::Global => true,
Root::PerUser(_) => false,
}
}
pub fn unprivileged(&self) -> bool {
match self {
Root::Global => false,
Root::PerUser(_) => true,
}
}
}

Comment thread agent/src/main.rs
Comment on lines -1200 to -1209
/*
* Ubuntu 18.04 had a genuine pre-war separate /bin directory!
*/
let binmd = std::fs::symlink_metadata("/bin")?;
if binmd.is_dir() {
std::os::unix::fs::symlink(
format!("../usr/bin/{CONTROL_PROGRAM}"),
format!("/bin/{CONTROL_PROGRAM}"),
)?;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this here?

Comment thread factory/user/smf/worker.xml Outdated
Comment thread factory/user/smf/worker.xml Outdated
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle name='buildomat-worker' type='manifest'>
<service name='site/buildomat/factory-user-worker' type='service' version='0'>
<exec_method name='start' type='method' timeout_seconds='60' exec='{{exec}}' />
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ought to use a method context here that constrains the process to the unprivileged build user for the instance, so that it doesn't start out running as root -- like

<method_context>
<method_credential user='build' group='build' />
</method_context>

In order for the chroot(2) to work we'll need to grant the process the proc_chroot privilege (not in the basic set). Then we'll need to drop that privilege as soon as the chroot() is done, using setppriv(2), prior to doing anything else so that when we then download and run the agent binary it can't chroot again.

We probably also want to remove proc_info (which is part of the basic set) so that you can't see other processes on the system that belong to other users/jobs in, say, ps(1) output. There might be other privileges that it makes sense to chuck out here, but that's the one that comes to mind immediately.

Because this factory intends to support multiple concurrent jobs on the machine, we should also look at putting each build user in a separate project(5), and then setting some resource_controls(7) on those projects to prevent one job from having too much of an impact on other jobs that are running concurrently. We might also want to look at the FSS(4) scheduler, which can some amount of scheduler fairness at the project rather than process level.

Comment thread factory/user/src/illumos/chroot.rs Outdated
}

fn root_dir(worker: WorkerId) -> PathBuf {
Path::new("/var/run/buildomat/worker-roots").join(worker.to_string())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we ought to create a two tier structure here:

  • top level, /var/run/buildomat/worker/WORKER_ID which would be owned by the (unprivileged) user and group for the worker, and mode 0700, so that it's only visible and traversable to the specific worker
  • another directory one level down, e.g., /var/run/buildomat/worker/WORKER_ID/root, which could then be owned root:root and mode 0755 like the real root directory.

Comment thread factory/user/src/illumos/chroot.rs Outdated
Comment thread factory/user/src/factory.rs Outdated
Comment on lines +145 to +156
let available_targets = c
.config
.slots
.iter()
.filter(|(name, _)| !used_slots.contains(name.as_str()))
.map(|(_, slot)| slot.target.clone())
/*
* Deduplicate the targets by first collecting into a HashSet.
*/
.collect::<HashSet<_>>()
.into_iter()
.collect::<Vec<_>>();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When determining which targets are available, I think we need to be able to specify some way to check the health of each configured slot. This is a piece that I had not yet completed for the hubris factory, but I think is relatively critical: we need to be able to check for the presence of the expected set of USB devices (debug probes, serial ports, etc) prior to taking a lease from the server. Otherwise, it seems likely that some of the time we'll have broken slots that absorb and then fail jobs, especially when we have more than one slot on a system with different probes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning on deferring health checking in a future PR: it doesn't strictly block deploying an MVP of the Hubris hardware CI, and there are some alternate ideas I have on how to possibly implement this. Would it be ok to defer health checking on a future PR?

The alternate idea I had was to delegate the health checking to the job itself, and adding a bmat worker mark-broken -m "message" command that marks the worker as failed and putting it on hold. The hold would both alert operators (once I implement monitoring for held workers) and keep the slot reserved, preventing other jobs for starting on it until the hold is released.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be ok for user-factory to take on the responsibility of assigning a list of system resources as defined in some pool and required by some slot. The workaround I'm using right now is to set the devices up as owned by a group. See the note elsewhere about additional groups not being set in the current commit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resources include (all optional depending on the testbed): SP probe, RoT probe, USB to serial device, power control, IPv6 network access to an SP. We could add logic probes or other devices as well in certain cases.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workaround I'm using right now is to set the devices up as owned by a group.

That's the core of my design for the user factory. For Hubris CI those resources are required, yes, but other uses of the factory in the future might need different resources and I kinda don't want to keep expanding the set of devices the factory understands.

#[serde(default)]
pub(crate) add_to_groups: Vec<String>,
#[serde(default)]
pub(crate) env: HashMap<String, String>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example configuration that includes all the environment variables you'd be specifying through this mechanism?

@emilyalbini emilyalbini force-pushed the ea-user-factory branch 2 times, most recently from a9266cd to f39679b Compare May 7, 2026 10:41
@lzrd
Copy link
Copy Markdown

lzrd commented May 7, 2026

Group id issue for USB device ownership:

With a slot's add_to_groups = ['staff'], the ephemeral worker process gets EACCES on
resources owned by 0660 root:staff. /etc/group correctly records the membership; the kernel
credential of the running process does not.

factory/user/src/bootstrap_agent.rs uses std::os::unix::process::CommandExt::uid()/gid(), which sets the primary uid/gid via setresuid/setresgid pre-exec but never calls setgroups(2)/initgroups(3C).
The kernel credential at exec inherits the factory daemon's supplementary group list (empty for root-without-supps).

Fix: a pre_exec hook calling libc::initgroups(user, primary_gid) just before exec in
bootstrap_agent.rs (and wherever the agent then forks the job script, if separate).

Workaround: install everything at world-traversable system paths (/opt/...) so workers don't
depend on supplementary groups. Sidesteps but doesn't solve.

Co-Authored-By: Joshua M. Clulow <jmc@oxide.computer>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants