Skip to content

Conversation

@saehejkang
Copy link
Contributor

@saehejkang saehejkang commented Jan 13, 2026

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Motivation and Context

Relates to #642.

  • Each image carries it own error message (allows for different error types per image)
  • Stdout/stderr support
  • Use of the new StderrLogHandler
container image inspect test python:alpine
// Inspect output for python:alpine here but omitted due to size
test: The operation couldn’t be completed. (ContainerizationError.ContainerizationError error 1.)
Error: InspectError(succeeded: ["docker.io/library/python:alpine"], failed: [("test", notFound: "Image not found")])

I found these updates make error propogation/handling a lot easier. If this is the correct direction we want to take, I will start by making updates to all the image commands before continuing on #832.

Testing

  • Tested locally
  • Added/updated tests
  • Added/updated docs

@saehejkang saehejkang changed the title image-inspect stdout/stderr refactor [image-inspect]: stdout/stderr and logging refactor Jan 13, 2026
@jglogan
Copy link
Contributor

jglogan commented Jan 13, 2026

Thanks for kicking this off!

I'm going to be pretty tied up today but I'll take a look at it later or tomorrow. There's #642 which you mentioned and also #268 which lacks actionable work but we should look it over and see if we can close that with the set of fixes like this.

@Ronitsabhaya75
Copy link
Contributor

@saehejkang Throwing InspectError at the end causes swift-argument-parser to print the raw struct details (e.g. Error: InspectError(...)) to stderr.

Since you are already logging specific errors via Logger, this final error print is redundant and noisy.

My Suggestion: Throw ExitCode.failure instead of InspectError to exit non-zero without extra noise, OR make InspectError conform to LocalizedError with a clean errorDescription.

@saehejkang
Copy link
Contributor Author

@Ronitsabhaya75 My initial thought process was to make InspectError conform to LocalizedDescription but that would make more sense if we are under the assumption there is a single type of error message ("image not found"). That way also introduces a bit more complexity to define more error types and descriptions. By using a single custom error type, we keep error handling simple and consistent, throwing only one type of error. The raw struct may be noisy but at least we have control on the logging, as well as user facing errors.

@Ronitsabhaya75
Copy link
Contributor

@saehejkang I see your point about keeping the implementation simple! However, we can actually eliminate the noisy raw struct print without adding complex error types.

If InspectError conforms to LocalizedError and returns a generic summary, swift-argument-parser will print that instead of the raw struct dump.

extension ImageInspect.InspectError: LocalizedError {
    var errorDescription: String? {
        // We already logged specific errors above, so a simple summary suffices here.
        return "Failed to inspect \(failed.count) image(s)."
    }
}

This keeps the error handling logic exactly as you have it but makes the CLI output much cleaner for the end user.

@jglogan
Copy link
Contributor

jglogan commented Jan 15, 2026

@saehejkang Did InspectError not get added to the PR? I only see the inspect command in the commit.

@jglogan
Copy link
Contributor

jglogan commented Jan 15, 2026

@saehejkang @Ronitsabhaya75 also see the PR I pushed for creating a ManagedResource protocol. What do you think about having ContainerResource or ContainerAPIClient define an error type for ManagedResource that can wrap ContainerizationError?, and then we use those the typical resource access errors instead of using ContainerizationError for some and defining resource-specific, overlapping errors for others (VolumeError)?

I won't do any error-related work in that PR (I'm going to force push a rewrite of what I have, made a lot of changes), so feel free to play around in this one and we can work out something coordinated.

@saehejkang
Copy link
Contributor Author

Did InspectError not get added to the PR? I only see the inspect command in the commit.

https://github.com/apple/container/pull/1044/files#diff-f331141407ca78f7d4e5608fc336b134bd416ba8897a22bce745820cf45807f0R39:~:text=)%20%7B%7D-,struct,-InspectError%3A%20Error

What do you think about having ContainerResource or ContainerAPIClient define an error type for ManagedResource that can wrap ContainerizationError?

I like the idea of having a single ManagedResource type that can wrap ContainerizationError. I did take a look at the VolumeError and realized that all the other resources were out of date or did not follow the same design pattern. Different commands/resource, follow different patterns, so on and so forth.

If we follow one simple pattern, that could alleviate issues I was running into. For example, in the ClientImage.get() here, updating the return to be error: [ImageError] (with .notFound or .duplicate error codes) could be more useful. This can be then read back by ImageInspect and the correct errors can be filtered and thrown.

However, there could be another discussion on even if commands should even be throwing ANY errors. Should they only be thrown in the client/server functions?

@jglogan
Copy link
Contributor

jglogan commented Jan 16, 2026

However, there could be another discussion on even if commands should even be throwing ANY errors. Should they only be thrown in the client/server functions?

Yeah, one model I'd had in my head was that the API service and client would throw errors that were principally data carriers with minimal message text, and any code that uses the API client is responsible for taking the data and presenting it as it saw fit.

That's "informed" by my ignorance of how localization works for errors; that model assumes that the caller of the client is best equipped to localize errors.

It does feel less fussy if the errors thrown by the API server and client are ready to present without any additional work on the caller's part.

What do you think about a PartialError type that's a holder for a wrapped collection of errors from a compound operation? The PartialError meaning would be "it's up to you to determine whether this constitutes a proper error".

@jglogan
Copy link
Contributor

jglogan commented Jan 16, 2026

@saehejkang Also make sure to rebase to pick up the test fix!

@jglogan
Copy link
Contributor

jglogan commented Jan 17, 2026

I also just pushed #1066 that gives all the commands a logger that outputs the message and metadata only at levels other than debug and trace, and includes an ISO timestamp at those levels.

What I'd like to do is use structured logging principles for all our CLI logging. I'd written up these guidelines elsewhere and think they're applicable here:

Use the following guidelines when crafting log messages:

  • Write concise and legible messages. The message should be a short, complete sentence that describes the event being logged. Avoid unnecessary jargon.
  • Do not interpolate variables into log messages. Instead, use structural logging to add fields to messages. Choose meaningful field names, using the conventional field names below for commonly occurring data so that operations can be tracked readily across applications.
  • Use contextual logging to apply the same fields to a range of messages. If a context consists of identifying information and attributes, do not overload the context with the attributes. Instead, place the identifying information in the context, and insert an initial log message that includes the additional attributes.
  • Exclude sensitive information from log messages. Pay special attention when logging values in data structures, classes, and collections - it is generally safer to log only that which is needed, instead of sending such objects to the logger.
  • Reduce clutter by avoiding informational logging in frequently called methods, and when iterating large collections.
  • For languages that do not support conditional evaluation of log messages, test whether log levels are enabled before invoking logging messages that require expensive computation. Swift uses autoclosures to condition log message evaluation, but when using a language like Java, you should enclose expensive messages in code like: log.isDebugEnabled() {...}.

Conventional Structured Logging Fields

Name Description
container A container name.
image An imageref value.
count Result set size.
rc Exit status for a command.
request_millis Request processing time in milliseconds.
url A request URL.
version The application version.

@Ronitsabhaya75
Copy link
Contributor

@jglogan i'll have look once I'm good after my interview next week

Copy link
Contributor

@jglogan jglogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saehejkang This looks good to me. I suspect that this will change once we start doing this in a cross cutting way. In a follow-on PR would it make sense to define an Error type for partial errors on compound operations like these that we can use for any collection of resources?

Basically the starting point for such a follow-in could be just to move InspectError to ContainerResource and rename it.

There are some GitHub runner issues at the moment but once that's resolved I'll merge this.

@saehejkang
Copy link
Contributor Author

What do you think about a PartialError type that's a holder for a wrapped collection of errors from a compound operation? The PartialError meaning would be "it's up to you to determine whether this constitutes a proper error".

Do you have an example of what this would look like? In my head I’m thinking of one approach, and you may be thinking of it differently. Is the PartialError type going to be generic, or an extension of a generic error type, that we define for each command/resource? I can see it working out well either way.

Furthermore, this would allow commands that normally expect data to be returned, to still propagate partial failures as an error that can be inspected. It also seems like there’s no real way to avoid requiring callers of the client to implement proper error handling. I still think errors should be as extracted away from the callers and actually thrown in the client/service wherever compound operations are not needed.

What I'd like to do is use structured logging principles for all our CLI logging. I'd written up these guidelines elsewhere and think they're applicable here:

Thank you for the updates on this and glad I could get this initiative started. Might be a good idea to create an issue to keep track of all the commands/code we need to make to updates to. I would be more than happy to help with triaging issues :)

@jglogan
Copy link
Contributor

jglogan commented Jan 21, 2026

Using generics is a possibility. It's probably worth thinking through how these would get used all over container.

ContainerizationError doesn't really give us a structured way to deal with the common resource errors. I'd like error types that are better tailored to the container domain so that if the caller gets back, for example, the equivalent of a 404, there's a property (or if enum, an associated data field) that indicates the resource ID (and type, could be a app-unique resource ID that is (type, id)) that didn't exist.

Partial errors on compound operations like the one you're handling don't need to be rocket science. It could be just an Error that wraps [Error], but that goes back to what I wrote at the top, we should look at the different categories of error we handle today and tailor things so that the caller (not necessarily ContainerCommand types!) gets what it needs to present the error well.

Does that make sense? It's been a busy day, so this answer might not be well focused 😄

@jglogan
Copy link
Contributor

jglogan commented Jan 22, 2026

@saehejkang could you rebase onto main to pick up the git action fixes that will allow the PR to be built?

@saehejkang
Copy link
Contributor Author

saehejkang commented Jan 22, 2026

In a follow-on PR would it make sense to define an Error type for partial errors on compound operations like these that we can use for any collection of resources?

Basically the starting point for such a follow-in could be just to move InspectError to ContainerResource and rename it.

Should I wait until how we decide those resource errors are going to be defined in #1055, before working on the follow-in? I can also work on conforming images to the ManagedResource type, depending on the direction we take there.

Partial errors on compound operations like the one you're handling don't need to be rocket science.

Does that make sense?

Makes perfect sense to me!


Also let me know if I need to update the PR title, description, or commit message. It could maybe mention things about updating the error handling?

@jglogan
Copy link
Contributor

jglogan commented Jan 22, 2026

I think the ManagedResource work and the error handling could run in parallel, and I'd be happy if you want to take a crack at the errors, if you have the time to put into it. If not, no problem!

It might be useful to do a really quick survey of the failure modes in API server and helpers to see what sort of failure modes we have and what kind of error model would convey the best information back to the user.

The typical resource management errors (exists, not found, busy) are pretty cut and dried I think. We could have errors similar to what's in ContainerizationError but they wouldn't need error text, just a ManagedResource as associated data.

But there's a number of open issues related to hard-to-diagnose errors. I'm thinking about XPC errors that we may be better able to wrap for the user (at least provide some call site information), image transfer, content store, and platform mismatch errors, things like that.

For an initial PR you could just define a ManagedResource in ContainerResource that conforms to Identifiable, and a stub implementation that simply stores an id value. I've been busy getting our builds working with an updated CI fleet (we were still building on Sequoia until a few days ago), so I haven't gotten the chance to push the resource PR forward, but the base protocol I was planning looks like the following.

My hope is this gives us obvious patterns for contributors to follow when making feature/enhancement PRs for resources.

/// Common properties for all managed resources.
 public protocol ManagedResource: Identifiable, Sendable, Codable {
    /// A 64 byte hexadecimal string, assigned by the system, that uniquely
    /// identifies the resource. 
    var id: String { get }

    /// A user assigned name that shall be unique within the namespace of
    /// the resource category. If the user does not assign a name, this value
    /// shall be the same as the system-assigned identifier.
    var name: String { get }

    /// The time at which the system created the resource.
    var creationDate: Date { get }

    /// Key-value properties for the resource. The user and system may both
    /// make use of labels to read and write annotations or other metadata.
    /// A good practice is to use 
    var labels: [String: String] { get }

    /// Generates a unique resource ID value.
    static func generateId() -> String

    /// Returns true only if the specified resource name is syntactically valid.
    static func nameValid(_ name: String) -> Bool
}

public extension ManagedResource {
    static func generateId() -> String {
        (0..<4)
            .map { _ in UInt128.random(in: 0...UInt128.max) }
            .map { String($0, radix: 16) }
            .joined()
    }
}

@jglogan jglogan merged commit c42108d into apple:main Jan 22, 2026
2 checks passed
@jglogan
Copy link
Contributor

jglogan commented Jan 22, 2026

Also I really want to extract all the resource specific stuff out of those massive Parser and Utility types into extensions on resources, where it makes sense. That should yield clearer patterns and improve maintainability also.

@saehejkang saehejkang deleted the image-inspect-stdout-stderr-refactor branch January 23, 2026 03:37
@saehejkang
Copy link
Contributor Author

saehejkang commented Jan 23, 2026

I'd be happy if you want to take a crack at the errors, if you have the time to put into it

I can take a crack at the error handling piece, as well as the ManagedResource work. I think it makes most sense to focus on conforming the other resources first before working on error handling.

saehejkang added a commit to saehejkang/container that referenced this pull request Jan 23, 2026
- Relates to apple#642.
- Each image carries it own error message (allows for different error
types per image)
- Stdout/stderr support
- Use of the new `StderrLogHandler`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants