[ENH] RFC9457 compliant errors by PGijsbers · Pull Request #238 · openml/server-api

PGijsbers · 2026-02-12T08:37:54Z

Support RFC9457 compliant errors.

coderabbitai · 2026-02-12T08:38:17Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Replaces ad-hoc HTTPException usage with an RFC 9457 "application/problem+json" surface: adds ProblemDetailError, many domain-specific ProblemDetailError subclasses, and problem_detail_exception_handler in src/core/errors.py; removes the old DatasetError IntEnum. Registers the handler in src/main.py. Updates routers to raise the new problem-detail exceptions (datasets, flows, studies, tasks, qualities, task types, mldcat dataset service). Adjusts tests to expect problem+json responses. Minor unrelated tweaks: removed an error-format helper in src/core/formatting.py and made a safer config lookup in src/database/users.py.

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.87% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title '[ENH] RFC9457 compliant errors' directly corresponds to the main objective of implementing RFC 9457 compliance across the codebase.
Description check	✅ Passed	The PR description 'Support RFC9457 compliant errors' is directly related to the changeset, which implements RFC 9457 compliance throughout the error handling system.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch errors

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

src/routers/openml/qualities.py (1)
38-45: Semantic mismatch: DatasetNotFoundError overridden to behave as DatasetNoDataFileError.

The overrides (code=113, status_code=412) exactly match DatasetNoDataFileError's defaults. This means the response type URI will say dataset-not-found while the legacy code says "no data file" — contradictory signals for API consumers.

If the PHP API truly returned 412/113 for this case, consider raising DatasetNoDataFileError directly (which already has those defaults), or introduce a dedicated error class for this legacy endpoint behavior.
Proposed fix using existing error class
-from core.errors import DatasetNotFoundError
+from core.errors import DatasetNoDataFileError
 ...
         # Backwards compatibility: PHP API returns 412 with code 113
         msg = "Unknown dataset."
-        no_data_file = 113
-        raise DatasetNotFoundError(
+        raise DatasetNoDataFileError(
             msg,
-            code=no_data_file,
-            status_code=HTTPStatus.PRECONDITION_FAILED,
         )

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 8 issues, and left some high level feedback:

The ProblemDetailError docstring and problem_detail_exception_handler docstring still refer to ProblemDetailException; consider updating those names to avoid confusion for future readers.
For unknown datasets, most endpoints now use HTTPStatus.NOT_FOUND, but get_qualities still returns PRECONDITION_FAILED while mapping to ProblemType.DATASET_NOT_FOUND; consider aligning this status code with the other dataset-not-found cases unless there is a strong backward-compatibility reason not to.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `ProblemDetailError` docstring and `problem_detail_exception_handler` docstring still refer to `ProblemDetailException`; consider updating those names to avoid confusion for future readers.
- For unknown datasets, most endpoints now use `HTTPStatus.NOT_FOUND`, but `get_qualities` still returns `PRECONDITION_FAILED` while mapping to `ProblemType.DATASET_NOT_FOUND`; consider aligning this status code with the other dataset-not-found cases unless there is a strong backward-compatibility reason not to.

## Individual Comments

### Comment 1
<location> `src/core/errors.py:62` </location>
<code_context>
+    """Exception that produces RFC 9457 compliant error responses.
+
+    Usage:
+        raise ProblemDetailException(
+            status_code=HTTPStatus.NOT_FOUND,
+            detail="Dataset 123 was not found.",
+            title="Dataset Not Found",
+            type_="https://openml.org/problems/dataset-not-found",
+            code="111",  # Extension field for legacy error codes
+        )
+    """
</code_context>

<issue_to_address>
**issue (typo):** The docstring example references `ProblemDetailException`, but the actual class is `ProblemDetailError`.

This mismatch in the example can confuse users and any tooling that consumes docstrings. Please update the example (and any related text) to use `ProblemDetailError` instead.

```suggestion
        raise ProblemDetailError(
```
</issue_to_address>

### Comment 2
<location> `src/core/errors.py:92-101` </location>
<code_context>
+        super().__init__(detail or title or "An error occurred")
+
+
+def problem_detail_exception_handler(
+    request: Request,  # noqa: ARG001
+    exc: ProblemDetailError,
+) -> JSONResponse:
+    """FastAPI exception handler for ProblemDetailException.
+
+    Returns a response with:
</code_context>

<issue_to_address>
**issue (typo):** The exception handler docstring mentions `ProblemDetailException` instead of `ProblemDetailError`.

Since this handler is registered for `ProblemDetailError` in `main.py`, the docstring should reference `ProblemDetailError` as well to keep the documentation accurate and avoid confusion in editors and generated docs.

```suggestion
def problem_detail_exception_handler(
    request: Request,  # noqa: ARG001
    exc: ProblemDetailError,
) -> JSONResponse:
    """FastAPI exception handler for ProblemDetailError.

    Returns a response with:
    - Content-Type: application/problem+json
    - RFC 9457 compliant JSON body
    """
```
</issue_to_address>

### Comment 3
<location> `src/routers/openml/qualities.py:40-42` </location>
<code_context>
-        raise HTTPException(status_code=HTTPStatus.NOT_FOUND, detail=error)
+        raise_problem(
+            status_code=HTTPStatus.NOT_FOUND,
+            type_=ProblemType.DATASET_NOT_FOUND,
+            detail="Unknown dataset.",
+            code=DatasetError.NOT_FOUND,
</code_context>

<issue_to_address>
**issue:** The problem type and legacy error code for the missing dataset seem inconsistent.

Here `type_` is `ProblemType.DATASET_NOT_FOUND`, while the legacy `code` previously used for this case was `DatasetError.NO_DATA_FILE` with the message "Unknown dataset". To keep the problem type and legacy error code semantically aligned, consider either using `DatasetError.NOT_FOUND` when the intent is "unknown dataset", or adjusting `type_` to better match the specific "no data file" condition if that’s what this branch represents.
</issue_to_address>

### Comment 4
<location> `src/core/errors.py:212` </location>
<code_context>
+        code: Optional legacy OpenML error code (for backwards compatibility).
+        **extensions: Additional extension fields to include in the response.
+    """
+    title = PROBLEM_TITLES.get(type_)
+    if code is not None:
+        extensions["code"] = str(code)
</code_context>

<issue_to_address>
**suggestion:** Consider providing a sensible fallback title when the problem type URI is not in `PROBLEM_TITLES`.

If a new `type_` is added without updating `PROBLEM_TITLES`, `title` becomes `None` and is omitted. To keep responses self-describing, consider falling back to `HTTPStatus(status_code).phrase` or a generic title when `type_` is not found.

Suggested implementation:

```python
    title = PROBLEM_TITLES.get(type_)
    if title is None:
        try:
            # Fallback to the standard HTTP status phrase when the problem type is unknown.
            title = HTTPStatus(status_code).phrase
        except ValueError:
            # As a last resort, provide a generic, self-describing title.
            title = "Unknown error"
    if code is not None:
        extensions["code"] = str(code)

```

To complete this change, ensure that `HTTPStatus` is imported at the top of `src/core/errors.py`:

1. Add `from http import HTTPStatus` alongside your other imports, if it is not already present.
2. If `HTTPStatus` is already imported elsewhere in the file, no further changes are needed.
</issue_to_address>

### Comment 5
<location> `tests/routers/openml/datasets_test.py:31-37` </location>
<code_context>
-    assert response.status_code == HTTPStatus.PRECONDITION_FAILED
-    assert response.json()["detail"] == {"code": "103", "message": "Authentication failed"}
+    assert response.status_code == HTTPStatus.UNAUTHORIZED
+    assert response.headers["content-type"] == "application/problem+json"
+    error = response.json()
+    assert error["type"] == ProblemType.AUTHENTICATION_FAILED
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding dedicated unit tests for the new RFC9457 error helpers (ProblemDetail, ProblemDetailError, problem_detail_exception_handler, raise_problem).

The current endpoint tests give good integration coverage, but please also add unit tests for `core.errors` itself (e.g. in `tests/core/test_errors.py`) to make the new helpers safer to change. In particular, cover:
- `ProblemDetail`: default `type` of `about:blank`, `model_dump(exclude_none=True)` omitting `None` fields, and `type_/type` aliasing.
- `ProblemDetailError`: storing `status_code`, building a `ProblemDetail` with matching `status`, and preserving arbitrary `extensions` (multiple keys and value types).
- `problem_detail_exception_handler`: returns a `JSONResponse` with `application/problem+json`, merges `problem` and `extensions` correctly, and ensures `status` in the body matches the HTTP status.
- `raise_problem`: looks up `title` from `PROBLEM_TITLES`, coerces numeric `code` to strings, and passes through `instance` and extra extension fields.
These will better protect RFC9457 behavior from regressions than endpoint-only tests.
</issue_to_address>

### Comment 6
<location> `tests/routers/openml/migration/datasets_migration_test.py:31-36` </location>
<code_context>

     if new.status_code != HTTPStatus.OK:
-        assert original.json()["error"] == new.json()["detail"]
+        # RFC 9457: Python API now returns problem+json format
+        assert new.headers["content-type"] == "application/problem+json"
+        # Both APIs should return error responses in the same cases
+        assert "error" in original.json()
         return
</code_context>

<issue_to_address>
**suggestion (testing):** Strengthen migration test for dataset error responses by asserting basic RFC9457 fields.

This currently only checks the `application/problem+json` content type and that the PHP response has an `"error"` key. To better protect the new behavior while keeping this as a migration test (not a full schema check), consider also asserting minimal RFC9457 fields, for example:
```python
error = new.json()
assert "type" in error
assert "status" in error
assert error["status"] == new.status_code
```
This gives early warning if the problem-details structure or status field diverges from expectations without making the test brittle.

```suggestion
    if new.status_code != HTTPStatus.OK:
        # RFC 9457: Python API now returns problem+json format
        assert new.headers["content-type"] == "application/problem+json"
        # Both APIs should return error responses in the same cases
        assert "error" in original.json()

        # Minimal RFC 9457 structure checks for Python API response
        error = new.json()
        assert "type" in error
        assert "status" in error
        assert error["status"] == new.status_code
        return
```
</issue_to_address>

### Comment 7
<location> `tests/routers/openml/migration/datasets_migration_test.py:209-213` </location>
<code_context>
     assert response.status_code == original.status_code

     if response.status_code != HTTPStatus.OK:
-        error = response.json()["detail"]
-        error["code"] = str(error["code"])
-        assert error == original.json()["error"]
+        # RFC 9457: Python API now returns problem+json format
+        assert response.headers["content-type"] == "application/problem+json"
+        error = response.json()
+        # Verify Python API returns properly typed RFC 9457 response
+        assert "type" in error
+        assert "status" in error
+        # Both APIs should error in the same cases
+        assert "error" in original.json()
         return
</code_context>

<issue_to_address>
**suggestion (testing):** Preserve explicit checks for legacy error codes in the features migration test to ensure backward-compatibility.

This test used to compare the legacy `code` field between PHP and Python. After switching to RFC9457, it only validates `type`/`status` and the presence of an `"error"` block. Since this endpoint historically exposed specific error codes (272/273/274), please keep a minimal assertion that the legacy `code` is still present and string-typed in the Python response, e.g.:
```python
error = response.json()
assert "code" in error
assert isinstance(error["code"], str)
# optionally:
assert error["code"] in {"272", "273", "274"}
```

```suggestion
        # RFC 9457: Python API now returns problem+json format
        assert new.headers["content-type"] == "application/problem+json"
        error = new.json()
        # Verify Python API returns properly typed RFC 9457 response
        assert "type" in error
        assert "status" in error
        # Preserve legacy error code field for backward-compatibility
        assert "code" in error
        assert isinstance(error["code"], str)
        assert error["code"] in {"272", "273", "274"}
        # Both APIs should error in the same cases
        assert "error" in original.json()
        return
```
</issue_to_address>

### Comment 8
<location> `src/core/errors.py:114` </location>
<code_context>
+
+# Problem type URIs for OpenML-specific errors
+# These should be documented at the corresponding URLs
+class ProblemType:
+    """Problem type URIs for common OpenML errors."""
+
</code_context>

<issue_to_address>
**issue (complexity):** Consider collapsing the parallel problem-type URI/title structures and separate extensions dict into richer enum and model types to reduce indirection and duplication.

You can reduce indirection without changing functionality by collapsing a couple of parallel structures.

### 1. Collapse `ProblemType` + `PROBLEM_TITLES`

You currently maintain URIs and titles separately. You can avoid the extra dict, runtime lookup, and mismatch risk by making `ProblemType` an enum that carries both `uri` and `title`, and letting `raise_problem` accept that enum:

```python
from enum import Enum

class ProblemType(Enum):
    DATASET_NOT_FOUND = (
        "https://openml.org/problems/dataset-not-found",
        "Dataset Not Found",
    )
    DATASET_NO_ACCESS = (
        "https://openml.org/problems/dataset-no-access",
        "Dataset Access Denied",
    )
    # ... other entries ...

    @property
    def uri(self) -> str:
        return self.value[0]

    @property
    def title(self) -> str:
        return self.value[1]
```

Then simplify `raise_problem` to:

```python
def raise_problem(
    status_code: HTTPStatus | int,
    type_: ProblemType,
    detail: str,
    *,
    instance: str | None = None,
    code: int | str | None = None,
    **extensions: ExtensionValue,
) -> NoReturn:
    if code is not None:
        extensions["code"] = str(code)

    raise ProblemDetailError(
        status_code=status_code,
        detail=detail,
        title=type_.title,
        type_=type_.uri,
        instance=instance,
        **extensions,
    )
```

Call sites become more self-documenting and you can drop `PROBLEM_TITLES` entirely:

```python
raise_problem(
    HTTPStatus.NOT_FOUND,
    ProblemType.DATASET_NOT_FOUND,
    detail="Dataset 123 was not found.",
)
```

### 2. Fold `extensions` into `ProblemDetail`

You don’t need both `ProblemDetail` and a separate `extensions` dict plus merge logic. You can let `ProblemDetail` accept arbitrary extra fields and construct it directly with extensions:

```python
class ProblemDetail(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
        extra="allow",  # allow arbitrary extension fields
    )

    type_: str = Field(
        default="about:blank",
        alias="type",
        serialization_alias="type",
    )
    title: str | None = None
    status: int | None = None
    detail: str | None = None
    instance: str | None = None
```

Then `ProblemDetailError` doesn’t need to store `extensions` separately:

```python
class ProblemDetailError(Exception):
    def __init__(
        self,
        status_code: HTTPStatus | int,
        detail: str | None = None,
        title: str | None = None,
        type_: str = "about:blank",
        instance: str | None = None,
        **extensions: ExtensionValue,
    ) -> None:
        self.status_code = int(status_code)
        self.problem = ProblemDetail(
            type_=type_,
            title=title,
            status=self.status_code,
            detail=detail,
            instance=instance,
            **extensions,  # extensions become top-level JSON fields
        )
        super().__init__(detail or title or "An error occurred")
```

And the handler becomes a straight dump (no merge needed):

```python
def problem_detail_exception_handler(
    request: Request,  # noqa: ARG001
    exc: ProblemDetailError,
) -> JSONResponse:
    return JSONResponse(
        status_code=exc.status_code,
        content=exc.problem.model_dump(by_alias=True, exclude_none=True),
        media_type="application/problem+json",
    )
```

This keeps the wire format identical (extensions are still top-level fields) while removing one data structure (`extensions` on the exception) and one merge step from the handler. Combined with the enum change, you remove duplication and some mental overhead without changing behavior or the RFC 9457-compliant API surface.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

src/core/errors.py

src/routers/openml/qualities.py

src/core/errors.py

tests/routers/openml/datasets_test.py

sourcery-ai · 2026-02-12T08:39:07Z

tests/routers/openml/migration/datasets_migration_test.py

    if new.status_code != HTTPStatus.OK:
-        assert original.json()["error"] == new.json()["detail"]
+        # RFC 9457: Python API now returns problem+json format
+        assert new.headers["content-type"] == "application/problem+json"
+        # Both APIs should return error responses in the same cases
+        assert "error" in original.json()
        return


suggestion (testing): Strengthen migration test for dataset error responses by asserting basic RFC9457 fields.

This currently only checks the application/problem+json content type and that the PHP response has an "error" key. To better protect the new behavior while keeping this as a migration test (not a full schema check), consider also asserting minimal RFC9457 fields, for example:

error = new.json() assert "type" in error assert "status" in error assert error["status"] == new.status_code

This gives early warning if the problem-details structure or status field diverges from expectations without making the test brittle.

Suggested change

if new.status_code != HTTPStatus.OK:

assert original.json()["error"] == new.json()["detail"]

# RFC 9457: Python API now returns problem+json format

assert new.headers["content-type"] == "application/problem+json"

# Both APIs should return error responses in the same cases

assert "error" in original.json()

return

if new.status_code != HTTPStatus.OK:

# RFC 9457: Python API now returns problem+json format

assert new.headers["content-type"] == "application/problem+json"

# Both APIs should return error responses in the same cases

assert "error" in original.json()

# Minimal RFC 9457 structure checks for Python API response

error = new.json()

assert "type" in error

assert "status" in error

assert error["status"] == new.status_code

return

sourcery-ai · 2026-02-12T08:39:07Z

tests/routers/openml/migration/datasets_migration_test.py

+        # RFC 9457: Python API now returns problem+json format
+        assert new.headers["content-type"] == "application/problem+json"
+        # Both APIs should error in the same cases
+        assert "error" in original.json()
        return


suggestion (testing): Preserve explicit checks for legacy error codes in the features migration test to ensure backward-compatibility.

This test used to compare the legacy code field between PHP and Python. After switching to RFC9457, it only validates type/status and the presence of an "error" block. Since this endpoint historically exposed specific error codes (272/273/274), please keep a minimal assertion that the legacy code is still present and string-typed in the Python response, e.g.:

error = response.json() assert "code" in error assert isinstance(error["code"], str) # optionally: assert error["code"] in {"272", "273", "274"}

Suggested change

# RFC 9457: Python API now returns problem+json format

assert new.headers["content-type"] == "application/problem+json"

# Both APIs should error in the same cases

assert "error" in original.json()

return

# RFC 9457: Python API now returns problem+json format

assert new.headers["content-type"] == "application/problem+json"

error = new.json()

# Verify Python API returns properly typed RFC 9457 response

assert "type" in error

assert "status" in error

# Preserve legacy error code field for backward-compatibility

assert "code" in error

assert isinstance(error["code"], str)

assert error["code"] in {"272", "273", "274"}

# Both APIs should error in the same cases

assert "error" in original.json()

return

src/core/errors.py

codecov · 2026-02-12T08:39:26Z

Codecov Report

❌ Patch coverage is 62.66094% with 87 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@294ca47). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
src/routers/openml/datasets.py	6.45%	29 Missing ⚠️
src/routers/openml/study.py	7.69%	24 Missing ⚠️
src/core/errors.py	88.51%	17 Missing ⚠️
src/routers/openml/flows.py	33.33%	4 Missing ⚠️
src/routers/openml/tasks.py	33.33%	4 Missing ⚠️
src/routers/openml/qualities.py	40.00%	3 Missing ⚠️
src/routers/mldcat_ap/dataset.py	50.00%	2 Missing ⚠️
src/routers/openml/tasktype.py	50.00%	2 Missing ⚠️
src/database/users.py	0.00%	0 Missing and 1 partial ⚠️
src/main.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #238   +/-   ##
=======================================
  Coverage        ?   56.15%           
=======================================
  Files           ?       32           
  Lines           ?     1284           
  Branches        ?      105           
=======================================
  Hits            ?      721           
  Misses          ?      561           
  Partials        ?        2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/routers/mldcat_ap/dataset.py (1)
153-154: ⚠️ Potential issue | 🟡 Minor

Unhandled StopIteration if quality name doesn't match any results.

Line 154 uses a generator expression with next() but has no fallback. If quality_name doesn't match any quality in the list, this will raise a raw StopIteration, resulting in an uncontrolled 500. Consider catching this and raising a proper raise_problem error consistent with the new error pattern.
Suggested fix
-    quality = next(q for q in qualities if q.name == quality_name)
+    quality = next((q for q in qualities if q.name == quality_name), None)
+    if quality is None:
+        raise_problem(
+            status_code=HTTPStatus.NOT_FOUND,
+            type_=ProblemType.NO_RESULTS,
+            detail=f"Quality '{quality_name}' not found for distribution {distribution_id}.",
+        )
tests/routers/openml/dataset_tag_test.py (1)
83-84: ⚠️ Potential issue | 🟡 Minor

Pre-existing bug: missing = in API key query parameter.

Line 84 has f"/datasets/tag?api_key{ApiKey.ADMIN}" which produces a URL like /datasets/tag?api_keyabc instead of /datasets/tag?api_key=abc. The test likely still passes because FastAPI body validation rejects the invalid tag before authentication is checked, but the request is sent without a valid API key.
🐛 Proposed fix
-        f"/datasets/tag?api_key{ApiKey.ADMIN}",
+        f"/datasets/tag?api_key={ApiKey.ADMIN}",
src/routers/openml/datasets.py (1)
255-278: ⚠️ Potential issue | 🟡 Minor

Docstring still references ProblemDetailException.

Line 262 says "Raises ProblemDetailException" but the actual exception class is ProblemDetailError.
Proposed fix
-    Raises ProblemDetailException if the dataset does not exist or the user can not access it.
+    Raises ProblemDetailError if the dataset does not exist or the user can not access it.

🤖 Fix all issues with AI agents

In `@src/core/errors.py`:
- Around line 58-89: Update the class docstring to reference the correct class
name: replace occurrences of "ProblemDetailException" with "ProblemDetailError"
(in the usage example) so the example matches the actual class name
ProblemDetailError; ensure any other mentions inside the docstring align with
the class identifier ProblemDetailError and keep the example parameters
(status_code, detail, title, type_, code) unchanged.

In `@src/routers/openml/qualities.py`:
- Around line 38-43: The raised problem in raise_problem uses mismatched enums:
ProblemType.DATASET_NOT_FOUND with code=DatasetError.NO_DATA_FILE; update this
to be consistent by either changing code to DatasetError.NOT_FOUND to match
ProblemType.DATASET_NOT_FOUND, or change type_ to
ProblemType.DATASET_NO_DATA_FILE if the intent is to signal a missing data file;
adjust the call site (raise_problem) accordingly following the pattern used in
datasets.py so ProblemType and DatasetError align (e.g., use
ProblemType.DATASET_NOT_FOUND + DatasetError.NOT_FOUND or
ProblemType.DATASET_NO_DATA_FILE + DatasetError.NO_DATA_FILE).

🧹 Nitpick comments (3)

tests/routers/openml/migration/datasets_migration_test.py (1)

235-242: Consider asserting a specific ProblemType for feature error cases.

Lines 239–240 only check that "type" and "status" keys exist in the error payload, unlike other migration tests (e.g., Lines 112–113) which assert exact ProblemType values. If multiple error types are possible here (e.g., dataset not found vs. features not processed), a set-based check like assert error["type"] in {ProblemType.DATASET_NOT_FOUND, ProblemType.DATASET_NO_FEATURES, ...} would be more precise while still flexible.
src/core/errors.py (2)
92-109: Extension keys could shadow standard RFC 9457 fields.

content.update(exc.extensions) on Line 103 could overwrite standard problem detail fields (type, title, status, detail, instance) if an extension key collides. Currently raise_problem only adds "code", but direct ProblemDetailError construction could introduce collisions.

Consider either validating extension keys or applying them with a namespace/prefix guard.
Proposed defensive guard
     content = exc.problem.model_dump(by_alias=True, exclude_none=True)
-    content.update(exc.extensions)
+    reserved = {"type", "title", "status", "detail", "instance"}
+    safe_extensions = {k: v for k, v in exc.extensions.items() if k not in reserved}
+    content.update(safe_extensions)
162-190: Consider a test to enforce PROBLEM_TITLES covers all ProblemType constants.

raise_problem silently falls back to title=None when a type URI isn't in PROBLEM_TITLES. If a new ProblemType constant is added without a corresponding title entry, responses will lack a title with no warning. A simple test iterating over ProblemType class attributes and asserting presence in PROBLEM_TITLES would prevent this drift.

Also applies to: 193-222

src/core/errors.py

src/routers/openml/qualities.py

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/core/errors.py`:
- Line 93: Update the docstring that currently reads "FastAPI exception handler
for ProblemDetailException" to reference the correct class name
ProblemDetailError; locate the docstring associated with the ProblemDetailError
FastAPI exception handler (search for ProblemDetailError and its handler
function) and change the class name in the text so the docstring correctly
states "FastAPI exception handler for ProblemDetailError."
- Around line 99-100: The code currently merges exc.extensions into the
ProblemDetail dict with content.update(exc.extensions), which allows extensions
to overwrite RFC 9457 standard members; change this to filter or validate
extensions before merging: identify the standard keys (type, title, status,
detail, instance) and either remove any of those keys from exc.extensions or
raise a validation error in ProblemDetailError.__init__ when a collision occurs,
then merge only the allowed extension keys (e.g., filtered_extensions) into
content instead of exc.extensions to ensure standard fields cannot be
overwritten.

🧹 Nitpick comments (1)

src/core/errors.py (1)
111-187: No enforcement that every ProblemType has a corresponding PROBLEM_TITLES entry.

Adding a new constant to ProblemType without updating PROBLEM_TITLES will silently produce responses with title: null. Consider adding a module-level assertion or a test to keep them in sync.
# e.g., at module level or in a test:
_problem_types = {v for k, v in vars(ProblemType).items() if not k.startswith("_")}
assert _problem_types == set(PROBLEM_TITLES), "ProblemType and PROBLEM_TITLES are out of sync"

src/core/errors.py

PGijsbers added 3 commits February 12, 2026 10:36

Allow more lenient api key through configuration

855fa05

Added RFC with some failing linting/type checks

5c551c7

make access safe even if toml doesn't have dev section

81b4f60

PGijsbers added the enhancement New feature or request label Feb 12, 2026

sourcery-ai bot reviewed Feb 12, 2026

View reviewed changes

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

src/core/errors.py Outdated Show resolved Hide resolved

src/routers/openml/qualities.py Outdated Show resolved Hide resolved

Simplify model definition

50ed235

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

src/core/errors.py Outdated Show resolved Hide resolved

src/core/errors.py Outdated Show resolved Hide resolved

PGijsbers added 5 commits February 12, 2026 11:14

Update name in docstring

dcc5fcd

Rewrite errors to separate classes

b6db690

Remove unused dictionary

5e12e74

Remove the ProblemType class as it was confusing and only for tests

4a0d5cb

Provide default codes for the different errors based on PHP codes

97d5378

Uh oh!

Conversation

PGijsbers commented Feb 12, 2026

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading