-
Notifications
You must be signed in to change notification settings - Fork 8
Evaluation: Fix score format #549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughSplits the evaluations router into two direct routers (dataset and evaluation) and standardizes score storage on a Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client/API Request
participant API as API Router
participant Svc as Evaluation Service
participant CRUD as CRUD Layer
participant DB as Database
participant Langfuse as External API (Langfuse)
Client->>API: GET /evaluations/:id[?get_trace_info]
API->>Svc: get_evaluation_with_scores(params)
Svc->>CRUD: get_evaluation_run(id)
CRUD->>DB: Query eval_run
DB-->>CRUD: eval_run (may include cached summary_scores/traces)
CRUD-->>Svc: eval_run
alt Need fetch or resync
Svc->>Langfuse: Fetch traces & summary_scores
Langfuse-->>Svc: langfuse_score (summary_scores + traces)
Svc->>Svc: Extract existing_summary_scores from eval_run.score
Svc->>Svc: Merge existing_summary_scores + langfuse_summary_scores (langfuse precedence)
Svc->>Svc: Build final score {merged_summary_scores, traces}
Svc->>CRUD: save_score(eval_run.id, final_score)
CRUD->>DB: Update eval_run.score
DB-->>CRUD: Confirm
CRUD-->>Svc: persisted
else Cached data sufficient
Svc->>Svc: Return cached eval_run.score
end
Svc-->>Client: Evaluation with merged scores (+ traces if requested)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| eval_run: EvaluationRun, | ||
| langfuse: Langfuse, | ||
| force_refetch: bool = False, | ||
| ) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add strict type safety here?
Summary
Target issue is #520
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.