-
Notifications
You must be signed in to change notification settings - Fork 43
Add report generation for OSU Benchmark #807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughThis pull request introduces OSU Benchmark report generation and comparison functionality. A new report generation strategy extracts benchmark data from stdout into CSV format, while a specialized comparison report generates HTML tables and charts aggregating metrics across multiple benchmark runs. Supporting registration, module exports, and test infrastructure are also added. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryThis PR adds report generation functionality for OSU Benchmark tests. The implementation parses OSU stdout output into CSV format with support for three benchmark types: latency (with Key changes:
The implementation uses regex patterns to detect benchmark types, parses data rows, and handles missing files gracefully by returning empty DataFrames. The comparison report integrates with the existing ComparisonReport base class to generate HTML reports with Bokeh charts and Rich tables. Confidence Score: 5/5
Important Files Changed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py`:
- Around line 65-142: The code reads each group's CSVs twice because
create_tables and create_charts each call extract_data_as_df independently; add
a simple per-report cache to avoid duplicate reads by implementing a helper
(e.g., _get_group_dfs) that checks/sets a dict on self (e.g., self._dfs_cache)
keyed by a stable identifier for the group (like tuple(item.tr.path) or
id(group)), and returns the list produced by [self.extract_data_as_df(item.tr)
for item in group.items]; replace the direct list comprehensions in
create_tables and create_charts with calls to _get_group_dfs and clear/init
self._dfs_cache appropriately (e.g., in report constructor or before processing)
so extract_data_as_df is invoked only once per group.
In `@src/cloudai/workloads/osu_bench/report_generation_strategy.py`:
- Around line 64-85: The two branches in _parse_data_row that handle
BenchmarkType.BANDWIDTH and the LATENCY fallback both return [parts[0],
parts[1]]; simplify by removing the explicit BANDWIDTH branch and collapsing
them into a single fallback that returns [parts[0], parts[1]] for any
non-MULTIPLE_BANDWIDTH case (after validating parts and the message-size int),
keeping the MULTIPLE_BANDWIDTH branch unchanged; this reduces duplication while
preserving behavior tied to _columns_for_type.
In
`@tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py`:
- Around line 63-112: Add three edge-case tests for extract_osu_bench_data to
cover its early-return branches: (1) "file not found" — call
extract_osu_bench_data with a non-existent Path and assert it returns an empty
DataFrame (no columns, shape (0,0) or however empty is represented in your
project), (2) "empty file" — create an empty stdout file and assert
extract_osu_bench_data returns an empty DataFrame, and (3) "unrecognized header"
— write a stdout file containing non-OSU output (e.g., "osu_hello" text) and
assert extract_osu_bench_data returns an empty DataFrame; reference the function
extract_osu_bench_data and the module osu_bench.py so tests exercise the
special-case handling added there.
| def create_tables(self, cmp_groups: list[GroupedTestRuns]) -> list[Table]: | ||
| tables: list[Table] = [] | ||
| for group in cmp_groups: | ||
| dfs = [self.extract_data_as_df(item.tr) for item in group.items] | ||
|
|
||
| if self._has_metric(dfs, "avg_lat"): | ||
| tables.append( | ||
| self.create_table( | ||
| group, | ||
| dfs=dfs, | ||
| title="Latency", | ||
| info_columns=list(self.INFO_COLUMNS), | ||
| data_columns=["avg_lat"], | ||
| ) | ||
| ) | ||
| if self._has_metric(dfs, "mb_sec"): | ||
| tables.append( | ||
| self.create_table( | ||
| group, | ||
| dfs=dfs, | ||
| title="Bandwidth", | ||
| info_columns=list(self.INFO_COLUMNS), | ||
| data_columns=["mb_sec"], | ||
| ) | ||
| ) | ||
| if self._has_metric(dfs, "messages_sec"): | ||
| tables.append( | ||
| self.create_table( | ||
| group, | ||
| dfs=dfs, | ||
| title="Message Rate", | ||
| info_columns=list(self.INFO_COLUMNS), | ||
| data_columns=["messages_sec"], | ||
| ) | ||
| ) | ||
|
|
||
| return tables | ||
|
|
||
| def create_charts(self, cmp_groups: list[GroupedTestRuns]) -> list[bk.figure]: | ||
| charts: list[bk.figure] = [] | ||
| for group in cmp_groups: | ||
| dfs = [self.extract_data_as_df(item.tr) for item in group.items] | ||
|
|
||
| if self._has_metric(dfs, "avg_lat"): | ||
| charts.append( | ||
| self.create_chart( | ||
| group, | ||
| dfs, | ||
| "Latency", | ||
| list(self.INFO_COLUMNS), | ||
| ["avg_lat"], | ||
| "Time (us)", | ||
| ) | ||
| ) | ||
| if self._has_metric(dfs, "mb_sec"): | ||
| charts.append( | ||
| self.create_chart( | ||
| group, | ||
| dfs, | ||
| "Bandwidth", | ||
| list(self.INFO_COLUMNS), | ||
| ["mb_sec"], | ||
| "Bandwidth (MB/s)", | ||
| ) | ||
| ) | ||
| if self._has_metric(dfs, "messages_sec"): | ||
| charts.append( | ||
| self.create_chart( | ||
| group, | ||
| dfs, | ||
| "Message Rate", | ||
| list(self.INFO_COLUMNS), | ||
| ["messages_sec"], | ||
| "Messages/s", | ||
| ) | ||
| ) | ||
|
|
||
| return charts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Repeated extract_data_as_df calls for the same group items across create_tables and create_charts.
Each group's CSVs are read from disk twice — once for tables, once for charts. This follows the existing ComparisonReport interface pattern (each method receives cmp_groups independently), so changing it would require a broader refactor. Acceptable as-is for small CSV files.
🤖 Prompt for AI Agents
In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py` around lines 65 -
142, The code reads each group's CSVs twice because create_tables and
create_charts each call extract_data_as_df independently; add a simple
per-report cache to avoid duplicate reads by implementing a helper (e.g.,
_get_group_dfs) that checks/sets a dict on self (e.g., self._dfs_cache) keyed by
a stable identifier for the group (like tuple(item.tr.path) or id(group)), and
returns the list produced by [self.extract_data_as_df(item.tr) for item in
group.items]; replace the direct list comprehensions in create_tables and
create_charts with calls to _get_group_dfs and clear/init self._dfs_cache
appropriately (e.g., in report constructor or before processing) so
extract_data_as_df is invoked only once per group.
| def _parse_data_row(parts: list[str], benchmark_type: BenchmarkType) -> list[str] | None: | ||
| if len(parts) < 2: | ||
| return None | ||
|
|
||
| try: | ||
| int(parts[0]) # message size | ||
| except ValueError: | ||
| return None | ||
|
|
||
| # Append row data based on benchmark type. | ||
| if benchmark_type == BenchmarkType.MULTIPLE_BANDWIDTH: | ||
| if len(parts) >= 3: | ||
| # size, MB/s, Messages/s | ||
| return [parts[0], parts[1], parts[2]] | ||
| return None | ||
|
|
||
| if benchmark_type == BenchmarkType.BANDWIDTH: | ||
| # size, MB/s | ||
| return [parts[0], parts[1]] | ||
|
|
||
| # LATENCY | ||
| return [parts[0], parts[1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
_parse_data_row for BANDWIDTH and LATENCY are identical — both return [parts[0], parts[1]].
The two branches at lines 80-82 and 84-85 produce the same result. This is not a bug (the column naming differs in _columns_for_type), but you could collapse them into a single fallback for clarity.
♻️ Minor simplification
if benchmark_type == BenchmarkType.MULTIPLE_BANDWIDTH:
if len(parts) >= 3:
return [parts[0], parts[1], parts[2]]
return None
- if benchmark_type == BenchmarkType.BANDWIDTH:
- # size, MB/s
- return [parts[0], parts[1]]
-
- # LATENCY
+ # BANDWIDTH or LATENCY: size + one metric
return [parts[0], parts[1]]🤖 Prompt for AI Agents
In `@src/cloudai/workloads/osu_bench/report_generation_strategy.py` around lines
64 - 85, The two branches in _parse_data_row that handle BenchmarkType.BANDWIDTH
and the LATENCY fallback both return [parts[0], parts[1]]; simplify by removing
the explicit BANDWIDTH branch and collapsing them into a single fallback that
returns [parts[0], parts[1]] for any non-MULTIPLE_BANDWIDTH case (after
validating parts and the message-size int), keeping the MULTIPLE_BANDWIDTH
branch unchanged; this reduces duplication while preserving behavior tied to
_columns_for_type.
| def test_osu_multiple_bandwidth_message_rate_parsing(tmp_path: Path) -> None: | ||
| stdout = tmp_path / "stdout.txt" | ||
| stdout.write_text(OSU_MULTIPLE_BW) | ||
|
|
||
| df = extract_osu_bench_data(stdout) | ||
| assert list(df.columns) == ["size", "mb_sec", "messages_sec"] | ||
| assert df.shape == (2, 3) | ||
| assert df["size"].iloc[0] == 1 | ||
| assert df["mb_sec"].iloc[0] == pytest.approx(2.36) | ||
| assert df["messages_sec"].iloc[0] == pytest.approx(2356892.44) | ||
| assert df["size"].iloc[-1] == 4194304 | ||
| assert df["mb_sec"].iloc[-1] == pytest.approx(26903.29) | ||
| assert df["messages_sec"].iloc[-1] == pytest.approx(6414.24) | ||
|
|
||
|
|
||
| def test_osu_latency_parsing(tmp_path: Path) -> None: | ||
| stdout = tmp_path / "stdout.txt" | ||
| stdout.write_text(OSU_ALLGATHER_LAT) | ||
|
|
||
| df = extract_osu_bench_data(stdout) | ||
| assert list(df.columns) == ["size", "avg_lat"] | ||
| assert df.shape == (2, 2) | ||
| assert df["size"].iloc[0] == 1 | ||
| assert df["avg_lat"].iloc[0] == pytest.approx(2.81) | ||
| assert df["size"].iloc[-1] == 1048576 | ||
| assert df["avg_lat"].iloc[-1] == pytest.approx(104.30) | ||
|
|
||
|
|
||
| def test_osu_bandwidth_parsing(tmp_path: Path) -> None: | ||
| stdout = tmp_path / "stdout.txt" | ||
| stdout.write_text(OSU_BW) | ||
|
|
||
| df = extract_osu_bench_data(stdout) | ||
| assert list(df.columns) == ["size", "mb_sec"] | ||
| assert df.shape == (2, 2) | ||
| assert df["size"].iloc[0] == 1 | ||
| assert df["mb_sec"].iloc[0] == pytest.approx(2.32) | ||
| assert df["size"].iloc[-1] == 4194304 | ||
| assert df["mb_sec"].iloc[-1] == pytest.approx(27161.33) | ||
|
|
||
|
|
||
| def test_osu_multi_latency_short_header_parsing(tmp_path: Path) -> None: | ||
| stdout = tmp_path / "stdout.txt" | ||
| stdout.write_text(OSU_MULTI_LAT) | ||
|
|
||
| df = extract_osu_bench_data(stdout) | ||
| assert list(df.columns) == ["size", "avg_lat"] | ||
| assert df.shape == (6, 2) | ||
| assert df["size"].tolist() == [1, 2, 4, 8, 16, 32] | ||
| assert df["avg_lat"].tolist() == pytest.approx([1.88, 1.84, 1.88, 1.91, 1.87, 2.01]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Good coverage of the happy-path parsing for all three benchmark types.
Consider adding edge-case tests for extract_osu_bench_data: file not found (returns empty DataFrame), empty file, and stdout with no recognizable header (e.g., osu_hello output). These would guard the early-return paths and align with the special-case handling added in osu_bench.py.
Would you like me to generate those edge-case tests?
🤖 Prompt for AI Agents
In
`@tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py`
around lines 63 - 112, Add three edge-case tests for extract_osu_bench_data to
cover its early-return branches: (1) "file not found" — call
extract_osu_bench_data with a non-existent Path and assert it returns an empty
DataFrame (no columns, shape (0,0) or however empty is represented in your
project), (2) "empty file" — create an empty stdout file and assert
extract_osu_bench_data returns an empty DataFrame, and (3) "unrecognized header"
— write a stdout file containing non-OSU output (e.g., "osu_hello" text) and
assert extract_osu_bench_data returns an empty DataFrame; reference the function
extract_osu_bench_data and the module osu_bench.py so tests exercise the
special-case handling added there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments
Summary
osu_init,osu_hello).Test Plan
Tested on internal environment:
osu_latency,osu_bw,osu_get_bw,osu_put_bw,osu_bibw,osu_mbw_mr,osu_multi_lat.osu_bw,osu_latency.