-
Notifications
You must be signed in to change notification settings - Fork 1k
Refactor colnamesInt to provide context-specific error messages (#5039) #7554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Refactor colnamesInt to provide context-specific error messages (#5039) #7554
Conversation
81ceafe to
2104dc4
Compare
|
source is not a good name for an argument as it is widely used function name, I would use context instead. Unit tests are missing. |
b45cd57 to
11cc256
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7554 +/- ##
=======================================
Coverage 98.96% 98.96%
=======================================
Files 87 87
Lines 16737 16743 +6
=======================================
+ Hits 16563 16569 +6
Misses 174 174 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thank you for the feedback, @jangorecki ! I will rename the argument to context across the C and R levels. I will also add a dedicated unit test file to verify that the context hints are being correctly prepended to the error messages. I'll push these updates shortly. |
11cc256 to
65f8ca6
Compare
|
@ayushsingh9720 did you use AI when preparing this PR? |
|
Hi @jangorecki, yes, I've been using an AI as a , kind of a thought/learning partner to help me with the data.table codebase. as i am new to open source, it has been helping me understand the internl logic and debug build failures, but I am manually implementing the changes, running the local R CMD INSTALL builds, and verifying the error messages in my terminal to ensure everything works as intended. |
|
Ok. Check how we define tests and try to follow that logic, you can read manual of internal function |
65f8ca6 to
c9a0d8d
Compare
c9a0d8d to
2649ed9
Compare
|
checks are passing now, i have updated the unit test(Test 2357) , to follow the internal test() , as suggested. really appreciate the advice |
|
@tdhock @Anirban166 @DorisAmoakohene @joshhwuu I have completed the contributor tests for issues #5039 and #6512. I've updated the Wiki and sent an introductory email to Toby regarding my interest in a mix of low-level C work and bug fixing for GSoC 2026. I've also joined the R-GSoC Slack for future coordination. |
This PR addresses issue #5039 by refactoring the internal colnamesInt function at both the C and R levels. By introducing a mandatory source argument, the internal error messages now provide specific context (e.g., pinpointing whether a failure occurred during an on= join, setkey, or frank), making debugging significantly more intuitive for the user.
Technical Changes
src/utils.c: Modified the colnamesInt signature to accept a 5th argument, SEXP source. Updated all internal error() calls to include a %s placeholder, prepending the error with the provided context string.
src/data.table.h: Updated the function prototype to remain in sync with the new signature.
src/init.c: Updated the R-to-C registration from -1 (variable) to a strict 5 to ensure formal argument counting at the C interface level.
R/wrappers.R: Updated the R-side wrapper for colnamesInt to accept source=NULL and pass it through the .Call interface.
Global Refactor: Performed a comprehensive audit and update of 19 call sites across the following files to ensure the new 5th argument is passed correctly with descriptive hints:
R/data.table.R (Joins, setcolorder, setnames)
R/mergelist.R (Merge operations and internal column copying)
R/setkey.R (Key assignment)
R/frank.R (Ranking)
R/duplicated.R (Unique/Duplicated checks)
R/setops.R (Set operations)
Impact
Users will no longer see generic "inscrutable" errors. For example: Old Error: Error in colnamesInt(...) : argument specifying columns received non-existing column(s): cols[2]='z' New Error: Error in colnamesInt(...) : In x table's columns in on= join, argument specifying columns received non-existing column(s): cols[2]='z'
Checklist
[x] Reproduced the original "inscrutable" error.
[x] Successfully compiled using R CMD INSTALL ..
[x] Verified that all 19 call sites are updated to avoid "unused argument" or "argument mismatch" errors.
[x] Passed manual tests for joins, setkeys, and set operations.
Closes #5039