-
Notifications
You must be signed in to change notification settings - Fork 13
Format LANGUAGE SQL function/procedure bodies #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/embedSql.ts
Outdated
| const quote = detectQuote(node); | ||
|
|
||
| if (quote) { | ||
| const sql = await textToDoc(node.value, options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an option that can be passed here to tell textToDoc to retain the string quoting when quote === "'"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling the escaped quotes properly is very hard to do and IMHO not worth the effort. We'd need to keep track which quotes were escaped in the original code and then re-escape them later. I don't even know from where to start with when solving this problem.
In practice one generally doesn't use single quotes for these SQL strings. At most one would for quoting a very simple SQL. If quotes are needed inside there, one generally just switches to another quoting style (like the dollar-quotes) instead of using escapes.
What I would suggest is:
- format SQL inside dollar-quoted strings as you already do
- convert single-quoted strings to dollar-quoted strings
- do no formatting when it's not possible to convert a single-quoted string to dollar-quoted one. (Like for start we might only attempt to covert it to the form
$$ ... $$, which can't be done when the string already contains$$as substring).
This is largely the approach taken inside embedJs() function where strings are converted to either r''' ... ''' or r""" ... """ quoting style (specific to BigQuery).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree completely!
src/embedSql.ts
Outdated
| const match = node.text.match(/^('|\$[^$]*\$)/); | ||
| const quote = match?.[1]; | ||
|
|
||
| if (quote && node.text.endsWith(quote)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wen't with this version over const quote = node.text.match(/^('|\$[^$]*\$).*\1$/s)?.[1] because the s regular expression flag is only supported from ES2018. Also my guess is that endsWith might perform better than a regex backreference but I'm not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to check both the start and end of the string to determine its quoting style. When the string starts with ' it must end with '. When the string starts with $$ it must end with $$ and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know. I presumed that this is handled by the parser and indicated that in my commit message but let me know if that's not the case and I'll change the commit message. Or ignore this if you squash commits when merging :)
69180ea to
4e30fe6
Compare
src/embedSql.ts
Outdated
|
|
||
| const isSqlLanguageClause = ( | ||
| clause: CreateFunctionStmt["clauses"][0], | ||
| ): boolean => isLanguageClause(clause) && clause.name.name === "sql"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PostgreSQL the language name is case-insensitive when unquoted. So LANGUAGE SQL and LANGUAGE SqL are also valid.
However, that case-insensitivity only applies when the language name identifier is not quoted. So LANGUAGE "sql" is valid, but LANGUAGE "SQL" is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following the code in embedJs and presumed the casing was normalised elsewhere, but good to know. Btw just FYI, I noticed that the parser currently fails on single-quoted language identifiers, e.g. LANGUAGE 'sql', which are also valid in postgres -- but I'd wait for someone to request it before spending time fixing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info.
|
FYI, you don't need to worry about MySQL, because MySQL doesn't have function definitions inside strings. Instead in MySQL a CREATE PROCEDURE citycount (IN country CHAR(3), OUT cities INT)
BEGIN
SELECT COUNT(*) INTO cities FROM world.city
WHERE CountryCode = country;
ENDSimilarly in BigQuery the SQL-functions are written without quoting, like so: CREATE TEMP FUNCTION AddFourAndDivide(x INT64, y INT64)
RETURNS FLOAT64
AS (
(x + 4) / y
);In both cases But for PostgreSQL we should implement this formatting for both |
|
Thanks for the review, I've implemented the fixes. And thanks for the info regarding MySQL! |
|
This has now been released in 0.18.0. |
This enables formatting function and procedue bodies where
LANGUAGEissql. More info: nene/sql-parser-cst#126.This is currently a draft because it's not handling escaped string literals correctly for string quoted functions. Also, I'm not certain if there's more work needed to format SQL functions in other dialects (eg MySQL?).UPDATE:$$ .. $$), to simplify formatting. If the function body contains$$, it's left as is.