Skip to content

Conversation

@joelmukuthu
Copy link
Contributor

@joelmukuthu joelmukuthu commented Jan 17, 2026

This enables formatting function and procedue bodies where LANGUAGE is sql. More info: nene/sql-parser-cst#126.

This is currently a draft because it's not handling escaped string literals correctly for string quoted functions. Also, I'm not certain if there's more work needed to format SQL functions in other dialects (eg MySQL?). UPDATE:

  1. Single-quoted function bodies are converted to dollar-quoted ($$ .. $$), to simplify formatting. If the function body contains $$, it's left as is.
  2. No need to worry about MySQL: Format LANGUAGE SQL function/procedure bodies #61 (comment)

src/embedSql.ts Outdated
const quote = detectQuote(node);

if (quote) {
const sql = await textToDoc(node.value, options);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an option that can be passed here to tell textToDoc to retain the string quoting when quote === "'"?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling the escaped quotes properly is very hard to do and IMHO not worth the effort. We'd need to keep track which quotes were escaped in the original code and then re-escape them later. I don't even know from where to start with when solving this problem.

In practice one generally doesn't use single quotes for these SQL strings. At most one would for quoting a very simple SQL. If quotes are needed inside there, one generally just switches to another quoting style (like the dollar-quotes) instead of using escapes.

What I would suggest is:

  • format SQL inside dollar-quoted strings as you already do
  • convert single-quoted strings to dollar-quoted strings
  • do no formatting when it's not possible to convert a single-quoted string to dollar-quoted one. (Like for start we might only attempt to covert it to the form $$ ... $$, which can't be done when the string already contains $$ as substring).

This is largely the approach taken inside embedJs() function where strings are converted to either r''' ... ''' or r""" ... """ quoting style (specific to BigQuery).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree completely!

src/embedSql.ts Outdated
Comment on lines 48 to 51
const match = node.text.match(/^('|\$[^$]*\$)/);
const quote = match?.[1];

if (quote && node.text.endsWith(quote)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wen't with this version over const quote = node.text.match(/^('|\$[^$]*\$).*\1$/s)?.[1] because the s regular expression flag is only supported from ES2018. Also my guess is that endsWith might perform better than a regex backreference but I'm not sure

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to check both the start and end of the string to determine its quoting style. When the string starts with ' it must end with '. When the string starts with $$ it must end with $$ and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. I presumed that this is handled by the parser and indicated that in my commit message but let me know if that's not the case and I'll change the commit message. Or ignore this if you squash commits when merging :)

src/embedSql.ts Outdated

const isSqlLanguageClause = (
clause: CreateFunctionStmt["clauses"][0],
): boolean => isLanguageClause(clause) && clause.name.name === "sql";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PostgreSQL the language name is case-insensitive when unquoted. So LANGUAGE SQL and LANGUAGE SqL are also valid.

However, that case-insensitivity only applies when the language name identifier is not quoted. So LANGUAGE "sql" is valid, but LANGUAGE "SQL" is invalid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the code in embedJs and presumed the casing was normalised elsewhere, but good to know. Btw just FYI, I noticed that the parser currently fails on single-quoted language identifiers, e.g. LANGUAGE 'sql', which are also valid in postgres -- but I'd wait for someone to request it before spending time fixing that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info.

@nene
Copy link
Owner

nene commented Jan 17, 2026

FYI, you don't need to worry about MySQL, because MySQL doesn't have function definitions inside strings. Instead in MySQL a CREATE FUNCTION/PROCEDURE looks like this:

CREATE PROCEDURE citycount (IN country CHAR(3), OUT cities INT)
   BEGIN
     SELECT COUNT(*) INTO cities FROM world.city
     WHERE CountryCode = country;
   END

Similarly in BigQuery the SQL-functions are written without quoting, like so:

CREATE TEMP FUNCTION AddFourAndDivide(x INT64, y INT64)
RETURNS FLOAT64
AS (
  (x + 4) / y
);

In both cases LANGUAGE sql clause would produce a syntax error. So we don't need to worry about that.

But for PostgreSQL we should implement this formatting for both CREATE FUNCTION and CREATE PROCEDURE.

@joelmukuthu joelmukuthu changed the title Format LANGUAGE SQL function bodies Format LANGUAGE SQL function/procedure bodies Jan 18, 2026
@joelmukuthu
Copy link
Contributor Author

Thanks for the review, I've implemented the fixes. And thanks for the info regarding MySQL!

@nene nene merged commit 966121a into nene:master Jan 19, 2026
1 check passed
@nene
Copy link
Owner

nene commented Jan 19, 2026

This has now been released in 0.18.0.

@joelmukuthu joelmukuthu deleted the feat/language-sql branch January 19, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants