Conversation
- Extended SQL grammar to support tag column references using the same syntax as non-tag column references (col_name FROM db.table.col and db.table.col positional form) - Updated SCreateVSubTableStmt with pSpecificTagRefs and pTagRefs fields - Modified parTranslater to validate and process tag references through checkAndReplaceTagRefs - Added JSON serialization/deserialization for new tag reference fields - Added test cases in test_vtable_tag_ref.py
…thout truncation
- parser/parTranslater.c: relax checkColRef to only check type (not bytes)
for variable-length types (binary/nchar/varchar/varbinary), allowing
virtual table columns to have different lengths from source columns.
- executor/executil.c: in initQueryTableDataCondWithColArray, use
TMAX(virtual_bytes, source_bytes) for variable-length types to ensure
buildBuf allocation is adequate for source data.
- executor/scanoperator.c:
- createVTableScanInfoFromParam: relax type+bytes check to type-only
for variable-length types; after createOneDataBlockWithColArray,
update pResBlock column info.bytes to source table bytes so
doCopyColVal length check passes.
- createVTableScanInfoFromBatchParam: populate SColIdPair.type from
source schema; add same pResBlock info.bytes update loop.
- test: add test_vtable_nchar_length.py covering three scenarios
(vtable col len > / = / < source col len) with projection, SQL
functions (length, char_length, lower, upper, concat, substring,
replace, ascii, position, cast, first, last), filters, and mixed
cross-table references.
- Add validateSrcTableColRef to verify source table/column existence - vtbRefValidateLocal: check local vnode meta first - vtbRefValidateRemote: fetch DB vgroup info from MNode, calculate vgId, fetch table schema from target vnode via RPC, validate column existence - Fix deadlock: release META_READER_LOCK before RPC calls in doOptimizeVTableNameFilter; skip self-RPC when localVgId matches - Fix double htonl bug in vtbRefFetchTableSchema (vgId encoding) - Fix rpcMallocCont/taosMemoryFree mismatch causing crash on cross-db refs - Fix sysTableFillOneVirtualTableRefImpl: correct src_column_name (col 6), err_code (col 8) and err_msg (col 9) population - Populate err_code with TSDB_CODE_PAR_TABLE_NOT_EXIST or TSDB_CODE_PAR_INVALID_REF_COLUMN when validation fails
- Introduce a new test suite for validating virtual table referencing in both same-db and cross-db scenarios. - Implement tests to ensure that references to normal and child tables are correctly validated, checking for successful error codes. - Set up necessary databases and tables for testing, including source tables and virtual tables for comprehensive coverage.
# Conflicts: # source/libs/parser/inc/sql.y
Summary of ChangesHello @yihaoDeng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the diagnostics and manageability of virtual tables by introducing a dedicated mechanism to validate their column references. It provides a new system table and a Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new SHOW VTABLE VALIDATE FOR ... command and a corresponding system table information_schema.ins_virtual_tables_referencing to validate column references in virtual tables, covering cross-vnode and cross-database scenarios. A security audit, however, identified several critical issues, primarily in source/libs/executor/src/sysscanoperator.c. These include a buffer truncation bug, a memory leak in the schema caching mechanism, highly suspicious logic in the virtual child table column filling function that could lead to memory corruption or crashes, and the use of an uninitialized database name in an optimization path for the new system table scan. Furthermore, general code review noted a duplicate macro, a potential null pointer dereference, inconsistent naming, and dead code. Addressing these issues is crucial for the stability and security of this enhancement.
There was a problem hiding this comment.
Pull request overview
This PR adds validation functionality for virtual table column references through a new system table ins_virtual_tables_referencing and SQL command SHOW VTABLE VALIDATE FOR <table>. The implementation validates whether virtual table column references to source tables/columns are still valid after schema changes.
Changes:
- Introduces new
SHOW VTABLE VALIDATESQL syntax andins_virtual_tables_referencingsystem table - Implements validation logic that checks virtual table column references against source tables via local metadata or cross-vnode RPC
- Adds comprehensive test suite covering validation scenarios (valid references, dropped columns/tables/databases, mixed valid/invalid references)
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| source/libs/parser/inc/sql.y | Adds VALIDATE token and new SHOW VTABLE VALIDATE syntax; also adds problematic CREATE VTABLE variant |
| source/libs/parser/src/parTokenizer.c | Registers VALIDATE keyword token |
| source/libs/parser/src/parAstCreater.c | Creates AST node for SHOW VTABLE VALIDATE statement |
| source/libs/parser/src/parAstParser.c | Collects metadata keys for validation query |
| source/libs/parser/src/parTranslater.c | Translates and validates SHOW VTABLE VALIDATE statement |
| source/libs/parser/src/parUtil.c | Handles virtual_db_name/virtual_table_name filter conditions |
| source/libs/planner/src/planPhysiCreater.c | Routes new system table to vnode execution |
| source/libs/executor/src/sysscanoperator.c | Implements validation logic with local and remote RPC-based checks |
| source/libs/nodes/src/nodesUtilFuncs.c | Handles SShowValidateVirtualTable node creation/destruction |
| source/libs/nodes/src/nodesCodeFuncs.c | Serialization support for new node type |
| source/common/src/systable.c | Defines schema for ins_virtual_tables_referencing table |
| source/dnode/mnode/impl/src/mndShow.c | Routes system table queries to appropriate handler |
| source/dnode/mgmt/node_mgmt/src/dmTransport.c | Allows TABLE_META_RSP message type for validation RPCs |
| include/libs/nodes/cmdnodes.h | Defines SShowValidateVirtualTable structure |
| include/common/tmsg.h | Adds SHOW_VALIDATE_VTABLE_STMT and VIRTUAL_TABLES_REFERENCING enum values |
| include/common/systable.h | Defines system table name constant |
| include/util/tdef.h | Defines error message buffer size |
| test/cases/05-VirtualTables/test_vtable_validate_referencing.py | Comprehensive test suite for validation functionality |
| tests/test_new/case_list_docs/query/hint.md | Formatting change |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
When creating a virtual table with tag column references, validate that the referenced column is actually a tag column, not a data column. - Add check using getNormalColSchema to detect data columns - Return distinct error messages: - "references column which is not a tag column" for data columns - "references non-existent tag" for missing columns - Add test cases in test_vtable_tag_ref.py
Fix compilation error where STableType was incorrectly used instead of ETableType in vtbRefCreateSchemaCache and vtbRefGetTableSchemaLocal. Also adds schema cache functionality for virtual table validation.
When validating virtual table column references, multiple columns may reference the same remote table. Previously, each column reference triggered a separate RPC call to fetch the table schema. This optimization: - Adds 'ownsSchema' flag to SVtbRefSchemaCache to support deep-copy for remote tables (vs shallow copy for local tables) - Creates vtbRefCreateSchemaCacheFromMetaRsp to build cache from RPC response - Adds vtbRefGetRemoteCacheEntry/vtbRefPutRemoteCacheEntry helpers - Modifies vtbRefValidateRemote to cache successful results - Checks cache before making RPC calls in validateSrcTableColRef Result: N columns referencing the same remote table = 1 RPC instead of N
When validating virtual table column references, the previous implementation used linear search (O(N)) to find column names in the cached schema. This becomes inefficient for tables with many columns. This optimization: - Adds pColNameIndex (SHashObj*) to SVtbRefSchemaCache - Creates vtbRefBuildColNameIndex to build hash index from schema array - Modifies vtbRefCreateSchemaCache to build index for local tables - Modifies vtbRefCreateSchemaCacheFromMetaRsp to build index for remote tables - Simplifies vtbRefCheckColumnInCache to use hash lookup (O(1)) - Updates vtbRefFreeSchemaCache to cleanup hash table Result: Column name lookup from O(N) to O(1), especially beneficial for wide tables with many columns.
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.