C#: mass enable diff-informed data flow#19661
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR auto-generates patches to enable diff-informed data flow by adding a default observeDiffInformedIncrementalMode predicate in numerous data-flow configuration modules.
- Added
predicate observeDiffInformedIncrementalMode() { any() }to all relevantDataFlow::ConfigSigmodules. - Covers security, cryptography, and likely-bug query modules for incremental diff analysis.
Reviewed Changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/src/Security Features/CWE-091/XMLInjection.ql | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/src/Likely Bugs/LeapYear/UnsafeYearConstruction.ql | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/ZipSlipQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/XPathInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/UrlRedirectQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/TaintedPathQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/SqlInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/ResourceInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/RegexInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/ReDoSQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/MissingXMLValidationQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/LogForgingQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/LDAPInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/ExposureOfPrivateInformationQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/CommandInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/CodeInjectionQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/dataflow/CleartextStorageQuery.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/cryptography/HardcodedSymmetricEncryptionKey.qll | Added observeDiffInformedIncrementalMode predicate |
| csharp/ql/lib/semmle/code/csharp/security/cryptography/EncryptionKeyDataFlowQuery.qll | Added observeDiffInformedIncrementalMode predicate |
Comments suppressed due to low confidence (2)
csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql:45
- [nitpick] Add a brief comment above this predicate to explain its role in diff-informed incremental analysis, improving clarity for future maintainers.
predicate observeDiffInformedIncrementalMode() { any() }
csharp/ql/src/Security Features/CWE-114/AssemblyPathInjection.ql:45
- There are no existing tests exercising the incremental diff mode; consider adding test cases to validate behavior when this predicate is active.
predicate observeDiffInformedIncrementalMode() { any() }
|
It turns out that some of the generated changes in the PRs were not correct, e.g. because they should have also generated a |
|
@d10c : Great!
|
|
Update: no changes since last time I opened the PR. It turns out that it's sound (but not optimally performant) to leave |
In this initial PR, we are skipping over the cases where a node other than source or sink is used as a location source in the select clause. To enable these cases, a non-empty override of
It's true that in order to enable diff-informed testing of queries, the tests have to be .qlref files. I'm planning on doing that kind of rewrite (similar to this PR) in a later phase. |
michaelnebel
left a comment
There was a problem hiding this comment.
Should we trigger a DCA run before merging? Are there any other special testing that is needed before merge?
| --- | ||
| category: minorAnalysis | ||
| --- | ||
| * A number of built-in C# queries can now run in diff-informed mode. |
There was a problem hiding this comment.
Has the documentation on, how to run queries in diff-informed mode been released? (basically my question is whether we should have a release note)
There was a problem hiding this comment.
No, it hasn't been made public. So it probably makes sense to not have a release note on any of these PRs.
| */ | ||
| predicate isBarrier(DataFlow::Node node) { node instanceof Sanitizer } | ||
|
|
||
| predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
The related location for this query uses a PathNode instead of Node. Will that cause an issue?
There was a problem hiding this comment.
As far as I can tell, I don't think it matters - as this predicate only relates to filtering of source and sink nodes based on an extensible predicate that is pre-populated with a diff range (in which a source and sink should be located).
There was a problem hiding this comment.
Since (AFAIK) PathNode and Node have the same location, it shouldn't make a difference.
| ) | ||
| } | ||
|
|
||
| predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
Both the primary and related locations are of type PathNode. Is that a problem?
| ) | ||
| } | ||
|
|
||
| predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
Related location is PathNode.
Fantastic! We are also rewriting the test cases whenever we touch the queries (to use inline expectations) - so that would be great! 😄 |
|
As for DCA, it looks like we don't have sources set up with diffs, so we can't use DCA to test performance on diffs yet. A normal DCA run would then test before-and-after-this-PR without diff-informed mode, but those two runs should have the same behaviour. So I say we leave DCA testing out for now until I get the diff-informed sources in place. |
An auto-generated patch that enables diff-informed data flow in the obvious cases. Builds on github#18344 and github/codeql-patch#88
9c48c73 to
f2085c2
Compare
Yes, I just wanted to make sure that the introduction of the override didn't somehow affect performance negatively. I don't know, if introducing this override might for some obscure reason e.g. change some join-ordering that could affect performance. |
|
Note, according to the follow-up PR, one of the queries (csharp/ql/src/Security Features/CWE-327/DontInstallRootCert.ql) has a missing source/sink location in its select clause; the other ones should have both. |
An auto-generated patch that enables diff-informed data flow in the obvious cases.
Builds on #18344 and https://github.com/github/codeql-patch/pull/88