From 8735462da3f87d78ecb2ce19031f6b5c9c2c5a9c Mon Sep 17 00:00:00 2001 From: Bernhard Date: Tue, 8 Jul 2025 13:30:43 +0200 Subject: [PATCH 1/6] add some docs regarding security --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/README.md b/README.md index 23c4c4b4..73d59a08 100644 --- a/README.md +++ b/README.md @@ -490,6 +490,18 @@ generated schema (as opposed to an ad-hoc schema infered from the graph data). The build targets JDK8, so that's the minimum version. The build itself requires JDK11+. However in any case it is highly encouraged to use a modern JVM, such as JDK20. +## What about security / untrusted flatgraph files? +The main potentially security issue is the following situation: You get handed an untrusted, potentially malicious, flatgraph file, and want to handle it. +Deserializing a `.fg` file should not pop a shell / cause privilege escalation, nor should not cause excessive filesystem activity. However, it may take an +unbounded amount of time and memory, potentially leading to an OOM crash of the JVM that might not be recoverable from within the JVM by catching some exceptions. + +The easiest malicious but completely valid `.fg` file in that vein is a ZIP-bomb. We take care not to decompress graphs into the filesystem, but we do decompress them into memory. + +If you need to handle untrusted `.fg` files, then we recommend some form of sandboxing in order to limit the DoS impact. + +If you do decide against our recommendation to write your own code to "sanity check" potentially malicious `.fg` files before attempting to deserialize them, then we'd be happy for your feedback and PRs. (also beware of potential parser differentials -- e.g. the manifest json can be reached either via the offset from the file header, or via `tail -n 1`, and these may very well be different manifests) + + ## What does EMT stand for? EMT is a naming convention that stands for "erased marker trait". The domain classes generator generates one for each property in the schema and users can define additional marker traits. From c6fb8d6c6279b4c0d2b2e5bc71b4adb98fc47620 Mon Sep 17 00:00:00 2001 From: bbrehm Date: Thu, 10 Jul 2025 15:04:11 +0200 Subject: [PATCH 2/6] Update README.md Co-authored-by: Michael Pollmeier --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 73d59a08..6eed0309 100644 --- a/README.md +++ b/README.md @@ -491,7 +491,7 @@ The build targets JDK8, so that's the minimum version. The build itself requires However in any case it is highly encouraged to use a modern JVM, such as JDK20. ## What about security / untrusted flatgraph files? -The main potentially security issue is the following situation: You get handed an untrusted, potentially malicious, flatgraph file, and want to handle it. +The main potentially security issue is probably: how can you handle an untrusted - and potentially malicious - flatgraph file? Deserializing a `.fg` file should not pop a shell / cause privilege escalation, nor should not cause excessive filesystem activity. However, it may take an unbounded amount of time and memory, potentially leading to an OOM crash of the JVM that might not be recoverable from within the JVM by catching some exceptions. From cb60332e8205e145ca222b62bc3d0265f8c56c02 Mon Sep 17 00:00:00 2001 From: bbrehm Date: Thu, 10 Jul 2025 15:04:25 +0200 Subject: [PATCH 3/6] Update README.md Co-authored-by: Michael Pollmeier --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6eed0309..553f9894 100644 --- a/README.md +++ b/README.md @@ -492,7 +492,7 @@ However in any case it is highly encouraged to use a modern JVM, such as JDK20. ## What about security / untrusted flatgraph files? The main potentially security issue is probably: how can you handle an untrusted - and potentially malicious - flatgraph file? -Deserializing a `.fg` file should not pop a shell / cause privilege escalation, nor should not cause excessive filesystem activity. However, it may take an +Deserializing a `.fg` file should not be able to open a shell or cause privilege escalation, nor should it cause excessive filesystem activity. However, it may take an unbounded amount of time and memory, potentially leading to an OOM crash of the JVM that might not be recoverable from within the JVM by catching some exceptions. The easiest malicious but completely valid `.fg` file in that vein is a ZIP-bomb. We take care not to decompress graphs into the filesystem, but we do decompress them into memory. From 13b5a921c1a38e14170f7092095200079b228bef Mon Sep 17 00:00:00 2001 From: bbrehm Date: Thu, 10 Jul 2025 15:04:51 +0200 Subject: [PATCH 4/6] Update README.md Co-authored-by: Michael Pollmeier --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 553f9894..76680b5a 100644 --- a/README.md +++ b/README.md @@ -493,7 +493,7 @@ However in any case it is highly encouraged to use a modern JVM, such as JDK20. ## What about security / untrusted flatgraph files? The main potentially security issue is probably: how can you handle an untrusted - and potentially malicious - flatgraph file? Deserializing a `.fg` file should not be able to open a shell or cause privilege escalation, nor should it cause excessive filesystem activity. However, it may take an -unbounded amount of time and memory, potentially leading to an OOM crash of the JVM that might not be recoverable from within the JVM by catching some exceptions. +unbounded amount of time and memory, potentially leading to an OutOfMemoryError in which is typically not recoverable and will bring down the entire JVM. The easiest malicious but completely valid `.fg` file in that vein is a ZIP-bomb. We take care not to decompress graphs into the filesystem, but we do decompress them into memory. From a4e25d231012f60c4eca1f1946b494378ed3931f Mon Sep 17 00:00:00 2001 From: bbrehm Date: Thu, 10 Jul 2025 15:05:03 +0200 Subject: [PATCH 5/6] Update README.md Co-authored-by: Michael Pollmeier --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 76680b5a..51c4567a 100644 --- a/README.md +++ b/README.md @@ -497,7 +497,7 @@ unbounded amount of time and memory, potentially leading to an OutOfMemoryError The easiest malicious but completely valid `.fg` file in that vein is a ZIP-bomb. We take care not to decompress graphs into the filesystem, but we do decompress them into memory. -If you need to handle untrusted `.fg` files, then we recommend some form of sandboxing in order to limit the DoS impact. +If you need to handle untrusted `.fg` files you should really sandbox your process, in order to limit the DoS impact. If you do decide against our recommendation to write your own code to "sanity check" potentially malicious `.fg` files before attempting to deserialize them, then we'd be happy for your feedback and PRs. (also beware of potential parser differentials -- e.g. the manifest json can be reached either via the offset from the file header, or via `tail -n 1`, and these may very well be different manifests) From d4ac891b3ae754e6a4c2a76c93f221d61d8b4031 Mon Sep 17 00:00:00 2001 From: Bernhard Date: Thu, 10 Jul 2025 15:48:12 +0200 Subject: [PATCH 6/6] clean up more types --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 51c4567a..d1fbffa9 100644 --- a/README.md +++ b/README.md @@ -491,15 +491,15 @@ The build targets JDK8, so that's the minimum version. The build itself requires However in any case it is highly encouraged to use a modern JVM, such as JDK20. ## What about security / untrusted flatgraph files? -The main potentially security issue is probably: how can you handle an untrusted - and potentially malicious - flatgraph file? +The main potential security issue is probably: how can you handle an untrusted - and potentially malicious - flatgraph file? Deserializing a `.fg` file should not be able to open a shell or cause privilege escalation, nor should it cause excessive filesystem activity. However, it may take an -unbounded amount of time and memory, potentially leading to an OutOfMemoryError in which is typically not recoverable and will bring down the entire JVM. +unbounded amount of time and memory, potentially leading to an OutOfMemoryError, and potentially bringing down the JVM or even, depending on configuration, the system (off-heap allocations via `ByteBuffer.allocateDirect` do not necessarily respect the maximum heap size, and the OOM-killer is not gentle). The easiest malicious but completely valid `.fg` file in that vein is a ZIP-bomb. We take care not to decompress graphs into the filesystem, but we do decompress them into memory. -If you need to handle untrusted `.fg` files you should really sandbox your process, in order to limit the DoS impact. +If you need to handle untrusted `.fg` files, then you should really sandbox your process, in order to limit the DoS impact. -If you do decide against our recommendation to write your own code to "sanity check" potentially malicious `.fg` files before attempting to deserialize them, then we'd be happy for your feedback and PRs. (also beware of potential parser differentials -- e.g. the manifest json can be reached either via the offset from the file header, or via `tail -n 1`, and these may very well be different manifests) +If you decide to rather sanity check graphs before loading, then we would be happy for PRs; however, this is not our current development priority, nor is it our recommendation. In that case, also beware of potential parser differentials; e.g. the manifest json can be reached either via the offset from the file header, or via `tail -n 1`, and these may very well be different manifests. ## What does EMT stand for?