Add some scripts to automatically handle JNL file pollution #5102

drivera73 · 2025-12-24T17:09:43Z

In some cases, when running primary zones using BIND as the DNS, the journal files may not get properly synchronized into their primary zone databases - possibly because bind isn't shut down cleanly using the O/S's service scripts. As a result, on the next bootup, BIND9 may refuse to boot because those journal files are corrupted.

These scripts try to maximize the instances under which those journal files are cleaned out properly. The right way to do the cleanout is by either running service named stop or rndc sync -clean. Either of these commands will instruct BIND to sync the journal files to their DBs and clean them out properly.

However, if that fails, then there are contingencies in place to forcibly remove those files - if they didn't get synchronized cleanly, then they're garbage and should be removed. If they did, then they disappear and there will be nothing to forcibly delete.

Either way, the intent is to ensure that BIND has no issues starting up when runnin a master zone.

This is related to issue #5068, also reported by me (via a different account ... sorry :) ).

fichtner

Wouldn’t it be better to use a setup.sh script executed at each start? Not sure how the files are used across restarts.

drivera73 · 2025-12-24T17:32:08Z

Wouldn’t it be better to use a setup.sh script executed at each start? Not sure how the files are used across restarts.

That's indeed part of the solution. There is a patch for the existing setup.sh script, and the new early and stop syshook scripts.

fichtner · 2025-12-24T19:30:31Z

Ok, sorry the path didn’t expand on mobile view very well.

I’d think we should try to minimize impact. Early could be reasonable in some cases maybe, but stop isn’t really useful as a trigger.

Cheers,
Franco

drivera73 · 2025-12-24T22:00:32Z

Fair.

I added the stop hook for consistency, and to maximize the chances of the issue being triaged as cleanly as possible (i.e. with either rndc clean -sync or service named stop). The logic is this: by the time the stop hook is called there are only two possible scenarios:

The named service is still running
- The named service must be stopped, but first call rndc clean -sync to flush out any journal files
- Execute the other scenario ...
The named service has already been stopped, or was never started
- The journal files are already cleaned up as a result
- The journal files are not cleaned up, but since named is no longer running, they're effectively garbage

Under either scenario, at the end of the stop hook, the journal files can safely be deleted if present.

If we remove the stop hook, then those journal files would be cleaned up blindly on bootup by the early hook, which may not be as antiseptic as rndc clean -sync since all we can do is delete them directly. The goal of affording rndc the opportunity to do some cleanup is in case there's valuable information in those journals that we can still commit at the last minute. It may not happen very often, but might as well give it a shot ... who knows whose butt we'll be saving!

In reality, the early hook is just a contingency to maximize the chances of BIND bootup, since the expectation is that both the setup.sh and the stop hooks should be sufficient ... but since only the paranoid survive... :)

Cheers!

farthinder · 2026-01-10T13:03:16Z

For what it is worth, this fixes the worst of my issues when I try to use k8s external-dns to publish records in OpnSense Bind via rfc2136.

I still have the issue that whenever Bind in opnsense is restarted (eg. during manual config/record updates) all the records published via rfc2136 gets deleted. But this doesn't happen often in my use case and I have setup external-dns to sync every 30s so I can live with it.

For any one else running in to this, you can apply this patch with this command:

opnsense-patch -c plugins 8f4daa869f01ef203efa73ce4926f0945a0f1b11

Nice work! @drivera73

drivera-armedia · 2026-01-10T14:35:35Z

Nice work! @drivera73

Thanks!

I still have the issue that whenever Bind in opnsense is restarted (eg. during manual config/record updates)
all the records published via rfc2136 gets deleted

I'm fairly sure the issue here is that the OPNSense UI believes it's the only one feeding records into the zone. Thus, what happens is that when you click on the Save button it will simply re-generate the zone completely based on the information it knows about, which doesn't include those RFC-2136 updates.

The solve for this would be to somehow tell the UI to "re-import the zone" so that it can adopt those new records into its own "knowledge". This would probably be a much larger undertaking, but it would probably be worth adding as a separate ticket since this feature could also be used to import existing zone files, which is currently not possible. Currently if one wishes to import existing zones, one must download a configuration backup, add the records into the XML, and then upload the updated configuration. This is cumbersome, error-prone, and would be resolved by that proposed new code.

Cheers!

Add some scripts to automatically handle JNL file pollution

8f4daa8

drivera73 mentioned this pull request Dec 24, 2025

os-bind: journals for primary zones can get corrupted on a reboot #5068

Open

3 tasks

fichtner reviewed Dec 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add some scripts to automatically handle JNL file pollution #5102

Add some scripts to automatically handle JNL file pollution #5102

Uh oh!

drivera73 commented Dec 24, 2025 •

edited

Loading

Uh oh!

fichtner left a comment

Uh oh!

drivera73 commented Dec 24, 2025

Uh oh!

fichtner commented Dec 24, 2025

Uh oh!

drivera73 commented Dec 24, 2025 •

edited

Loading

Uh oh!

farthinder commented Jan 10, 2026

Uh oh!

drivera-armedia commented Jan 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Add some scripts to automatically handle JNL file pollution #5102

Are you sure you want to change the base?

Add some scripts to automatically handle JNL file pollution #5102

Uh oh!

Conversation

drivera73 commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fichtner left a comment

Choose a reason for hiding this comment

Uh oh!

drivera73 commented Dec 24, 2025

Uh oh!

fichtner commented Dec 24, 2025

Uh oh!

drivera73 commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

farthinder commented Jan 10, 2026

Uh oh!

drivera-armedia commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

drivera73 commented Dec 24, 2025 •

edited

Loading

drivera73 commented Dec 24, 2025 •

edited

Loading

drivera-armedia commented Jan 10, 2026 •

edited

Loading