Fix LZMA dictionary size parsing in 7z archives (read 4 bytes instead of 3) by analytical-engines · Pull Request #61 · tsolomko/SWCompression

analytical-engines · 2025-12-14T18:51:33Z

Report: LZMA dictionary size incorrectly parsed (only 3 bytes read instead of 4)

Summary

In 7zFolder.swift, the LZMA dictionary size is read using only 3 bytes instead of 4, causing decompression to fail with LZMAError.notEnoughToRepeat for archives with dictionary sizes >= 16MB.

Environment

SWCompression version: 4.8.6
Platform: macOS 15.2 (Darwin 25.1.0)
Swift version: 5.x

Steps to Reproduce

Create or obtain a 7z archive compressed with LZMA and dictionary size 16MB or larger (e.g., LZMA:24 = 2^24 = 16MB)
Attempt to decompress using SevenZipContainer.open(container:)
Decompression fails with LZMAError.notEnoughToRepeat

Expected Behavior

Archive should decompress successfully, as it does with the official 7z command-line tool and other applications.

Actual Behavior

Decompression fails with LZMAError.notEnoughToRepeat.

Root Cause

In Sources/7-Zip/7zFolder.swift, lines 174-176:

  var dictionarySize = 0
  for i in 1..<4 {
      dictionarySize |= properties[i].toInt() << (8 * (i - 1))
  }

The loop 1..<4 only iterates over indices [1, 2, 3], reading 3 bytes. However, the LZMA dictionary size is stored as a 4-byte little-endian integer in properties[1...4].

For a dictionary size of 16MB (0x01000000 in little-endian):

properties[1] = 0x00
properties[2] = 0x00
properties[3] = 0x00
properties[4] = 0x01 ← Not read!

Result: dictionarySize is computed as 0, then adjusted to the minimum 4096 bytes by LZMAProperties. The decoder allocates a 4KB dictionary instead of the required 16MB, causing notEnoughToRepeat when the data references bytes beyond this undersized window.

Proposed Fix

Change 1..<4 to 1..<5:

  var dictionarySize = 0
  for i in 1..<5 {
      dictionarySize |= properties[i].toInt() << (8 * (i - 1))
  }

Verification

Test archive info from 7z t:
Method = LZMA:24
Solid = +
Physical Size = 86250112

The archive is valid and decompresses correctly with the official 7-Zip implementation.

The LZMA dictionary size was being read from only 3 bytes (indices 1-3) instead of the full 4 bytes (indices 1-4) specified in the LZMA format. This caused decompression to fail with LZMAError.notEnoughToRepeat for archives using dictionary sizes >= 16MB (e.g., LZMA:24 = 2^24 = 16MB), because the high byte was not read and the dictionary size defaulted to the minimum 4096 bytes. Change: `for i in 1..<4` → `for i in 1..<5`

tsolomko force-pushed the develop branch 5 times, most recently from fad4e5f to 33341b6 Compare February 2, 2026 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LZMA dictionary size parsing in 7z archives (read 4 bytes instead of 3)#61

Fix LZMA dictionary size parsing in 7z archives (read 4 bytes instead of 3)#61
analytical-engines wants to merge 1 commit intotsolomko:developfrom
analytical-engines:fix-lzma-dictionary-size

analytical-engines commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

analytical-engines commented Dec 14, 2025

Report: LZMA dictionary size incorrectly parsed (only 3 bytes read instead of 4)

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Proposed Fix

Verification

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants