Skip to content

Commit 47a8453

Browse files
authored
Ensure re-raised from write_file (#20)
Previously a `PathSet::Conflict` would get raised when `close` would cause a duplicate filename. `close` would then be called again, and raising the `PathSet::Conflict` _again_, overriding the actual exception raised within the block. Also, a partial write of a local file entry header would not cause a correct rollback. This needs more state management variables to do correctly. Closes #19 and addresses the underlying cause of #15
1 parent 02bbff0 commit 47a8453

File tree

7 files changed

+75
-8
lines changed

7 files changed

+75
-8
lines changed

CHANGELOG.md

+5
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
## 6.3.2
2+
3+
* Make sure `rollback!` correctly works with `write_file` and the original exception gets re-raised from `write_file` if
4+
closing the current entry happens in `Writable#close`
5+
16
## 6.3.1
27

38
* Include `RailsStreaming` in a Rails loader callback, so that ActionController does not need to be in the namespace.

lib/zip_kit/streamer.rb

+23-5
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,12 @@
55
# Is used to write ZIP archives without having to read them back or to overwrite
66
# data. It outputs into any object that supports `<<` or `write`, namely:
77
#
8-
# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
9-
# for the `Streamer`.
8+
# * `Array` - will contain binary strings
9+
# * `File` - data will be written to it as it gets generated
10+
# * `IO` (`Socket`, `StringIO`) - data gets written into it
11+
# * `String` - in binary encoding and unfrozen - also makes a decent output target
12+
#
13+
# or anything else that responds to `#<<` or `#write`.
1014
#
1115
# You can also combine output through the `Streamer` with direct output to the destination,
1216
# all while preserving the correct offsets in the ZIP file structures. This allows usage
@@ -482,6 +486,10 @@ def update_last_entry_and_write_data_descriptor(crc32:, compressed_size:, uncomp
482486
# is likely already on the wire. However, excluding the entry from the central directory of the ZIP
483487
# file will allow better-behaved ZIP unarchivers to extract the entries which did store correctly,
484488
# provided they read the ZIP from the central directory and not straight-ahead.
489+
# Rolling back does not perform any writes.
490+
#
491+
# `rollback!` gets called for you if an exception is raised inside the block of `write_file`,
492+
# `write_deflated_file` and `write_stored_file`.
485493
#
486494
# @example
487495
# zip.add_stored_entry(filename: "data.bin", size: 4.megabytes, crc32: the_crc)
@@ -493,14 +501,17 @@ def update_last_entry_and_write_data_descriptor(crc32:, compressed_size:, uncomp
493501
# end
494502
# @return [Integer] position in the output stream / ZIP archive
495503
def rollback!
496-
removed_entry = @files.pop
497-
return @out.tell unless removed_entry
504+
@files.pop if @remove_last_file_at_rollback
498505

506+
# Recreate the path set from remaining entries (PathSet does not support cheap deletes yet)
499507
@path_set.clear
500508
@files.each do |e|
501509
@path_set.add_directory_or_file_path(e.filename) unless e.filler?
502510
end
503-
@files << Filler.new(@out.tell - removed_entry.local_header_offset)
511+
512+
# Create filler for the truncated or unusable local file entry that did get written into the output
513+
filler_size_bytes = @out.tell - @offset_before_last_local_file_header
514+
@files << Filler.new(filler_size_bytes)
504515

505516
@out.tell
506517
end
@@ -554,6 +565,11 @@ def add_file_and_write_local_header(
554565
use_data_descriptor:,
555566
unix_permissions:
556567
)
568+
# Set state needed for proper rollback later. If write_local_file_header
569+
# does manage to write _some_ bytes, but fails later (we write in tiny bits sometimes)
570+
# we should be able to create a filler from this offset on when we
571+
@offset_before_last_local_file_header = @out.tell
572+
@remove_last_file_at_rollback = false
557573

558574
# Clean backslashes
559575
filename = remove_backslash(filename)
@@ -600,9 +616,11 @@ def add_file_and_write_local_header(
600616
mtime: e.mtime,
601617
filename: e.filename,
602618
storage_mode: e.storage_mode)
619+
603620
e.bytes_used_for_local_header = @out.tell - e.local_header_offset
604621

605622
@files << e
623+
@remove_last_file_at_rollback = true
606624
end
607625

608626
def remove_backslash(filename)

lib/zip_kit/streamer/heuristic.rb

+8
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ def initialize(streamer, filename, **write_file_options)
2626
@bytes_deflated = 0
2727

2828
@winner = nil
29+
@started_closing = false
2930
end
3031

3132
def <<(bytes)
@@ -40,13 +41,17 @@ def <<(bytes)
4041
end
4142

4243
def close
44+
return if @started_closing
45+
@started_closing = true # started_closing because an exception may get raised inside close(), as we add an entry there
46+
4347
decide unless @winner
4448
@winner.close
4549
end
4650

4751
private def decide
4852
# Finish and then close the deflater - it has likely buffered some data
4953
@bytes_deflated += @deflater.finish.bytesize until @deflater.finished?
54+
5055
# If the deflated version is smaller than the stored one
5156
# - use deflate, otherwise stored
5257
ratio = @bytes_deflated / @buf.size.to_f
@@ -55,9 +60,12 @@ def close
5560
else
5661
@streamer.write_stored_file(@filename, **@write_file_options)
5762
end
63+
5864
# Copy the buffered uncompressed data into the newly initialized writable
5965
@buf.rewind
6066
IO.copy_stream(@buf, @winner)
6167
@buf.truncate(0)
68+
ensure
69+
@deflater.close
6270
end
6371
end

lib/zip_kit/version.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# frozen_string_literal: true
22

33
module ZipKit
4-
VERSION = "6.3.1"
4+
VERSION = "6.3.2"
55
end

spec/zip_kit/file_reader_spec.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ class EvilWriter < ZipKit::ZipWriter
9393
def write_end_of_central_directory(**kwargs)
9494
# Pretend there has to be more data
9595
kwargs[:central_directory_size] = kwargs[:central_directory_size] + 64
96-
super(**kwargs)
96+
super
9797
end
9898
end
9999

spec/zip_kit/streamer_spec.rb

+36
Original file line numberDiff line numberDiff line change
@@ -639,4 +639,40 @@ def stream_with_just_write.write(bytes)
639639
expect(per_filename["deflated.txt"]).to eq("this is attempt 2")
640640
expect(per_filename["stored.txt"]).to eq("this is attempt 2")
641641
end
642+
643+
it "correctly rolls back if an exception is raised from Writable#close when using write_file" do
644+
# A Unicode string will not be happy about binary writes
645+
# and will raise an exception. The exception won't be raised when
646+
# starting the writes, but in `#close` of the Writable with write_file. When the size of the byte
647+
# stream is small, the decision whether to call write_stored_file or write_deflated_file is going to
648+
# be deferred until the Writable gets close() called on it. In that situation a correct rollback
649+
# must be performed.
650+
# If this is not handled correctly, the exception we will get raised won't be the original exception
651+
# (Encoding::CompatibilityError that we need to see - to know what went wrong inside the writing block)
652+
# but a PathSet::Conflict duplicate zip entry exception. Also, the IO offsets will be out of whack.
653+
# To cause this, we need something that will raise during `Writable#close` - we will use a Unicode string
654+
# for that purpose. See https://github.com/julik/zip_kit/issues/15
655+
uniсode_str_buf = +"Ж"
656+
657+
zip = described_class.new(uniсode_str_buf)
658+
4.times do
659+
bytes_before_partial_write = uniсode_str_buf.bytesize
660+
expect {
661+
zip.write_file("deflated.txt") do |sink|
662+
sink.write("x")
663+
end
664+
}.to raise_error(Encoding::CompatibilityError) # Should not be a PathSet::Conflict
665+
666+
# Ensure there was a partial write
667+
expect(uniсode_str_buf.bytesize - bytes_before_partial_write).to be > 0
668+
end
669+
670+
# We must force the string into binary so that the ZIP can be closed - we
671+
# want to write out the EOCD since we want to make sure there is no byte offset
672+
# mismatch (fillers have been correctly produced)
673+
uniсode_str_buf.force_encoding(Encoding::BINARY)
674+
expect {
675+
zip.close
676+
}.not_to raise_error # Offset validation should pass
677+
end
642678
end

spec/zip_kit/zip_writer_spec.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
describe ZipKit::ZipWriter do
44
class ByteReader < Struct.new(:io)
55
def initialize(io)
6-
super(io).tap { io.rewind }
6+
super.tap { io.rewind }
77
end
88

99
def read_1b

0 commit comments

Comments
 (0)