Skip to content

Commit b708c1f

Browse files
karesrobbaveykarenzone
authored
Docs: more documentation on restore + temp dir (#236)
Co-authored-by: Rob Bavey <[email protected]> Co-authored-by: Karen Metts <[email protected]>
1 parent ff42372 commit b708c1f

File tree

5 files changed

+41
-25
lines changed

5 files changed

+41
-25
lines changed

CHANGELOG.md

+10-6
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
1+
## 4.3.6
2+
- Docs: more documentation on restore + temp dir [#236](https://github.com/logstash-plugins/logstash-output-s3/pull/236)
3+
* minor logging improvements - use the same path: naming convention
4+
15
## 4.3.5
2-
- Feat: cast true/false values for additional_settings [#241](https://github.com/logstash-plugins/logstash-output-s3/pull/241)
6+
- Feat: cast true/false values for additional_settings [#241](https://github.com/logstash-plugins/logstash-output-s3/pull/241)
37

48
## 4.3.4
5-
- [DOC] Added note about performance implications of interpolated strings in prefixes [#233](https://github.com/logstash-plugins/logstash-output-s3/pull/233)
9+
- [DOC] Added note about performance implications of interpolated strings in prefixes [#233](https://github.com/logstash-plugins/logstash-output-s3/pull/233)
610

711
## 4.3.3
8-
- [DOC] Updated links to use shared attributes [#230](https://github.com/logstash-plugins/logstash-output-s3/pull/230)
12+
- [DOC] Updated links to use shared attributes [#230](https://github.com/logstash-plugins/logstash-output-s3/pull/230)
913

1014
## 4.3.2
11-
- [DOC] Added note that only AWS S3 is supported. No other S3 compatible storage solutions are supported. [#223](https://github.com/logstash-plugins/logstash-output-s3/pull/223)
15+
- [DOC] Added note that only AWS S3 is supported. No other S3 compatible storage solutions are supported. [#223](https://github.com/logstash-plugins/logstash-output-s3/pull/223)
1216

1317
## 4.3.1
14-
- [DOC] Updated setting descriptions for clarity [#219](https://github.com/logstash-plugins/logstash-output-s3/pull/219) and [#220](https://github.com/logstash-plugins/logstash-output-s3/pull/220)
18+
- [DOC] Updated setting descriptions for clarity [#219](https://github.com/logstash-plugins/logstash-output-s3/pull/219) and [#220](https://github.com/logstash-plugins/logstash-output-s3/pull/220)
1519

1620
## 4.3.0
17-
- Feat: Added retry_count and retry_delay config [#218](https://github.com/logstash-plugins/logstash-output-s3/pull/218)
21+
- Feat: Added retry_count and retry_delay config [#218](https://github.com/logstash-plugins/logstash-output-s3/pull/218)
1822

1923
## 4.2.0
2024
- Added ability to specify [ONEZONE_IA](https://aws.amazon.com/s3/storage-classes/#__) as storage_class

docs/index.asciidoc

+13-3
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,9 @@ Other S3 compatible storage solutions are not supported.
3030
S3 outputs create temporary files into the OS' temporary directory.
3131
You can specify where to save them using the `temporary_directory` option.
3232

33-
IMPORTANT: For configurations containing multiple s3 outputs with the restore
34-
option enabled, each output should define its own 'temporary_directory'.
33+
IMPORTANT: For configurations containing multiple s3 outputs with the `restore`
34+
option enabled, each output should define its own `temporary_directory`.
35+
Shared or nested directories can cause data loss upon recovery.
3536

3637
===== Requirements
3738

@@ -255,6 +256,10 @@ The AWS Region
255256
Used to enable recovery after crash/abnormal termination.
256257
Temporary log files will be recovered and uploaded.
257258

259+
NOTE: If you're using multiple S3 outputs, always set
260+
<<plugins-{type}s-{plugin}-temporary_directory>> to a
261+
unique directory. Otherwise the recovery mechanism won't work correctly.
262+
258263
[id="plugins-{type}s-{plugin}-retry_count"]
259264
===== `retry_count`
260265

@@ -388,7 +393,12 @@ Defaults to STANDARD.
388393
* Default value is `"/tmp/logstash"`
389394

390395
Set the directory where logstash will store the tmp files before sending it to S3
391-
default to the current OS temporary directory in linux /tmp/logstash
396+
default to the current OS temporary directory in linux `/tmp/logstash`.
397+
398+
WARNING: Using multiple S3 outputs with `restore => true` requires unique directories
399+
per output. All of the directory's contents are processed and deleted upon recovery, and shared or nested directories can cause data loss.
400+
For example, an output using `/tmp/s3` and a second configured with `/tmp/s3/sub` would
401+
cause issues. Having temporary directories `/tmp/s3/sub1` and `/tmp/s3/sub2` is fine.
392402

393403
[id="plugins-{type}s-{plugin}-time_file"]
394404
===== `time_file`

lib/logstash/outputs/s3.rb

+16-12
Original file line numberDiff line numberDiff line change
@@ -110,18 +110,19 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
110110

111111
# Set the size of file in bytes, this means that files on bucket when have dimension > file_size, they are stored in two or more file.
112112
# If you have tags then it will generate a specific size file for every tags
113-
##NOTE: define size of file is the better thing, because generate a local temporary file on disk and then put it in bucket.
113+
#
114+
# NOTE: define size of file is the better thing, because generate a local temporary file on disk and then put it in bucket.
114115
config :size_file, :validate => :number, :default => 1024 * 1024 * 5
115116

116117
# Set the time, in MINUTES, to close the current sub_time_section of bucket.
117118
# If you also define file_size you have a number of files related to the section and the current tag.
118119
# If it's valued 0 and rotation_strategy is 'time' or 'size_and_time' then the plugin reaise a configuration error.
119120
config :time_file, :validate => :number, :default => 15
120121

121-
## IMPORTANT: if you use multiple instance of s3, you should specify on one of them the "restore=> true" and on the others "restore => false".
122-
## This is hack for not destroy the new files after restoring the initial files.
123-
## If you do not specify "restore => true" when logstash crashes or is restarted, the files are not sent into the bucket,
124-
## for example if you have single Instance.
122+
# If `restore => false` is specified and Logstash crashes, the unprocessed files are not sent into the bucket.
123+
#
124+
# NOTE: that the `recovery => true` default assumes multiple S3 outputs would set a unique `temporary_directory => ...`
125+
# if they do not than only a single S3 output is safe to recover (since let-over files are processed and deleted).
125126
config :restore, :validate => :boolean, :default => true
126127

127128
# The S3 canned ACL to use when putting the file. Defaults to "private".
@@ -147,6 +148,9 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
147148

148149
# Set the directory where logstash will store the tmp files before sending it to S3
149150
# default to the current OS temporary directory in linux /tmp/logstash
151+
#
152+
# NOTE: the reason we do not have a unique (isolated) temporary directory as a default, to support multiple plugin instances,
153+
# is that we would have to rely on something static that does not change between restarts (e.g. a user set id => ...).
150154
config :temporary_directory, :validate => :string, :default => File.join(Dir.tmpdir, "logstash")
151155

152156
# Specify a prefix to the uploaded filename, this can simulate directories on S3. Prefix does not require leading slash.
@@ -347,10 +351,10 @@ def rotate_if_needed(prefixes)
347351
temp_file = factory.current
348352

349353
if @rotation.rotate?(temp_file)
350-
@logger.debug("Rotate file",
351-
:strategy => @rotation.class.name,
352-
:key => temp_file.key,
353-
:path => temp_file.path)
354+
@logger.debug? && @logger.debug("Rotate file",
355+
:key => temp_file.key,
356+
:path => temp_file.path,
357+
:strategy => @rotation.class.name)
354358

355359
upload_file(temp_file)
356360
factory.rotate!
@@ -360,7 +364,7 @@ def rotate_if_needed(prefixes)
360364
end
361365

362366
def upload_file(temp_file)
363-
@logger.debug("Queue for upload", :path => temp_file.path)
367+
@logger.debug? && @logger.debug("Queue for upload", :path => temp_file.path)
364368

365369
# if the queue is full the calling thread will be used to upload
366370
temp_file.close # make sure the content is on disk
@@ -383,7 +387,7 @@ def rotation_strategy
383387
end
384388

385389
def clean_temporary_file(file)
386-
@logger.debug("Removing temporary file", :file => file.path)
390+
@logger.debug? && @logger.debug("Removing temporary file", :path => file.path)
387391
file.delete!
388392
end
389393

@@ -398,7 +402,7 @@ def restore_from_crash
398402
.each do |file|
399403
temp_file = TemporaryFile.create_from_existing_file(file, temp_folder_path)
400404
if temp_file.size > 0
401-
@logger.debug("Recovering from crash and uploading", :file => temp_file.path)
405+
@logger.debug? && @logger.debug("Recovering from crash and uploading", :path => temp_file.path)
402406
@crash_uploader.upload_async(temp_file, :on_complete => method(:clean_temporary_file), :upload_options => upload_options)
403407
else
404408
clean_temporary_file(temp_file)

lib/logstash/outputs/s3/file_repository.rb

+1-3
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44
require "concurrent/timer_task"
55
require "logstash/util"
66

7-
ConcurrentHashMap = java.util.concurrent.ConcurrentHashMap
8-
97
module LogStash
108
module Outputs
119
class S3
@@ -59,7 +57,7 @@ def initialize(tags, encoding, temporary_directory,
5957
sweeper_interval = DEFAULT_STATE_SWEEPER_INTERVAL_SECS)
6058
# The path need to contains the prefix so when we start
6159
# logtash after a crash we keep the remote structure
62-
@prefixed_factories = ConcurrentHashMap.new
60+
@prefixed_factories = java.util.concurrent.ConcurrentHashMap.new
6361

6462
@sweeper_interval = sweeper_interval
6563

logstash-output-s3.gemspec

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Gem::Specification.new do |s|
22
s.name = 'logstash-output-s3'
3-
s.version = '4.3.5'
3+
s.version = '4.3.6'
44
s.licenses = ['Apache-2.0']
55
s.summary = "Sends Logstash events to the Amazon Simple Storage Service"
66
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"

0 commit comments

Comments
 (0)