Open
Description
Looks like a simple error when generating the sincedb file. Whitespaces in the file name are not escaped and when reading it back after a reboot, if the file name has whitespaces, it isn't read properly and the plugin loads the file from the beginning again, like if it wasn't in the sincedb file. Looks very simple, I don't try to fix it myself because I have zero knowledge of ruby.
-
Version:
4.1.6 (logstash version 6.4.2) -
Operating System:
Linux (I'm using the official docker images, I think they are CentOS 7). -
Config File (if you have sensitive info, please remove it):
input {
file {
path => ["/mnt/data/wrong file.csv", "/mnt/data/ok-file.csv"]
sincedb_path => "/mnt/data/test.sincedb"
start_position => "beginning"
}
}
output {
file {
path => "/mnt/data/output.dat"
}
}
- Sample Data:
After a couple of reboots you can see that the data from "wrong file.csv" keeps repeating.
{"message":"field1,field2,field3\r","host":"77a43f13b227","path":"/mnt/data/wrong file.csv","@timestamp":"2018-10-25T11:40:10.127Z","@version":"1"}
{"message":"data2-1,data2-2,data2-3\r","host":"77a43f13b227","path":"/mnt/data/wrong file.csv","@timestamp":"2018-10-25T11:40:10.193Z","@version":"1"}
{"message":"data1-1,data1-2,data1-3\r","host":"77a43f13b227","path":"/mnt/data/wrong file.csv","@timestamp":"2018-10-25T11:40:10.187Z","@version":"1"}
{"message":"field1,field2,field3\r","host":"77a43f13b227","path":"/mnt/data/ok-file.csv","@timestamp":"2018-10-25T11:40:10.415Z","@version":"1"}
{"message":"data1-1,data1-2,data1-3\r","host":"77a43f13b227","path":"/mnt/data/ok-file.csv","@timestamp":"2018-10-25T11:40:10.415Z","@version":"1"}
{"message":"data2-1,data2-2,data2-3\r","host":"77a43f13b227","path":"/mnt/data/ok-file.csv","@timestamp":"2018-10-25T11:40:10.415Z","@version":"1"}
{"path":"/mnt/data/wrong file.csv","message":"data2-1,data2-2,data2-3\r","@timestamp":"2018-10-25T11:41:27.625Z","@version":"1","host":"77a43f13b227"}
{"path":"/mnt/data/wrong file.csv","message":"field1,field2,field3\r","@timestamp":"2018-10-25T11:41:27.367Z","@version":"1","host":"77a43f13b227"}
{"path":"/mnt/data/wrong file.csv","message":"data1-1,data1-2,data1-3\r","@timestamp":"2018-10-25T11:41:27.617Z","@version":"1","host":"77a43f13b227"}
{"path":"/mnt/data/wrong file.csv","host":"9ac74fb6972b","message":"field1,field2,field3\r","@version":"1","@timestamp":"2018-10-25T11:54:21.743Z"}
{"path":"/mnt/data/wrong file.csv","host":"9ac74fb6972b","message":"data2-1,data2-2,data2-3\r","@version":"1","@timestamp":"2018-10-25T11:54:21.820Z"}
{"path":"/mnt/data/wrong file.csv","host":"9ac74fb6972b","message":"data1-1,data1-2,data1-3\r","@version":"1","@timestamp":"2018-10-25T11:54:21.806Z"}
Sincedb contents:
3940649673984149 0 113 72 1540468461.2128549 /mnt/data/ok-file.csv
5629499534248064 0 113 72 1540468461.821136 /mnt/data/wrong file.csv
- Steps to Reproduce:
- Create a file with whitespaces in the path.
- Load it with the file plugin.
- Restart.
- Get your duplicate data.