Skip to content

Commit e6324c4

Browse files
Make the delimiter optional in the encoder (#45)
1 parent 11f1c6e commit e6324c4

File tree

4 files changed

+62
-33
lines changed

4 files changed

+62
-33
lines changed

README.md

+37-19
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ luarocks install ftcsv
1717
There are two main parsing methods: `ftcv.parse` and `ftcsv.parseLine`.
1818
`ftcsv.parse` loads the entire file and parses it, while `ftcsv.parseLine` is an iterator that parses one line at a time.
1919

20-
### `ftcsv.parse(fileName, [, options])`
20+
### `ftcsv.parse(fileName [, options])`
2121
`ftcsv.parse` will load the entire csv file into memory, then parse it in one go, returning a lua table with the parsed data and a lua table containing the column headers. It has only one required parameter - the file name. A few optional parameters can be passed in via a table (examples below).
2222

2323
Just loading a csv file:
@@ -26,7 +26,7 @@ local ftcsv = require('ftcsv')
2626
local zipcodes, headers = ftcsv.parse("free-zipcode-database.csv")
2727
```
2828

29-
### `ftcsv.parseLine(fileName, [, options])`
29+
### `ftcsv.parseLine(fileName [, options])`
3030
`ftcsv.parseLine` will open a file and read `options.bufferSize` bytes of the file. `bufferSize` defaults to 2^16 bytes (which provides the fastest parsing on most unix-based systems), or can be specified in the options. `ftcsv.parseLine` is an iterator and returns one line at a time. When all the lines in the buffer are read, it will read in another `bufferSize` bytes of a file and repeat the process until the entire file has been read.
3131

3232
If specifying `bufferSize` there are a couple of things to remember:
@@ -48,7 +48,7 @@ end
4848
The options are the same for `parseLine` and `parse`, with the exception of `loadFromString` and `bufferSize`. `loadFromString` only works with `parse` and `bufferSize` can only be specified for `parseLine`.
4949

5050
The following are optional parameters passed in via the third argument as a table.
51-
- `delimeter`
51+
- `delimiter`
5252

5353
If your file doesn't use the comma character as the delimiter, you can specify your own. It is limited to one character and defaults to `,`
5454
```lua
@@ -131,37 +131,66 @@ ftcsv.parse("apple,banana,carrot", {loadFromString=true, headers=false})
131131
```
132132

133133
## Encoding
134-
### `ftcsv.encode(inputTable, delimiter[, options])`
134+
### `ftcsv.encode(inputTable [, options])`
135135

136-
`ftcsv.encode` takes in a lua table and turns it into a text string that can be written to a file. It has two required parameters, an inputTable and a delimiter. You can use it to write out a file like this:
136+
`ftcsv.encode` takes in a lua table and turns it into a text string that can be written to a file. You can use it to write out a file like this:
137137
```lua
138-
local fileOutput = ftcsv.encode(users, ",")
138+
local users = {
139+
{name="alice", fruit="apple"},
140+
{name="bob", fruit="banana"},
141+
{name="eve", fruit="pear"}
142+
}
143+
local fileOutput = ftcsv.encode(users)
139144
local file = assert(io.open("ALLUSERS.csv", "w"))
140145
file:write(fileOutput)
141146
file:close()
142147
```
143148

144149
### Options
150+
- `delimiter`
151+
152+
by default the encoder uses a `,` as a delimiter. The delimiter can be changed by setting a value for `delimiter`
153+
154+
```lua
155+
local output = ftcsv.encode(everyUser, {delimiter="\t"})
156+
```
157+
145158
- `fieldsToKeep`
146159

147160
if `fieldsToKeep` is set in the encode process, only the fields specified will be written out to a file. The `fieldsToKeep` will be written out in the order that is specified.
148161

149162
```lua
150-
local output = ftcsv.encode(everyUser, ",", {fieldsToKeep={"Name", "Phone", "City"}})
163+
local output = ftcsv.encode(everyUser, {fieldsToKeep={"Name", "Phone", "City"}})
151164
```
152165

153166
- `onlyRequiredQuotes`
154167

155168
if `onlyRequiredQuotes` is set to `true`, the output will only include quotes around fields that are quotes, have newlines, or contain the delimter.
156169

157170
```lua
158-
local output = ftcsv.encode(everyUser, ",", {onlyRequiredQuotes=true})
171+
local output = ftcsv.encode(everyUser, {onlyRequiredQuotes=true})
159172
```
160173

161174

162175
## Error Handling
163176
ftcsv returns a litany of errors when passed a bad csv file or incorrect parameters. You can find a more detailed explanation of the more cryptic errors in [ERRORS.md](ERRORS.md)
164177

178+
## Delimiter no longer required from 1.4.0!
179+
Starting with version 1.4.0, the delimiter no longer required as the second argument. **But don't worry,** ftcsv remains backwards compatible! We check the argument types and adjust parsing as necessary. There is no intention to remove this backwards compatibility layer, so your existing code should just keep on working!
180+
181+
So this works just fine:
182+
```lua
183+
ftcsv.parse("a>b>c\r\n1,2,3", ">", {loadFromString=true})
184+
```
185+
186+
as well as:
187+
```lua
188+
ftcsv.encode(users, ",")
189+
```
190+
191+
The delimiter as the second argument will always take precedent if both are provided.
192+
193+
165194
## Benchmarks
166195
We ran ftcsv against a few different csv parsers ([PIL](http://www.lua.org/pil/20.4.html)/[csvutils](http://lua-users.org/wiki/CsvUtils), [lua_csv](https://github.com/geoffleyland/lua-csv), and [lpeg_josh](http://lua-users.org/lists/lua-l/2009-08/msg00020.html)) for lua and here is what we found:
167196

@@ -207,17 +236,6 @@ Feel free to create a new issue for any bugs you've found or help you need. If y
207236
8. Enjoy the changes made!
208237

209238

210-
## Delimiter no longer required as of 1.4.0!
211-
Starting with version 1.4.0, the delimiter no longer required as the second argument. **But don't worry,** ftcsv remains backwards compatible! We check the argument types and adjust parsing as necessary. There is no intention to remove this backwards compatibility layer, so you can always enjoy your up-to-date lightning fast CSV parser!
212-
213-
So this works just fine:
214-
```lua
215-
ftcsv.parse("a>b>c\r\n1,2,3", ">", {loadFromString=true})
216-
```
217-
218-
The delimiter as the second argument will always take precedent if both are provided.
219-
220-
221239
## Licenses
222240
- The main library is licensed under the MIT License. Feel free to use it!
223241
- Some of the test CSVs are from [csv-spectrum](https://github.com/maxogden/csv-spectrum) (BSD-2-Clause) which includes some from [csvkit](https://github.com/wireservice/csvkit) (MIT License)

ftcsv.lua

+1
Original file line numberDiff line numberDiff line change
@@ -804,6 +804,7 @@ end
804804

805805
-- works really quickly with luajit-2.1, because table.concat life
806806
function ftcsv.encode(inputTable, delimiter, options)
807+
local delimiter, options = determineArgumentOrder(delimiter, options)
807808
local output, headers = initializeGenerator(inputTable, delimiter, options)
808809

809810
for i, line in csvLineGenerator(inputTable, delimiter, headers, options) do

spec/feature_spec.lua

+2-2
Original file line numberDiff line numberDiff line change
@@ -502,7 +502,7 @@ describe("csv features", function()
502502
assert.are.same(expected, actual)
503503
end)
504504

505-
it("should handle ignoring the single quote without specifying the delimeter", function()
505+
it("should handle ignoring the single quote without specifying the delimiter", function()
506506
local expected = {}
507507
expected[1] = {}
508508
expected[1].a = '"apple'
@@ -524,7 +524,7 @@ describe("csv features", function()
524524
assert.are.same(expected, actual)
525525
end)
526526

527-
it("should handle reusing the options without specifying the delimeter", function()
527+
it("should handle reusing the options without specifying the delimiter", function()
528528
local expected = {}
529529
expected[1] = {}
530530
expected[1].a = '"apple'

spec/parse_encode_spec.lua

+22-12
Original file line numberDiff line numberDiff line change
@@ -76,13 +76,29 @@ describe("csv encode", function()
7676
it("should handle " .. value, function()
7777
local jsonFile = loadFile("spec/json/" .. value .. ".json")
7878
local jsonDecode = cjson.decode(jsonFile)
79-
-- local parse = staecsv:ftcsv(contents, ",")
8079
local reEncoded = ftcsv.parse(ftcsv.encode(jsonDecode, ","), ",", {loadFromString=true})
81-
-- local f = csv.openstring(contents, {separator=",", header=true})
82-
-- local parse = {}
83-
-- for fields in f:lines() do
84-
-- parse[#parse+1] = fields
85-
-- end
80+
assert.are.same(jsonDecode, reEncoded)
81+
end)
82+
end
83+
end)
84+
85+
describe("csv encode without a delimiter", function()
86+
for _, value in ipairs(files) do
87+
it("should handle " .. value, function()
88+
local jsonFile = loadFile("spec/json/" .. value .. ".json")
89+
local jsonDecode = cjson.decode(jsonFile)
90+
local reEncoded = ftcsv.parse(ftcsv.encode(jsonDecode), ",", {loadFromString=true})
91+
assert.are.same(jsonDecode, reEncoded)
92+
end)
93+
end
94+
end)
95+
96+
describe("csv encode with a delimiter specified in options", function()
97+
for _, value in ipairs(files) do
98+
it("should handle " .. value, function()
99+
local jsonFile = loadFile("spec/json/" .. value .. ".json")
100+
local jsonDecode = cjson.decode(jsonFile)
101+
local reEncoded = ftcsv.parse(ftcsv.encode(jsonDecode, {delimiter="\t"}), {delimiter="\t", loadFromString=true})
86102
assert.are.same(jsonDecode, reEncoded)
87103
end)
88104
end
@@ -93,13 +109,7 @@ describe("csv encode without quotes", function()
93109
it("should handle " .. value, function()
94110
local jsonFile = loadFile("spec/json/" .. value .. ".json")
95111
local jsonDecode = cjson.decode(jsonFile)
96-
-- local parse = staecsv:ftcsv(contents, ",")
97112
local reEncodedNoQuotes = ftcsv.parse(ftcsv.encode(jsonDecode, ",", {onlyRequiredQuotes=true}), ",", {loadFromString=true})
98-
-- local f = csv.openstring(contents, {separator=",", header=true})
99-
-- local parse = {}
100-
-- for fields in f:lines() do
101-
-- parse[#parse+1] = fields
102-
-- end
103113
assert.are.same(jsonDecode, reEncodedNoQuotes)
104114
end)
105115
end

0 commit comments

Comments
 (0)