You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ftcsv, a fairly fast csv library written in pure Lua. It's been tested with LuaJIT 2.0/2.1 and Lua 5.2
3
+
4
+
It works well for CSVs that can easily be fully loaded into memory (easily up to a hundred MB). Currently, there isn't a "large" file mode with proper readers and writers for ingesting CSVs in bulk with a fixed amount of memory. It correctly handles both `\n` (LF) and `\r\n` (CRLF) line endings (ie it should work with Windows and Mac/Linux line endings) and has UTF-8 support.
5
+
6
+
7
+
8
+
## Installing
9
+
You can either grab `ftcsv.lua` from here or install via luarocks:
10
+
11
+
```
12
+
luarocks install ftcsv
13
+
```
14
+
15
+
16
+
## Parsing
17
+
###`ftcsv.parse(fileName, delimiter [, options])`
18
+
19
+
ftcsv will load the entire csv file into memory, then parse it in one go, returning a lua table with the parsed data. It has only two required parameters - a file name and delimiter (limited to one character). A few optional parameters can be passed in via a table (examples below).
The following are optional parameters passed in via the third argument as a table. For example if you wanted to `loadFromString` and not use `headers`, you could use the following:
local actual = ftcsv.parse("a,b,c\r\napple,banana,carrot", ",", options)
48
+
```
49
+
50
+
-`fieldsToKeep`
51
+
52
+
If you only want to keep certain fields from the CSV, send them in as a table-list and it should parse a little faster and use less memory.
53
+
54
+
Note: If you want to keep a renamed field, put the new name of the field in `fieldsToKeep`:
55
+
56
+
```lua
57
+
local options = {loadFromString=true, fieldsToKeep={"a","f"}, rename={["c"] = "f"}}
58
+
local actual = ftcsv.parse("a,b,c\r\napple,banana,carrot\r\n", ",", options)
59
+
```
60
+
61
+
-`headers`
62
+
63
+
Set `headers` to `false` if the file you are reading doesn't have any headers. This will cause ftcsv to create indexed tables rather than a key-value tables for the output.
64
+
65
+
```lua
66
+
local options = {loadFromString=true, headers=false}
67
+
local actual = ftcsv.parse("apple>banana>carrot\ndiamond>emerald>pearl", ">", options)
68
+
```
69
+
70
+
Note: Header-less files can still use the `rename` option and after a field has been renamed, it can specified as a field to keep. The `rename` syntax changes a little bit:
71
+
72
+
```lua
73
+
local options = {loadFromString=true, headers=false, rename={"a","b","c"}, fieldsToKeep={"a","b"}}
74
+
local actual = ftcsv.parse("apple>banana>carrot\ndiamond>emerald>pearl", ">", options)
75
+
```
76
+
77
+
In the above example, the first field becomes 'a', the second field becomes 'b' and so on.
78
+
79
+
For all tested examples, take a look in /spec/feature_spec.lua
ftcsv can also take a lua table and turn it into a text string to be written to a file. It has two required parameters, an inputTable and a delimiter. You can use it to write out a file like this:
85
+
```lua
86
+
localfileOutput=ftcsv.encode(users, ",")
87
+
localfile=assert(io.open("ALLUSERS.csv", "w"))
88
+
file:write(fileOutput)
89
+
file:close()
90
+
```
91
+
92
+
### Options
93
+
-`fieldsToKeep`
94
+
95
+
if `fieldsToKeep` is set in the encode process, only the fields specified will be written out to a file.
96
+
97
+
```lua
98
+
local output = ftcsv.encode(everyUser, ",", {fieldsToKeep={"Name", "Phone", "City"}})
99
+
```
100
+
101
+
102
+
103
+
## Performance
104
+
I did some basic testing and found that in lua, if you want to iterate over a string character-by-character and look for single chars, `string.byte` performs better than `string.sub`. As such, ftcsv iterates over the whole file and does byte compares to find quotes and delimiters and then generates a table from it. If you have thoughts on how to improve performance (either big picture or specifically within the code), create a GitHub issue - I'd love to hear about it!
105
+
106
+
107
+
108
+
## Contributing
109
+
Feel free to create a new issue for any bugs you've found or help you need. If you want to contribute back to the project please do the following:
110
+
1. Fork the repo
111
+
2. Create a new branch
112
+
3. Push your changes to the branch
113
+
4. Run the test suite and make sure it still works
114
+
5. Submit a pull request
115
+
6. ???
116
+
7. Enjoy the changes made to the repo!
117
+
118
+
119
+
120
+
## Licenses
121
+
- The main library is licensed under the MIT License. Feel free to use it!
122
+
- Some of the test CSVs are from [csv-spectrum](https://github.com/maxogden/csv-spectrum) (BSD-2-Clause) which includes some from [csvkit](https://github.com/wireservice/csvkit) (MIT License)
@@ -82,12 +83,7 @@ local function createHeaders(line, rename, fieldsToKeep)
82
83
headers[i] =line[i]
83
84
end
84
85
end
85
-
iffieldsToKeep~=nilthen
86
-
fori=1, #fieldsToKeepdo
87
-
fieldsToKeep[fieldsToKeep[i]] =true
88
-
end
89
-
end
90
-
returnheaders, 0, true, fieldsToKeep
86
+
returnheaders, 0, true
91
87
end
92
88
93
89
-- main function used to parse
@@ -118,6 +114,9 @@ function ftcsv.parse(inputFile, delimiter, options)
118
114
ifoptions.fieldsToKeep~=nilthen
119
115
assert(type(options.fieldsToKeep) =="table", "ftcsv only takes in a list (as a table) for the optional parameter 'fieldsToKeep'. You passed in '" ..tostring(options.fieldsToKeep) .."' of type '" ..type(options.fieldsToKeep) .."'.")
120
116
ofieldsToKeep=options.fieldsToKeep
117
+
ifheader==falsethen
118
+
assert(next(rename) ~=nil, "ftcsv can only have fieldsToKeep for header-less files when they have been renamed. Please add the 'rename' option and try again.")
119
+
end
121
120
end
122
121
ifoptions.loadFromString~=nilthen
123
122
assert(type(options.loadFromString) =="boolean", "ftcsv only takes a boolean value for optional parameter 'loadFromString'. You passed in '" ..tostring(options.loadFromString) .."' of type '" ..type(options.loadFromString) .."'.")
@@ -199,10 +198,32 @@ function ftcsv.parse(inputFile, delimiter, options)
199
198
doubleQuoteEscape=createNewField(inputString, quote, fieldStart, i, outResults[lineNum], headerField[fieldNum], doubleQuoteEscape, fieldsToKeep)
200
199
201
200
-- if we have headers then we gotta do something about it
-- files without headers, but with a rename need to be handled too!
212
+
if#rename>0then
213
+
forj=1, math.max(#rename, #headerField) do
214
+
headerField[j] =rename[j]
215
+
-- this is an odd case of where there are certain fields to be kept
216
+
iffieldsToKeep==nilorfieldsToKeep[rename[j]] then
217
+
outResults[1][rename[j]] =outResults[1][j]
218
+
end
219
+
-- print("J", j)
220
+
outResults[1][j] =nil
221
+
end
222
+
end
223
+
end
204
224
end
205
225
226
+
-- incrememnt for new line
206
227
lineNum=lineNum+1
207
228
outResults[lineNum] = {}
208
229
fieldNum=1
@@ -314,9 +335,9 @@ function ftcsv.encode(inputTable, delimiter, options)
314
335
-- grab the headers from the options if they are there
315
336
localheaders=nil
316
337
ifoptionsthen
317
-
ifoptions.headers~=nilthen
318
-
assert(type(options.headers) =="table", "ftcsv only takes in a list (as a table) for the optional parameter 'headers'. You passed in '" ..tostring(options.headers) .."' of type '" ..type(options.headers) .."'.")
319
-
headers=options.headers
338
+
ifoptions.fieldsToKeep~=nilthen
339
+
assert(type(options.fieldsToKeep) =="table", "ftcsv only takes in a list (as a table) for the optional parameter 'fieldsToKeep'. You passed in '" ..tostring(options.headers) .."' of type '" ..type(options.headers) .."'.")
0 commit comments