Skip to content

Commit 2f50943

Browse files
committed
Add changelog for 1.2.1
1 parent 6ed6fae commit 2f50943

File tree

1 file changed

+78
-68
lines changed

1 file changed

+78
-68
lines changed

README.md

Lines changed: 78 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,17 @@ Ingest streaming data into PostgresSQL or Export data from PostgreSQL and transf
77

88
## what are you talking about ?
99

10-
Well first you have to know that PostgreSQL has not-so-well-known mechanism that helps when importing into PostgreSQL from a source (*copy-in*)
11-
or exporting to a sink from PostgreSQL (*copy-out*)
10+
Well first you have to know that PostgreSQL has not-so-well-known mechanism that helps when importing into PostgreSQL from a source (_copy-in_)
11+
or exporting to a sink from PostgreSQL (_copy-out_)
1212

1313
You should first go and get familiar with the [pg-copy-streams](https://github.com/brianc/node-pg-copy-streams) module that does
1414
the heavy lifting of handling the COPY part of the protocol flow.
1515

1616
## what does this module do ?
1717

18-
When dealing with the COPY mechanism, you can use different formats for *copy-out* or *copy-in* : text, csv or binary.
18+
When dealing with the COPY mechanism, you can use different formats for _copy-out_ or _copy-in_ : text, csv or binary.
1919

20-
The text and csv formats are interesting but they have some limitations due to the fact that they are text based, need field separators, escaping, etc. Have you ever been in the CSV hell ?
20+
The text and csv formats are interesting but they have some limitations due to the fact that they are text based, need field separators, escaping, etc. Have you ever been in the CSV hell ?
2121

2222
The PostgreSQL documentation states : Many programs produce strange and occasionally perverse CSV files, so the file format is more a convention than a standard. Thus you might encounter some files that cannot be imported using this mechanism, and COPY might produce files that other programs cannot process.
2323

@@ -26,9 +26,10 @@ Do you want to go there ? If you take the blue pill, then this module might be f
2626
It can be used to parse and deparse the PostgreSQL binary streams that are made available by the `pg-copy-streams` module.
2727

2828
The main API is called `transform` an tries to hide many of those details. It can be used to easily do non trivial things like :
29-
- transforming rows
30-
- expanding on the number of rows
31-
- forking rows into several databases at the same time, with the same of different structures
29+
30+
- transforming rows
31+
- expanding on the number of rows
32+
- forking rows into several databases at the same time, with the same of different structures
3233

3334
## Example
3435

@@ -61,83 +62,89 @@ Table C has the simple structure
6162
CREATE TABLE generated (body text);
6263
```
6364

64-
And you want to fill it, for each source row, with a number `id` of rows (expanding the number of rows), with a body of "BODY: " + description.
65+
And you want to fill it, for each source row, with a number `id` of rows (expanding the number of rows), with a body of "BODY: " + description.
6566

6667
After all this is done, you want to add a line in the `generated` table with a body of "COUNT: " + total number of rows inserted (not counting this one)
6768

6869
Here is a code that will do just this.
6970

7071
```js
71-
var pg = require('pg');
72-
var through2 = require('through2');
73-
var copyOut = require('pg-copy-streams').to;
74-
var copyIn = require('pg-copy-streams').from;
75-
var pgCopyTransform = require('pg-copy-streams-binary').transform;
76-
77-
var client = function(dsn) {
78-
var client = new pg.Client(dsn);
79-
client.connect();
80-
return client;
72+
var pg = require('pg')
73+
var through2 = require('through2')
74+
var copyOut = require('pg-copy-streams').to
75+
var copyIn = require('pg-copy-streams').from
76+
var pgCopyTransform = require('pg-copy-streams-binary').transform
77+
78+
var client = function (dsn) {
79+
var client = new pg.Client(dsn)
80+
client.connect()
81+
return client
8182
}
8283

83-
var dsnA = null; // configure database A connection parameters
84-
var dsnB = null; // configure database B connection parameters
85-
var dsnC = null; // configure database C connection parameters
84+
var dsnA = null // configure database A connection parameters
85+
var dsnB = null // configure database B connection parameters
86+
var dsnC = null // configure database C connection parameters
8687

87-
var clientA = client(dsnA);
88-
var clientB = client(dsnB);
89-
var clientC = client(dsnC);
88+
var clientA = client(dsnA)
89+
var clientB = client(dsnB)
90+
var clientC = client(dsnC)
9091

9192
var AStream = clientA.query(copyOut('COPY item TO STDOUT BINARY'))
92-
var BStream = clientB.query(copyIn ('COPY product FROM STDIN BINARY'))
93-
var CStream = clientB.query(copyIn ('COPY generated FROM STDIN BINARY'))
93+
var BStream = clientB.query(copyIn('COPY product FROM STDIN BINARY'))
94+
var CStream = clientB.query(copyIn('COPY generated FROM STDIN BINARY'))
9495

9596
var transform = through2.obj(
96-
function(row, _, cb) {
97-
var id = parseInt(row.ref.split(':')[0]);
98-
var d = new Date('1999-01-01T00:00:00Z');
99-
d.setDate(d.getDate() + id);
97+
function (row, _, cb) {
98+
var id = parseInt(row.ref.split(':')[0])
99+
var d = new Date('1999-01-01T00:00:00Z')
100+
d.setDate(d.getDate() + id)
100101
count++
101-
this.push([0,
102+
this.push([
103+
0,
102104
{ type: 'int4', value: id },
103105
{ type: 'text', value: row.ref.split(':')[1] },
104106
{ type: 'text', value: row.description.toLowerCase() },
105107
{ type: 'timestamptz', value: d },
106-
{ type: '_int2', value: [ [ id, id+1 ], [ id+2, id+3 ] ] }
108+
{
109+
type: '_int2',
110+
value: [
111+
[id, id + 1],
112+
[id + 2, id + 3],
113+
],
114+
},
107115
])
108116
while (id > 0) {
109117
count++
110-
this.push([1,
111-
{ type: 'text', value: 'BODY: ' + row.description }
112-
]);
113-
id--;
118+
this.push([1, { type: 'text', value: 'BODY: ' + row.description }])
119+
id--
114120
}
115121
cb()
116122
},
117-
function(cb) {
118-
this.push([1,
119-
{ type: 'text', value: 'COUNT: ' + count}
120-
])
123+
function (cb) {
124+
this.push([1, { type: 'text', value: 'COUNT: ' + count }])
121125
cb()
122126
}
123-
);
127+
)
124128

125-
var count = 0;
129+
var count = 0
126130
var pct = pgCopyTransform({
127-
mapping: [{key:'id',type:'int4'}, {key:'ref',type:'text'},{key:'description',type:'text'}],
131+
mapping: [
132+
{ key: 'id', type: 'int4' },
133+
{ key: 'ref', type: 'text' },
134+
{ key: 'description', type: 'text' },
135+
],
128136
transform: transform,
129137
targets: [BStream, CStream],
130-
});
138+
})
131139

132-
pct.on('close', function() {
140+
pct.on('close', function () {
133141
// Done !
134-
clientA.end();
135-
clientB.end();
136-
clientC.end();
142+
clientA.end()
143+
clientB.end()
144+
clientC.end()
137145
})
138146

139-
AStream.pipe(pct);
140-
147+
AStream.pipe(pct)
141148
```
142149

143150
The `test/transform.js` test does something along these lines to check that it works.
@@ -207,7 +214,6 @@ default: true
207214
This option can be used to not send the header that PostgreSQL expects at the end of COPY session.
208215
You could use this if you want to unpipe this stream pipe another one that will send more data and maybe finish the COPY session.
209216

210-
211217
## API for Parser
212218

213219
### options.mapping
@@ -225,27 +231,33 @@ When `mapping` is not given, the Parser will push rows as arrays of Buffers.
225231

226232
For all supported types, their corresponding array version is also supported.
227233

228-
* bool
229-
* bytea
230-
* int2, int4
231-
* float4, float8
232-
* text
233-
* json
234-
* timestamptz
234+
- bool
235+
- bytea
236+
- int2, int4
237+
- float4, float8
238+
- text
239+
- json
240+
- timestamptz
235241

236242
Note that when types are mentioned in the `mapping` option, it should be stricly equal to one of theses types. pgadmin might sometimes mention aliases (like integer instead of int4) and you should not use these aliases.
237243

238-
The types for array (one or more dimentions) corresponds to the type prefixed with an underscore. So an array of int4, int4[], needs to be referenced as _int4 without any mention of the dimensions. This is because the dimension information is embedded in the binary format.
244+
The types for array (one or more dimentions) corresponds to the type prefixed with an underscore. So an array of int4, int4[], needs to be referenced as \_int4 without any mention of the dimensions. This is because the dimension information is embedded in the binary format.
239245

246+
## changelog
247+
248+
### version 1.2.1 - published 2020-05-29
249+
250+
- Fix a compatibility bug introduced via `pg-copy-streams` 3.0. The parser can now handle rows that span across several stream chunks
251+
- Migration of tests to mocha
240252

241253
## Warnings & Disclaimer
242254

243255
There are many details in the binary protocol, and as usual, the devil is in the details.
244-
* Currently, operations are considered to happen on table WITHOUT OIDS. Usage on table WITH OIDS has not been tested.
245-
* In Arrays null placeholders are not implemented (no spot in the array can be empty).
246-
* In Arrays, the first element of a dimension is always at index 1.
247-
* Errors handling has not yet been tuned so do not expect explicit error messages
248256

257+
- Currently, operations are considered to happen on table WITHOUT OIDS. Usage on table WITH OIDS has not been tested.
258+
- In Arrays null placeholders are not implemented (no spot in the array can be empty).
259+
- In Arrays, the first element of a dimension is always at index 1.
260+
- Errors handling has not yet been tuned so do not expect explicit error messages
249261

250262
The PostgreSQL documentation states it clearly : "a binary-format file is less portable across machine architectures and PostgreSQL versions".
251263
Tests are trying to discover issues that may appear in between PostgreSQL version but it might not work in your specific environment.
@@ -255,9 +267,9 @@ Use it at your own risks !
255267

256268
## External references
257269

258-
* [COPY documentation, including binary format](https://www.postgresql.org/docs/current/static/sql-copy.html)
259-
* [send/recv implementations for types in PostgreSQL](https://github.com/postgres/postgres/tree/master/src/backend/utils/adt)
260-
* [default type OIDs in PostgreSQL catalog](https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_type.h)
270+
- [COPY documentation, including binary format](https://www.postgresql.org/docs/current/static/sql-copy.html)
271+
- [send/recv implementations for types in PostgreSQL](https://github.com/postgres/postgres/tree/master/src/backend/utils/adt)
272+
- [default type OIDs in PostgreSQL catalog](https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_type.h)
261273

262274
## Acknowledgments
263275

@@ -286,5 +298,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
286298
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
287299
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
288300
THE SOFTWARE.
289-
290-

0 commit comments

Comments
 (0)