Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior of pgtype.JSONBArray and json.RawMessage #149

Open
EmilLaursen opened this issue Mar 9, 2022 · 4 comments
Open

Unexpected behavior of pgtype.JSONBArray and json.RawMessage #149

EmilLaursen opened this issue Mar 9, 2022 · 4 comments

Comments

@EmilLaursen
Copy link
Contributor

I am setting a pgtype.JSONBArray value from a value of type []json.RawMessage, and encountered some unexpected behavior. I am not sure if this is a bug.

This first test fails:

func TestPgxArraySet1(t *testing.T) {
	data := []json.RawMessage{
		json.RawMessage(`{"lol":1}`),
		json.RawMessage(`{"lol":2}`),
	}
	var nas pgtype.JSONBArray
	err := nas.Set(data)
	if err != nil {
		t.Fatal(err)
	}
	assert.Len(t, nas.Elements, len(data))
	if len(nas.Elements) != len(data) {
		t.Errorf("data length: %v, nas length: %v", len(data), len(nas.Elements))
	}
}

What happens is that data is interpreted as an array of single integer values, one integer for each byte in the two json values. So nas.Elements has length 18.
But changing one of the JSON values a bit will make the test pass:

func TestPgxArraySet2(t *testing.T) {
	data := []json.RawMessage{
		json.RawMessage(`{"lol":2}`),
		json.RawMessage(`{"lol":{"x":1}}`),
	}
	var nas pgtype.JSONBArray
	err := nas.Set(data)
	if err != nil {
		t.Fatal(err)
	}
	assert.Len(t, nas.Elements, len(data))
	if len(nas.Elements) != len(data) {
		t.Errorf("data length: %v, nas length: %v", len(data), len(nas.Elements))
	}
}

This is with github.com/jackc/pgtype v1.8.1 and go 1.17. Casting the type to [][]byte makes everything work just fine. I assume this is because [][]byte is handled by .Set specifically, where []json.RawMessage will fall through to the default reflection based conversion in .Set.

Would it be possible to add a case for []json.RawMessage in .Set?

@jackc
Copy link
Owner

jackc commented Mar 12, 2022

Would it be possible to add a case for []json.RawMessage in .Set?

If that is the solution then it should be possible to add it to typed_array_gen.sh and regenerate the array struct.

@EmilLaursen
Copy link
Contributor Author

As mentioned in the PR, your suggestion will fix the test cases for json.RawMessage. However, the fundamental issue remains, in the sense that defining any other type alias for []byte such as

type somebytealias []byte

and substituting json.RawMessage with this type, will make the test cases fail once again. So it appears there might be a problem in the reflect based case in .Set?

I am happy to help investigate this, but I do not feel I understand the ramifications of tinkering with the typed_array.go.erb template, as this is used for several types. I have tried to debug these test cases:

type somebytealias []byte

func TestJSONBArraySetAlias(t *testing.T) {
	successfulTests := []struct {
		source interface{}
		result pgtype.JSONBArray
	}{
		{source: []somebytealias{somebytealias(`{"foo":12}`), somebytealias(`{"bar":2}`)}, result: pgtype.JSONBArray{
			Elements:   []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":2}`), Status: pgtype.Present}},
			Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
			Status:     pgtype.Present,
		}},
		{source: []somebytealias{somebytealias(`{"foo":1}`), somebytealias(`{"bar":{"x":2}}`)}, result: pgtype.JSONBArray{
			Elements:   []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":{"x":2}}`), Status: pgtype.Present}},
			Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
			Status:     pgtype.Present,
		}},
		{source: []somebytealias{somebytealias(`{"foo":1}`), somebytealias(`{"bar":2}`)}, result: pgtype.JSONBArray{
			Elements:   []pgtype.JSONB{pgtype.JSONB{Bytes: []byte(`{"foo":1}`), Status: pgtype.Present}, pgtype.JSONB{Bytes: []byte(`{"bar":2}`), Status: pgtype.Present}},
			Dimensions: []pgtype.ArrayDimension{pgtype.ArrayDimension{Length: 2, LowerBound: 1}},
			Status:     pgtype.Present,
		}},
	}

	for i, tt := range successfulTests {
		var d pgtype.JSONBArray
		err := d.Set(tt.source)
		if err != nil {
			t.Errorf("%d: %v", i, err)
		}

		if !reflect.DeepEqual(d, tt.result) {
			var xs []string
			for _, e := range d.Elements {
				xs = append(xs, string(e.Bytes))
			}
			t.Errorf("%d: expected %+v to convert to %+v, but it was %+v, raw: %v", i, tt.source, tt.result, d, xs)
		}
	}
}

And at least I can share some observations. The two first cases are different from the third.

  1. In the third case, the two elements {"foo":1} and {"bar":2} both have length 9, and what is set on dst.Elements is
[123 34 102 111 111 34 58 49 125 123 34 98 97 114 34 58 50 125]

which is the concatenation of the two inputs.

  1. Changing 1 to 12, which is the first case, gives us this output
["eyJmb28iOjEyfQ==" "eyJiYXIiOjJ9"]

which is the base64 encoding of the two inputs as strings. This is probably due to json.Marshal being used in the pgtype.JSON type at some point when calling dst.Elements[index].Set in the setRecursive code, as this is the default behavior on []byte types. The same happens in the second case.

@jackc
Copy link
Owner

jackc commented Mar 22, 2022

The code that tries to automatically handle unknown types is hard to reason about. The JSON code path is even harder because it automatically calls json.Marshal with anything that isn't explicitly recognized.

That conflicts with the try to figure out if the unknown type can be converted to something known. For example:

type mytype []byte

func (s mytype) MarshalJSON() ([]byte, error) {
	// ...
}

Should mytype be marshaled or should it be treated as a []byte? And how to handle that while not breaking any of the other convertible to JSON types.

JSONBArray gets even more complicated because there is the extra layer of the array.

Personally, I'm inclined to be content with the partial solution in your PR rather than delve into that nest of complexity. But your welcome to dig into it further if you want.

FWIW, tricky issues like this were one of the primary motivations behind deciding to start development on pgx v5. I'm hoping to simplify the structure of the type conversion system such that this type issue isn't so complicated.

@EmilLaursen
Copy link
Contributor Author

I think I would be hard pressed to make much progress here :) Seeing as you are working on pgx v5, I think it is better to look for the future. I am curious to look into the new type conversion code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants