Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: How to link new object to existing objects? (many-to-many relationship) #359

Open
riyavsinha opened this issue Jan 21, 2025 · 4 comments
Labels
documentation Improvements or additions to documentation

Comments

@riyavsinha
Copy link

Summary

I've tried to understand and execute the Post-Tag example for my own use case with a many-to-many relationship, and I am not able to get working how to add a new Post, say, that uses existing tags.

In my code, I do:

variants = [/** list of model.Variants already fetched from a previous method **/]
job_data = models.Job(
            name=job_name,
            status="pending",
            submitted_by_id=user.id,
            variants=variants,
        )
job = await self.upsert(job_data)

and I get:

(sqlalchemy.dialects.postgresql.asyncpg.IntegrityError) <class 'asyncpg.exceptions.UniqueViolationError'>: duplicate key value violates unique constraint "pk_variants"
DETAIL:  Key (id)=(aa508a73-af8d-49ed-a1d2-c146321c910a) already exists.
[SQL: INSERT INTO variants (chr, pos, ref, alt, id, created_at, updated_at) VALUES ($1::VARCHAR, $2::INTEGER, $3::VARCHAR, $4::VARCHAR, $5::UUID, $6::TIMESTAMP WITH TIME ZONE, $7::TIMESTAMP WITH TIME ZONE)]
[parameters: [('chr15', 88569433, 'CTTTG', 'C', 'aa508a73-af8d-49ed-a1d2-c146321c910a', datetime.datetime(2025, 1, 21, 9, 59, 37, 865640, tzinfo=datetime.timezone.utc), datetime.datetime(2025, 1, 21, 9, 59, 37, 865653, tzinfo=datetime.timezone.utc)), ('chr2', 72149046, 'C', 'T', '4b0269e2-f196-415e-8116-d7304cd02b0b', datetime.datetime(2025, 1, 21, 9, 59, 37, 865658, tzinfo=datetime.timezone.utc), datetime.datetime(2025, 1, 21, 9, 59, 37, 865661, tzinfo=datetime.timezone.utc))]]
(Background on this error at: https://sqlalche.me/e/20/gkpj)

But I'm only trying to create a new Job, I'm not intending to create new variants. How do I just link the new job to the existing variants / the new post to the existing tags?

@riyavsinha riyavsinha added the documentation Improvements or additions to documentation label Jan 21, 2025
@cofin
Copy link
Member

cofin commented Jan 21, 2025

Without setting the match_fields on the upsert, it tries to match on every column. In this case, when not every field in your job matches an existing record, it tries to insert a new row. This fails because the natural key of the row is violated.

To fix this, you should set the match_fields value at the class level or on the call itself to upsert. Use something other than a Primary key (like in this case, maybe Job.name is unique). Give that a try and let me know.

Insert of upsert, you might also look at one of the alternative methods such as get_and_update or get_or_upsert

@cofin
Copy link
Member

cofin commented Jan 21, 2025

FWIW, here's a working service that allows you to add tags to "teams". This may be helpful because it's another example of a many-to-many. Notice how in this example we also use the as_unique Mixins on the tag to make sure what we don't incorrectly insert new records that already exist in Tag.

This piece is what's probably most important here.

data.tags.extend(
                    [
                        await m.Tag.as_unique_async(self.repository.session, name=tag_text, slug=slugify(tag_text))
                        for tag_text in tags_added
                    ],
                )

This will force a lookup to match for existing tags before inserting a new tag record.

@riyavsinha
Copy link
Author

Thank you for the response!

To fix this, you should set the match_fields value at the class level or on the call itself to upsert. Use something other than a Primary key (like in this case, maybe Job.name is unique). Give that a try and let me know.

Hm I'm not sure I understand, sorry; the Job (aka Post) is new and so doesn't need to be matched, just needs to be created. The things that needs to be matched is the Variants (aka Tags) that it is referencing. To form that constraint, (1) would I still add match_fields in the JobRepository and (2) would I add like match_fields = [Job.variant.id] then?

This may be helpful because it's another example of a many-to-many. Notice how in this example we also use the as_unique Mixins on the tag to make sure what we don't incorrectly insert new records that already exist in Tag.

So this works but I'm quite confused by this behavior :( I added the UniqueMixin, but that alone doesn't do anything, I have to use as_unique_async like:

uniq_vars =[await models.Variant.as_unique_async(self.repository.session, **{k:v for k, v in variant.__dict__.items() if k in ['chr', 'pos', 'ref', 'alt']}) for variant in variants]
job_data = models.Job(
            ...
            variants=uniq_vars,
        )
job = await self.upsert(job_data)

However, doesn't as_unique_async just re-return the object? What is different about the object given by that method, vs the Variant objects I had already fetched from a different method and passed to this one? The reason I write the pattern above with **{k:v for k, v in model.__dict__.items() ... } is because I have another model I have to do this for with like at least 10 different properties I have to check uniqueness for and which is already fetched also in an earlier validation step. Please let me know if theres something I'm fundamentally misunderstanding here.

Also, I had a UniqueConstraint set earlier as a table arg on Variant for [chr,pos,ref,alt] that I had to remove to get this to work, is that expected?

Thanks again for your patience

@riyavsinha
Copy link
Author

riyavsinha commented Jan 22, 2025

Ah, ok i think in general, I figured this out. There is something happening in my earlier validation step when I load the Variants causing the Variant to be treated as modified (using session.is_modified(variant)). Sorry for the confusion!

I would still find it helpful to have more documentation and guidance in docs around:

  1. Using only the UniqueMixin vs UniqueMixin + Table Arg Unique Constraints; does UniqueMixin compare to things in the db? or do I need to first fetch to get all objects in the session, and then use as_unique_async to add to the existing objects list?
  2. Best practices around using UniqueMixin --
    a) since it's instantiating an object in the session, it needs to have all the non-nullable fields it seems like, not just the fields that are required for checking uniqueness, correct? since otherwise an insert with objects created from as_unique_async are missing some of those db non nullable fields
    b) is there a cleaner way to do it for models with many columns where I don't want to re-enter the whole list of args into the fn signature? is there a way I can just pass in an obj I've already instantiated to just de-dupe it? (such as in my case) and avoid the **{k:v for k, v in model.__dict__.items() ... } pattern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants