Fix mutable attributes being shared between processors and datasets #366

dale-wahl · 2023-05-30T14:51:58Z

Sometimes options and/or parameters appear to be updated. I have yet to be able to consistently reproduce some of these errors, but can find examples.

This count posts analysis for example somehow picked up the add_relative option and had it set to True. That should never happen to a Twitter dataset. Unfortunately, I cannot recreate the error (though somehow a similar processor ran with add_relative as False, even though that option should not even appear to select).

I thought it was connect to this issue. But that does not seem to be the case as I can recreate that environment and there are still random columns appearing. My best guess there is now that something in the frontend jinja is holding data from a previous dataset or options that appears when get_columns fails.

This PR can be merged (at this time), but I have not figured out how to test that it actually solves the problem.

dale-wahl · 2023-05-30T14:53:49Z

backend/abstract/processor.py

 	#: Is this processor running 'within' a preset processor?
 	is_running_in_preset = False

 	#: This will be defined automatically upon loading the processor. There is
 	#: no need to override manually
 	filepath = None

+	# def __init__(self, logger, job, queue=None, manager=None, modules=None):


I left this year for the moment as a comment. It does not seem necessary for the base processor class to have either options or parameters as they are updated when needed.

options in particular is always retrieved by the get_options method which correctly returns a blank dictionary if the child class does not define options itself.

That makes sense, most of the set-up for BasicProcessor happens in work() anyway since it is always instantiated in a separate thread. So insofar as these properties need to be set up that is where it would happen and it is also where self.parameters is set.

dale-wahl · 2023-05-30T14:56:11Z

To remind myself why this matters:

class BigClass():
    parameters = {"default": "Big"}
print(f"Original BigClass parameters: {BigClass.parameters}\n")

# Sub Class inherits from BigClass
class SubA(BigClass):
    pass
# Instantiate this Sub Class and update the dictionary
sub_a = SubA()
sub_a.parameters.update({"test": "chaos!"})
print("sub_a object instantiated and sub_a.parameters.update({\"test\": \"chaos!\"}) called")
print(f"SubA 'update' parameters points to BigClass parameters: {sub_a.parameters}")
print(f"Modified BigClass parameters changed by SubA 'update' method: {BigClass.parameters}\n")

# Now reassign sub_a parameters to new dictionary
sub_a.parameters = {"completely_new": "SubA"}
print("sub_a assignment via sub_a.parameters = {\"completely_new\": \"SubA\"}")
print(f"SubA parameters are assigned not updated: {sub_a.parameters}")
print(f"No change to BigClass parameters: {BigClass.parameters}")
print("sub_a and BigClass parameters have been decoupled")

Output:

Original BigClass parameters: {'default': 'Big'}

sub_a object instantiated and sub_a.parameters.update({"test": "chaos!"}) called
SubA 'update' parameters points to BigClass parameters: {'default': 'Big', 'test': 'chaos!'}
Modified BigClass parameters changed by SubA 'update' method: {'default': 'Big', 'test': 'chaos!'}

sub_a assignment via sub_a.parameters = {"completely_new": "SubA"}
SubA parameters are assigned not updated: {'completely_new': 'SubA'}
No change to BigClass parameters: {'default': 'Big', 'test': 'chaos!'}
sub_a and BigClass parameters have been decoupled

stijn-uva · 2023-10-12T08:38:34Z

LGTM once the conflicts are resolved (took a look myself but I'd prefer if you do it @dale-wahl, just so I don't make the wrong choice of what attributes to declare)

dale-wahl · 2023-10-12T14:27:38Z

Merged and tested with no noted ill effects.

Those can be discovered later 😁

init mutable attributes

d93a70b

dale-wahl commented May 30, 2023

View reviewed changes

stijn-uva marked this pull request as ready for review September 21, 2023 14:18

dale-wahl added 4 commits October 12, 2023 13:32

build to use local directory

e284343

Merge branch 'master' into mutable_attributes

6e74af2

quick fix docker-compose_build.yml

26c4d0e

Merge branch 'master' into mutable_attributes

1d1cabd

dale-wahl merged commit 2ef3027 into master Oct 12, 2023

dale-wahl deleted the mutable_attributes branch October 12, 2023 14:28

stijn-uva added this to the 1.37 milestone Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mutable attributes being shared between processors and datasets #366

Fix mutable attributes being shared between processors and datasets #366

dale-wahl commented May 30, 2023 •

edited

Loading

dale-wahl May 30, 2023

stijn-uva Jun 2, 2023

dale-wahl commented May 30, 2023 •

edited

Loading

stijn-uva commented Oct 12, 2023

dale-wahl commented Oct 12, 2023

Fix mutable attributes being shared between processors and datasets #366

Fix mutable attributes being shared between processors and datasets #366

Conversation

dale-wahl commented May 30, 2023 • edited Loading

dale-wahl May 30, 2023

Choose a reason for hiding this comment

stijn-uva Jun 2, 2023

Choose a reason for hiding this comment

dale-wahl commented May 30, 2023 • edited Loading

stijn-uva commented Oct 12, 2023

dale-wahl commented Oct 12, 2023

dale-wahl commented May 30, 2023 •

edited

Loading

dale-wahl commented May 30, 2023 •

edited

Loading