-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgraded from beat 1.0.1 to 1.4.0 and beat has died twice in two days #210
Comments
To make things worse, the migrations are also broken, you cannot migrate back to 0001 initial. I have a feeling the issue is because of a squash migration, squash migrations are going to mess other peoples installations up if they try to go backwards and is something I normally try to avoid It looks like migration 0005 is a squash migration, but it doesn't blow up until it gets to migration 0001 |
The issue with the broken migration 0005 shows up if you try to go from 0006 to 0005: ./manage.py migrate django_celery_beat 0005 CommandError: More than one migration matches '0005' in app 'django_celery_beat'. Please be more specific. That is because 0005 is that squash migration, it seems to have messed with the order. If you do try to be more specific, that is when it blows up: ./manage.py migrate django_celery_beat 0005_add_solarschedule_events_choices_squashed_0009_merge_20181012_1416 django.db.utils.ProgrammingError: column django_celery_beat_periodictask.priority does not exist |
There are multiple 0006 and multiple 0005 migrations, this thing is a real mess and should be resolved. It looks like someone did try to resolve it, but you cannot reverse migrate or jump to a specific migration because so many re-used numbers. |
I also tried to empty all the beat related tables in the Django admin first before migrating back to 0001 and even then it blows up unfortunately. psycopg2.ProgrammingError: column django_celery_beat_periodictask.solar_id does not exist Basically I'm going to have to go into the db and delete the tables and reset the migration state of the django_celery_beat app and just tart again, it looks like the migrations are broken and it simply is not possible to migrate back. |
I'm happy to help fix the migrations. Version |
Thanks, in the mean time I've manually rolled back to beat version 1.0.1 and rolled back celery to 4.1.1, because this is a live production system it can't really go down. I'm going to do some experimentation in the new year to see if I can just upgrade celery but hold back beat for now, it would be nice to find out why beat keeps dying on me but it's a production system for a client so I can't really afford to mess with it too much. |
Anyone with celery 4.3 or master can verify this issue with celery beat master? |
After beat suddenly stopped running scheduled tasks after running fine all
night, we've been too scared to try beat 1.4.0 again unfortunately.
We've moved to Celery 4.3 since but are still using beat 1.0.1, also due to
the fact that beat 1.3.0 and 1.4.0 all have broken migrations and cannot be
migrated backwards.
Sadly unless the migrations are fixed, I would not suggest migrating beyond
beat 1.0.1 at this stage. I've been waiting for a 1.5.0 release to fix the
broken migrations, that should be the highest priority on the list as a
project with broken migrations is really really bad. If people need to
migrate their Django project back in production because of a failed
release, you cannot roll back the migrations which is just bad. It happened
to me and I had to manually "fix" the database and undo the migrations. I
was not impressed.
…On Tue, 23 Apr 2019 at 15:24, Asif Saif Uddin ***@***.***> wrote:
Anyone with celery 4.3 or master can verify this issue with celery beat
master?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#210 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHYWMVXTXVG3MPFPKTG3PDPRZ6PRANCNFSM4GLF77TQ>
.
|
I can see a migration change here, v1.4.0...master could you please try on local using from the master branch? |
I had a quick look at the diff between v1.4.0...master and the only
migration I can see is another addition adding a field. That won't be the
fix we are looking for unfortunately.
See issue #217 about the broken migrations, they don't seem to be fixed and
it's holding people back from accepting the new version. People don't feel
safe if they can't migrate backwards.
As I recall looking into it earlier this year, people have messed with
migration history, and/or squashing migrations. You basically cannot
migrate backwards now.
…On Tue, 23 Apr 2019 at 15:58, Asif Saif Uddin ***@***.***> wrote:
I can see a migration change here, v1.4.0...master
<v1.4.0...master>
could you please try on local using from the master branch?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#210 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHYWMX272O2H76LNPTRWCDPR2COXANCNFSM4GLF77TQ>
.
|
Basically to test this, start a project with beat 1.0.1, then upgrade to beat 1.4.0 and apply the migrations, then migrate the django_celery_beat app back to migration 0001, it will fail. I've tried migrating backwards in steps but it will also fail. |
Actually issue #217 doesn't accurately explain the issue, the comments in this issue do. The issue #217 seems to end with "don't edit migrations that have already been applied". I beg to differ, sure that is the case normally, but if the migrations are broken then you basically have no choice to fix them to resolve this mess in this particular case. |
One could clean up the migrations, and create a new one that may need some insane logic to fix past mistakes. For new installs it would be fine since there would be no data. Some people may just need to rebuild their schedules |
@robvdl can you come with a solution? |
I'm happy to work through this (As I've said in #217). I just want to make sure my time isn't wasted and that I'll get active support in getting it figured out and merged. |
I have dedicated time to help maintain these projects, so, please proceed and mention me in the new PR |
Awesome. Thanks! |
@robvdl, can we start with this?
You should be able to migrate frontwards and backwards. The next question is migrations aside, does the beat process still die? Also, not sure on your timezone settings, I needed to add this:
To my celery app |
I would really like to avoid that. If necessary, we can always use the migrations.RunPython to manually fix things if we need to do something crazy. |
I agree. First step is to solve the problem, then find out how to make it work. In theory if you are in a working state, tweaking existing migrations wont hurt assuming the final state is the same. |
@auvipy and @liquidpele, do you have any idea what versions you have in the wild? Doing some testing I found:
also that a fresh install of so worst case if someone is on |
@robvdl, I've had it running to 24 hours and processed about 250k tasks without any issues. Would love to know if you did (realizing the crashing and migrations have nothing to do with each other) |
Sorry I won't be able to get a chance to try this again for a while, but I will eventually. One issue is that the servers this runs on are still on trusty and we've been trying to convince the client for a year they need upgrading. |
I've fixed the migration issue and added tests for it in my PR since I was adding a new migration anyway. #241 I was easily able to reproduce issues by just migrating back to 0001 and then forward again on django 1.11.20. @justmobilize care to retest with my branch? |
@liquidpele, you deleted migrations, meaning if someone is on another version they will be totally blocked from moving forward (for example my production environment is on 1.1.1 which ends on 0005). I would highly recommend tossing Thoughts? |
You could then after that, make a squash |
I like this solution :) I'm going to give this another shot in the next few weeks when I have a chance. |
We recently upgraded a UAT system from Celery 4.1.1 to 4.2.1 and Beat to 1.4.0
At first glance everything was fine, but twice now has beat process randomly just died and we've had to restart it.
The reason for the upgrade was to switch to the redis-py library version 3, with celery 4.1 we had to pin redis-py to version 2 since the recent redis-py 3 release broke things initially.
Anyway, beat dying has never happened before until we upgraded it, I'll see if I can get some additional information but in the mean time we have to do a rollback to Celery 4.1.1 and Beat 1.0.1
The text was updated successfully, but these errors were encountered: