-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss is 0 in step 200 and assertion error. #4
Comments
Not a trivial question at all. Thanks for your interest! The evaluation error is likely a product of the optimizer going off the rails. At a high level, I would first check if the data are correct and then move on to the optimization. A few comments and questions so that I can get a better sense of what might be wrong.
|
I'm using Titan Xp. It is
sometimes loss is NaN in step 100. I checked my doc_key in my jsonlines is not I tried Chinese data and It works... Can you tell me what is |
Hmm this is quite odd. The |
I'm trying to your code every day..
Can you explain this Assertion? |
From a quick glance, I think so. I suspect getting rid of it will mess up the evaluation, but you can still try it out. The official coref evaluation is rather complicated in that it needs to call perl scripts. It might be a good idea to turn those off ( The problem seems to be in |
Thank you so much for your effort.. |
Closing due to inactivity. Please feel free to reopen if something comes up. Thanks! |
As you think, The problem seems to be in While I was checking the code, I found something I didn't understand.
I think
because, token[0] is and.... I face Can you re-open this issue? |
@fairy-of-9 Did you try training with that change? CCing @freesunshine0316 who might be interested. |
There is not change in code. |
I want to know which document It's train.py
Can you give me some advice on this? |
I'm AFK for the weekend. Off the top of my head, one way would be to add an integer document ID to input_props. You can then print or tf.Print that with the loss in both the dev and train loops. I can take a closer look when I get back. |
@fairy-of-9 |
@mandarjoshi90 @freesunshine0316 |
Hi, I have a short question regarding training this model on another language, and I don't want to create a separate issue for this. Is it required to change the vocab.txt file and fill it with the chosen language's words? (I think this one is quite trivial :D, but this is my first experience using BERT/SpanBERT) |
I'm really really sorry to ask you a trivial question.
train.py train_bert_base
works well!I have Korean data in conll 2012 form.
they are work in e2e-coref model.
I want to apply Korean data to your model.
So I did run
minimize.py
, andtrain.py
But Loss is 0 in step 200 and assertion error in eval time.
assertion error log
It's my
experiments.conf
I changed the BERT model to the multi_cased.

I editted num_docs to 2411.
(It's result of minimize.py. Is this correct?)
and default
ffnn_size and max_training_sentences
cause memory error.so I editted ffnn_size and max_training_sentences.
Is there anything I miss?
The text was updated successfully, but these errors were encountered: