Thanks for your code! I found that when I do training, the GPU are not totally utilized. So it there is way to add batch to train more pairs at one iter?