报错:ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to call `tokenizer.padding_side = 'left'` before tokenizing the input.
但是已经把grpo_trainer.py里所有的padding_side都=left了,还是报错,求解