AIPO: Improving Training Objective for Iterative Preference Optimization

...

Yaojie Shen, Xinyao Wang, Yulei Niu, Ying Zhou, Lexin Tang, Libo Zhang, Fan Chen, Longyin Wen