Jeremie Harris chats empirical alignment research, recursive reward modeling, and a possible alignment MVP with Jan Leike, the head of AI alignment at OpenAI.
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track:
0:00 Intro
1:35 Jan’s background
7:10 Timing of scalable solutions
16:30 Recursive reward modeling
24:30 Amplification of misalignment
31:00 Community focus
32:55 Wireheading
41:30 Arguments against the democratization of AIs
49:30 Differences between capabilities and alignment
51:15 Research to focus on
1:01:45 Formalizing an understanding of personal experience
1:04:04 OpenAI hiring
1:05:02 Wrap-up