pipeline h

Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement …

Using ggplot2 for Visualization in R

Last Updated on September 26, 2023 One of the most popular plotting libraries in R is not the plotting function in R base, but the ggplot2 library. People use that because it is flexible. This library also works using the philosophy of “grammar of graphics”, which is not to generate a visualization upon a function …

Plotting Graphs in R

Last Updated on September 4, 2023 Visualizing data can sometimes help people understand it better. As a data analytics platform, R provided some advanced plotting functions. In this post, you will learn how to use the built-in plot functions to create some common visualization. Specifically, you will learn how to create: Line plot Scatter plot …