Sherjil Ozair · Apr 19, 2025 · 9:42 PM UTC

Sherjil Ozair

Sherjil Ozair @sherjilozair

Apr 19

I can't believe that we'll soon have drop-in remote workers, indistinguishable from a human worker, and it will be trained purely on behavioral inputs/outputs. Total behaviorism victory!

Sherjil Ozair · Apr 19, 2025 · 10:06 PM UTC

Sherjil Ozair · Apr 19, 2025 · 10:06 PM UTC

Sherjil Ozair @sherjilozair

Apr 19

The best thing about supervised learning is that you can clone any static system arbitrarily well. The downside is that supervised learning only works for static targets. This is an AI koan and also career advice.

Apr 19, 2025 · 10:06 PM UTC

Rohan Saphal · Apr 19, 2025 · 10:13 PM UTC

Rohan Saphal @RohanSaphal

Apr 19

Replying to @sherjilozair

Do you see a future where this behaviour cloned policy becomes an initialisation for an RL agent and adapts to dynamic systems?

Sherjil Ozair · Apr 19, 2025 · 10:15 PM UTC

Sherjil Ozair @sherjilozair

Apr 19

Yes! But it will require more than just vanilla RL, which also requires a static environment.