Most ML advice obsesses over technique, but Schulman argues the harder and higher-leverage skill is taste — knowing which problems are worth your next six months. Written in 2017 for the first cohort of OpenAI Fellows and still passed around as required reading, the essay distills how a creator of PPO actually decides what to work on, and how to keep momentum when results refuse to cooperate.
What Sets It Apart
- Idea-driven vs goal-driven research. Chasing the literature invites getting scooped; a concrete goal hands you a perspective no one else has, so a solo researcher without a famous lab can still outrun the pack.
- Aim high, climb incrementally. A 10% gain only matters inside a 10X goal — and a method's allowed complexity scales with its payoff: a 10% improvement had better be two lines of code, or no one (not even you) will use it.
- Restrict yourself to general solutions. Hitting your target with a domain-specific hack advances nothing; constrain the search to methods that transfer to other problems.
- The notebook-and-review loop. Daily entries plus a condensed weekly review (findings, insights, code progress, next steps) turns scattered work into compounding progress and honest accounting of where your time went.
Great Fit / Look Elsewhere
Great fit if you are a grad student, fellow, or early-career researcher choosing a direction and you want a battle-tested operating system for multi-month research projects. Look elsewhere if you want hands-on tutorials, math derivations, or code — this is about judgment and process, not implementation.
