Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Comparing Sampling Strategies

StrategyWhat it doesTrade-off
TemperatureReshapes the entire distributionSimple, but low-probability junk words can still be picked
Top-kHard cutoff at k wordsFixed k does not adapt to model confidence
Top-pAdaptive cutoff by cumulative probabilityHandles both confident and uncertain positions well

In practice, these are combined. A typical production configuration might use temperature=0.7, top_p=0.9, and top_k=50 together. Temperature reshapes the distribution, then top-k and top-p trim the tail. Our implementation supports all three, individually or combined.