add reasonning budget parameters in chat handlers

could we have the option to limit the number of tokens generated for thinking via parameters equivalent to
--reasoning-budget and --reasoning-budget-message in llama.cpp CLI  in the chathandler or elsewhere?

Very useful to control the thinking effort of the models , feel gemma 4 and qwen 3.6 tend to overthink in thinking mode