A comparison of some of the most popular libraries that constrain LLM generation.
Read
A verbose explanation of common auto-regressive decoding methods: temperature, top k and top p.