Chainlit - hyperparameters tuning

To control the behavior of the llm, we can change some hyperparameters like the max_new_tokens, the temperature of the sampling, or the top_k and top_p values.

Note

For more information on the hyperparameters that you can play with, please refer to the CTransformers documentation.

There is a video tutorial available for this section watch it.

The implementation

Let’s first import the necessary libraries:

import chainlit as cl
from chainlit.input_widget import Slider, Switch

To change the hyperparameters, we will use the Slider and Switch widgets. The Slider widget allows us to change the value of a hyperparameter by moving a slider. The Switch widget allows us to change the value of a hyperparameter by switching between two values.

settings = await cl.ChatSettings(
    [
        Slider(
            id="Temperature",
            label="Temperature",
            initial=1,
            min=0,
            max=2,
            step=0.1,
        ),
        Slider(
            id="Repetition Penalty",
            label="Repetition Penalty",
            initial=0.3,
            min=0,
            max=2,
            step=0.1,
        ),
        Slider(
            id="Top P",
            label="Top P",
            initial=0.7,
            min=0,
            max=1,
            step=0.1,
        ),
        Slider(
            id="Top K",
            label="Top K",
            initial=42,
            min=0,
            max=100,
            step=1,
        ),
        Slider(
            id="Max New Tokens",
            label="Max New Tokens",
            initial=256,
            min=0,
            max=1024,
            step=1,
        ),
        Switch(id="Streaming", label="Stream Tokens", initial=True),
    ]
).send()
Alternative text for the image

The settings panel.

Now as demonstrated in the figure above the user can change the hyperparameters using the sliders. The Streaming switch allows us to stream the tokens as they are generated. If the switch is turned off, the tokens will be generated all at once.