Are you tired of spending hours manually refine your prompt mega to the little onions to get better results from your AI models?
So let me introduce you Promptimizera whole new experimental and open-source library that will automate the improvement of your prompts based on systematic evaluations and human feedback.
The principle is formidable: you provide an initial prompt, a dataset and personalized assessors, and Promptimizer is responsible for automatically optimizing your prompt to obtain the best possible results.
Take a concrete example to illustrate: Suppose you want to create an optimized tweet generator.
Here’s how to do step step. Already we will install it: pip install -U promptim
Then we will configure the API keys. Make sure you have configured your keys to Langsmith and the model of your choice (like Claude d’Anthropic):
export LANGSMITH_API_KEY=VOTRE_CLE
export ANTHROPIC_API_KEY=VOTRE_CLE
Then we will create a task like this:
proptim create task ./mon-generateur-tweets \
--name generateur-tweets \
--prompt langchain-ai/tweet-generator-example-with-nothing:starter \
--dataset URL_DATASET \
--description "Génération de tweets informatifs sur divers sujets" \
-y
The real asset of promptimizer lies in its assessors. These small intelligent pieces of code analyze the results of your prompt and attribute scores to them according to your criteria.
Imagine for example that you want to avoid hashtags in your generated tweets. Here is what your assessor would look like:
def evaluateur_tweets(run: Run, example: Example) -> dict:
# Récupère la prédiction du modèle
prediction = run.outputs["output"]
# Convertit la prédiction en string
resultat = str(prediction.content)
# Score = 1 si pas de hashtag, 0 sinon
score = int("#" not in resultat)
return {
"key": "sans_hashtags",
"score": score,
"comment": "Réussi : tweet sans hashtag" if score == 1 else "Échec : retirer les hashtags"
}
So when you launch optimization with the command promptim train
a fascinating process starts:
- Initial assessment : Promptimizer test your prompt on the validation set to establish a reference
- Iterative optimization : The system analyzes the results by lots and suggests improvements via a meta -post
- Continuous validation : Each modification is tested to ensure that it really improves performance
- Human feedback (optional): You can intervene in the process by manually evaluating the results
To add this human dimension to optimization, use the option --annotation-queue
::
proptim train --task ./mon-generateur-tweets/config.json --annotation-queue ma_queue
You will then be able to examine the results in the Langsmith interface and provide your feedback, which promptimizer will integrate into its improvement process.
The file config.json
Allows you to finely personalize the behavior of promptimizer:
json
{
"name": "Générateur de Tweets",
"dataset": "mon_dataset",
"initial_prompt": {
"prompt_str": "Rédigez un tweet sur {sujet} dans le style de {auteur}",
"which": 0
},
"description": "Génération de tweets engageants adaptés au style",
"optimizer": {
"model": {
"name": "gpt-3.5-turbo",
"temperature": 0.7
}
}
}
Now, as usual:
- Start small : First test on a small dataset before going to scale
- Refine your assessors : The quality of optimization depends directly on the relevance of your evaluation criteria
- Use Debug mode : The option
--debug
Allows you to follow in detail the optimization process - Experience with hyperparameters : Adjust the size of the lots and the number of eras to find the best balance
By combining systematic evaluation and human feedback, Promptimizer offers a very pragmatic but very effective approach to improve the performance of your AI systems. It’s up to you now! And thank you to Lorenper for the info!
Source link
Subscribe to our email newsletter to get the latest posts delivered right to your email.
Comments