As LLM app becomes more and more popular, one of the challenges engineers face is how to accurately design the optimal prompt. This process often requires continuous trial and error. In addition, the output of large models is probabilistic, and the output results may vary from time to time, which further increases the difficulty of debugging prompts. Engineers can invest hours or more in optimizing a hint. Therefore, mastering how to write effective tips and how to optimize tips has become an urgent and growing need.
Today, we introduce one of the strongest tools in the field today-Ape, a prompt optimization tool developed by YC-funded startup Weavel, which scored 93% on the GSM 8K benchmark, exceeding BaseLLM’s 70% and DSPy’s 86%.
At the same time, it can also automatically generate evaluation codes and use LLMas a judgment, or set its own evaluation indicators.
Ape core philosophy is simple:
Good input + correct guidance = better tips.
The design implementation of APE is influenced by DSPy, and the prompt self-optimization is carried out in the way of data + iteration. Thanks to the…