Logprobs, short for log probabilities, are logarithms of probabilities that indicate the likelihood of each token occurring based on the previous tokens in the context. They allow users to gauge a model’s confidence in its outputs and explore alternative responses considered by the model and are beneficial for various applications such as classification tasks, retrieval evaluations, and autocomplete suggestions. One big use case of using logprobs is to assess how confident a model is in its answer. For example, if you were building a classifier to categorize emails into 5 categories, with logprobs, you can get back the category and the confidence of the model in that token. For example, the LLM can categorize an email as “Spam” with 87% confidence. You can then make decisions based on this probability like if it’s too low, having a larger LLM classify a specific email.Documentation Index
Fetch the complete documentation index at: https://togetherai-migration.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Returning logprobs
To return logprobs from our API, simply addlogprobs: 1 to your API call as seen below.
Response of returning logprobs
Here’s the response you can expect. You’ll notice both the tokens and the log probability of every token is shown.Converting logprobs to probabilities
Let’s take the first token from the previous example:{ "New": -0.39648438 }. The “New” token has a logprob of -0.39648438, but this isn’t very helpful by itself. However, we can quickly convert it to a probability by taking the exponential of it.
A practical example for logprobs: Classification
In this example, we’re building an email classifier and we want to know how confident the model is in its answer. We give the LLM 4 categories in the system prompt then pass in an example email.-0.012512207. After taking the exponential of this, we get a probability of 98.7%. We’re using a small and fast LLM here (llama 3.1 8B) which is great, but using logprobs, we can also tell when the model is unsure of its answer and see if we need to route it to a bigger LLM.
Conclusion
We were able to uselogprobs to show how to build a more robust classifier (and a cheaper classifier, using a smaller model for most queries but selectively using bigger models when needed). There are many other use cases for logprobs around autocompletion, keyword selection, and moderation.