Skip to content
  • There are no suggestions because the search field is empty.

What are the context window limits?

The context window limit we support depends on the model.

Context window limits explained

When we talk about context window limits, there are a few key concepts to understand:
  • Total Context Window (or Token Limit): This represents the maximum number of tokens a model can handle, including both the input tokens it can process and the output tokens it can generate. Think of it as the overall capacity of the model to work with text.
  • Output Token Limit: This is the maximum number of tokens a model can produce in its response. Depending on the specific model, this limit typically falls between 2,000 and 4,000 tokens.
  • Input Token Limit: This is the remaining token capacity for the input, calculated by subtracting the Output Token Limit from the Total Context Window. Essentially, it's the space available for the model to comprehend and process the input text.

It's important to note that tokens are not the same as words. Depending on the complexity of the language, one token could represent a single character, a part of a word, or a whole word.

Understanding these context window limits helps in managing the amount of text that can be processed and generated by the model.

What are You.com's context window limits?

  • Most of our models have a maximum context window of 64K tokens or more. For these models, we commit to supporting at least 64K tokens.
  • Some of our models have a maximum context window of less than 64K tokens, such as
    • Llama 3 supports 8K tokens.
    • Dolphin 2.5 supports 16K tokens.
    • DBRX-Instruct supports 32K tokens.
    For these models, we commit to supporting up to their respective context lengths.

For information on how context window limits play into file uploads, click here.