Google has made available the Gemini 1.5 Pro model, with an input length of 2 million tokens, which was unveiled at the Google I/O event in May 2024.
Increasing the context window length allows for more complex data inputs, such as entire books or a large number of organizational documents, to be processed by the model. This enables various applications, such as creating organizational knowledge bases for employees to query through bots. However, the trade-off for longer inputs is the increased cost, prompting Google to introduce context caching in the Gemini API. This feature reduces redundant inputs, with cached inputs priced lower than new ones. Developers can customize the token amount and caching duration to optimize their usage.
For pricing examples on the Gemini API webpage:
– Normal input: $3.5 per 1 million tokens (max length of 128K tokens)
– Cached input: $0.875 per 1 million tokens (max length of 128K tokens)
TLDR: Google introduces the Gemini 1.5 Pro model with a 2 million token input length at Google I/O 2024. The extended context window allows for processing complex data inputs, but comes with increased costs, mitigated by context caching in the Gemini API.
Leave a Comment