Google has teamed up with Sourcegraph, the AI developer, to assist in coding with their AI tool named Cody. They have tested the Gemini 1.5 model that supports input sequences up to 1 million tokens. How does it improve the quality of answers?
Cody involves using AI to read the code within an organization’s clientele to aid in finding and recommending new code writing options. It is compatible with popular IDEs such as Visual Studio and JetBrains. The language models selected by Cody are popular models in the market, like Claude 3/3.5, GPT-4o, Gemini, Mixtral (clients can choose their model). The preferred models used operate at a context window size of 10,000 tokens (10k).
Sourcegraph experimented with the Gemini 1.5 Flash with a 1M context window and compared the results in 4 dimensions, namely Essential Recall, Essential Concision, Helpfulness, and Hallucination. The answers obtained by using a model that supports longer input were more detailed and comprehensive, reducing noise and irrelevant content as the model did not need to add extra information to the already targeted answer.
The drawback of using a model with a 1M input is the increased time to the first token. The time increases linearly with the length of the input provided. Sourcegraph mentions that they are addressing this issue by prefetching some data before running the model, caching it, thereby reducing the waiting time from 30-40 seconds to just 5 seconds.
TLDR: Google and Sourcegraph collaborated to test the Gemini 1.5 model with a 1M context window for coding assistance, resulting in more detailed answers but with increased time to the first token, which Sourcegraph is working on reducing through prefetching and caching techniques.
Leave a Comment