Home ยป Databricks Proposes LLM Training Approach to Align with GPT-4o without Data Set Creation Just by Requesting Usage Logs.

Databricks Proposes LLM Training Approach to Align with GPT-4o without Data Set Creation Just by Requesting Usage Logs.

Databricks introduces Test-time Adaptive Optimization (TAO) as a way to train AI models like LLM for organizations already using LLM but looking for models specific to their needs. TAO consists of four steps:

1. Generating responses from past prompts stored in the organization.
2. Scoring the quality of responses based on the input prompts, using LLM or other software for assessment.
3. Training through Reinforcement Learning (RL) to ensure the model being trained achieves the best possible scores.
4. Data collection and retraining from step 1.

Although the RL training process can be resource-intensive, the resulting model utilizes resources similar to the original model without sacrificing computational efficiency. Databricks utilizes TAO to create models for customer service, such as answering database queries or translating text into SQL. This training method allows models to achieve performance levels on par with GPT-4o or o3-mini even with smaller models like Llama 3.3 70B or Llama 3.1 8B, reducing service costs significantly.

Notably, test-time fine-tuning is a crucial approach that many research teams use in the ARC-AGI competition, where models learn from as few as 3-5 examples.

TLDR:
Databricks presents Test-time Adaptive Optimization (TAO) to train AI models for organizations already using LLM. TAO includes generating responses from past prompts, scoring response quality, RL training, and data collection. Despite resource-intensive RL training, resulting models maintain efficiency. Databricks uses TAO to create efficient models for customer service, yielding performance akin to larger models like GPT-4o. Test-time fine-tuning is essential in learning from limited examples.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *