A research team from UC San Diego has updated a time-limited version of the Turing Test to assess whether artificial intelligence can deceive humans. Originally conducted since 2023, the latest findings reveal that GPT-4.5 convincingly outperformed humans.
The testing process involved individuals chatting with both AI and humans simultaneously, limited to only 8 messages and a 5-minute time constraint. At the end, participants had to determine which screen displayed the AI versus the human. In 2023, GPT-4 achieved a top score of 41%, whereas GPT-4.5 has now surged to an impressive 73%, surpassing real human performances significantly.
The test participants consisted of two groups: 126 undergraduate students from UC San Diego and another 158 individuals testing AI from the Prolific platform. GPT-4.5 excelled in both groups, followed by Llama 3.1 405B performing notably well with the Prolific group.
Results indicated that initial prompts greatly influenced scores. While traditional Turing Tests aimed to assess “intelligence,” this limited version forced participants to strategize in identifying the AI. Despite the constraints, GPT-4.5’s exceptional performance suggests that AI can increasingly emulate human behaviors.
Source – arxiv
TLDR: UC San Diego researchers conducted a time-limited Turing Test, with GPT-4.5 surpassing humans in deceiving testers. The test showcased AI’s evolving ability to mimic human responses effectively.
Leave a Comment