Home ยป ChatGPT Performance Evaluation: Proficiency in Programming Challenges Shows Strong Suit in Outdated Tasks Pre-2021

ChatGPT Performance Evaluation: Proficiency in Programming Challenges Shows Strong Suit in Outdated Tasks Pre-2021

Research from Chinese researchers experimented with using ChatGPT to solve a programming challenge consisting of 728 problems written in 5 popular programming languages (C, C++, Java, Python, JavaScript), as well as analyzing 18 CWE vulnerabilities. The evaluation by the research team revealed that ChatGPT performed fairly well, scoring 89% on easy problems, 71% on medium difficulty, and 40% on hard problems.

However, a weakness of ChatGPT was identified when it came to problems introduced after 2021, with a significant decrease in success rates for both easy (52% success rate) and hard problems (0.66% success rate). The reason being attributed to ChatGPT being trained on pre-2021 data and lacking the analytical thinking ability akin to humans. Therefore, encountering new problems post-2021 resulted in a notable decline in problem-solving capabilities.

Source: IEEE Paper

TLDR: ChatGPT showed promise in solving programming challenges but struggled with problems introduced after 2021 due to lack of analytical thinking abilities, as reported by Chinese researchers.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Renowned Linguist Pascal’s Creator, Niklaus Wirth, Passes Away at the Age of 89

July 2024 Programming Language Rankings Report: Rust Gaining Momentum Towards Top 10 Spot

The Resurgence of Kotlin: An Impressive Ascension from Ranking 20 to 15 within a Two-Month Span – TIOBE Index November 2023