MLCommons is the central organization responsible for developing performance testing suites for artificial intelligence. Recently, they have introduced a new AI Safety testing suite specifically designed to measure the safety of LLM AI models. This marks the first time that MLCommons has released a testing suite for AI models themselves, as the previous MLPerf testing suite was primarily focused on testing the performance of hardware used for running and training AI models.
The initial tests in this suite will assess the models’ responses to 7 risky prompts, including but not limited to child sexual exploitation, support for violent weapon creation, human hatred, support for non-violent crimes, support for sexual crimes, support for self-harm or suicide, and support for violence. The AI models undergoing testing will be asked a total of 43,000 questions across these categories to determine if their responses align with safety guidelines.
The grading process will be divided into 5 levels, ranging from low risk to high risk. Models that fall into the high risk category will be compared to the best-performing models currently available, with models deemed high risk if they answer four times more risky prompts than the top model. Conversely, low risk grades will be given when models fail to respond to less than 0.1% of low-risk questions.
MLCommons has showcased test results for models with less than 15B parameters, grading them from high risk to medium-low risk. However, the initial version of the testing suite has not been fully optimized, and results are not attributed to specific models. Additionally, there are plans to expand the testing suite to cover a total of 13 categories, with 7 currently open for testing.
Source: MLCommons
TLDR: MLCommons has introduced a new AI Safety testing suite to evaluate the safety of LLM AI models, marking the first time they have released a testing suite for AI models themselves. The suite includes tests for risky prompts and grades models based on their responses, with plans to expand the testing to cover a total of 13 categories.
Leave a Comment