Amazon introduced its large-scale language model named Nova in November 2024. Following this, they have started implementing it in some of their consumer products like Alexa+. Most recently, Amazon unveiled Amazon Nova Act, a sub-model under the Nova family, specifically designed for web browsing control via browsers. This paves the way for Agentic AI tasks, allowing easy web control through the Nova Act SDK, which is open for external developers to connect with.
Similar to the previously released OpenAI Operator or Gemini Mariner, Nova Act stands out as a customizable model tailored for tasks such as web reading, web control, familiarity with various web icons (such as the popular 5-star review system), and managing frequently encountered web UI elements like date options and selecting cities on a map. This customization offers a more efficient performance compared to rival models in the same league.
Moreover, Nova Act comes equipped with an SDK ready for common web commands (search, checkout, answering screen-related questions) and executing complex conditional commands (such as rejecting insurance upsells). Additionally, it can operate in a headless mode, performing tasks without continuously displaying visuals.
Source: Amazon, Amazon AGI Blog
TLDR: Amazon introduces Nova, a large-scale language model, and its sub-model Nova Act designed for web browsing control tasks with advanced customization features and an SDK for common and complex web operations.
Leave a Comment