Close Menu
  • Home
  • Android
  • Android Operating
  • Apple
  • Apps
  • Gadgets
  • Galaxy
  • Ipad
  • IPhone
  • Smartphone
  • Tablet

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

TVOS26 turns your iPhone into a karaoke microphone from Apple Music Sing

June 10, 2025

Muse Dash, Hyperforma, Tower of Fortune 4, etc.

March 28, 2025

Best Kitchen Gadgets of 2025

March 18, 2025
Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise with Us
  • Contact us
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
Wtf AndroidWtf Android
  • Home
  • Android
  • Android Operating
  • Apple
  • Apps
  • Gadgets
  • Galaxy
  • Ipad
  • IPhone
  • Smartphone
  • Tablet
Wtf AndroidWtf Android
Home » Apple’s new Ferret-UI 2 AI system can control apps across iPhone, iPad, Android, and Apple TV
Apps

Apple’s new Ferret-UI 2 AI system can control apps across iPhone, iPad, Android, and Apple TV

adminBy adminOctober 26, 2024No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


summary
summary

Apple has developed a new AI system called Ferret-UI 2 that can read and control apps across iPhones, iPads, Android devices, web browsers, and Apple TV.

The system’s UI element recognition test score was 89.73, significantly higher than GPT-4o’s score of 77.73. We also see significant improvements over previous versions in basic tasks such as text and button recognition, as well as more complex operations.

Comparison table: Benchmark results for different UI models with different backbones. Shows performance values ​​for basic and advanced tasks.
Apple tested the system using several language models. Rama-3 showed the best results, but the smaller Gemma-2B also performed well. |Image: Apple

share

Recommend our article

Understand user intent

Ferret-UI 2 aims to understand user intent rather than relying on specific click coordinates. When given a command such as “confirm input,” the system can identify the appropriate button without requiring precise location data. Apple’s research team used GPT-4o’s visual capabilities to generate high-quality training data that helps systems better understand how UI elements relate to each other spatially. did.

Ferret-UI 2 uses an adaptive architecture that recognizes UI elements across the platform. It includes algorithms that automatically balance image resolution and processing requirements for each platform. According to the researchers, this approach “combines both the preservation of information and the efficiency of local encoding.”

advertisement

THE DECODER Newsletter

Get the most important AI news delivered straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel anytime

Four UI screenshots with example conversations: iPhone settings, iPad weather app, MacBook product page, and Apple TV interface with model answers.
Ferret-UI 2 interaction example. |Image: Apple

Testing showed strong cross-platform performance, with models trained on iPhone data achieving 68% accuracy on iPad and 71% accuracy on Android devices. However, this system makes transitions between mobile devices and television or web interfaces more difficult, which researchers attribute to differences in screen layouts.

Microsoft releases UI understanding tool as open source

Apple’s efforts come as other companies develop their own UI-understanding AI systems. Anthropic recently released the latest Claude 3.5 Sonnet with UI interactions. Meanwhile, Microsoft released OmniParser, an open source tool that converts screen content into structured data for the same purpose.

Apple also recently announced CAMPHOR, a framework that uses specialized AI agents coordinated by a master inference agent to handle complex tasks. This technology, combined with Ferret-UI 2, enables voice assistants like Siri to analyze and perform complex tasks such as searching for and making reservations for a specific restaurant, navigating apps and the web using only voice commands. It will look like this.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
admin
  • Website

Related Posts

Muse Dash, Hyperforma, Tower of Fortune 4, etc.

March 28, 2025

New Android spyware warning – don’t install these apps

October 31, 2024

Google Apps Finally Adds Material 3 Bottom Bar to Android

October 31, 2024
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Will Google’s new anti-theft feature be a game-changer for Android users?

October 13, 2024

Huawei’s Android replacement HarmonyOS Next launches next week, permanently discontinuing Google’s operating system on existing devices

October 11, 2024

Android 15 lets you turn your phone into a useful smart home dashboard – here’s how

October 11, 2024

Google ordered to open Android app store to competition

October 10, 2024
Top Reviews
Wtf Android
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About Us
  • Advertise with Us
  • Contact us
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
© 2025 wtfandroid. Designed by wtfandroid.

Type above and press Enter to search. Press Esc to cancel.