A detailed comparison of real-world performance across 5 use cases (repeated twice but shown once here, not rigorous since small n):
MultiOn generally performs tasks faster, but Claude tends to be more accurate and better at task completion.
Prompt: Create a single-page itinerary for a trip to Hawaii using recommendations from Reddit and travel blogs.
Goal: A detailed itinerary with links, timings, and costs.
Notes (MultiOn): Blocked on Reddit security; stuck in a loop and failed to include links in output, had lots of questions.
Notes (Claude): Encountered server errors with reddit; correctly displayed an itinerary, unclear if grounded on what it learned.
Prompt: Based on my interest in fantasy art, suggest a website or activity.
Goal: Provide a relevant suggestion aligned with preferences.
Notes (MultiOn): Found and clicked on ArtStation (fantasy section) after searching Google.
Notes (Claude): Randomly went to .com website after a long time getting set up then went to the correct site without explanation.
Prompt: Find the most popular beach destination for US travelers in their twenties and book a direct flight for tomorrow morning.
Goal: Book a flight to a relevant beach destination with proper details.
Notes (MultiOn): Said tomorro was Nov 28 but clicked 27th, very slow, couldn't select the right places to go to, got stuck in loop.
Notes (Claude):Searched for URLs in the search box, but recovered then asked clarifying questions and got rate limited.
Prompt: Download the latest ML paper from arXiv, summarize the abstract, and save it in a folder named 'Research'.
Goal: Accurately download, summarize, and save the file.
Notes (MultiOn): Failed to click on the PDF link; struggled to summarize, scrolled up and down repeatedly. Couldn't download since not OS-level.
Notes (Claude): Successfully completed the task, creating the folder, summarizing, but failed to rename and download.
Prompt: Estimate the calorie count of the most popular menu item at Geneva Steakhouse in SF.
Goal: Accurately estimate calories (~1200) based on images and descriptions.
Notes (MultiOn): Uses general photos for restaurant rather than searching the reviews for an item. Didn't know when to stop looking at photos. Needed me to clarify the most popular one even though I asked it.
Notes (Claude): Rate-limited after interacting with Google Maps; partially recovered but clicked on incorrect icons. Goes to the menu then google search images rather than the restaurant on google maps, and estimates ~1500 calories correctly but unclear how it's grounded.