I mean, you can run small models on mobile now, but they’re mostly good as a cog in an automation pipeline, not at (say) interpreting english instructions on how to alter a webpage.
…Honestly, open weight model APIs for single-off calls like this are not a bad stopgap. It costs basically nothing, you can use any provider you want, its power efficient, and if you’re on the web, you have internet.












I mean, there are literally hundreds of API providers. I’d probably pick Cerebras, but you can take your pick from any jurisdiction and any privacy policy.
I guess you could rent an on-demand cloud instance yourself too, that spins down when you aren’t using it.