If your workflow is browser, chat box, editor, repeat, the expensive mistake is treating browser-use like just another RPA click bot. It is trying to turn browser automation into a natural-language interface [C002]. Many people think they need a stronger model. They usually need fewer tab switches.
That is why the browser-use / video-use split matters [C001]. If you throw them into the same bucket, you keep hand-carrying context between tabs and add one more loop of rework. AI tools are starting to compete for your fragmented minutes, not just your coding time.
What changed my mind was the v3.0 framing. The GitHub quickstart is built around a reliable browser for AI coding helpers, plus a small Python helper, a saved browser setup, and an allowlist of sites. That setup is about stable context, not replaying clicks.
The docs push the same idea again. You can add your own actions and keep the same browser session alive, so the model can continue work inside one running browser instead of starting from zero each step. That is a programmable tool surface, not a toy recorder.
Boundary: this read only uses the browser-use GitHub page and official docs, not a live run, and it says nothing about video-use quality. If a tool saves clicks but still makes you re-explain context, the hard part is still manual. Share this with the friend who lives in copy-paste loops.