* Enhance model loading with device support and error handling
Updated device handling for model loading and added support for MPS. Improved error handling and fallback mechanisms for attention implementations.
* Improve device handling and model loading logic
Updated device argument handling to support MPS and added validation for MPS availability. Enhanced model loading logic based on the selected device type.
* fallback only when flash_attention_2 and add some comments back
---------
Co-authored-by: YaoyaoChang <cyy574006791@qq.com>
### Summary
The Stop button previously appeared only after the backend generation job actually started, leaving users with no cancel affordance while a request sat in the queue. This PR makes the Stop button visible immediately after clicking **Generate Podcast**.
### Change
Adds a single non-queued `.then` step in the click event chain
that hides the Generate button and shows the Stop button instantly, before the queued job begins processing.