ipex-llm

History

Yang Wang 99b05ba1dc separate prefill into a process (#11787 ) * seperate prefill into a process * using model.share_memory() * might work * worked * use long prompt * refactor * cleanup * fix bug * clean up * changable inter and intra process stages * refactor * add max output len * fix npu_model changes that may cause generate down * fix npu_model generate import error * fix generare forward error --------- Co-authored-by: sgwhat <ge.song@intel.com>	2024-08-19 17:53:36 +08:00
..
LLM	separate prefill into a process (#11787 )	2024-08-19 17:53:36 +08:00
Multimodal	Update npu multimodal example (#11773 )	2024-08-13 14:14:59 +08:00

separate prefill into a process (#11787 )

* seperate prefill into a process

* using model.share_memory()

* might work

* worked

* use long prompt

* refactor

* cleanup

* fix bug

* clean up

* changable inter and intra process stages

* refactor

* add max output len

* fix npu_model changes that may cause generate down

* fix npu_model generate import error

* fix generare forward error

---------

Co-authored-by: sgwhat <ge.song@intel.com>

2024-08-19 17:53:36 +08:00

LLM

separate prefill into a process (#11787 )

2024-08-19 17:53:36 +08:00

Multimodal

Update npu multimodal example (#11773 )

2024-08-13 14:14:59 +08:00