ipex-llm

Author	SHA1	Message	Date
Jinyi Wan	07485eff5a	Add SOLAR-10.7B to README (#9869 )	2024-01-11 14:28:41 +08:00
ZehuaCao	146076bdb5	Support llm-awq backend (#9856 ) * Support for LLM-AWQ Backend * fix * Update README.md * Add awqconfig * modify init * update * support llm-awq * fix style * fix style * update * fix AwqBackendPackingMethod not found error * fix style * update README * fix style --------- Co-authored-by: Uxito-Ada <414416158@qq.com> Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com> Co-authored-by: cyita <yitastudy@gmail.com>	2024-01-09 13:07:32 +08:00
Mingyu Wei	ed81baa35e	LLM: Use default typing-extension in LangChain examples (#9857 ) * remove typing extension downgrade in readme; minor fixes of code * fix typos in README * change default question of docqa.py	2024-01-08 16:50:55 +08:00
Jinyi Wan	3147ebe63d	Add cpu and gpu examples for SOLAR-10.7B (#9821 )	2024-01-05 09:50:28 +08:00
Wang, Jian4	4ceefc9b18	LLM: Support bitsandbytes config on qlora finetune (#9715 ) * test support bitsandbytesconfig * update style * update cpu example * update example * update readme * update unit test * use bfloat16 * update logic * use int4 * set defalut bnb_4bit_use_double_quant * update * update example * update model.py * update * support lora example	2024-01-04 11:23:16 +08:00
Wang, Jian4	a54cd767b1	LLM: Add gguf falcon (#9801 ) * init falcon * update convert.py * update style	2024-01-03 14:49:02 +08:00
binbin Deng	6584539c91	LLM: fix installation of codellama (#9813 )	2024-01-02 14:32:50 +08:00
Wang, Jian4	7ed9538b9f	LLM: support gguf mpt (#9773 ) * add gguf mpt * update	2023-12-28 09:22:39 +08:00
Jason Dai	361781bcd0	Update readme (#9788 )	2023-12-26 19:46:11 +08:00
Ziteng Zhang	44b4a0c9c5	[LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py (#9786 ) * correct prompt format of Yi * correct prompt format of llama2 in cpu generate.py * correct prompt format of Qwen in GPU example	2023-12-26 16:57:55 +08:00
Heyang Sun	66e286a73d	Support for Mixtral AWQ (#9775 ) * Support for Mixtral AWQ * Update README.md * Update README.md * Update awq_config.py * Update README.md * Update README.md	2023-12-25 16:08:09 +08:00
Wang, Jian4	984697afe2	LLM: Add bloom gguf support (#9734 ) * init * update bloom add merges * update * update readme * update for llama error * update	2023-12-21 14:06:25 +08:00
Heyang Sun	1fa7793fc0	Load Mixtral GGUF Model (#9690 ) * Load Mixtral GGUF Model * refactor * fix empty tensor when to cpu * update gpu and cpu readmes * add dtype when set tensor into module	2023-12-19 13:54:38 +08:00
Wang, Jian4	b8437a1c1e	LLM: Add gguf mistral model support (#9691 ) * add mistral support * need to upgrade transformers version * update	2023-12-15 13:37:39 +08:00
Wang, Jian4	496bb2e845	LLM: Support load BaiChuan model family gguf model (#9685 ) * support baichuan model family gguf model * update gguf generate.py * add verify models * add support model_family * update * update style * update type * update readme * update * remove support model_family	2023-12-15 13:34:33 +08:00
Lilac09	3afed99216	fix path issue (#9696 )	2023-12-15 11:21:49 +08:00
Ziteng Zhang	21c7503a42	[LLM] Correct prompt format of Qwen in generate.py (#9678 ) * Change qwen prompt format to chatml	2023-12-14 14:01:30 +08:00
Qiyuan Gong	223c9622f7	[LLM] Mixtral CPU examples (#9673 ) * Mixtral CPU PyTorch and hugging face examples, based on #9661 and #9671	2023-12-14 10:35:11 +08:00
ZehuaCao	877229f3be	[LLM]Add Yi-34B-AWQ to verified AWQ model. (#9676 ) * verfiy Yi-34B-AWQ * update	2023-12-14 09:55:47 +08:00
ZehuaCao	503880809c	verfiy codeLlama (#9668 )	2023-12-13 15:39:31 +08:00
Heyang Sun	c64e2248ef	fix str returned by get_int_from_str rather than expected int (#9667 )	2023-12-13 11:01:21 +08:00
ZehuaCao	45721f3473	verfiy llava (#9649 )	2023-12-11 14:26:05 +08:00
Heyang Sun	9f02f96160	[LLM] support for Yi AWQ model (#9648 )	2023-12-11 14:07:34 +08:00
ZehuaCao	6eca8a8bb5	update transformer version (#9631 )	2023-12-08 09:36:00 +08:00
Heyang Sun	3811cf43c9	[LLM] update AWQ documents (#9623 ) * [LLM] update AWQ and verified models' documents * refine * refine links * refine	2023-12-07 16:02:20 +08:00
Jason Dai	51b668f229	Update GGUF readme (#9611 )	2023-12-06 18:21:54 +08:00
dingbaorong	a7bc89b3a1	remove q4_1 in gguf example (#9610 ) * remove q4_1 * fixes	2023-12-06 16:00:05 +08:00
dingbaorong	89069d6173	Add gpu gguf example (#9603 ) * add gpu gguf example * some fixes * address kai's comments * address json's comments	2023-12-06 15:17:54 +08:00
Ziteng Zhang	aeb77b2ab1	Add minimum Qwen model version (#9606 )	2023-12-06 11:49:14 +08:00
Heyang Sun	4e70e33934	[LLM] code and document for distributed qlora (#9585 ) * [LLM] code and document for distributed qlora * doc * refine for gradient checkpoint * refine * Update alpaca_qlora_finetuning_cpu.py * Update alpaca_qlora_finetuning_cpu.py * Update alpaca_qlora_finetuning_cpu.py * add link in doc	2023-12-06 09:23:17 +08:00
Jinyi Wan	b721138132	Add cpu and gpu examples for BlueLM (#9589 ) * Add cpu int4 example for BlueLM * addexample optimize_model cpu for bluelm * add example gpu int4 blueLM * add example optimiza_model GPU for bluelm * Fixing naming issues and BigDL package version. * Fixing naming issues... * Add BlueLM in README.md "Verified Models"	2023-12-05 13:59:02 +08:00
Wang, Jian4	ed0dc57c6e	LLM: Add cpu qlora support other models guide (#9567 ) * use bf16 flag * add using baichuan model * update merge * remove * update	2023-12-01 11:18:04 +08:00
Jason Dai	bda404fc8f	Update readme (#9575 )	2023-11-30 22:45:52 +08:00
Yishuo Wang	66f5b45f57	[LLM] add a llama2 gguf example (#9553 )	2023-11-30 16:37:17 +08:00
Wang, Jian4	a0a80d232e	LLM: Add qlora cpu distributed readme (#9561 ) * init readme * add distributed guide * update	2023-11-30 13:42:30 +08:00
Qiyuan Gong	d85a430a8c	Uing bigdl-llm-init instead of bigdl-nano-init (#9558 ) * Replace `bigdl-nano-init` with `bigdl-llm-init`. * Install `bigdl-llm` instead of `bigdl-nano`. * Remove nano in README.	2023-11-30 10:10:29 +08:00
Wang, Jian4	b824754256	LLM: Update for cpu qlora mpirun (#9548 )	2023-11-29 10:56:17 +08:00
Guancheng Fu	963a5c8d79	Add vLLM-XPU version's README/examples (#9536 ) * test * test * fix last kv cache * add xpu readme * remove numactl for xpu example * fix link error * update max_num_batched_tokens logic * add explaination * add xpu environement version requirement * refine gpu memory * fix * fix style	2023-11-28 09:44:03 +08:00
Guancheng Fu	b6c3520748	Remove xformers from vLLM-CPU (#9535 )	2023-11-27 11:21:25 +08:00
binbin Deng	6bec0faea5	LLM: support Mistral AWQ models (#9520 )	2023-11-24 16:20:22 +08:00
Jason Dai	b3178d449f	Update README.md (#9525 )	2023-11-23 21:45:20 +08:00
Jason Dai	064848028f	Update README.md (#9523 )	2023-11-23 21:16:21 +08:00
Guancheng Fu	bf579507c2	Integrate vllm (#9310 ) * done * Rename structure * add models * Add structure/sampling_params,sequence * add input_metadata * add outputs * Add policy,logger * add and update * add parallelconfig back * core/scheduler.py * Add llm_engine.py * Add async_llm_engine.py * Add tested entrypoint * fix minor error * Fix everything * fix kv cache view * fix * fix * fix * format&refine * remove logger from repo * try to add token latency * remove logger * Refine config.py * finish worker.py * delete utils.py * add license * refine * refine sequence.py * remove sampling_params.py * finish * add license * format * add license * refine * refine * Refine line too long * remove exception * so dumb style-check * refine * refine * refine * refine * refine * refine * add README * refine README * add warning instead error * fix padding * add license * format * format * format fix * Refine vllm dependency (#1) vllm dependency clear * fix licence * fix format * fix format * fix * adapt LLM engine * fix * add license * fix format * fix * Moving README.md to the correct position * Fix readme.md * done * guide for adding models * fix * Fix README.md * Add new model readme * remove ray-logic * refactor arg_utils.py * remove distributed_init_method logic * refactor entrypoints * refactor input_metadata * refactor model_loader * refactor utils.py * refactor models * fix api server * remove vllm.stucture * revert by txy 1120 * remove utils * format * fix license * add bigdl model * Refer to a specfic commit * Change code base * add comments * add async_llm_engine comment * refine * formatted * add worker comments * add comments * add comments * fix style * add changes --------- Co-authored-by: xiangyuT <xiangyu.tian@intel.com> Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com> Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>	2023-11-23 16:46:45 +08:00
Heyang Sun	48fbb1eb94	support ccl (MPI) distributed mode in alpaca_qlora_finetuning_cpu (#9507 )	2023-11-23 10:58:09 +08:00
Heyang Sun	11fa5a8a0e	Fix QLoRA CPU dispatch_model issue about accelerate (#9506 )	2023-11-23 08:41:25 +08:00
Heyang Sun	1453046938	install bigdl-llm in deepspeed cpu inference example (#9508 )	2023-11-23 08:39:21 +08:00
binbin Deng	86743fb57b	LLM: fix transformers version in CPU finetuning example (#9511 )	2023-11-22 15:53:07 +08:00
Wang, Jian4	c5cb3ab82e	LLM : Add CPU alpaca qlora example (#9469 ) * init * update xpu to cpu * update * update readme * update example * update * add refer * add guide to train different datasets * update readme * update	2023-11-21 09:19:58 +08:00
binbin Deng	96fd26759c	LLM: fix QLoRA finetuning example on CPU (#9489 )	2023-11-20 14:31:24 +08:00
Heyang Sun	921b263d6a	update deepspeed install and run guide in README (#9441 )	2023-11-17 09:11:39 +08:00
Yina Chen	d5263e6681	Add awq load support (#9453 ) * Support directly loading GPTQ models from huggingface * fix style * fix tests * change example structure * address comments * fix style * init * address comments * add examples * fix style * fix style * fix style * fix style * update * remove * meet comments * fix style --------- Co-authored-by: Yang Wang <yang3.wang@intel.com>	2023-11-16 14:06:25 +08:00
Yang Wang	51d07a9fd8	Support directly loading gptq models from huggingface (#9391 ) * Support directly loading GPTQ models from huggingface * fix style * fix tests * change example structure * address comments * fix style * address comments	2023-11-13 20:48:12 -08:00
Heyang Sun	da6bbc8c11	fix deepspeed dependencies to install (#9400 ) * remove reductant parameter from deepspeed install * Update install.sh * Update install.sh	2023-11-13 16:42:50 +08:00
Zheng, Yi	9b5d0e9c75	Add examples for Yi-6B (#9421 )	2023-11-13 10:53:15 +08:00
Wang, Jian4	ac7fbe77e2	Update qlora readme (#9416 )	2023-11-12 19:29:29 +08:00
Zheng, Yi	0674146cfb	Add cpu and gpu examples of distil-whisper (#9374 ) * Add distil-whisper examples * Fixes based on comments * Minor fixes --------- Co-authored-by: Ariadne330 <wyn2000330@126.com>	2023-11-10 16:09:55 +08:00
Ziteng Zhang	ad81b5d838	Update qlora README.md (#9422 )	2023-11-10 15:19:25 +08:00
Heyang Sun	b23b91407c	fix llm-init on deepspeed missing lib (#9419 )	2023-11-10 13:51:24 +08:00
dingbaorong	36fbe2144d	Add CPU examples of fuyu (#9393 ) * add fuyu cpu examples * add gpu example * add comments * add license * remove gpu example * fix inference time	2023-11-09 15:29:19 +08:00
binbin Deng	97316bbb66	LLM: highlight transformers version requirement in mistral examples (#9380 )	2023-11-08 16:05:03 +08:00
Heyang Sun	af94058203	[LLM] Support CPU deepspeed distributed inference (#9259 ) * [LLM] Support CPU Deepspeed distributed inference * Update run_deepspeed.py * Rename * fix style * add new codes * refine * remove annotated codes * refine * Update README.md * refine doc and example code	2023-11-06 17:56:42 +08:00
Jin Qiao	e6b6afa316	LLM: add aquila2 model example (#9356 )	2023-11-06 15:47:39 +08:00
Yining Wang	9377b9c5d7	add CodeShell CPU example (#9345 ) * add CodeShell CPU example * fix some problems	2023-11-03 13:15:54 +08:00
Zheng, Yi	63411dff75	Add cpu examples of WizardCoder (#9344 ) * Add wizardcoder example * Minor fixes	2023-11-02 20:22:43 +08:00
dingbaorong	2e3bfbfe1f	Add internlm_xcomposer cpu examples (#9337 ) * add internlm-xcomposer cpu examples * use chat * some fixes * add license * address shengsheng's comments * use demo.jpg	2023-11-02 15:50:02 +08:00
Jin Qiao	97a38958bd	LLM: add CodeLlama CPU and GPU examples (#9338 ) * LLM: add codellama CPU pytorch examples * LLM: add codellama CPU transformers examples * LLM: add codellama GPU transformers examples * LLM: add codellama GPU pytorch examples * LLM: add codellama in readme * LLM: add LLaVA link	2023-11-02 15:34:25 +08:00
Zheng, Yi	63b2556ce2	Add cpu examples of skywork (#9340 )	2023-11-02 15:10:45 +08:00
dingbaorong	f855a864ef	add llava gpu example (#9324 ) * add llava gpu example * use 7b model * fix typo * add in README	2023-11-02 14:48:29 +08:00
Wang, Jian4	149146004f	LLM: Add qlora finetunning CPU example (#9275 ) * add qlora finetunning example * update readme * update example * remove merge.py and update readme	2023-11-02 09:45:42 +08:00
Jin Qiao	c44c6dc43a	LLM: add chatglm3 examples (#9305 )	2023-11-01 09:50:05 +08:00
dingbaorong	ee5becdd61	use coco image in Qwen-VL (#9298 ) * use coco image * add output * address yuwen's comments	2023-10-30 14:32:35 +08:00
dingbaorong	f053688cad	add cpu example of LLaVA (#9269 ) * add LLaVA cpu example * Small text updates * update link --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2023-10-27 18:59:20 +08:00
Zheng, Yi	7f2ad182fd	Minor Fixes of README (#9294 )	2023-10-27 18:25:46 +08:00
Zheng, Yi	1bff54a378	Display demo.jpg n the README.md of HuggingFace Transformers Agent (#9293 ) * Display demo.jpg * remove demo.jpg	2023-10-27 18:00:03 +08:00
Zheng, Yi	a4a1dec064	Add a cpu example of HuggingFace Transformers Agent (use vicuna-7b-v1.5) (#9284 ) * Add examples of HF Agent * Modify folder structure and add link of demo.jpg * Fixes of readme * Merge applications and Applications	2023-10-27 17:14:12 +08:00
Guoqiong Song	aa319de5e8	Add streaming-llm using llama2 on CPU (#9265 ) Enable streaming-llm to let model take infinite inputs, tested on desktop and SPR10	2023-10-27 01:30:39 -07:00
Yining Wang	a6a8afc47e	Add qwen vl CPU example (#9221 ) * eee * add examples on CPU and GPU * fix * fix * optimize model examples * add Qwen-VL-Chat CPU example * Add Qwen-VL CPU example * fix optimize problem * fix error * Have updated, benchmark fix removed from this PR * add generate API example * Change formats in qwen-vl example * Add CPU transformer int4 example for qwen-vl * fix repo-id problem and add Readme * change picture url * Remove unnecessary file --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2023-10-25 13:22:12 +08:00
dingbaorong	5a2ce421af	add cpu and gpu examples of flan-t5 (#9171 ) * add cpu and gpu examples of flan-t5 * address yuwen's comments * Add explanation why we add modules to not convert * Refine prompt and add a translation example * Add a empty line at the end of files * add examples of flan-t5 using optimize_mdoel api * address bin's comments * address binbin's comments * add flan-t5 in readme	2023-10-24 15:24:01 +08:00
Yining Wang	4a19f50d16	phi-1_5 CPU and GPU examples (#9173 ) * eee * add examples on CPU and GPU * fix * fix * optimize model examples * have updated * Warmup and configs added * Update two tables	2023-10-24 15:08:04 +08:00
Xin Qiu	0c5055d38c	add position_ids and fuse embedding for falcon (#9242 ) * add position_ids for falcon * add cpu * add cpu * add license	2023-10-24 09:58:20 +08:00
Jin Qiao	d946bd7c55	LLM: add CPU More-Data-Types and Save-Load examples (#9179 )	2023-10-17 14:38:52 +08:00
JIN Qiao	1a1ddc4144	LLM: Add Replit CPU and GPU example (#9028 )	2023-10-12 13:42:14 +08:00
binbin Deng	2ad67a18b1	LLM: add mistral examples (#9121 )	2023-10-11 13:38:15 +08:00
binbin Deng	5e9962b60e	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00

1 2 3 4

184 commits