ipex-llm

Author	SHA1	Message	Date
Yuxuan Xia	95636cad97	Add AutoGen CPU and XPU Example (#9980 ) * Add AutoGen example * Adjust AutoGen README * Adjust AutoGen README * Change AutoGen README * Change AutoGen README	2024-01-31 11:31:18 +08:00
Heyang Sun	7284edd9b7	Vicuna CPU example of speculative decoding (#10018 ) * Vicuna CPU example of speculative decoding * Update speculative.py * Update README.md * add requirements for ipex * Update README.md * Update speculative.py * Update speculative.py	2024-01-31 11:23:50 +08:00
Wang, Jian4	fb53b994f8	LLM : Add llama ipex optimized (#10046 ) * init ipex * remove padding	2024-01-31 10:38:46 +08:00
Heyang Sun	b1ff28ceb6	LLama2 CPU example of speculative decoding (#9962 ) * LLama2 example of speculative decoding * add docs * Update speculative.py * Update README.md * Update README.md * Update speculative.py * remove autocast	2024-01-31 09:45:20 +08:00
WeiguangHan	0fcad6ce14	LLM: add gpu example for redpajama models (#10040 )	2024-01-30 19:39:28 +08:00
Xiangyu Tian	9978089796	[LLM] Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example (#10028 ) Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example	2024-01-30 17:11:37 +08:00
Heyang Sun	cc3f122f6a	Baichuan2 CPU example of speculative decoding (#10003 ) * Baichuan2 CPU example of speculative decoding * Update generate.py * Update README.md * Update generate.py * Update generate.py * Update generate.py * fix default model * fix wrong chinese coding * Update generate.py * update prompt * update sample outputs * baichuan 7b needs transformers==4.31.0 * rename example file's name	2024-01-29 14:21:09 +08:00
Jin Qiao	440cfe18ed	LLM: GPU Example Updates for Windows (#9992 ) * modify aquila * modify aquila2 * add baichuan * modify baichuan2 * modify blue-lm * modify chatglm3 * modify chinese-llama2 * modiy codellama * modify distil-whisper * modify dolly-v1 * modify dolly-v2 * modify falcon * modify flan-t5 * modify gpt-j * modify internlm * modify llama2 * modify mistral * modify mixtral * modify mpt * modify phi-1_5 * modify qwen * modify qwen-vl * modify replit * modify solar * modify starcoder * modify vicuna * modify voiceassistant * modify whisper * modify yi * modify aquila2 * modify baichuan * modify baichuan2 * modify blue-lm * modify chatglm2 * modify chatglm3 * modify codellama * modify distil-whisper * modify dolly-v1 * modify dolly-v2 * modify flan-t5 * modify llama2 * modify llava * modify mistral * modify mixtral * modify phi-1_5 * modify qwen-vl * modify replit * modify solar * modify starcoder * modify yi * correct the comments * remove cpu_embedding in code for whisper and distil-whisper * remove comment * remove cpu_embedding for voice assistant * revert modify voice assistant * modify for voice assistant * add comment for voice assistant * fix comments * fix comments	2024-01-29 11:25:11 +08:00
SONG Ge	421e7cee80	[LLM] Add Text_Generation_WebUI Support (#9884 ) * initially add text_generation_webui support * add env requirements install * add necessary dependencies * update for starting webui * update shared and noted to place models * update heading of part3 * meet comments * add copyright license * remove extensions * convert tutorial to windows side * add warm-up to optimize performance	2024-01-26 15:12:49 +08:00
binbin Deng	171fb2d185	LLM: reorganize GPU finetuning examples (#9952 )	2024-01-25 19:02:38 +08:00
Wang, Jian4	093e6f8f73	LLM: Add qwen CPU speculative example (#9985 ) * init from gpu * update for cpu * update * update * fix xpu readme * update * update example prompt * update prompt and add 72b * update * update	2024-01-25 17:01:34 +08:00
Yina Chen	99ff6cf048	Update gpu spec decoding baichuan2 example dependency (#9990 ) * add dependency * update * update	2024-01-25 11:05:04 +08:00
Jason Dai	3bc3d0bbcd	Update self-speculative readme (#9986 )	2024-01-24 22:37:32 +08:00
Ruonan Wang	d4f65a6033	LLM: add mistral speculative example (#9976 ) * add mistral example * update	2024-01-24 17:35:15 +08:00
Yina Chen	b176cad75a	LLM: Add baichuan2 gpu spec example (#9973 ) * add baichuan2 gpu spec example * update readme & example * remove print * fix typo * meet comments * revert * update	2024-01-24 16:40:16 +08:00
Jinyi Wan	ec2d9de0ea	Fix README.md for solar (#9957 )	2024-01-24 15:50:54 +08:00
Mingyu Wei	bc9cff51a8	LLM GPU Example Update for Windows Support (#9902 ) * Update README in LLM GPU Examples * Update reference of Intel GPU * add cpu_embedding=True in comment * small fixes * update GPU/README.md and add explanation for cpu_embedding=True * address comments * fix small typos * add backtick for cpu_embedding=True * remove extra backtick in the doc * add period mark * update readme	2024-01-24 13:42:27 +08:00
Yina Chen	5aa4b32c1b	LLM: Add qwen spec gpu example (#9965 ) * add qwen spec gpu example * update readme --------- Co-authored-by: rnwang04 <ruonan1.wang@intel.com>	2024-01-23 15:59:43 +08:00
Ruonan Wang	60b35db1f1	LLM: add chatglm3 speculative decoding example (#9966 ) * add chatglm3 example * update * fix	2024-01-23 15:54:12 +08:00
Ruonan Wang	27b19106f3	LLM: add readme for speculative decoding gpu examples (#9961 ) * add readme * add readme * meet code review	2024-01-23 12:54:19 +08:00
Ruonan Wang	3e601f9a5d	LLM: Support speculative decoding in bigdl-llm (#9951 ) * first commit * fix error, add llama example * hidden print * update api usage * change to api v3 * update * meet code review * meet code review, fix style * add reference, fix style * fix style * fix first token time	2024-01-22 19:14:56 +08:00
binbin Deng	db8e90796a	LLM: add avg token latency information and benchmark guide of autotp (#9940 )	2024-01-19 15:09:57 +08:00
Heyang Sun	5184f400f9	Fix Mixtral GGUF Wrong Output Issue (#9930 ) * Fix Mixtral GGUF Wrong Output Issue * fix style * fix style	2024-01-18 14:11:27 +08:00
Jinyi Wan	07485eff5a	Add SOLAR-10.7B to README (#9869 )	2024-01-11 14:28:41 +08:00
ZehuaCao	e76d984164	[LLM] Support llm-awq vicuna-7b-1.5 on arc (#9874 ) * support llm-awq vicuna-7b-1.5 on arc * support llm-awq vicuna-7b-1.5 on arc	2024-01-10 14:28:39 +08:00
Yuwen Hu	023679459e	[LLM] Small fixes for finetune related examples and UTs (#9870 )	2024-01-09 18:05:03 +08:00
Yuwen Hu	23fc888abe	Update llm gpu xpu default related info to PyTorch 2.1 (#9866 )	2024-01-09 15:38:47 +08:00
ZehuaCao	146076bdb5	Support llm-awq backend (#9856 ) * Support for LLM-AWQ Backend * fix * Update README.md * Add awqconfig * modify init * update * support llm-awq * fix style * fix style * update * fix AwqBackendPackingMethod not found error * fix style * update README * fix style --------- Co-authored-by: Uxito-Ada <414416158@qq.com> Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com> Co-authored-by: cyita <yitastudy@gmail.com>	2024-01-09 13:07:32 +08:00
binbin Deng	294fd32787	LLM: update DeepSpeed AutoTP example with GPU memory optimization (#9823 )	2024-01-09 09:22:49 +08:00
Mingyu Wei	ed81baa35e	LLM: Use default typing-extension in LangChain examples (#9857 ) * remove typing extension downgrade in readme; minor fixes of code * fix typos in README * change default question of docqa.py	2024-01-08 16:50:55 +08:00
Jinyi Wan	3147ebe63d	Add cpu and gpu examples for SOLAR-10.7B (#9821 )	2024-01-05 09:50:28 +08:00
Ruonan Wang	8504a2bbca	LLM: update qlora alpaca example to change lora usage (#9835 ) * update example * fix style	2024-01-04 15:22:20 +08:00
Ziteng Zhang	05b681fa85	[LLM] IPEX auto importer set on by default (#9832 ) * Set BIGDL_IMPORT_IPEX default to True * Remove import intel_extension_for_pytorch as ipex from GPU example	2024-01-04 13:33:29 +08:00
Wang, Jian4	4ceefc9b18	LLM: Support bitsandbytes config on qlora finetune (#9715 ) * test support bitsandbytesconfig * update style * update cpu example * update example * update readme * update unit test * use bfloat16 * update logic * use int4 * set defalut bnb_4bit_use_double_quant * update * update example * update model.py * update * support lora example	2024-01-04 11:23:16 +08:00
Wang, Jian4	a54cd767b1	LLM: Add gguf falcon (#9801 ) * init falcon * update convert.py * update style	2024-01-03 14:49:02 +08:00
binbin Deng	6584539c91	LLM: fix installation of codellama (#9813 )	2024-01-02 14:32:50 +08:00
Wang, Jian4	7ed9538b9f	LLM: support gguf mpt (#9773 ) * add gguf mpt * update	2023-12-28 09:22:39 +08:00
binbin Deng	40edb7b5d7	LLM: fix get environment variables setting (#9787 )	2023-12-27 09:11:37 +08:00
Jason Dai	361781bcd0	Update readme (#9788 )	2023-12-26 19:46:11 +08:00
Ziteng Zhang	44b4a0c9c5	[LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py (#9786 ) * correct prompt format of Yi * correct prompt format of llama2 in cpu generate.py * correct prompt format of Qwen in GPU example	2023-12-26 16:57:55 +08:00
Heyang Sun	66e286a73d	Support for Mixtral AWQ (#9775 ) * Support for Mixtral AWQ * Update README.md * Update README.md * Update awq_config.py * Update README.md * Update README.md	2023-12-25 16:08:09 +08:00
Ruonan Wang	1917bbe626	LLM: fix `BF16Linear` related training & inference issue (#9755 ) * fix bf16 related issue * fix * update based on comment & add arc lora script * update readme * update based on comment * update based on comment * update * force to bf16 * fix style * move check input dtype into function * update convert * meet code review * meet code review * update merged model to support new training_mode api * fix typo	2023-12-25 14:49:30 +08:00
Yina Chen	449b387125	Support relora in bigdl-llm (#9687 ) * init * fix style * update * support resume & update readme * update * update * remove important * add training mode * meet comments	2023-12-25 14:04:28 +08:00
Yishuo Wang	be13b162fe	add codeshell example (#9743 )	2023-12-25 10:54:01 +08:00
binbin Deng	ed8ed76d4f	LLM: update deepspeed autotp usage (#9733 )	2023-12-25 09:41:14 +08:00
Qiyuan Gong	4c487313f2	Revert "[LLM] IPEX auto importer turn on by default for XPU (#9730 )" (#9759 ) This reverts commit `0284801fbd`.	2023-12-22 16:38:24 +08:00
Qiyuan Gong	0284801fbd	[LLM] IPEX auto importer turn on by default for XPU (#9730 ) * Set BIGDL_IMPORT_IPEX default to true, i.e., auto import IPEX for XPU. * Remove import intel_extension_for_pytorch as ipex from GPU example. * Add support for bigdl-core-xe-21.	2023-12-22 16:20:32 +08:00
Ruonan Wang	2f36769208	LLM: bigdl-llm lora support & lora example (#9740 ) * lora support and single card example * support multi-card, refactor code * fix model id and style * remove torch patch, add two new class for bf16, update example * fix style * change to training_mode * small fix * add more info in help * fixstyle, update readme * fix ut * fix ut * Handling compatibility issues with default LoraConfig	2023-12-22 11:05:39 +08:00
Wang, Jian4	984697afe2	LLM: Add bloom gguf support (#9734 ) * init * update bloom add merges * update * update readme * update for llama error * update	2023-12-21 14:06:25 +08:00
Heyang Sun	1fa7793fc0	Load Mixtral GGUF Model (#9690 ) * Load Mixtral GGUF Model * refactor * fix empty tensor when to cpu * update gpu and cpu readmes * add dtype when set tensor into module	2023-12-19 13:54:38 +08:00

1 2 3 4 5 ...

251 commits