Yuxuan Xia
95636cad97
Add AutoGen CPU and XPU Example ( #9980 )
...
* Add AutoGen example
* Adjust AutoGen README
* Adjust AutoGen README
* Change AutoGen README
* Change AutoGen README
2024-01-31 11:31:18 +08:00
Heyang Sun
7284edd9b7
Vicuna CPU example of speculative decoding ( #10018 )
...
* Vicuna CPU example of speculative decoding
* Update speculative.py
* Update README.md
* add requirements for ipex
* Update README.md
* Update speculative.py
* Update speculative.py
2024-01-31 11:23:50 +08:00
Wang, Jian4
fb53b994f8
LLM : Add llama ipex optimized ( #10046 )
...
* init ipex
* remove padding
2024-01-31 10:38:46 +08:00
Heyang Sun
b1ff28ceb6
LLama2 CPU example of speculative decoding ( #9962 )
...
* LLama2 example of speculative decoding
* add docs
* Update speculative.py
* Update README.md
* Update README.md
* Update speculative.py
* remove autocast
2024-01-31 09:45:20 +08:00
WeiguangHan
0fcad6ce14
LLM: add gpu example for redpajama models ( #10040 )
2024-01-30 19:39:28 +08:00
Xiangyu Tian
9978089796
[LLM] Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example ( #10028 )
...
Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example
2024-01-30 17:11:37 +08:00
Heyang Sun
cc3f122f6a
Baichuan2 CPU example of speculative decoding ( #10003 )
...
* Baichuan2 CPU example of speculative decoding
* Update generate.py
* Update README.md
* Update generate.py
* Update generate.py
* Update generate.py
* fix default model
* fix wrong chinese coding
* Update generate.py
* update prompt
* update sample outputs
* baichuan 7b needs transformers==4.31.0
* rename example file's name
2024-01-29 14:21:09 +08:00
Jin Qiao
440cfe18ed
LLM: GPU Example Updates for Windows ( #9992 )
...
* modify aquila
* modify aquila2
* add baichuan
* modify baichuan2
* modify blue-lm
* modify chatglm3
* modify chinese-llama2
* modiy codellama
* modify distil-whisper
* modify dolly-v1
* modify dolly-v2
* modify falcon
* modify flan-t5
* modify gpt-j
* modify internlm
* modify llama2
* modify mistral
* modify mixtral
* modify mpt
* modify phi-1_5
* modify qwen
* modify qwen-vl
* modify replit
* modify solar
* modify starcoder
* modify vicuna
* modify voiceassistant
* modify whisper
* modify yi
* modify aquila2
* modify baichuan
* modify baichuan2
* modify blue-lm
* modify chatglm2
* modify chatglm3
* modify codellama
* modify distil-whisper
* modify dolly-v1
* modify dolly-v2
* modify flan-t5
* modify llama2
* modify llava
* modify mistral
* modify mixtral
* modify phi-1_5
* modify qwen-vl
* modify replit
* modify solar
* modify starcoder
* modify yi
* correct the comments
* remove cpu_embedding in code for whisper and distil-whisper
* remove comment
* remove cpu_embedding for voice assistant
* revert modify voice assistant
* modify for voice assistant
* add comment for voice assistant
* fix comments
* fix comments
2024-01-29 11:25:11 +08:00
SONG Ge
421e7cee80
[LLM] Add Text_Generation_WebUI Support ( #9884 )
...
* initially add text_generation_webui support
* add env requirements install
* add necessary dependencies
* update for starting webui
* update shared and noted to place models
* update heading of part3
* meet comments
* add copyright license
* remove extensions
* convert tutorial to windows side
* add warm-up to optimize performance
2024-01-26 15:12:49 +08:00
binbin Deng
171fb2d185
LLM: reorganize GPU finetuning examples ( #9952 )
2024-01-25 19:02:38 +08:00
Wang, Jian4
093e6f8f73
LLM: Add qwen CPU speculative example ( #9985 )
...
* init from gpu
* update for cpu
* update
* update
* fix xpu readme
* update
* update example prompt
* update prompt and add 72b
* update
* update
2024-01-25 17:01:34 +08:00
Yina Chen
99ff6cf048
Update gpu spec decoding baichuan2 example dependency ( #9990 )
...
* add dependency
* update
* update
2024-01-25 11:05:04 +08:00
Jason Dai
3bc3d0bbcd
Update self-speculative readme ( #9986 )
2024-01-24 22:37:32 +08:00
Ruonan Wang
d4f65a6033
LLM: add mistral speculative example ( #9976 )
...
* add mistral example
* update
2024-01-24 17:35:15 +08:00
Yina Chen
b176cad75a
LLM: Add baichuan2 gpu spec example ( #9973 )
...
* add baichuan2 gpu spec example
* update readme & example
* remove print
* fix typo
* meet comments
* revert
* update
2024-01-24 16:40:16 +08:00
Jinyi Wan
ec2d9de0ea
Fix README.md for solar ( #9957 )
2024-01-24 15:50:54 +08:00
Mingyu Wei
bc9cff51a8
LLM GPU Example Update for Windows Support ( #9902 )
...
* Update README in LLM GPU Examples
* Update reference of Intel GPU
* add cpu_embedding=True in comment
* small fixes
* update GPU/README.md and add explanation for cpu_embedding=True
* address comments
* fix small typos
* add backtick for cpu_embedding=True
* remove extra backtick in the doc
* add period mark
* update readme
2024-01-24 13:42:27 +08:00
Yina Chen
5aa4b32c1b
LLM: Add qwen spec gpu example ( #9965 )
...
* add qwen spec gpu example
* update readme
---------
Co-authored-by: rnwang04 <ruonan1.wang@intel.com>
2024-01-23 15:59:43 +08:00
Ruonan Wang
60b35db1f1
LLM: add chatglm3 speculative decoding example ( #9966 )
...
* add chatglm3 example
* update
* fix
2024-01-23 15:54:12 +08:00
Ruonan Wang
27b19106f3
LLM: add readme for speculative decoding gpu examples ( #9961 )
...
* add readme
* add readme
* meet code review
2024-01-23 12:54:19 +08:00
Ruonan Wang
3e601f9a5d
LLM: Support speculative decoding in bigdl-llm ( #9951 )
...
* first commit
* fix error, add llama example
* hidden print
* update api usage
* change to api v3
* update
* meet code review
* meet code review, fix style
* add reference, fix style
* fix style
* fix first token time
2024-01-22 19:14:56 +08:00
binbin Deng
db8e90796a
LLM: add avg token latency information and benchmark guide of autotp ( #9940 )
2024-01-19 15:09:57 +08:00
Heyang Sun
5184f400f9
Fix Mixtral GGUF Wrong Output Issue ( #9930 )
...
* Fix Mixtral GGUF Wrong Output Issue
* fix style
* fix style
2024-01-18 14:11:27 +08:00
Jinyi Wan
07485eff5a
Add SOLAR-10.7B to README ( #9869 )
2024-01-11 14:28:41 +08:00
ZehuaCao
e76d984164
[LLM] Support llm-awq vicuna-7b-1.5 on arc ( #9874 )
...
* support llm-awq vicuna-7b-1.5 on arc
* support llm-awq vicuna-7b-1.5 on arc
2024-01-10 14:28:39 +08:00
Yuwen Hu
023679459e
[LLM] Small fixes for finetune related examples and UTs ( #9870 )
2024-01-09 18:05:03 +08:00
Yuwen Hu
23fc888abe
Update llm gpu xpu default related info to PyTorch 2.1 ( #9866 )
2024-01-09 15:38:47 +08:00
ZehuaCao
146076bdb5
Support llm-awq backend ( #9856 )
...
* Support for LLM-AWQ Backend
* fix
* Update README.md
* Add awqconfig
* modify init
* update
* support llm-awq
* fix style
* fix style
* update
* fix AwqBackendPackingMethod not found error
* fix style
* update README
* fix style
---------
Co-authored-by: Uxito-Ada <414416158@qq.com>
Co-authored-by: Heyang Sun <60865256+Uxito-Ada@users.noreply.github.com>
Co-authored-by: cyita <yitastudy@gmail.com>
2024-01-09 13:07:32 +08:00
binbin Deng
294fd32787
LLM: update DeepSpeed AutoTP example with GPU memory optimization ( #9823 )
2024-01-09 09:22:49 +08:00
Mingyu Wei
ed81baa35e
LLM: Use default typing-extension in LangChain examples ( #9857 )
...
* remove typing extension downgrade in readme; minor fixes of code
* fix typos in README
* change default question of docqa.py
2024-01-08 16:50:55 +08:00
Jinyi Wan
3147ebe63d
Add cpu and gpu examples for SOLAR-10.7B ( #9821 )
2024-01-05 09:50:28 +08:00
Ruonan Wang
8504a2bbca
LLM: update qlora alpaca example to change lora usage ( #9835 )
...
* update example
* fix style
2024-01-04 15:22:20 +08:00
Ziteng Zhang
05b681fa85
[LLM] IPEX auto importer set on by default ( #9832 )
...
* Set BIGDL_IMPORT_IPEX default to True
* Remove import intel_extension_for_pytorch as ipex from GPU example
2024-01-04 13:33:29 +08:00
Wang, Jian4
4ceefc9b18
LLM: Support bitsandbytes config on qlora finetune ( #9715 )
...
* test support bitsandbytesconfig
* update style
* update cpu example
* update example
* update readme
* update unit test
* use bfloat16
* update logic
* use int4
* set defalut bnb_4bit_use_double_quant
* update
* update example
* update model.py
* update
* support lora example
2024-01-04 11:23:16 +08:00
Wang, Jian4
a54cd767b1
LLM: Add gguf falcon ( #9801 )
...
* init falcon
* update convert.py
* update style
2024-01-03 14:49:02 +08:00
binbin Deng
6584539c91
LLM: fix installation of codellama ( #9813 )
2024-01-02 14:32:50 +08:00
Wang, Jian4
7ed9538b9f
LLM: support gguf mpt ( #9773 )
...
* add gguf mpt
* update
2023-12-28 09:22:39 +08:00
binbin Deng
40edb7b5d7
LLM: fix get environment variables setting ( #9787 )
2023-12-27 09:11:37 +08:00
Jason Dai
361781bcd0
Update readme ( #9788 )
2023-12-26 19:46:11 +08:00
Ziteng Zhang
44b4a0c9c5
[LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py ( #9786 )
...
* correct prompt format of Yi
* correct prompt format of llama2 in cpu generate.py
* correct prompt format of Qwen in GPU example
2023-12-26 16:57:55 +08:00
Heyang Sun
66e286a73d
Support for Mixtral AWQ ( #9775 )
...
* Support for Mixtral AWQ
* Update README.md
* Update README.md
* Update awq_config.py
* Update README.md
* Update README.md
2023-12-25 16:08:09 +08:00
Ruonan Wang
1917bbe626
LLM: fix BF16Linear related training & inference issue ( #9755 )
...
* fix bf16 related issue
* fix
* update based on comment & add arc lora script
* update readme
* update based on comment
* update based on comment
* update
* force to bf16
* fix style
* move check input dtype into function
* update convert
* meet code review
* meet code review
* update merged model to support new training_mode api
* fix typo
2023-12-25 14:49:30 +08:00
Yina Chen
449b387125
Support relora in bigdl-llm ( #9687 )
...
* init
* fix style
* update
* support resume & update readme
* update
* update
* remove important
* add training mode
* meet comments
2023-12-25 14:04:28 +08:00
Yishuo Wang
be13b162fe
add codeshell example ( #9743 )
2023-12-25 10:54:01 +08:00
binbin Deng
ed8ed76d4f
LLM: update deepspeed autotp usage ( #9733 )
2023-12-25 09:41:14 +08:00
Qiyuan Gong
4c487313f2
Revert "[LLM] IPEX auto importer turn on by default for XPU ( #9730 )" ( #9759 )
...
This reverts commit 0284801fbd .
2023-12-22 16:38:24 +08:00
Qiyuan Gong
0284801fbd
[LLM] IPEX auto importer turn on by default for XPU ( #9730 )
...
* Set BIGDL_IMPORT_IPEX default to true, i.e., auto import IPEX for XPU.
* Remove import intel_extension_for_pytorch as ipex from GPU example.
* Add support for bigdl-core-xe-21.
2023-12-22 16:20:32 +08:00
Ruonan Wang
2f36769208
LLM: bigdl-llm lora support & lora example ( #9740 )
...
* lora support and single card example
* support multi-card, refactor code
* fix model id and style
* remove torch patch, add two new class for bf16, update example
* fix style
* change to training_mode
* small fix
* add more info in help
* fixstyle, update readme
* fix ut
* fix ut
* Handling compatibility issues with default LoraConfig
2023-12-22 11:05:39 +08:00
Wang, Jian4
984697afe2
LLM: Add bloom gguf support ( #9734 )
...
* init
* update bloom add merges
* update
* update readme
* update for llama error
* update
2023-12-21 14:06:25 +08:00
Heyang Sun
1fa7793fc0
Load Mixtral GGUF Model ( #9690 )
...
* Load Mixtral GGUF Model
* refactor
* fix empty tensor when to cpu
* update gpu and cpu readmes
* add dtype when set tensor into module
2023-12-19 13:54:38 +08:00