Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j ( #10606 )
...
* remove ipex.optimize
* fix
* fix
2024-04-01 12:21:49 +08:00
Keyan (Kyrie) Zhang
848fa04dd6
Fix typo in Baichuan2 example ( #10589 )
2024-03-29 13:31:47 +08:00
ZehuaCao
52a2135d83
Replace ipex with ipex-llm ( #10554 )
...
* fix ipex with ipex_llm
* fix ipex with ipex_llm
* update
* update
* update
* update
* update
* update
* update
* update
2024-03-28 13:54:40 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation ( #10561 )
2024-03-27 17:58:57 +08:00
hxsz1997
d86477f14d
Remove native_int4 in LangChain examples ( #10510 )
...
* rebase the modify to ipex-llm
* modify the typo
2024-03-27 17:48:16 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm ( #24 )
...
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Jin Qiao
cc5806f4bc
LLM: add save/load example for hf-transformers ( #10432 )
2024-03-22 13:57:47 +08:00
binbin Deng
2958ca49c0
LLM: add patching function for llm finetuning ( #10247 )
2024-03-21 16:01:01 +08:00
hxsz1997
a5f35757a4
Migrate langchain rag cpu example to gpu ( #10450 )
...
* add langchain rag on gpu
* add rag example in readme
* add trust_remote_code in TransformersEmbeddings.from_model_id
* add trust_remote_code in TransformersEmbeddings.from_model_id in cpu
2024-03-21 15:20:46 +08:00
Ruonan Wang
28c315a5b9
LLM: fix deepspeed error of finetuning on xpu ( #10484 )
2024-03-21 09:46:25 +08:00
Cengguang Zhang
463a86cd5d
LLM: fix qwen-vl interpolation gpu abnormal results. ( #10457 )
...
* fix qwen-vl interpolation gpu abnormal results.
* fix style.
* update qwen-vl gpu example.
* fix comment and update example.
* fix style.
2024-03-19 16:59:39 +08:00
Jiao Wang
f3fefdc9ce
fix pad_token_id issue ( #10425 )
2024-03-18 23:30:28 -07:00
Yuxuan Xia
74e7490fda
Fix Baichuan2 prompt format ( #10334 )
...
* Fix Baichuan2 prompt format
* Fix Baichuan2 README
* Change baichuan2 prompt info
* Change baichuan2 prompt info
2024-03-19 12:48:07 +08:00
Yang Wang
9e763b049c
Support running pipeline parallel inference by vertically partitioning model to different devices ( #10392 )
...
* support pipeline parallel inference
* fix logging
* remove benchmark file
* fic
* need to warmup twice
* support qwen and qwen2
* fix lint
* remove genxir
* refine
2024-03-18 13:04:45 -07:00
Jiao Wang
5ab52ef5b5
update ( #10424 )
2024-03-15 09:24:26 -07:00
Jin Qiao
ca372f6dab
LLM: add save/load example for ModelScope ( #10397 )
...
* LLM: add sl example for modelscope
* fix according to comments
* move file
2024-03-15 15:17:50 +08:00
Wang, Jian4
fe8976a00f
LLM: Support gguf models use low_bit and fix no json( #10408 )
...
* support others model use low_bit
* update readme
* update to add *.json
2024-03-15 09:34:18 +08:00
binbin Deng
5d7e044dbc
LLM: add low bit option in deepspeed autotp example ( #10382 )
2024-03-12 17:07:09 +08:00
binbin Deng
df3bcc0e65
LLM: remove english_quotes dataset ( #10370 )
2024-03-12 16:57:40 +08:00
binbin Deng
fe27a6971c
LLM: update modelscope version ( #10367 )
2024-03-11 16:18:27 +08:00
Zhicun
9026c08633
Fix llamaindex AutoTokenizer bug ( #10345 )
...
* fix tokenizer
* fix AutoTokenizer bug
* modify code style
2024-03-08 16:24:50 +08:00
hxsz1997
af11c53473
Add the installation step of postgresql and pgvector on windows in LlamaIndex GPU support ( #10328 )
...
* add the installation of postgresql and pgvector of windows
* fix some format
2024-03-05 18:31:19 +08:00
dingbaorong
1e6f0c6f1a
Add llamaindex gpu example ( #10314 )
...
* add llamaindex example
* fix core dump
* refine readme
* add trouble shooting
* refine readme
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:36:00 +08:00
dingbaorong
fc7f10cd12
add langchain gpu example ( #10277 )
...
* first draft
* fix
* add readme for transformer_int4_gpu
* fix doc
* check device_map
* add arc ut test
* fix ut test
* fix langchain ut
* Refine README
* fix gpu mem too high
* fix ut test
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:33:57 +08:00
Xin Qiu
58208a5883
Update FAQ document. ( #10300 )
...
* Update install_gpu.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
2024-03-04 08:35:11 +08:00
Xin Qiu
509e206de0
update doc about gemma random and unreadable output. ( #10297 )
...
* Update install_gpu.md
* Update README.md
* Update README.md
2024-03-01 15:41:16 +08:00
Guancheng Fu
2d930bdca8
Add vLLM bf16 support ( #10278 )
...
* add argument load_in_low_bit
* add docs
* modify gpu doc
* done
---------
Co-authored-by: ivy-lv11 <lvzc@lamda.nju.edu.cn>
2024-02-29 16:33:42 +08:00
Ruonan Wang
a9fd20b6ba
LLM: Update qkv fusion for GGUF-IQ2 ( #10271 )
...
* first commit
* update mistral
* fix transformers==4.36.0
* fix
* disable qk for mixtral now
* fix style
2024-02-29 12:49:53 +08:00
Shengsheng Huang
db0d129226
Revert "Add rwkv example ( #9432 )" ( #10264 )
...
This reverts commit 6930422b42 .
2024-02-28 11:48:31 +08:00
Yining Wang
6930422b42
Add rwkv example ( #9432 )
...
* codeshell fix wrong urls
* restart runner
* add RWKV CPU & GPU example (rwkv-4-world-7b)
* restart runner
* update submodule
* fix runner
* runner-test
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:41:00 +08:00
Keyan (Kyrie) Zhang
59861f73e5
Add Deepseek-6.7B ( #9991 )
...
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* modify deepseek
* modify deepseek
* Add verified model in README
* Turn cpu_embedding=True in Deepseek example
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:36:39 +08:00
Yuxuan Xia
2524273198
Update AutoGen README ( #10255 )
...
* Update AutoGen README
* Fix AutoGen README typos
* Update AutoGen README
* Update AutoGen README
2024-02-28 11:34:45 +08:00
Zheng, Yi
2347f611cf
Add cpu and gpu examples of Mamba ( #9797 )
...
* Add mamba cpu example
* Add mamba gpu example
* Use a smaller model as the example
* minor fixes
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:33:29 +08:00
Guoqiong Song
f4a2e32106
Stream llm example for both GPU and CPU ( #9390 )
2024-02-27 15:54:47 -08:00
Keyan (Kyrie) Zhang
843fe546b0
Add CPU and GPU examples for DeciLM-7B ( #9867 )
...
* Add cpu and gpu examples for DeciLM-7B
* Add cpu and gpu examples for DeciLM-7B
* Add DeciLM-7B to README table
* modify deciLM
* modify deciLM
* modify deciLM
* Add verified model in README
* Add cpu_embedding=True
2024-02-27 13:15:49 +08:00
Xin Qiu
8ef5482da2
update Gemma readme ( #10229 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
2024-02-23 16:57:08 +08:00
Xin Qiu
aabfc06977
add gemma example ( #10224 )
...
* add gemma gpu example
* Update README.md
* add cpu example
* Update README.md
* Update README.md
* Update generate.py
* Update generate.py
2024-02-23 15:20:57 +08:00
yb-peng
a2c1675546
Add CPU and GPU examples for Yuan2-2B-hf ( #9946 )
...
* Add a new CPU example of Yuan2-2B-hf
* Add a new CPU generate.py of Yuan2-2B-hf example
* Add a new GPU example of Yuan2-2B-hf
* Add Yuan2 to README table
* In CPU example:1.Use English as default prompt; 2.Provide modified files in yuan2-2B-instruct
* In GPU example:1.Use English as default prompt;2.Provide modified files
* GPU example:update README
* update Yuan2-2B-hf in README table
* Add CPU example for Yuan2-2B in Pytorch-Models
* Add GPU example for Yuan2-2B in Pytorch-Models
* Add license in generate.py; Modify README
* In GPU Add license in generate.py; Modify README
* In CPU yuan2 modify README
* In GPU yuan2 modify README
* In CPU yuan2 modify README
* In GPU example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* GPU hf example README modified
* GPU example README modified
2024-02-23 14:09:30 +08:00
yb-peng
f1f4094a09
Add CPU and GPU examples of phi-2 ( #10014 )
...
* Add CPU and GPU examples of phi-2
* In GPU hf example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* update the table in BigDL/README.md
* update the table in BigDL/python/llm/README.md
2024-02-23 14:05:53 +08:00
Guoqiong Song
63681af97e
falcon for transformers 4.36 ( #9960 )
...
* falcon for transformers 4.36
2024-02-22 17:04:40 -08:00
Jason Dai
84d5f40936
Update README.md ( #10213 )
2024-02-22 17:22:59 +08:00
Ruonan Wang
5e1fee5e05
LLM: add GGUF-IQ2 examples ( #10207 )
...
* add iq2 examples
* small fix
* meet code review
* fix
* meet review
* small fix
2024-02-22 14:18:45 +08:00
binbin Deng
9975b029c5
LLM: add qlora finetuning example using trl.SFTTrainer ( #10183 )
2024-02-21 16:40:04 +08:00
Zhicun
c7e839e66c
Add Qwen1.5-7B-Chat ( #10113 )
...
* add Qwen1.5-7B-Chat
* modify Qwen1.5 example
* update README
* update prompt format
* update folder name and example README
* add Chinese prompt sample output
* update link in README
* correct the link
* update transformer version
2024-02-21 13:29:29 +08:00
binbin Deng
11fe5a87ec
LLM: add Modelscope model example ( #10126 )
2024-02-08 11:18:07 +08:00
Jin Qiao
0fcfbfaf6f
LLM: add rwkv5 eagle GPU HF example ( #10122 )
...
* LLM: add rwkv5 eagle example
* fix
* fix link
2024-02-07 16:58:29 +08:00
binbin Deng
c1ec3d8921
LLM: update FAQ about too many open files ( #10119 )
2024-02-07 15:02:24 +08:00
Jin Qiao
d3d2ee1b63
LLM: add speech T5 GPU example ( #10090 )
...
* add speech t5 example
* fix
* fix
2024-02-07 10:50:02 +08:00
Jin Qiao
2f4c754759
LLM: add bark gpu example ( #10091 )
...
* add bark gpu example
* fix
* fix license
* add bark
* add example
* fix
* another way
2024-02-07 10:47:11 +08:00
Yuwen Hu
3a46b57253
[LLM] Add RWKV4 HF GPU Example ( #10105 )
...
* Add GPU HF example for RWKV 4
* Add link to rwkv4
* fix
2024-02-06 16:30:24 +08:00
Zhicun
7d2be7994f
add phixtral and optimize phi-moe ( #10052 )
2024-02-05 11:12:47 +08:00
binbin Deng
7e49fbc5dd
LLM: make finetuning examples more common for other models ( #10078 )
2024-02-04 16:03:52 +08:00
ivy-lv11
428b7105f6
Add HF and PyTorch example InternLM2 ( #10061 )
2024-02-04 10:25:55 +08:00
Yina Chen
77be19bb97
LLM: Support gpt-j in speculative decoding ( #10067 )
...
* gptj
* support gptj in speculative decoding
* fix
* update readme
* small fix
2024-02-02 14:54:55 +08:00
binbin Deng
aae20d728e
LLM: Add initial DPO finetuning example ( #10021 )
2024-02-01 14:18:08 +08:00
WeiguangHan
a9018a0e95
LLM: modify the GPU example for redpajama model ( #10044 )
...
* LLM: modify the GPU example for redpajama model
* small fix
2024-01-31 14:32:08 +08:00
Yuxuan Xia
95636cad97
Add AutoGen CPU and XPU Example ( #9980 )
...
* Add AutoGen example
* Adjust AutoGen README
* Adjust AutoGen README
* Change AutoGen README
* Change AutoGen README
2024-01-31 11:31:18 +08:00
WeiguangHan
0fcad6ce14
LLM: add gpu example for redpajama models ( #10040 )
2024-01-30 19:39:28 +08:00
Jin Qiao
440cfe18ed
LLM: GPU Example Updates for Windows ( #9992 )
...
* modify aquila
* modify aquila2
* add baichuan
* modify baichuan2
* modify blue-lm
* modify chatglm3
* modify chinese-llama2
* modiy codellama
* modify distil-whisper
* modify dolly-v1
* modify dolly-v2
* modify falcon
* modify flan-t5
* modify gpt-j
* modify internlm
* modify llama2
* modify mistral
* modify mixtral
* modify mpt
* modify phi-1_5
* modify qwen
* modify qwen-vl
* modify replit
* modify solar
* modify starcoder
* modify vicuna
* modify voiceassistant
* modify whisper
* modify yi
* modify aquila2
* modify baichuan
* modify baichuan2
* modify blue-lm
* modify chatglm2
* modify chatglm3
* modify codellama
* modify distil-whisper
* modify dolly-v1
* modify dolly-v2
* modify flan-t5
* modify llama2
* modify llava
* modify mistral
* modify mixtral
* modify phi-1_5
* modify qwen-vl
* modify replit
* modify solar
* modify starcoder
* modify yi
* correct the comments
* remove cpu_embedding in code for whisper and distil-whisper
* remove comment
* remove cpu_embedding for voice assistant
* revert modify voice assistant
* modify for voice assistant
* add comment for voice assistant
* fix comments
* fix comments
2024-01-29 11:25:11 +08:00
binbin Deng
171fb2d185
LLM: reorganize GPU finetuning examples ( #9952 )
2024-01-25 19:02:38 +08:00
Wang, Jian4
093e6f8f73
LLM: Add qwen CPU speculative example ( #9985 )
...
* init from gpu
* update for cpu
* update
* update
* fix xpu readme
* update
* update example prompt
* update prompt and add 72b
* update
* update
2024-01-25 17:01:34 +08:00
Yina Chen
99ff6cf048
Update gpu spec decoding baichuan2 example dependency ( #9990 )
...
* add dependency
* update
* update
2024-01-25 11:05:04 +08:00
Jason Dai
3bc3d0bbcd
Update self-speculative readme ( #9986 )
2024-01-24 22:37:32 +08:00
Ruonan Wang
d4f65a6033
LLM: add mistral speculative example ( #9976 )
...
* add mistral example
* update
2024-01-24 17:35:15 +08:00
Yina Chen
b176cad75a
LLM: Add baichuan2 gpu spec example ( #9973 )
...
* add baichuan2 gpu spec example
* update readme & example
* remove print
* fix typo
* meet comments
* revert
* update
2024-01-24 16:40:16 +08:00
Jinyi Wan
ec2d9de0ea
Fix README.md for solar ( #9957 )
2024-01-24 15:50:54 +08:00
Mingyu Wei
bc9cff51a8
LLM GPU Example Update for Windows Support ( #9902 )
...
* Update README in LLM GPU Examples
* Update reference of Intel GPU
* add cpu_embedding=True in comment
* small fixes
* update GPU/README.md and add explanation for cpu_embedding=True
* address comments
* fix small typos
* add backtick for cpu_embedding=True
* remove extra backtick in the doc
* add period mark
* update readme
2024-01-24 13:42:27 +08:00
Yina Chen
5aa4b32c1b
LLM: Add qwen spec gpu example ( #9965 )
...
* add qwen spec gpu example
* update readme
---------
Co-authored-by: rnwang04 <ruonan1.wang@intel.com>
2024-01-23 15:59:43 +08:00
Ruonan Wang
60b35db1f1
LLM: add chatglm3 speculative decoding example ( #9966 )
...
* add chatglm3 example
* update
* fix
2024-01-23 15:54:12 +08:00
Ruonan Wang
27b19106f3
LLM: add readme for speculative decoding gpu examples ( #9961 )
...
* add readme
* add readme
* meet code review
2024-01-23 12:54:19 +08:00
Ruonan Wang
3e601f9a5d
LLM: Support speculative decoding in bigdl-llm ( #9951 )
...
* first commit
* fix error, add llama example
* hidden print
* update api usage
* change to api v3
* update
* meet code review
* meet code review, fix style
* add reference, fix style
* fix style
* fix first token time
2024-01-22 19:14:56 +08:00
binbin Deng
db8e90796a
LLM: add avg token latency information and benchmark guide of autotp ( #9940 )
2024-01-19 15:09:57 +08:00
Heyang Sun
5184f400f9
Fix Mixtral GGUF Wrong Output Issue ( #9930 )
...
* Fix Mixtral GGUF Wrong Output Issue
* fix style
* fix style
2024-01-18 14:11:27 +08:00
Jinyi Wan
07485eff5a
Add SOLAR-10.7B to README ( #9869 )
2024-01-11 14:28:41 +08:00
ZehuaCao
e76d984164
[LLM] Support llm-awq vicuna-7b-1.5 on arc ( #9874 )
...
* support llm-awq vicuna-7b-1.5 on arc
* support llm-awq vicuna-7b-1.5 on arc
2024-01-10 14:28:39 +08:00
Yuwen Hu
023679459e
[LLM] Small fixes for finetune related examples and UTs ( #9870 )
2024-01-09 18:05:03 +08:00
Yuwen Hu
23fc888abe
Update llm gpu xpu default related info to PyTorch 2.1 ( #9866 )
2024-01-09 15:38:47 +08:00
binbin Deng
294fd32787
LLM: update DeepSpeed AutoTP example with GPU memory optimization ( #9823 )
2024-01-09 09:22:49 +08:00
Jinyi Wan
3147ebe63d
Add cpu and gpu examples for SOLAR-10.7B ( #9821 )
2024-01-05 09:50:28 +08:00
Ruonan Wang
8504a2bbca
LLM: update qlora alpaca example to change lora usage ( #9835 )
...
* update example
* fix style
2024-01-04 15:22:20 +08:00
Ziteng Zhang
05b681fa85
[LLM] IPEX auto importer set on by default ( #9832 )
...
* Set BIGDL_IMPORT_IPEX default to True
* Remove import intel_extension_for_pytorch as ipex from GPU example
2024-01-04 13:33:29 +08:00
Wang, Jian4
4ceefc9b18
LLM: Support bitsandbytes config on qlora finetune ( #9715 )
...
* test support bitsandbytesconfig
* update style
* update cpu example
* update example
* update readme
* update unit test
* use bfloat16
* update logic
* use int4
* set defalut bnb_4bit_use_double_quant
* update
* update example
* update model.py
* update
* support lora example
2024-01-04 11:23:16 +08:00
Wang, Jian4
a54cd767b1
LLM: Add gguf falcon ( #9801 )
...
* init falcon
* update convert.py
* update style
2024-01-03 14:49:02 +08:00
binbin Deng
6584539c91
LLM: fix installation of codellama ( #9813 )
2024-01-02 14:32:50 +08:00
Wang, Jian4
7ed9538b9f
LLM: support gguf mpt ( #9773 )
...
* add gguf mpt
* update
2023-12-28 09:22:39 +08:00
binbin Deng
40edb7b5d7
LLM: fix get environment variables setting ( #9787 )
2023-12-27 09:11:37 +08:00
Jason Dai
361781bcd0
Update readme ( #9788 )
2023-12-26 19:46:11 +08:00
Ziteng Zhang
44b4a0c9c5
[LLM] Correct prompt format of Yi, Llama2 and Qwen in generate.py ( #9786 )
...
* correct prompt format of Yi
* correct prompt format of llama2 in cpu generate.py
* correct prompt format of Qwen in GPU example
2023-12-26 16:57:55 +08:00
Heyang Sun
66e286a73d
Support for Mixtral AWQ ( #9775 )
...
* Support for Mixtral AWQ
* Update README.md
* Update README.md
* Update awq_config.py
* Update README.md
* Update README.md
2023-12-25 16:08:09 +08:00
Ruonan Wang
1917bbe626
LLM: fix BF16Linear related training & inference issue ( #9755 )
...
* fix bf16 related issue
* fix
* update based on comment & add arc lora script
* update readme
* update based on comment
* update based on comment
* update
* force to bf16
* fix style
* move check input dtype into function
* update convert
* meet code review
* meet code review
* update merged model to support new training_mode api
* fix typo
2023-12-25 14:49:30 +08:00
Yina Chen
449b387125
Support relora in bigdl-llm ( #9687 )
...
* init
* fix style
* update
* support resume & update readme
* update
* update
* remove important
* add training mode
* meet comments
2023-12-25 14:04:28 +08:00
Yishuo Wang
be13b162fe
add codeshell example ( #9743 )
2023-12-25 10:54:01 +08:00
binbin Deng
ed8ed76d4f
LLM: update deepspeed autotp usage ( #9733 )
2023-12-25 09:41:14 +08:00
Qiyuan Gong
4c487313f2
Revert "[LLM] IPEX auto importer turn on by default for XPU ( #9730 )" ( #9759 )
...
This reverts commit 0284801fbd .
2023-12-22 16:38:24 +08:00
Qiyuan Gong
0284801fbd
[LLM] IPEX auto importer turn on by default for XPU ( #9730 )
...
* Set BIGDL_IMPORT_IPEX default to true, i.e., auto import IPEX for XPU.
* Remove import intel_extension_for_pytorch as ipex from GPU example.
* Add support for bigdl-core-xe-21.
2023-12-22 16:20:32 +08:00
Ruonan Wang
2f36769208
LLM: bigdl-llm lora support & lora example ( #9740 )
...
* lora support and single card example
* support multi-card, refactor code
* fix model id and style
* remove torch patch, add two new class for bf16, update example
* fix style
* change to training_mode
* small fix
* add more info in help
* fixstyle, update readme
* fix ut
* fix ut
* Handling compatibility issues with default LoraConfig
2023-12-22 11:05:39 +08:00
Wang, Jian4
984697afe2
LLM: Add bloom gguf support ( #9734 )
...
* init
* update bloom add merges
* update
* update readme
* update for llama error
* update
2023-12-21 14:06:25 +08:00
Heyang Sun
1fa7793fc0
Load Mixtral GGUF Model ( #9690 )
...
* Load Mixtral GGUF Model
* refactor
* fix empty tensor when to cpu
* update gpu and cpu readmes
* add dtype when set tensor into module
2023-12-19 13:54:38 +08:00