hxsz1997
d86477f14d
Remove native_int4 in LangChain examples ( #10510 )
...
* rebase the modify to ipex-llm
* modify the typo
2024-03-27 17:48:16 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm ( #24 )
...
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Jin Qiao
cc5806f4bc
LLM: add save/load example for hf-transformers ( #10432 )
2024-03-22 13:57:47 +08:00
binbin Deng
2958ca49c0
LLM: add patching function for llm finetuning ( #10247 )
2024-03-21 16:01:01 +08:00
Zhicun
5b97fdb87b
update deepseek example readme ( #10420 )
...
* update readme
* update
* update readme
2024-03-21 15:21:48 +08:00
hxsz1997
a5f35757a4
Migrate langchain rag cpu example to gpu ( #10450 )
...
* add langchain rag on gpu
* add rag example in readme
* add trust_remote_code in TransformersEmbeddings.from_model_id
* add trust_remote_code in TransformersEmbeddings.from_model_id in cpu
2024-03-21 15:20:46 +08:00
Ruonan Wang
28c315a5b9
LLM: fix deepspeed error of finetuning on xpu ( #10484 )
2024-03-21 09:46:25 +08:00
Cengguang Zhang
463a86cd5d
LLM: fix qwen-vl interpolation gpu abnormal results. ( #10457 )
...
* fix qwen-vl interpolation gpu abnormal results.
* fix style.
* update qwen-vl gpu example.
* fix comment and update example.
* fix style.
2024-03-19 16:59:39 +08:00
Jiao Wang
f3fefdc9ce
fix pad_token_id issue ( #10425 )
2024-03-18 23:30:28 -07:00
Yuxuan Xia
74e7490fda
Fix Baichuan2 prompt format ( #10334 )
...
* Fix Baichuan2 prompt format
* Fix Baichuan2 README
* Change baichuan2 prompt info
* Change baichuan2 prompt info
2024-03-19 12:48:07 +08:00
Yang Wang
9e763b049c
Support running pipeline parallel inference by vertically partitioning model to different devices ( #10392 )
...
* support pipeline parallel inference
* fix logging
* remove benchmark file
* fic
* need to warmup twice
* support qwen and qwen2
* fix lint
* remove genxir
* refine
2024-03-18 13:04:45 -07:00
Wang, Jian4
1de13ea578
LLM: remove CPU english_quotes dataset and update docker example ( #10399 )
...
* update dataset
* update readme
* update docker cpu
* update xpu docker
2024-03-18 10:45:14 +08:00
Jiao Wang
5ab52ef5b5
update ( #10424 )
2024-03-15 09:24:26 -07:00
Jin Qiao
ca372f6dab
LLM: add save/load example for ModelScope ( #10397 )
...
* LLM: add sl example for modelscope
* fix according to comments
* move file
2024-03-15 15:17:50 +08:00
Wang, Jian4
fe8976a00f
LLM: Support gguf models use low_bit and fix no json( #10408 )
...
* support others model use low_bit
* update readme
* update to add *.json
2024-03-15 09:34:18 +08:00
Wang, Jian4
0193f29411
LLM : Enable gguf float16 and Yuan2 model ( #10372 )
...
* enable float16
* add yun files
* enable yun
* enable set low_bit on yuan2
* update
* update license
* update generate
* update readme
* update python style
* update
2024-03-13 10:19:18 +08:00
binbin Deng
5d7e044dbc
LLM: add low bit option in deepspeed autotp example ( #10382 )
2024-03-12 17:07:09 +08:00
binbin Deng
df3bcc0e65
LLM: remove english_quotes dataset ( #10370 )
2024-03-12 16:57:40 +08:00
binbin Deng
fe27a6971c
LLM: update modelscope version ( #10367 )
2024-03-11 16:18:27 +08:00
Zhicun
9026c08633
Fix llamaindex AutoTokenizer bug ( #10345 )
...
* fix tokenizer
* fix AutoTokenizer bug
* modify code style
2024-03-08 16:24:50 +08:00
Zhicun
2a10b53d73
rename docqa.py->rag.py ( #10353 )
2024-03-08 16:07:09 +08:00
Shengsheng Huang
370c52090c
Langchain readme ( #10348 )
...
* update langchain readme
* update readme
* create new README
* Update README_nativeint4.md
2024-03-08 14:57:24 +08:00
hxsz1997
af11c53473
Add the installation step of postgresql and pgvector on windows in LlamaIndex GPU support ( #10328 )
...
* add the installation of postgresql and pgvector of windows
* fix some format
2024-03-05 18:31:19 +08:00
dingbaorong
1e6f0c6f1a
Add llamaindex gpu example ( #10314 )
...
* add llamaindex example
* fix core dump
* refine readme
* add trouble shooting
* refine readme
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:36:00 +08:00
dingbaorong
fc7f10cd12
add langchain gpu example ( #10277 )
...
* first draft
* fix
* add readme for transformer_int4_gpu
* fix doc
* check device_map
* add arc ut test
* fix ut test
* fix langchain ut
* Refine README
* fix gpu mem too high
* fix ut test
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:33:57 +08:00
Xin Qiu
58208a5883
Update FAQ document. ( #10300 )
...
* Update install_gpu.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
2024-03-04 08:35:11 +08:00
Xin Qiu
509e206de0
update doc about gemma random and unreadable output. ( #10297 )
...
* Update install_gpu.md
* Update README.md
* Update README.md
2024-03-01 15:41:16 +08:00
Shengsheng Huang
bcfad555df
revise llamaindex readme ( #10283 )
2024-02-29 17:19:23 +08:00
Guancheng Fu
2d930bdca8
Add vLLM bf16 support ( #10278 )
...
* add argument load_in_low_bit
* add docs
* modify gpu doc
* done
---------
Co-authored-by: ivy-lv11 <lvzc@lamda.nju.edu.cn>
2024-02-29 16:33:42 +08:00
Zhicun
4e6cc424f1
Add LlamaIndex RAG ( #10263 )
...
* run demo
* format code
* add llamaindex
* add custom LLM with bigdl
* update
* add readme
* begin ut
* add unit test
* add license
* add license
* revised
* update
* modify docs
* remove data folder
* update
* modify prompt
* fixed
* fixed
* fixed
2024-02-29 15:21:19 +08:00
Ruonan Wang
a9fd20b6ba
LLM: Update qkv fusion for GGUF-IQ2 ( #10271 )
...
* first commit
* update mistral
* fix transformers==4.36.0
* fix
* disable qk for mixtral now
* fix style
2024-02-29 12:49:53 +08:00
Shengsheng Huang
db0d129226
Revert "Add rwkv example ( #9432 )" ( #10264 )
...
This reverts commit 6930422b42 .
2024-02-28 11:48:31 +08:00
Yining Wang
6930422b42
Add rwkv example ( #9432 )
...
* codeshell fix wrong urls
* restart runner
* add RWKV CPU & GPU example (rwkv-4-world-7b)
* restart runner
* update submodule
* fix runner
* runner-test
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:41:00 +08:00
Keyan (Kyrie) Zhang
59861f73e5
Add Deepseek-6.7B ( #9991 )
...
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* Add new example Deepseek
* modify deepseek
* modify deepseek
* Add verified model in README
* Turn cpu_embedding=True in Deepseek example
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:36:39 +08:00
Yuxuan Xia
2524273198
Update AutoGen README ( #10255 )
...
* Update AutoGen README
* Fix AutoGen README typos
* Update AutoGen README
* Update AutoGen README
2024-02-28 11:34:45 +08:00
Zheng, Yi
2347f611cf
Add cpu and gpu examples of Mamba ( #9797 )
...
* Add mamba cpu example
* Add mamba gpu example
* Use a smaller model as the example
* minor fixes
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:33:29 +08:00
Zhao Changmin
937e1f7c74
rebase ( #9104 )
...
Co-authored-by: leonardozcm <leonardozcm@gmail.com>
2024-02-28 11:18:21 +08:00
Zhicun
308e637d0d
Add DeepSeek-MoE-16B-Chat ( #10155 )
...
* dsmoe-hf add
* add dsmoe pytorch
* update README
* modify comment
* remove GPU example
* update model name
* format code
2024-02-28 10:12:09 +08:00
Guoqiong Song
f4a2e32106
Stream llm example for both GPU and CPU ( #9390 )
2024-02-27 15:54:47 -08:00
Keyan (Kyrie) Zhang
843fe546b0
Add CPU and GPU examples for DeciLM-7B ( #9867 )
...
* Add cpu and gpu examples for DeciLM-7B
* Add cpu and gpu examples for DeciLM-7B
* Add DeciLM-7B to README table
* modify deciLM
* modify deciLM
* modify deciLM
* Add verified model in README
* Add cpu_embedding=True
2024-02-27 13:15:49 +08:00
Heyang Sun
36a9e88104
Speculative Starcoder on CPU ( #10138 )
...
* Speculative Starcoder on CPU
* enable kv-cache pre-allocation
* refine codes
* refine
* fix style
* fix style
* fix style
* refine
* refine
* Update speculative.py
* Update gptbigcode.py
* fix style
* Update speculative.py
* enable mixed-datatype layernorm on top of torch API
* adaptive dtype
* Update README.md
2024-02-27 09:57:29 +08:00
Wang, Jian4
6c74b99a28
LLM: Update qwen readme ( #10245 )
2024-02-26 17:03:09 +08:00
Wang, Jian4
f9b75f900b
LLM: Enable qwen target_model ipex ( #10232 )
...
* change order
* enable qwen ipex
* update qwen example
* update
* fix style
* update
2024-02-26 16:41:12 +08:00
Ziteng Zhang
ea23afc8ec
[LLM]update ipex part in mistral example readme ( #10239 )
...
* update ipex part in mistral example readme
2024-02-26 14:35:20 +08:00
Xiangyu Tian
85a99e13e8
LLM: Fix ChatGLM3 Speculative Example ( #10236 )
...
Fix ChatGLM3 Speculative Example.
2024-02-26 10:57:28 +08:00
Xin Qiu
8ef5482da2
update Gemma readme ( #10229 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
2024-02-23 16:57:08 +08:00
Xin Qiu
aabfc06977
add gemma example ( #10224 )
...
* add gemma gpu example
* Update README.md
* add cpu example
* Update README.md
* Update README.md
* Update generate.py
* Update generate.py
2024-02-23 15:20:57 +08:00
yb-peng
a2c1675546
Add CPU and GPU examples for Yuan2-2B-hf ( #9946 )
...
* Add a new CPU example of Yuan2-2B-hf
* Add a new CPU generate.py of Yuan2-2B-hf example
* Add a new GPU example of Yuan2-2B-hf
* Add Yuan2 to README table
* In CPU example:1.Use English as default prompt; 2.Provide modified files in yuan2-2B-instruct
* In GPU example:1.Use English as default prompt;2.Provide modified files
* GPU example:update README
* update Yuan2-2B-hf in README table
* Add CPU example for Yuan2-2B in Pytorch-Models
* Add GPU example for Yuan2-2B in Pytorch-Models
* Add license in generate.py; Modify README
* In GPU Add license in generate.py; Modify README
* In CPU yuan2 modify README
* In GPU yuan2 modify README
* In CPU yuan2 modify README
* In GPU example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* GPU hf example README modified
* GPU example README modified
2024-02-23 14:09:30 +08:00
yb-peng
f1f4094a09
Add CPU and GPU examples of phi-2 ( #10014 )
...
* Add CPU and GPU examples of phi-2
* In GPU hf example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* update the table in BigDL/README.md
* update the table in BigDL/python/llm/README.md
2024-02-23 14:05:53 +08:00