Keyan (Kyrie) Zhang
1e27e08322
Modify example from fp32 to fp16 ( #10528 )
...
* Modify example from fp32 to fp16
* Remove Falcon from fp16 example for now
* Remove MPT from fp16 example
2024-04-09 15:45:49 +08:00
binbin Deng
d9a1153b4e
LLM: upgrade deepspeed in AutoTP on GPU ( #10647 )
2024-04-07 14:05:19 +08:00
Zhicun
9d8ba64c0d
Llamaindex: add tokenizer_id and support chat ( #10590 )
...
* add tokenizer_id
* fix
* modify
* add from_model_id and from_mode_id_low_bit
* fix typo and add comment
* fix python code style
---------
Co-authored-by: pengyb2001 <284261055@qq.com>
2024-04-07 13:51:34 +08:00
Jin Qiao
10ee786920
Replace with IPEX-LLM in example comments ( #10671 )
...
* Replace with IPEX-LLM in example comments
* More replacement
* revert some changes
2024-04-07 13:29:51 +08:00
Jiao Wang
69bdbf5806
Fix vllm print error message issue ( #10664 )
...
* update chatglm readme
* Add condition to invalidInputError
* update
* update
* style
2024-04-05 15:08:13 -07:00
Jason Dai
29d97e4678
Update readme ( #10665 )
2024-04-05 18:01:57 +08:00
Jin Qiao
cc8b3be11c
Add GPU and CPU example for stablelm-zephyr-3b ( #10643 )
...
* Add example for StableLM
* fix
* add to readme
2024-04-03 16:28:31 +08:00
Heyang Sun
6000241b10
Add Deepspeed Example of FLEX Mistral ( #10640 )
2024-04-03 16:04:17 +08:00
Zhicun
b827f534d5
Add tokenizer_id in Langchain ( #10588 )
...
* fix low-bit
* fix
* fix style
---------
Co-authored-by: arda <arda@arda-arc12.sh.intel.com>
2024-04-03 14:25:35 +08:00
Zhicun
f6fef09933
fix prompt format for llama-2 in langchain ( #10637 )
2024-04-03 14:17:34 +08:00
Jiao Wang
330d4b4f4b
update readme ( #10631 )
2024-04-02 23:08:02 -07:00
Jiao Wang
4431134ec5
update readme ( #10632 )
2024-04-02 19:54:30 -07:00
Jiao Wang
654dc5ba57
Fix Qwen-VL example problem ( #10582 )
...
* update
* update
* update
* update
2024-04-02 12:17:30 -07:00
Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j ( #10606 )
...
* remove ipex.optimize
* fix
* fix
2024-04-01 12:21:49 +08:00
Keyan (Kyrie) Zhang
848fa04dd6
Fix typo in Baichuan2 example ( #10589 )
2024-03-29 13:31:47 +08:00
ZehuaCao
52a2135d83
Replace ipex with ipex-llm ( #10554 )
...
* fix ipex with ipex_llm
* fix ipex with ipex_llm
* update
* update
* update
* update
* update
* update
* update
* update
2024-03-28 13:54:40 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation ( #10561 )
2024-03-27 17:58:57 +08:00
hxsz1997
d86477f14d
Remove native_int4 in LangChain examples ( #10510 )
...
* rebase the modify to ipex-llm
* modify the typo
2024-03-27 17:48:16 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm ( #24 )
...
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Jin Qiao
cc5806f4bc
LLM: add save/load example for hf-transformers ( #10432 )
2024-03-22 13:57:47 +08:00
binbin Deng
2958ca49c0
LLM: add patching function for llm finetuning ( #10247 )
2024-03-21 16:01:01 +08:00
Zhicun
5b97fdb87b
update deepseek example readme ( #10420 )
...
* update readme
* update
* update readme
2024-03-21 15:21:48 +08:00
hxsz1997
a5f35757a4
Migrate langchain rag cpu example to gpu ( #10450 )
...
* add langchain rag on gpu
* add rag example in readme
* add trust_remote_code in TransformersEmbeddings.from_model_id
* add trust_remote_code in TransformersEmbeddings.from_model_id in cpu
2024-03-21 15:20:46 +08:00
Ruonan Wang
28c315a5b9
LLM: fix deepspeed error of finetuning on xpu ( #10484 )
2024-03-21 09:46:25 +08:00
Cengguang Zhang
463a86cd5d
LLM: fix qwen-vl interpolation gpu abnormal results. ( #10457 )
...
* fix qwen-vl interpolation gpu abnormal results.
* fix style.
* update qwen-vl gpu example.
* fix comment and update example.
* fix style.
2024-03-19 16:59:39 +08:00
Jiao Wang
f3fefdc9ce
fix pad_token_id issue ( #10425 )
2024-03-18 23:30:28 -07:00
Yuxuan Xia
74e7490fda
Fix Baichuan2 prompt format ( #10334 )
...
* Fix Baichuan2 prompt format
* Fix Baichuan2 README
* Change baichuan2 prompt info
* Change baichuan2 prompt info
2024-03-19 12:48:07 +08:00
Yang Wang
9e763b049c
Support running pipeline parallel inference by vertically partitioning model to different devices ( #10392 )
...
* support pipeline parallel inference
* fix logging
* remove benchmark file
* fic
* need to warmup twice
* support qwen and qwen2
* fix lint
* remove genxir
* refine
2024-03-18 13:04:45 -07:00
Wang, Jian4
1de13ea578
LLM: remove CPU english_quotes dataset and update docker example ( #10399 )
...
* update dataset
* update readme
* update docker cpu
* update xpu docker
2024-03-18 10:45:14 +08:00
Jiao Wang
5ab52ef5b5
update ( #10424 )
2024-03-15 09:24:26 -07:00
Jin Qiao
ca372f6dab
LLM: add save/load example for ModelScope ( #10397 )
...
* LLM: add sl example for modelscope
* fix according to comments
* move file
2024-03-15 15:17:50 +08:00
Wang, Jian4
fe8976a00f
LLM: Support gguf models use low_bit and fix no json( #10408 )
...
* support others model use low_bit
* update readme
* update to add *.json
2024-03-15 09:34:18 +08:00
Wang, Jian4
0193f29411
LLM : Enable gguf float16 and Yuan2 model ( #10372 )
...
* enable float16
* add yun files
* enable yun
* enable set low_bit on yuan2
* update
* update license
* update generate
* update readme
* update python style
* update
2024-03-13 10:19:18 +08:00
binbin Deng
5d7e044dbc
LLM: add low bit option in deepspeed autotp example ( #10382 )
2024-03-12 17:07:09 +08:00
binbin Deng
df3bcc0e65
LLM: remove english_quotes dataset ( #10370 )
2024-03-12 16:57:40 +08:00
binbin Deng
fe27a6971c
LLM: update modelscope version ( #10367 )
2024-03-11 16:18:27 +08:00
Zhicun
9026c08633
Fix llamaindex AutoTokenizer bug ( #10345 )
...
* fix tokenizer
* fix AutoTokenizer bug
* modify code style
2024-03-08 16:24:50 +08:00
Zhicun
2a10b53d73
rename docqa.py->rag.py ( #10353 )
2024-03-08 16:07:09 +08:00
Shengsheng Huang
370c52090c
Langchain readme ( #10348 )
...
* update langchain readme
* update readme
* create new README
* Update README_nativeint4.md
2024-03-08 14:57:24 +08:00
hxsz1997
af11c53473
Add the installation step of postgresql and pgvector on windows in LlamaIndex GPU support ( #10328 )
...
* add the installation of postgresql and pgvector of windows
* fix some format
2024-03-05 18:31:19 +08:00
dingbaorong
1e6f0c6f1a
Add llamaindex gpu example ( #10314 )
...
* add llamaindex example
* fix core dump
* refine readme
* add trouble shooting
* refine readme
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:36:00 +08:00
dingbaorong
fc7f10cd12
add langchain gpu example ( #10277 )
...
* first draft
* fix
* add readme for transformer_int4_gpu
* fix doc
* check device_map
* add arc ut test
* fix ut test
* fix langchain ut
* Refine README
* fix gpu mem too high
* fix ut test
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
2024-03-05 13:33:57 +08:00
Xin Qiu
58208a5883
Update FAQ document. ( #10300 )
...
* Update install_gpu.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
* Update README.md
* Update resolve_error.md
2024-03-04 08:35:11 +08:00
Xin Qiu
509e206de0
update doc about gemma random and unreadable output. ( #10297 )
...
* Update install_gpu.md
* Update README.md
* Update README.md
2024-03-01 15:41:16 +08:00
Shengsheng Huang
bcfad555df
revise llamaindex readme ( #10283 )
2024-02-29 17:19:23 +08:00
Guancheng Fu
2d930bdca8
Add vLLM bf16 support ( #10278 )
...
* add argument load_in_low_bit
* add docs
* modify gpu doc
* done
---------
Co-authored-by: ivy-lv11 <lvzc@lamda.nju.edu.cn>
2024-02-29 16:33:42 +08:00
Zhicun
4e6cc424f1
Add LlamaIndex RAG ( #10263 )
...
* run demo
* format code
* add llamaindex
* add custom LLM with bigdl
* update
* add readme
* begin ut
* add unit test
* add license
* add license
* revised
* update
* modify docs
* remove data folder
* update
* modify prompt
* fixed
* fixed
* fixed
2024-02-29 15:21:19 +08:00
Ruonan Wang
a9fd20b6ba
LLM: Update qkv fusion for GGUF-IQ2 ( #10271 )
...
* first commit
* update mistral
* fix transformers==4.36.0
* fix
* disable qk for mixtral now
* fix style
2024-02-29 12:49:53 +08:00