Commit graph

194 commits

Author SHA1 Message Date
Heyang Sun
581ebf6104
GaLore Finetuning Example (#10722)
* GaLore Finetuning Example

* Update README.md

* Update README.md

* change data to HuggingFaceH4/helpful_instructions

* Update README.md

* Update README.md

* shrink train size and delete cache before starting training to save memory

* Update README.md

* Update galore_finetuning.py

* change model to llama2 3b

* Update README.md
2024-04-18 13:47:41 +08:00
Yina Chen
ea5b373a97
Add lookahead GPU example (#10785)
* Add lookahead example

* fix style & attn mask

* fix typo

* address comments
2024-04-17 17:41:55 +08:00
Cengguang Zhang
7ec82c6042
LLM: add README.md for Long-Context examples. (#10765)
* LLM: add readme to long-context examples.

* add precision.

* update wording.

* add GPU type.

* add Long-Context example to GPU examples.

* fix comments.

* update max input length.

* update max length.

* add output length.

* fix wording.
2024-04-17 15:34:59 +08:00
Qiyuan Gong
9e5069437f
Fix gradio version in axolotl example (#10776)
* Change to gradio>=4.19.2
2024-04-17 10:23:43 +08:00
Qiyuan Gong
f2e923b3ca
Axolotl v0.4.0 support (#10773)
* Add Axolotl 0.4.0, remove legacy 0.3.0 support.
* replace is_torch_bf16_gpu_available
* Add HF_HUB_OFFLINE=1
* Move transformers out of requirement
* Refine readme and qlora.yml
2024-04-17 09:49:11 +08:00
Heyang Sun
26cae0a39c
Update FLEX in Deepspeed README (#10774)
* Update FLEX in Deepspeed README

* Update README.md
2024-04-17 09:28:24 +08:00
Qiyuan Gong
d30b22a81b
Refine axolotl 0.3.0 documents and links (#10764)
* Refine axolotl 0.3 based on comments
* Rename requirements to requirement-xpu
* Add comments for paged_adamw_32bit
* change lora_r from 8 to 16
2024-04-16 14:47:45 +08:00
ZehuaCao
599a88db53
Add deepsped-autoTP-Fastapi serving (#10748)
* add deepsped-autoTP-Fastapi serving

* add readme

* add license

* update

* update

* fix
2024-04-16 14:03:23 +08:00
Jin Qiao
73a67804a4
GPU configuration update for examples (windows pip installer, etc.) (#10762)
* renew chatglm3-6b gpu example readme

fix

fix

fix

* fix for comments

* fix

* fix

* fix

* fix

* fix

* apply on HF-Transformers-AutoModels

* apply on PyTorch-Models

* fix

* fix
2024-04-15 17:42:52 +08:00
yb-peng
b5209d3ec1
Update example/GPU/PyTorch-Models/Model/llava/README.md (#10757)
* Update example/GPU/PyTorch-Models/Model/llava/README.md

* Update README.md

fix path in windows installation
2024-04-15 13:01:37 +08:00
Jiao Wang
9e668a5bf0
fix_internlm-chat-7b-8k repo name in examples (#10747) 2024-04-12 10:15:48 -07:00
Keyan (Kyrie) Zhang
1256a2cc4e
Add chatglm3 long input example (#10739)
* Add long context input example for chatglm3

* Small fix

* Small fix

* Small fix
2024-04-11 16:33:43 +08:00
Qiyuan Gong
2d64630757
Remove transformers version in axolotl example (#10736)
* Remove transformers version in axolotl requirements.txt
2024-04-11 14:02:31 +08:00
Shaojun Liu
29bf28bd6f
Upgrade python to 3.11 in Docker Image (#10718)
* install python 3.11 for cpu-inference docker image

* update xpu-inference dockerfile

* update cpu-serving image

* update qlora image

* update lora image

* update document
2024-04-10 14:41:27 +08:00
Qiyuan Gong
b727767f00
Add axolotl v0.3.0 with ipex-llm on Intel GPU (#10717)
* Add axolotl v0.3.0 support on Intel GPU.
* Add finetune example on llama-2-7B with Alpaca dataset.
2024-04-10 14:38:29 +08:00
Jiao Wang
878a97077b
Fix llava example to support transformerds 4.36 (#10614)
* fix llava example

* update
2024-04-09 13:47:07 -07:00
Jiao Wang
1e817926ba
Fix low memory generation example issue in transformers 4.36 (#10702)
* update cache in low memory generate

* update
2024-04-09 09:56:52 -07:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 (#10711)
* create conda env with python 3.11

* recommend to use Python 3.11

* update
2024-04-09 17:41:17 +08:00
Cengguang Zhang
6a32216269
LLM: add llama2 8k input example. (#10696)
* LLM: add llama2-32K example.

* refactor name.

* fix comments.

* add IPEX_LLM_LOW_MEM notes and update sample output.
2024-04-09 16:02:37 +08:00
Keyan (Kyrie) Zhang
1e27e08322
Modify example from fp32 to fp16 (#10528)
* Modify example from fp32 to fp16

* Remove Falcon from fp16 example for now

* Remove MPT from fp16 example
2024-04-09 15:45:49 +08:00
binbin Deng
d9a1153b4e
LLM: upgrade deepspeed in AutoTP on GPU (#10647) 2024-04-07 14:05:19 +08:00
Zhicun
9d8ba64c0d
Llamaindex: add tokenizer_id and support chat (#10590)
* add tokenizer_id

* fix

* modify

* add from_model_id and from_mode_id_low_bit

* fix typo and add comment

* fix python code style

---------

Co-authored-by: pengyb2001 <284261055@qq.com>
2024-04-07 13:51:34 +08:00
Jin Qiao
10ee786920
Replace with IPEX-LLM in example comments (#10671)
* Replace with IPEX-LLM in example comments

* More replacement

* revert some changes
2024-04-07 13:29:51 +08:00
Jiao Wang
69bdbf5806
Fix vllm print error message issue (#10664)
* update chatglm readme

* Add condition to invalidInputError

* update

* update

* style
2024-04-05 15:08:13 -07:00
Jason Dai
29d97e4678
Update readme (#10665) 2024-04-05 18:01:57 +08:00
Jin Qiao
cc8b3be11c
Add GPU and CPU example for stablelm-zephyr-3b (#10643)
* Add example for StableLM

* fix

* add to readme
2024-04-03 16:28:31 +08:00
Heyang Sun
6000241b10
Add Deepspeed Example of FLEX Mistral (#10640) 2024-04-03 16:04:17 +08:00
Zhicun
b827f534d5
Add tokenizer_id in Langchain (#10588)
* fix low-bit

* fix

* fix style

---------

Co-authored-by: arda <arda@arda-arc12.sh.intel.com>
2024-04-03 14:25:35 +08:00
Zhicun
f6fef09933
fix prompt format for llama-2 in langchain (#10637) 2024-04-03 14:17:34 +08:00
Jiao Wang
330d4b4f4b
update readme (#10631) 2024-04-02 23:08:02 -07:00
Jiao Wang
654dc5ba57
Fix Qwen-VL example problem (#10582)
* update

* update

* update

* update
2024-04-02 12:17:30 -07:00
Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j (#10606)
* remove ipex.optimize

* fix

* fix
2024-04-01 12:21:49 +08:00
Keyan (Kyrie) Zhang
848fa04dd6
Fix typo in Baichuan2 example (#10589) 2024-03-29 13:31:47 +08:00
ZehuaCao
52a2135d83
Replace ipex with ipex-llm (#10554)
* fix ipex with ipex_llm

* fix ipex with ipex_llm

* update

* update

* update

* update

* update

* update

* update

* update
2024-03-28 13:54:40 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package (#10557)
* Change to 'pip install .. --extra-index-url' for readthedocs

* Change to 'pip install .. --extra-index-url' for examples

* Change to 'pip install .. --extra-index-url' for remaining files

* Fix URL for ipex

* Add links for ipex US and CN servers

* Update ipex cpu url

* remove readme

* Update for github actions

* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation (#10561) 2024-03-27 17:58:57 +08:00
hxsz1997
d86477f14d
Remove native_int4 in LangChain examples (#10510)
* rebase the modify to ipex-llm

* modify the typo
2024-03-27 17:48:16 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Jin Qiao
cc5806f4bc LLM: add save/load example for hf-transformers (#10432) 2024-03-22 13:57:47 +08:00
binbin Deng
2958ca49c0 LLM: add patching function for llm finetuning (#10247) 2024-03-21 16:01:01 +08:00
hxsz1997
a5f35757a4 Migrate langchain rag cpu example to gpu (#10450)
* add langchain rag on gpu

* add rag example in readme

* add trust_remote_code in TransformersEmbeddings.from_model_id

* add trust_remote_code in TransformersEmbeddings.from_model_id in cpu
2024-03-21 15:20:46 +08:00
Ruonan Wang
28c315a5b9 LLM: fix deepspeed error of finetuning on xpu (#10484) 2024-03-21 09:46:25 +08:00
Cengguang Zhang
463a86cd5d LLM: fix qwen-vl interpolation gpu abnormal results. (#10457)
* fix qwen-vl interpolation gpu abnormal results.

* fix style.

* update qwen-vl gpu example.

* fix comment and update example.

* fix style.
2024-03-19 16:59:39 +08:00
Jiao Wang
f3fefdc9ce fix pad_token_id issue (#10425) 2024-03-18 23:30:28 -07:00
Yuxuan Xia
74e7490fda Fix Baichuan2 prompt format (#10334)
* Fix Baichuan2 prompt format

* Fix Baichuan2 README

* Change baichuan2 prompt info

* Change baichuan2 prompt info
2024-03-19 12:48:07 +08:00
Yang Wang
9e763b049c Support running pipeline parallel inference by vertically partitioning model to different devices (#10392)
* support pipeline parallel inference

* fix logging

* remove benchmark file

* fic

* need to warmup twice

* support qwen and qwen2

* fix lint

* remove genxir

* refine
2024-03-18 13:04:45 -07:00
Jiao Wang
5ab52ef5b5 update (#10424) 2024-03-15 09:24:26 -07:00
Jin Qiao
ca372f6dab LLM: add save/load example for ModelScope (#10397)
* LLM: add sl example for modelscope

* fix according to comments

* move file
2024-03-15 15:17:50 +08:00
Wang, Jian4
fe8976a00f LLM: Support gguf models use low_bit and fix no json(#10408)
* support others model use low_bit

* update readme

* update to add *.json
2024-03-15 09:34:18 +08:00