Commit graph

22 commits

Author SHA1 Message Date
Ruonan Wang
27d669210f
remove fschat in EAGLE example (#13005)
* update fschat version

* fix
2025-03-25 15:48:48 +08:00
Yishuo Wang
d0d9c9d636
remove load_in_8bit usage as it is not supported a long time ago (#12779) 2025-02-07 11:21:29 +08:00
Jiao Wang
667f0db466
Update Eagle example to Eagle2+ipex-llm integration (#11717)
* update to e2 example

* update

* update
2024-10-16 23:16:14 -07:00
Qiyuan Gong
ce3f08b25a
Fix IPEX auto importer (#11192)
* Fix ipex auto importer with Python builtins.
* Raise errors if the user imports ipex manually before importing ipex_llm. Do nothing if they import ipex after importing ipex_llm.
* Remove import ipex in examples.
2024-06-04 16:57:18 +08:00
Jiao Wang
93146b9433
Reconstruct Speculative Decoding example directory (#11136)
* update

* update

* update
2024-05-29 13:15:27 -07:00
Ruonan Wang
d550af957a
fix security issue of eagle (#11140)
* fix security issue of eagle

* small fix
2024-05-27 10:15:28 +08:00
Jean Yu
ab476c7fe2
Eagle Speculative Sampling examples (#11104)
* Eagle Speculative Sampling examples

* rm multi-gpu and ray content

* updated README to include Arc A770
2024-05-24 11:13:43 -07:00
Yina Chen
ea5b373a97
Add lookahead GPU example (#10785)
* Add lookahead example

* fix style & attn mask

* fix typo

* address comments
2024-04-17 17:41:55 +08:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 (#10711)
* create conda env with python 3.11

* recommend to use Python 3.11

* update
2024-04-09 17:41:17 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package (#10557)
* Change to 'pip install .. --extra-index-url' for readthedocs

* Change to 'pip install .. --extra-index-url' for examples

* Change to 'pip install .. --extra-index-url' for remaining files

* Fix URL for ipex

* Add links for ipex US and CN servers

* Update ipex cpu url

* remove readme

* Update for github actions

* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Yina Chen
77be19bb97 LLM: Support gpt-j in speculative decoding (#10067)
* gptj

* support gptj in speculative decoding

* fix

* update readme

* small fix
2024-02-02 14:54:55 +08:00
Wang, Jian4
093e6f8f73 LLM: Add qwen CPU speculative example (#9985)
* init from gpu

* update for cpu

* update

* update

* fix xpu readme

* update

* update example prompt

* update prompt and add 72b

* update

* update
2024-01-25 17:01:34 +08:00
Yina Chen
99ff6cf048 Update gpu spec decoding baichuan2 example dependency (#9990)
* add dependency

* update

* update
2024-01-25 11:05:04 +08:00
Jason Dai
3bc3d0bbcd Update self-speculative readme (#9986) 2024-01-24 22:37:32 +08:00
Ruonan Wang
d4f65a6033 LLM: add mistral speculative example (#9976)
* add mistral example

* update
2024-01-24 17:35:15 +08:00
Yina Chen
b176cad75a LLM: Add baichuan2 gpu spec example (#9973)
* add baichuan2 gpu spec example

* update readme & example

* remove print

* fix typo

* meet comments

* revert

* update
2024-01-24 16:40:16 +08:00
Yina Chen
5aa4b32c1b LLM: Add qwen spec gpu example (#9965)
* add qwen spec gpu example

* update readme

---------

Co-authored-by: rnwang04 <ruonan1.wang@intel.com>
2024-01-23 15:59:43 +08:00
Ruonan Wang
60b35db1f1 LLM: add chatglm3 speculative decoding example (#9966)
* add chatglm3 example

* update

* fix
2024-01-23 15:54:12 +08:00
Ruonan Wang
27b19106f3 LLM: add readme for speculative decoding gpu examples (#9961)
* add readme

* add readme

* meet code review
2024-01-23 12:54:19 +08:00
Ruonan Wang
3e601f9a5d LLM: Support speculative decoding in bigdl-llm (#9951)
* first commit

* fix error, add llama example

* hidden print

* update api usage

* change to api v3

* update

* meet code review

* meet code review, fix style

* add reference, fix style

* fix style

* fix first token time
2024-01-22 19:14:56 +08:00