ipex-llm

Author	SHA1	Message	Date
Jiao Wang	33b9e7744d	fix dimension (#10097 )	2024-02-05 15:07:38 -08:00
SONG Ge	4b02ff188b	[WebUI] Add prompt format and stopping words for Qwen (#10066 ) * add prompt format and stopping_words for qwen mdoel * performance optimization * optimize * update * meet comments	2024-02-05 18:23:13 +08:00
WeiguangHan	0aecd8637b	LLM: small fix for the html script (#10094 )	2024-02-05 17:27:34 +08:00
Zhicun	7d2be7994f	add phixtral and optimize phi-moe (#10052 )	2024-02-05 11:12:47 +08:00
Zhicun	676d6923f2	LLM: modify transformersembeddings.embed() in langchain (#10051 )	2024-02-05 10:42:10 +08:00
Jin Qiao	ad050107b3	LLM: fix mpt load_low_bit issue (#10075 ) * fix * retry * retry	2024-02-05 10:17:07 +08:00
Lilac09	f8dcaff7f4	use default python (#10070 )	2024-02-05 09:06:59 +08:00
SONG Ge	9050991e4e	fix gradio check issue temply (#10082 )	2024-02-04 16:46:29 +08:00
WeiguangHan	c2e562d037	LLM: add batch_size to the csv and html (#10080 ) * LLM: add batch_size to the csv and html * small fix	2024-02-04 16:35:44 +08:00
Yuwen Hu	136f042f84	[LLM] Make sure python 310-311 tests only happen for nightly tests (#10081 ) * Make sure python 310-311 tests only happen for nightly tests * Use default runner for setup-python-version * Small fixes	2024-02-04 16:14:48 +08:00
binbin Deng	7e49fbc5dd	LLM: make finetuning examples more common for other models (#10078 )	2024-02-04 16:03:52 +08:00
Heyang Sun	90f004b80b	remove benchmarkwrapper form deepspeed example (#10079 )	2024-02-04 15:42:15 +08:00
Jin Qiao	f9a468a2c7	LLM: conditionally choose python version for unit test (#10062 ) * conditional python version * retry * temporary skip llm-cpp-build * apply on llm-unit-test-on-arc * fix * add llm-cpp-build dependency * use GITHUB_OUTPUT instead of set-output * check nightly build * fix quote * fix quote * add llm-cpp-build dependency * test nightly build * test pull request	2024-02-04 13:37:34 +08:00
Ruonan Wang	8e33cb0f38	LLM: support speecht5_tts (#10077 ) * support speecht5_tts * fix	2024-02-04 13:26:42 +08:00
yb-peng	738275761d	In llm-harness-evaluation, add new models and change schedule to nightly (#10072 ) * add new models and change schedule to nightly * correct syntax error * modify env set up and job * change label and schedule time * change schedule time * change label	2024-02-04 13:12:09 +08:00
Shaojun Liu	698f84648c	split stable version tests (#10076 ) Co-authored-by: Your Name <Your Email>	2024-02-04 11:08:12 +08:00
ivy-lv11	428b7105f6	Add HF and PyTorch example InternLM2 (#10061 )	2024-02-04 10:25:55 +08:00
binbin Deng	91cf9d41d0	LLM: add solutions of some frequently asked questions (#10068 )	2024-02-04 09:28:20 +08:00
Yina Chen	77be19bb97	LLM: Support gpt-j in speculative decoding (#10067 ) * gptj * support gptj in speculative decoding * fix * update readme * small fix	2024-02-02 14:54:55 +08:00
Jason Dai	2927c77d7f	Update readme (#10071 )	2024-02-01 20:40:20 -08:00
SONG Ge	19183ef476	[WebUI] Reset bigdl-llm loader options with default value (#10064 ) * reset bigdl-llm loader options with default value * remove options which maybe complex for naive users	2024-02-01 15:45:39 +08:00
Xin Qiu	6e0f1a1e92	use apply_rotary_pos_emb_cache_freq_xpu in mixtral (#10060 ) * use apply_rotary_pos_emb_cache_freq_xpu in mixtral * fix style	2024-02-01 15:40:49 +08:00
binbin Deng	aae20d728e	LLM: Add initial DPO finetuning example (#10021 )	2024-02-01 14:18:08 +08:00
Heyang Sun	601024f418	Mistral CPU example of speculative decoding (#10024 ) * Mistral CPU example of speculative decoding * update transformres version * update example * Update README.md	2024-02-01 10:52:32 +08:00
Heyang Sun	968e70544d	Enable IPEX Mistral in Speculative (#10059 )	2024-02-01 10:48:16 +08:00
Yina Chen	3ca03d4e97	Add deepmind sample into bigdl-llm speculative decoding (#10041 ) * migrate deepmind sample * update * meet comments * fix style * fix style	2024-02-01 09:57:02 +08:00
Lilac09	72e67eedbb	Add speculative support in docker (#10058 ) * add speculative environment * add speculative environment * add speculative environment	2024-02-01 09:53:53 +08:00
binbin Deng	4b92235bdb	LLM: add initial FAQ page (#10055 )	2024-02-01 09:43:39 +08:00
WeiguangHan	d2d3f6b091	LLM: ensure the result of daily arc perf test (#10016 ) * ensure the result of daily arc perf test * small fix * small fix * small fix * small fix * small fix * small fix * small fix * small fix * small fix * small fix * concat more csvs * small fix * revert some files	2024-01-31 18:26:21 +08:00
WeiguangHan	9724939499	temporarily disable bloom 2k input (#10056 )	2024-01-31 17:49:12 +08:00
Jin Qiao	8c8fc148c9	LLM: add rwkv 5 (#10048 )	2024-01-31 15:54:55 +08:00
WeiguangHan	a9018a0e95	LLM: modify the GPU example for redpajama model (#10044 ) * LLM: modify the GPU example for redpajama model * small fix	2024-01-31 14:32:08 +08:00
Yuxuan Xia	95636cad97	Add AutoGen CPU and XPU Example (#9980 ) * Add AutoGen example * Adjust AutoGen README * Adjust AutoGen README * Change AutoGen README * Change AutoGen README	2024-01-31 11:31:18 +08:00
Heyang Sun	7284edd9b7	Vicuna CPU example of speculative decoding (#10018 ) * Vicuna CPU example of speculative decoding * Update speculative.py * Update README.md * add requirements for ipex * Update README.md * Update speculative.py * Update speculative.py	2024-01-31 11:23:50 +08:00
Wang, Jian4	7e5cd42a5c	LLM : Update optimize ipex bf16 (#10038 ) * use 4.35.2 and remove * update rmsnorm * remove * remove * update python style * update * update python style * update * fix style * update * remove whitespace	2024-01-31 10:59:55 +08:00
Wang, Jian4	fb53b994f8	LLM : Add llama ipex optimized (#10046 ) * init ipex * remove padding	2024-01-31 10:38:46 +08:00
Ruonan Wang	3685622f29	LLM: fix llama 4.36 forward(#10047 )	2024-01-31 10:31:10 +08:00
Yishuo Wang	53a5140eff	Optimize rwkv v5 rest token again (#10043 )	2024-01-31 10:01:11 +08:00
Heyang Sun	b1ff28ceb6	LLama2 CPU example of speculative decoding (#9962 ) * LLama2 example of speculative decoding * add docs * Update speculative.py * Update README.md * Update README.md * Update speculative.py * remove autocast	2024-01-31 09:45:20 +08:00
WeiguangHan	0fcad6ce14	LLM: add gpu example for redpajama models (#10040 )	2024-01-30 19:39:28 +08:00
Shaojun Liu	2a0d65a009	Bump aiohttp from 3.9.0 to 3.9.2 to resolve security issues (#10030 ) * Bump aiohttp from 3.9.0 to 3.9.2 to resolve security issues * update url * remove aiohttp version from requirements.txt * revert aiohttp to 3.9.0 * trigger tests * revert	2024-01-30 19:36:02 +08:00
Yuwen Hu	863c3f94d0	[LLM] Change nightly perf to install from pypi (#10027 ) * Change to install from pypi and have a check to make sure the installed bigdl-llm version is as expected * Make sure result date is the same as tested bigdl-llm version * Small fixes * Small fix * Small fixes * Small fix * Small fixes * Small updates	2024-01-30 18:15:44 +08:00
Xiangyu Tian	9978089796	[LLM] Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example (#10028 ) Enable BIGDL_OPT_IPEX in speculative baichuan2 13b example	2024-01-30 17:11:37 +08:00
Ovo233	226f398c2a	fix ppl test errors (#10036 )	2024-01-30 16:26:21 +08:00
Xin Qiu	13e61738c5	hide detail memory for each token in benchmark_utils.py (#10037 )	2024-01-30 16:04:17 +08:00
Ruonan Wang	6b63ba23d1	LLM: add full module name during convert (#10035 )	2024-01-30 14:43:07 +08:00
Yishuo Wang	7dfa6dbe46	add rwkv time shift optimization (#10032 )	2024-01-30 14:10:55 +08:00
Xiangyu Tian	f57d0fda8b	[LLM] Use IPEX Optimization for Self Speculative Decoding (#9997 ) Use IPEX Optimization for Self Speculative Decoding	2024-01-30 09:11:06 +08:00
Ruonan Wang	ccf8f613fb	LLM: update fp16 Linear on ARC/FLEX (#10023 )	2024-01-29 18:25:26 +08:00
Yuwen Hu	a5c9dfdf91	[LLM] Main readme gpu installation related updates (#9868 ) * Main readme gpu installation related updates * Small updates for readthedocs main page	2024-01-29 16:33:27 +08:00

1 2 3 4 5 ...

2218 commits