ipex-llm

Author	SHA1	Message	Date
Ruonan Wang	83bd9cb681	add new version for cpp quickstart and keep an old version (#11151 ) * add new version * meet review	2024-05-28 15:29:34 +08:00
Guancheng Fu	daf7b1cd56	[Docker] Fix image using two cards error (#11144 ) * fix all * done	2024-05-27 16:20:13 +08:00
Xiangyu Tian	b3f6faa038	LLM: Add CPU vLLM entrypoint (#11083 ) Add CPU vLLM entrypoint and update CPU vLLM serving example.	2024-05-24 09:16:59 +08:00
Zhao Changmin	15d906a97b	Update linux igpu run script (#11098 ) * update run script	2024-05-22 17:18:07 +08:00
Zhao Changmin	bf0f904e66	Update level_zero on MTL linux (#11085 ) * Update level_zero on MTL --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2024-05-22 11:01:56 +08:00
Guancheng Fu	f654f7e08c	Add serving docker quickstart (#11072 ) * add temp file * add initial docker readme * temp * done * add fastchat service * fix * fix * fix * fix * remove stale file	2024-05-21 17:00:58 +08:00
binbin Deng	7170dd9192	Update guide for running qwen with AutoTP (#11065 )	2024-05-20 10:53:17 +08:00
Wang, Jian4	a2e1578fd9	Merge tgi_api_server to main (#11036 ) * init * fix style * speculative can not use benchmark * add tgi server readme	2024-05-20 09:15:03 +08:00
Guancheng Fu	dfac168d5f	fix format/typo (#11067 )	2024-05-17 16:52:17 +08:00
Guancheng Fu	67db925112	Add vllm quickstart (#10978 ) * temp * add doc * finish * done * fix * add initial docker readme * temp * done fixing vllm_quickstart * done * remove not used file * add * fix	2024-05-17 16:16:42 +08:00
Wang, Jian4	00d4410746	Update cpp docker quickstart (#11040 ) * add sample output * update link * update * update header * update	2024-05-16 14:55:13 +08:00
Ruonan Wang	1d73fc8106	update cpp quickstart (#11031 )	2024-05-15 14:33:36 +08:00
Wang, Jian4	86cec80b51	LLM: Add llm inference_cpp_xpu_docker (#10933 ) * test_cpp_docker * update * update * update * update * add sudo * update nodejs version * no need npm * remove blinker * new cpp docker * restore * add line * add manually_build * update and add mtl * update for workdir llm * add benchmark part * update readme * update 1024-128 * update readme * update * fix * update * update * update readme too * update readme * no change * update dir_name * update readme	2024-05-15 11:10:22 +08:00
Yuwen Hu	c34f85e7d0	[Doc] Simplify installation on Windows for Intel GPU (#11004 ) * Simplify GPU installation guide regarding windows Prerequisites * Update Windows install quickstart on Intel GPU * Update for llama.cpp quickstart * Update regarding minimum driver version * Small fix * Update based on comments * Small fix	2024-05-15 09:55:41 +08:00
Shengsheng Huang	0b7e78b592	revise the benchmark part in python inference docker (#11020 )	2024-05-14 18:43:41 +08:00
Shengsheng Huang	586a151f9c	update the README and reorganize the docker guides structure. (#11016 ) * update the README and reorganize the docker guides structure. * modified docker install guide into overview	2024-05-14 17:56:11 +08:00
Qiyuan Gong	c957ea3831	Add axolotl main support and axolotl Llama-3-8B QLoRA example (#10984 ) * Support axolotl main (796a085). * Add axolotl Llama-3-8B QLoRA example. * Change `sequence_len` to 256 for alpaca, and revert `lora_r` value. * Add example to quick_start.	2024-05-14 13:43:59 +08:00
Shaojun Liu	7f8c5b410b	Quickstart: Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL) (#10970 ) * add entrypoint.sh * add quickstart * remove entrypoint * update * Install related library of benchmarking * update * print out results * update docs * minor update * update * update quickstart * update * update * update * update * update * update * add chat & example section * add more details * minor update * rename quickstart * update * minor update * update * update config.yaml * update readme * use --gpu * add tips * minor update * update	2024-05-14 12:58:31 +08:00
Ruonan Wang	04d5a900e1	update troubleshooting of llama.cpp (#10990 ) * update troubleshooting * small update	2024-05-13 11:18:38 +08:00
Ruonan Wang	5e0872073e	add version for llama.cpp and ollama (#10982 ) * add version for cpp * meet review	2024-05-11 09:20:31 +08:00
Ruonan Wang	b7f7d05a7e	update llama.cpp usage of llama3 (#10975 ) * update llama.cpp usage of llama3 * fix	2024-05-09 16:44:12 +08:00
Shengsheng Huang	e3159c45e4	update private gpt quickstart and a small fix for dify (#10969 )	2024-05-09 13:57:45 +08:00
Shengsheng Huang	11df5f9773	revise private GPT quickstart and a few fixes for other quickstart (#10967 )	2024-05-08 21:18:20 +08:00
Keyan (Kyrie) Zhang	37820e1d86	Add privateGPT quickstart (#10932 ) * Add privateGPT quickstart * Update privateGPT_quickstart.md * Update _toc.yml * Update _toc.yml --------- Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>	2024-05-08 20:48:00 +08:00
Xiangyu Tian	02870dc385	LLM: Refine README of AutoTP-FastAPI example (#10960 )	2024-05-08 16:55:23 +08:00
Qiyuan Gong	164e6957af	Refine axolotl quickstart (#10957 ) * Add default accelerate config for axolotl quickstart. * Fix requirement link. * Upgrade peft to 0.10.0 in requirement.	2024-05-08 09:34:02 +08:00
Shengsheng Huang	d649236321	make images clickable (#10939 )	2024-05-06 20:24:15 +08:00
Shengsheng Huang	64938c2ca7	Dify quickstart revision (#10938 ) * revise dify quickstart guide * update quick links and a small typo	2024-05-06 19:59:17 +08:00
Ruonan Wang	3f438495e4	update llama.cpp and ollama quickstart (#10929 )	2024-05-06 15:01:06 +08:00
Wang, Jian4	0e0bd309e2	LLM: Enable Speculative on Fastchat (#10909 ) * init * enable streamer * update * update * remove deprecated * update * update * add gpu example	2024-05-06 10:06:20 +08:00
Zhicun	8379f02a74	Add Dify quickstart (#10903 ) * add quick start * modify * modify * add * add * resize * add mp4 * add vedio * add video * video * add * modify * add * modify	2024-05-06 10:01:34 +08:00
Shengsheng Huang	c78a8e3677	update quickstart (#10923 )	2024-04-30 18:19:31 +08:00
Shengsheng Huang	282d676561	update continue quickstart (#10922 )	2024-04-30 17:51:21 +08:00
Yuwen Hu	71f51ce589	Initial Update for Continue Quickstart with Ollama backend (#10918 ) * Initial continue quickstart with ollama backend updates * Small fix * Small fix	2024-04-30 15:10:30 +08:00
Shaojun Liu	d058f2b403	Fix apt install oneapi scripts (#10891 ) * Fix apt install oneapi scripts * add intel-oneapi-mkl-devel * add apt pkgs	2024-04-26 16:39:37 +08:00
Qiyuan Gong	634726211a	Add video to axolotl quick start (#10870 ) * Add video to axolotl quick start. * Fix wget url.	2024-04-24 16:53:14 +08:00
Zhicun	a017bf2981	add quick start for dify (#10813 ) * add quick start * modify * modify * add * add * resize * add mp4 * add vedio * add video * video * add	2024-04-23 16:32:22 +08:00
Qiyuan Gong	bce99a5b00	Minior fix for quick start (#10857 ) * Fix typo and space in quick start.	2024-04-23 15:22:01 +08:00
Qiyuan Gong	5eee1976ac	Add Axolotl v0.4.0 quickstart (#10840 ) * Add Axolotl v0.4.0 quickstart	2024-04-23 14:57:34 +08:00
Ruonan Wang	2ec45c49d3	fix ollama quickstart(#10846 )	2024-04-22 22:04:49 +08:00
Ruonan Wang	c6e868f7ad	update oneapi usage in cpp quickstart (#10836 ) * update oneapi usage * update * small fix	2024-04-22 11:48:05 +08:00
Ruonan Wang	1edb19c1dd	small fix of cpp quickstart(#10829 )	2024-04-22 09:44:08 +08:00
SONG Ge	197f8dece9	Add open-webui windows document (#10775 ) * add windows document * update * fix document * build fix * update some description * reorg document structure * update doc * re-update to better view * add reminder for running model on gpus * update * remove useless part	2024-04-19 18:06:40 +08:00
Ruonan Wang	a8df429985	QuickStart: Run Llama 3 on Intel GPU using llama.cpp and ollama with IPEX-LLM (#10809 ) * initial commit * update llama.cpp * add demo video at first * fix ollama link in readme * meet review * update * small fix	2024-04-19 17:44:59 +08:00
Yuwen Hu	34ff07b689	Add CPU related info to langchain-chatchat quickstart (#10812 )	2024-04-19 15:59:51 +08:00
SONG Ge	fbd1743b5e	Ollama quickstart update (#10806 ) * add ollama doc for OLLAMA_NUM_GPU * remove useless params * revert unexpected changes back * move env setting to server part * update	2024-04-19 15:00:25 +08:00
ZehuaCao	a7c12020b4	Add fastchat quickstart (#10688 ) * add fastchat quickstart * update * update * update	2024-04-16 14:02:38 +08:00
Ruonan Wang	ea5e46c8cb	Small update of quickstart (#10772 )	2024-04-16 10:46:58 +08:00
Yuwen Hu	1abd77507e	Small update for GPU configuration related doc (#10770 ) * Small doc fix for dGPU type name * Further fixes * Further fix * Small fix	2024-04-15 18:43:29 +08:00
Ruonan Wang	1bd431976d	Update ollama quickstart (#10756 ) * update windows part * update ollama quickstart * update ollama * update * small fix * update * meet review	2024-04-15 16:37:55 +08:00

1 2 3

114 commits