ipex-llm

Author	SHA1	Message	Date
Shaojun Liu	8aabb5bac7	Enable CodeQL Check for CT39 (#11242 ) * Create codeql.yml * Update codeql.yml * Update codeql.yml * Update codeql.yml * Update codeql.yml	2024-06-06 17:41:12 +08:00
hxsz1997	b6234eb4e2	Add task in allinone (#11226 ) * add task * update prompt * modify typos * add more cases in summarize * Make the summarize & QA prompt preprocessing as a util function	2024-06-06 17:22:40 +08:00
Wenjing Margaret Mao	c825a7e1e9	change the workflow file to test ftp (#11241 ) * change the workflow to test ftp * comment some models * revert file --------- Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>	2024-06-06 16:53:19 +08:00
Yishuo Wang	2e4ccd541c	fix qwen2 cpu (#11240 )	2024-06-06 16:24:19 +08:00
Yishuo Wang	e738ec38f4	disable quantize kv in specific qwen model (#11238 )	2024-06-06 14:08:39 +08:00
Yishuo Wang	c4e5806e01	add latest optimization in starcoder2 (#11236 )	2024-06-06 14:02:17 +08:00
Yishuo Wang	ba27e750b1	refactor yuan2 (#11235 )	2024-06-06 13:17:54 +08:00
Shaojun Liu	6be24fdd28	OSPDT: add tpp licenses (#11165 ) * add tpp licenses * add licenses * add licenses * delete mitchellh-mapstructure license * delete stb-image public domain license * add README.md * remove core-xe related licenses	2024-06-06 10:59:06 +08:00
Guoqiong Song	09c6780d0c	phi-2 transformers 4.37 (#11161 ) * phi-2 transformers 4.37	2024-06-05 13:36:41 -07:00
Guoqiong Song	f6d5c6af78	fix issue 1407 (#11171 )	2024-06-05 13:35:57 -07:00
Zijie Li	bfa1367149	Add CPU and GPU example for MiniCPM (#11202 ) * Change installation address Change former address: "https://docs.conda.io/en/latest/miniconda.html#" to new address: "https://conda-forge.org/download/" for 63 occurrences under python\llm\example * Change Prompt Change "Anaconda Prompt" to "Miniforge Prompt" for 1 occurrence * Create and update model minicpm * Update model minicpm Update model minicpm under GPU/PyTorch-Models * Update readme and generate.py change "prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=False)" and delete "pip install transformers==4.37.0 " * Update comments for minicpm GPU Update comments for generate.py at minicpm GPU * Add CPU example for MiniCPM * Update minicpm README for CPU * Update README for MiniCPM and Llama3 * Update Readme for Llama3 CPU Pytorch * Update and fix comments for MiniCPM	2024-06-05 18:09:53 +08:00
Xu, Shuo	a27a559650	Add some information in FAQ to help users solve "RuntimeError: could not create a primitive" error on Windows (#11221 ) * Add some information to help users to solve "could not create a primitive" error in Windows. * Small update --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com> Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2024-06-05 17:57:42 +08:00
Yuwen Hu	af96579c76	Update installation guide for pipeline parallel inference (#11224 ) * Update installation guide for pipeline parallel inference * Small fix * further fix * Small fix * Small fix * Update based on comments * Small fix * Small fix * Small fix	2024-06-05 17:54:29 +08:00
Yina Chen	ed67435491	Support Fp6 k in ipex-llm (#11222 ) * support fp6_k * support fp6_k * remove * fix style	2024-06-05 17:34:36 +08:00
binbin Deng	a6674f5bce	Fix `should_use_fuse_rope` error of Qwen1.5-MoE-A2.7B-Chat (#11216 )	2024-06-05 15:56:10 +08:00
Wenjing Margaret Mao	231b968aba	Modify the check_results.py to support batch 2&4 (#11133 ) * add batch 2&4 and exclude to perf_test * modify the perf-test&437 yaml * modify llm_performance_test.yml * remove batch 4 * modify check_results.py to support batch 2&4 * change the batch_size format * remove genxir * add str(batch_size) * change actual_test_casese in check_results file to support batch_size * change html highlight * less models to test html and html_path * delete the moe model * split batch html * split * use installing from pypi * use installing from pypi - batch2 * revert cpp * revert cpp * merge two jobs into one, test batch_size in one job * merge two jobs into one, test batch_size in one job * change file directory in workflow * try catch deal with odd file without batch_size * modify pandas version * change the dir * organize the code * organize the code * remove Qwen-MOE * modify based on feedback * modify based on feedback * modify based on second round of feedback * modify based on second round of feedback + change run-arc.sh mode * modify based on second round of feedback + revert config * modify based on second round of feedback + revert config * modify based on second round of feedback + remove comments * modify based on second round of feedback + remove comments * modify based on second round of feedback + revert arc-perf-test * modify based on third round of feedback * change error type * change error type * modify check_results.html * split batch into two folders * add all models * move csv_name * revert pr test * revert pr test --------- Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>	2024-06-05 15:04:55 +08:00
Shaojun Liu	dc4fea7e3f	always cleanup conda env after build (#11211 )	2024-06-05 13:46:30 +08:00
Shaojun Liu	1f2057b16a	Fix ipex-llm-cpu docker image (#11213 ) * fix * fix ipex-llm-cpu image	2024-06-05 11:13:17 +08:00
Xin Qiu	566691c5a3	quantized attention forward for minicpm (#11200 ) * quantized minicpm * fix style check	2024-06-05 09:15:25 +08:00
Jiao Wang	bb83bc23fd	Fix Starcoder issue on CPU on transformers 4.36+ (#11190 ) * fix starcoder for sdpa * update * style	2024-06-04 10:05:40 -07:00
Kai Huang	f93664147c	Update config.yaml (#11208 ) * update config.yaml * fix * minor * style	2024-06-04 19:58:18 +08:00
Xiangyu Tian	ac3d53ff5d	LLM: Fix vLLM CPU version error (#11206 ) Fix vLLM CPU version error	2024-06-04 19:10:23 +08:00
Guancheng Fu	3ef4aa98d1	Refine vllm_quickstart doc (#11199 ) * refine doc * refine	2024-06-04 18:46:27 +08:00
Shaojun Liu	744042d1b2	remove software-properties-common from Dockerfile (#11203 )	2024-06-04 17:37:42 +08:00
Ruonan Wang	1dde204775	update q6k (#11205 )	2024-06-04 17:14:33 +08:00
Qiyuan Gong	ce3f08b25a	Fix IPEX auto importer (#11192 ) * Fix ipex auto importer with Python builtins. * Raise errors if the user imports ipex manually before importing ipex_llm. Do nothing if they import ipex after importing ipex_llm. * Remove import ipex in examples.	2024-06-04 16:57:18 +08:00
Yina Chen	711fa0199e	Fix fp6k phi3 ppl core dump (#11204 )	2024-06-04 16:44:27 +08:00
Xiangyu Tian	f02f097002	Fix vLLM verion in CPU/vLLM-Serving example README (#11201 )	2024-06-04 15:56:55 +08:00
Yishuo Wang	6454655dcc	use sdp in baichuan2 13b (#11198 )	2024-06-04 15:39:00 +08:00
Yuwen Hu	9f8074c653	Add extra warmup for chatglm3-6b in igpu-performance test (#11197 ) * Add extra warmup for chatglm3-6b to record more stable performance (int4+fp32) * Small updates	2024-06-04 14:06:09 +08:00
Yishuo Wang	d90cd977d0	refactor stablelm (#11195 )	2024-06-04 13:14:43 +08:00
Zijie Li	a644e9409b	Miniconda/Anaconda -> Miniforge update in examples (#11194 ) * Change installation address Change former address: "https://docs.conda.io/en/latest/miniconda.html#" to new address: "https://conda-forge.org/download/" for 63 occurrences under python\llm\example * Change Prompt Change "Anaconda Prompt" to "Miniforge Prompt" for 1 occurrence	2024-06-04 10:14:02 +08:00
Xin Qiu	5f13700c9f	optimize Minicpm (#11189 ) * minicpm optimize * update	2024-06-03 18:28:29 +08:00
Xiangyu Tian	ff83fad400	Fix typo in vLLM CPU docker guide (#11188 )	2024-06-03 15:55:27 +08:00
Qiyuan Gong	15a6205790	Fix LoRA tokenizer for Llama and chatglm (#11186 ) * Set pad_token to eos_token if it's None. Otherwise, use model config.	2024-06-03 15:35:38 +08:00
Cengguang Zhang	3eb13ccd8c	LLM: fix input length condition in deepspeed all-in-one benchmark. (#11185 )	2024-06-03 10:05:43 +08:00
Shaojun Liu	401013a630	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 ) * remove chatglm_C.*.pyd to solve ngsolve weak copyright vunl fix style check error * remove chatglm native int4 from langchain	2024-05-31 17:03:11 +08:00
Ruonan Wang	50b5f4476f	update q4k convert (#11179 )	2024-05-31 11:36:53 +08:00
Yuwen Hu	f0aaa130a9	Update miniconda/anaconda -> miniforge in documentation (#11176 ) * Update miniconda/anaconda -> miniforge in installation guide * Update for all Quickstart * further fix for docs	2024-05-30 17:40:18 +08:00
Wang, Jian4	c0f1be6aea	Fix pp logic (#11175 ) * only send no none batch and rank1-n sending first * always send first	2024-05-30 16:40:59 +08:00
ZehuaCao	4127b99ed6	Fix null pointer dereferences error. (#11125 ) * delete unused function on tgi_server * update * update * fix style	2024-05-30 16:16:10 +08:00
Guancheng Fu	50ee004ac7	Fix vllm condition (#11169 ) * add use-vllm * done * fix style * fix done	2024-05-30 15:23:17 +08:00
Jin Qiao	dcbf4d3d0a	Add phi-3-vision example (#11156 ) * Add phi-3-vision example (HF-Automodels) * fix * fix * fix * Add phi-3-vision CPU example (HF-Automodels) * add in readme * fix * fix * fix * fix * use fp8 for gpu example * remove eval	2024-05-30 10:02:47 +08:00
Jiao Wang	93146b9433	Reconstruct Speculative Decoding example directory (#11136 ) * update * update * update	2024-05-29 13:15:27 -07:00
Xiangyu Tian	2299698b45	Refine Pipeline Parallel FastAPI example (#11168 )	2024-05-29 17:16:50 +08:00
Ruonan Wang	9bfbf78bf4	update api usage of xe_batch & fp16 (#11164 ) * update api usage * update setup.py	2024-05-29 15:15:14 +08:00
Yina Chen	e29e2f1c78	Support new fp8 e4m3 (#11158 )	2024-05-29 14:27:14 +08:00
Wang, Jian4	8e25de1126	LLM: Add codegeex2 example (#11143 ) * add codegeex example * update * update cpu * add GPU * add gpu * update readme	2024-05-29 10:00:26 +08:00
ZehuaCao	751e1a4e29	Fix concurrent issue in autoTP streming. (#11150 ) * add benchmark test * update	2024-05-29 08:22:38 +08:00
Jason Dai	7cc43aa67a	Update readme (#11160 )	2024-05-28 21:16:36 +08:00

... 3 4 5 6 7 ...

3123 commits