ipex-llm

Author	SHA1	Message	Date
Jason Dai	e5b384aaa2	Update README.md (#8437 )	2023-07-03 10:54:29 +08:00
Yang Wang	449aea7ffc	Optimize transformer int4 loading memory (#8400 ) * Optimize transformer int4 loading memory * move cast to convert * default settting low_cpu_mem_usage	2023-06-30 20:12:12 -07:00
Jason Dai	2da21163f8	Update llm README.md (#8431 )	2023-06-30 19:41:17 +08:00
Junwei Deng	2fd751de7a	LLM: add a dev tool for getting glibc/glibcxx requirement (#8399 ) * add a dev tool * pep8 change	2023-06-30 11:09:50 +08:00
binbin Deng	146662bc0d	LLM: fix langchain windows failure (#8417 )	2023-06-30 09:59:10 +08:00
Yina Chen	6251ad8934	[LLM]Windows unittest (#8356 ) * win-unittest * update * update * try llama 7b * delete llama * update * add red-3b * only test red-3b * revert * add langchain * add dependency * delete langchain	2023-06-29 14:03:12 +08:00
Yina Chen	783aea3309	[LLM] LLM windows daily test (#8328 ) * llm-win-init * test action * test * add types * update for schtasks * update pytests * update * update * update doc * use stable ckpt from ftp instead of the converted model * download using batch -> manually * add starcoder test	2023-06-28 15:02:11 +08:00
binbin Deng	ca5a4b6e3a	LLM: update bloom and starcoder usage in transformers_int4_pipeline (#8406 )	2023-06-28 13:15:50 +08:00
Zhao Changmin	cc76ec809a	check out dir (#8395 )	2023-06-27 21:28:39 +08:00
Ruonan Wang	4be784a49d	LLM: add UT for starcoder (convert, inference) update examples and readme (#8379 ) * first commit to add path * update example and readme * update path * fix * update based on comment	2023-06-27 12:12:11 +08:00
Xin Qiu	e68d631c0a	gptq2ggml: support loading safetensors model. (#8401 ) * update convert gptq to ggml * update convert gptq to ggml * gptq to ggml * update script * meet code review * meet code review	2023-06-27 11:19:33 +08:00
Ruonan Wang	b9eae23c79	LLM: add chatglm-6b example for transformer_int4 usage (#8392 ) * add example for chatglm-6b * fix	2023-06-26 13:46:43 +08:00
binbin Deng	19e19efb4c	LLM: raise warning instead of error when use unsupported parameters (#8382 )	2023-06-26 13:23:55 +08:00
Shengsheng Huang	c113ecb929	[LLM] langchain bloom, UT's, default parameters (#8357 ) * update langchain default parameters to align w/ api * add ut's for llm and embeddings * update inference test script to install langchain deps * update tests workflows --------- Co-authored-by: leonardozcm <changmin.zhao@intel.com>	2023-06-25 17:38:00 +08:00
Shengsheng Huang	446175cc05	transformer api refactor (#8389 ) * transformer api refactor * fix style * add huggingface tokenizer usage in example and make ggml tokenzizer as option 1 and huggingface tokenizer as option 2 * fix style	2023-06-25 17:15:33 +08:00
Yang Wang	ce6d06eb0a	Support directly quantizing huggingface transformers into 4bit format (#8371 ) * Support directly quantizing huggingface transformers into 4bit format * refine example * license * fix bias * address comments * move to ggml transformers * fix example * fix style * fix style * address comments * rename * change API * fix style * add lm head to conversion * address comments	2023-06-25 16:35:06 +08:00
binbin Deng	03c5fb71a8	LLM: fix ModuleNotFoundError when use llm-cli (#8378 )	2023-06-21 15:03:14 +08:00
Ruonan Wang	7296453f07	LLM: support starcoder in llm-cli (#8377 ) * support starcoder in cli * small fix	2023-06-21 14:38:30 +08:00
Ruonan Wang	50af0251e4	LLM: First commit of StarCoder pybinding (#8354 ) * first commit of starcoder * update setup.py and fix style * add starcoder_cpp, fix style * fix style * support windows binary * update pybinding * fix style, add avx2 binary * small fix * fix style	2023-06-21 13:23:06 +08:00
Yuwen Hu	a7d66b7342	[LLM] README revise for `llm_convert` (#8374 ) * Small readme revise for llm_convert * Small fix	2023-06-21 10:04:34 +08:00
Yuwen Hu	7ef1c890eb	[LLM] Supports GPTQ convert in transfomers-like API, and supports folder outfile for `llm-convert` (#8366 ) * Add docstrings to llm_convert * Small docstrings fix * Unify outfile type to be a folder path for either gptq or pth model_format * Supports gptq model input for from_pretrained * Fix example and readme * Small fix * Python style fix * Bug fix in llm_convert * Python style check * Fix based on comments * Small fix	2023-06-20 17:42:38 +08:00
Zhao Changmin	4ec46afa4f	LLM: Align converting GPTQ model API with transformer style (#8365 ) * LLM: Align GPTQ API with transformer style	2023-06-20 14:27:41 +08:00
Ruonan Wang	f99d348954	LLM: convert and quantize support for StarCoder (#8359 ) * basic support for starcoder * update from_pretrained * fix bug and fix style	2023-06-20 13:39:35 +08:00
binbin Deng	5f4f399ca7	LLM: fix bugs during supporting bloom in langchain (#8362 )	2023-06-20 13:30:37 +08:00
Zhao Changmin	30ac9a70f5	LLM: fix expected 2 blank lines (#8360 )	2023-06-19 18:10:02 +08:00
Zhao Changmin	c256cd136b	LLM: Fix ggml return value (#8358 ) * ggml return original value	2023-06-19 17:02:56 +08:00
Zhao Changmin	d4027d7164	fix typos in llm_convert (#8355 )	2023-06-19 16:17:21 +08:00
Zhao Changmin	4d177ca0a1	LLM: Merge convert pth/gptq model script into one shell script (#8348 ) * convert model in one * model type * license * readme and pep8 * ut path * rename * readme * fix docs * without lines	2023-06-19 11:50:05 +08:00
binbin Deng	ab1a833990	LLM: add basic uts related to inference (#8346 )	2023-06-19 10:25:51 +08:00
Yuwen Hu	1aa33d35d5	[LLM] Refactor LLM Linux tests (#8349 ) * Small name fix * Add convert nightly tests, and for other llm tests, use stable ckpt * Small fix and ftp fix * Small fix * Small fix	2023-06-16 15:22:48 +08:00
Ruonan Wang	9daf543e2f	LLM: Update convert of gpenox to sync with new libgptneox.so (#8345 )	2023-06-15 16:28:50 +08:00
Ruonan Wang	9fda7e34f1	LLM: fix version control (#8342 )	2023-06-15 15:18:50 +08:00
Ruonan Wang	f7f4e65788	LLM: support int8 and tmp_path for `from_pretrained` (#8338 )	2023-06-15 14:48:21 +08:00
Yuwen Hu	b30aa49c4e	[LLM] Add Actions for downloading & converting models (#8320 ) * First push to downloading and converting llm models for testing (Gondolin runner, avx2 for now) * Change yml file name	2023-06-15 13:43:47 +08:00
Ruonan Wang	8840dadd86	LLM: binary file version control on source forge (#8329 ) * support version control for llm based on date * update action	2023-06-15 09:53:27 +08:00
Ruonan Wang	5094970175	LLM: update `convert_model` to support int8 (#8326 ) * update example and convert_model for int8 * reset example * fix style	2023-06-15 09:25:07 +08:00
binbin Deng	f64e703083	LLM: first add `_tokenize`, `detokenize` and `_generate` for bloom pybinding (#8316 )	2023-06-14 17:29:57 +08:00
Xin Qiu	5576679a92	add convert-gptq-to-ggml.py to bigdl-llama (#8298 )	2023-06-14 14:51:51 +08:00
Ruonan Wang	a6c4b733cb	LLM: Update subprocess to show error message (#8323 ) * update subprocess * fix style	2023-06-13 16:43:37 +08:00
Shengsheng Huang	02c583144c	[LLM] langchain integrations and examples (#8256 ) * langchain intergrations and examples * add licences and rename * add licences * fix license issues and change backbone to model_family * update examples to use model_family param * fix linting * fix code style * exclude langchain integration from stylecheck * update langchain examples and update integrations based on latets changes * update simple llama-cpp-python style API example * remove bloom in README * change default n_threads to 2 and remove redundant code --------- Co-authored-by: leonardozcm <changmin.zhao@intel.com>	2023-06-12 19:22:07 +08:00
Yuwen Hu	f83c48280f	[LLM] Unify transformers-like API example for 3 different model families (#8315 ) * Refactor bigdl-llm transformers-like API to unify them * Small fix	2023-06-12 17:20:30 +08:00
xingyuan li	c4028d507c	[LLM] Add unified default value for cli programs (#8310 ) * add unified default value for threads and n_predict	2023-06-12 16:30:27 +08:00
Junwei Deng	f41995051b	LLM: add new readme as first version document (#8296 ) * add new readme * revice * revice * change readme * add python req	2023-06-09 15:52:02 +08:00
Yuwen Hu	c619315131	[LLM] Add examples for `gptneox`, `llama`, and `bloom` family model using transformers-like API (#8286 ) * First push of bigdl-llm example for gptneox model family * Add some args and other small updates * Small updates * Add example for llama family models * Small fix * Small fix * Update for batch_decode api and change default model for llama example * Small fix * Small fix * Small fix * Small model family name fix and add example for bloom * Small fix * Small default prompt fix * Small fix * Change default prompt * Add sample output for inference * Hide example inference time	2023-06-09 15:48:22 +08:00
binbin Deng	5d5da7b2c7	LLM: optimize namespace and remove unused import logic (#8302 )	2023-06-09 15:17:49 +08:00
Ruonan Wang	5d0e130605	LLM: fix convert path error of gptneox and bloom on windows (#8304 )	2023-06-09 10:10:19 +08:00
Yina Chen	7bfa0fcdf9	fix style (#8300 )	2023-06-08 16:52:17 +08:00
Yina Chen	637b72f2ad	[LLM] llm transformers api support batch actions (#8288 ) * llm transformers api support batch actions * align with transformer * meet comment	2023-06-08 15:10:08 +08:00
xingyuan li	ea3cf6783e	LLM: Command line wrapper for llama/bloom/gptneox (#8239 ) * add llama/bloom/gptneox wrapper * add readme * upload binary main file	2023-06-08 14:55:22 +08:00
binbin Deng	08bdfce2d8	LLM: avoid unnecessary import torch except converting process (#8297 )	2023-06-08 14:24:58 +08:00

1 2

80 commits