Jason Dai
e5b384aaa2
Update README.md ( #8437 )
2023-07-03 10:54:29 +08:00
Yang Wang
449aea7ffc
Optimize transformer int4 loading memory ( #8400 )
...
* Optimize transformer int4 loading memory
* move cast to convert
* default settting low_cpu_mem_usage
2023-06-30 20:12:12 -07:00
Jason Dai
2da21163f8
Update llm README.md ( #8431 )
2023-06-30 19:41:17 +08:00
Junwei Deng
2fd751de7a
LLM: add a dev tool for getting glibc/glibcxx requirement ( #8399 )
...
* add a dev tool
* pep8 change
2023-06-30 11:09:50 +08:00
binbin Deng
146662bc0d
LLM: fix langchain windows failure ( #8417 )
2023-06-30 09:59:10 +08:00
Yina Chen
6251ad8934
[LLM]Windows unittest ( #8356 )
...
* win-unittest
* update
* update
* try llama 7b
* delete llama
* update
* add red-3b
* only test red-3b
* revert
* add langchain
* add dependency
* delete langchain
2023-06-29 14:03:12 +08:00
Yina Chen
783aea3309
[LLM] LLM windows daily test ( #8328 )
...
* llm-win-init
* test action
* test
* add types
* update for schtasks
* update pytests
* update
* update
* update doc
* use stable ckpt from ftp instead of the converted model
* download using batch -> manually
* add starcoder test
2023-06-28 15:02:11 +08:00
binbin Deng
ca5a4b6e3a
LLM: update bloom and starcoder usage in transformers_int4_pipeline ( #8406 )
2023-06-28 13:15:50 +08:00
Zhao Changmin
cc76ec809a
check out dir ( #8395 )
2023-06-27 21:28:39 +08:00
Ruonan Wang
4be784a49d
LLM: add UT for starcoder (convert, inference) update examples and readme ( #8379 )
...
* first commit to add path
* update example and readme
* update path
* fix
* update based on comment
2023-06-27 12:12:11 +08:00
Xin Qiu
e68d631c0a
gptq2ggml: support loading safetensors model. ( #8401 )
...
* update convert gptq to ggml
* update convert gptq to ggml
* gptq to ggml
* update script
* meet code review
* meet code review
2023-06-27 11:19:33 +08:00
Ruonan Wang
b9eae23c79
LLM: add chatglm-6b example for transformer_int4 usage ( #8392 )
...
* add example for chatglm-6b
* fix
2023-06-26 13:46:43 +08:00
binbin Deng
19e19efb4c
LLM: raise warning instead of error when use unsupported parameters ( #8382 )
2023-06-26 13:23:55 +08:00
Shengsheng Huang
c113ecb929
[LLM] langchain bloom, UT's, default parameters ( #8357 )
...
* update langchain default parameters to align w/ api
* add ut's for llm and embeddings
* update inference test script to install langchain deps
* update tests workflows
---------
Co-authored-by: leonardozcm <changmin.zhao@intel.com>
2023-06-25 17:38:00 +08:00
Shengsheng Huang
446175cc05
transformer api refactor ( #8389 )
...
* transformer api refactor
* fix style
* add huggingface tokenizer usage in example and make ggml tokenzizer as option 1 and huggingface tokenizer as option 2
* fix style
2023-06-25 17:15:33 +08:00
Yang Wang
ce6d06eb0a
Support directly quantizing huggingface transformers into 4bit format ( #8371 )
...
* Support directly quantizing huggingface transformers into 4bit format
* refine example
* license
* fix bias
* address comments
* move to ggml transformers
* fix example
* fix style
* fix style
* address comments
* rename
* change API
* fix style
* add lm head to conversion
* address comments
2023-06-25 16:35:06 +08:00
binbin Deng
03c5fb71a8
LLM: fix ModuleNotFoundError when use llm-cli ( #8378 )
2023-06-21 15:03:14 +08:00
Ruonan Wang
7296453f07
LLM: support starcoder in llm-cli ( #8377 )
...
* support starcoder in cli
* small fix
2023-06-21 14:38:30 +08:00
Ruonan Wang
50af0251e4
LLM: First commit of StarCoder pybinding ( #8354 )
...
* first commit of starcoder
* update setup.py and fix style
* add starcoder_cpp, fix style
* fix style
* support windows binary
* update pybinding
* fix style, add avx2 binary
* small fix
* fix style
2023-06-21 13:23:06 +08:00
Yuwen Hu
a7d66b7342
[LLM] README revise for llm_convert ( #8374 )
...
* Small readme revise for llm_convert
* Small fix
2023-06-21 10:04:34 +08:00
Yuwen Hu
7ef1c890eb
[LLM] Supports GPTQ convert in transfomers-like API, and supports folder outfile for llm-convert ( #8366 )
...
* Add docstrings to llm_convert
* Small docstrings fix
* Unify outfile type to be a folder path for either gptq or pth model_format
* Supports gptq model input for from_pretrained
* Fix example and readme
* Small fix
* Python style fix
* Bug fix in llm_convert
* Python style check
* Fix based on comments
* Small fix
2023-06-20 17:42:38 +08:00
Zhao Changmin
4ec46afa4f
LLM: Align converting GPTQ model API with transformer style ( #8365 )
...
* LLM: Align GPTQ API with transformer style
2023-06-20 14:27:41 +08:00
Ruonan Wang
f99d348954
LLM: convert and quantize support for StarCoder ( #8359 )
...
* basic support for starcoder
* update from_pretrained
* fix bug and fix style
2023-06-20 13:39:35 +08:00
binbin Deng
5f4f399ca7
LLM: fix bugs during supporting bloom in langchain ( #8362 )
2023-06-20 13:30:37 +08:00
Zhao Changmin
30ac9a70f5
LLM: fix expected 2 blank lines ( #8360 )
2023-06-19 18:10:02 +08:00
Zhao Changmin
c256cd136b
LLM: Fix ggml return value ( #8358 )
...
* ggml return original value
2023-06-19 17:02:56 +08:00
Zhao Changmin
d4027d7164
fix typos in llm_convert ( #8355 )
2023-06-19 16:17:21 +08:00
Zhao Changmin
4d177ca0a1
LLM: Merge convert pth/gptq model script into one shell script ( #8348 )
...
* convert model in one
* model type
* license
* readme and pep8
* ut path
* rename
* readme
* fix docs
* without lines
2023-06-19 11:50:05 +08:00
binbin Deng
ab1a833990
LLM: add basic uts related to inference ( #8346 )
2023-06-19 10:25:51 +08:00
Yuwen Hu
1aa33d35d5
[LLM] Refactor LLM Linux tests ( #8349 )
...
* Small name fix
* Add convert nightly tests, and for other llm tests, use stable ckpt
* Small fix and ftp fix
* Small fix
* Small fix
2023-06-16 15:22:48 +08:00
Ruonan Wang
9daf543e2f
LLM: Update convert of gpenox to sync with new libgptneox.so ( #8345 )
2023-06-15 16:28:50 +08:00
Ruonan Wang
9fda7e34f1
LLM: fix version control ( #8342 )
2023-06-15 15:18:50 +08:00
Ruonan Wang
f7f4e65788
LLM: support int8 and tmp_path for from_pretrained ( #8338 )
2023-06-15 14:48:21 +08:00
Yuwen Hu
b30aa49c4e
[LLM] Add Actions for downloading & converting models ( #8320 )
...
* First push to downloading and converting llm models for testing (Gondolin runner, avx2 for now)
* Change yml file name
2023-06-15 13:43:47 +08:00
Ruonan Wang
8840dadd86
LLM: binary file version control on source forge ( #8329 )
...
* support version control for llm based on date
* update action
2023-06-15 09:53:27 +08:00
Ruonan Wang
5094970175
LLM: update convert_model to support int8 ( #8326 )
...
* update example and convert_model for int8
* reset example
* fix style
2023-06-15 09:25:07 +08:00
binbin Deng
f64e703083
LLM: first add _tokenize, detokenize and _generate for bloom pybinding ( #8316 )
2023-06-14 17:29:57 +08:00
Xin Qiu
5576679a92
add convert-gptq-to-ggml.py to bigdl-llama ( #8298 )
2023-06-14 14:51:51 +08:00
Ruonan Wang
a6c4b733cb
LLM: Update subprocess to show error message ( #8323 )
...
* update subprocess
* fix style
2023-06-13 16:43:37 +08:00
Shengsheng Huang
02c583144c
[LLM] langchain integrations and examples ( #8256 )
...
* langchain intergrations and examples
* add licences and rename
* add licences
* fix license issues and change backbone to model_family
* update examples to use model_family param
* fix linting
* fix code style
* exclude langchain integration from stylecheck
* update langchain examples and update integrations based on latets changes
* update simple llama-cpp-python style API example
* remove bloom in README
* change default n_threads to 2 and remove redundant code
---------
Co-authored-by: leonardozcm <changmin.zhao@intel.com>
2023-06-12 19:22:07 +08:00
Yuwen Hu
f83c48280f
[LLM] Unify transformers-like API example for 3 different model families ( #8315 )
...
* Refactor bigdl-llm transformers-like API to unify them
* Small fix
2023-06-12 17:20:30 +08:00
xingyuan li
c4028d507c
[LLM] Add unified default value for cli programs ( #8310 )
...
* add unified default value for threads and n_predict
2023-06-12 16:30:27 +08:00
Junwei Deng
f41995051b
LLM: add new readme as first version document ( #8296 )
...
* add new readme
* revice
* revice
* change readme
* add python req
2023-06-09 15:52:02 +08:00
Yuwen Hu
c619315131
[LLM] Add examples for gptneox, llama, and bloom family model using transformers-like API ( #8286 )
...
* First push of bigdl-llm example for gptneox model family
* Add some args and other small updates
* Small updates
* Add example for llama family models
* Small fix
* Small fix
* Update for batch_decode api and change default model for llama example
* Small fix
* Small fix
* Small fix
* Small model family name fix and add example for bloom
* Small fix
* Small default prompt fix
* Small fix
* Change default prompt
* Add sample output for inference
* Hide example inference time
2023-06-09 15:48:22 +08:00
binbin Deng
5d5da7b2c7
LLM: optimize namespace and remove unused import logic ( #8302 )
2023-06-09 15:17:49 +08:00
Ruonan Wang
5d0e130605
LLM: fix convert path error of gptneox and bloom on windows ( #8304 )
2023-06-09 10:10:19 +08:00
Yina Chen
7bfa0fcdf9
fix style ( #8300 )
2023-06-08 16:52:17 +08:00
Yina Chen
637b72f2ad
[LLM] llm transformers api support batch actions ( #8288 )
...
* llm transformers api support batch actions
* align with transformer
* meet comment
2023-06-08 15:10:08 +08:00
xingyuan li
ea3cf6783e
LLM: Command line wrapper for llama/bloom/gptneox ( #8239 )
...
* add llama/bloom/gptneox wrapper
* add readme
* upload binary main file
2023-06-08 14:55:22 +08:00
binbin Deng
08bdfce2d8
LLM: avoid unnecessary import torch except converting process ( #8297 )
2023-06-08 14:24:58 +08:00