* done * Rename structure * add models * Add structure/sampling_params,sequence * add input_metadata * add outputs * Add policy,logger * add and update * add parallelconfig back * core/scheduler.py * Add llm_engine.py * Add async_llm_engine.py * Add tested entrypoint * fix minor error * Fix everything * fix kv cache view * fix * fix * fix * format&refine * remove logger from repo * try to add token latency * remove logger * Refine config.py * finish worker.py * delete utils.py * add license * refine * refine sequence.py * remove sampling_params.py * finish * add license * format * add license * refine * refine * Refine line too long * remove exception * so dumb style-check * refine * refine * refine * refine * refine * refine * add README * refine README * add warning instead error * fix padding * add license * format * format * format fix * Refine vllm dependency (#1) vllm dependency clear * fix licence * fix format * fix format * fix * adapt LLM engine * fix * add license * fix format * fix * Moving README.md to the correct position * Fix readme.md * done * guide for adding models * fix * Fix README.md * Add new model readme * remove ray-logic * refactor arg_utils.py * remove distributed_init_method logic * refactor entrypoints * refactor input_metadata * refactor model_loader * refactor utils.py * refactor models * fix api server * remove vllm.stucture * revert by txy 1120 * remove utils * format * fix license * add bigdl model * Refer to a specfic commit * Change code base * add comments * add async_llm_engine comment * refine * formatted * add worker comments * add comments * add comments * fix style * add changes --------- Co-authored-by: xiangyuT <xiangyu.tian@intel.com> Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com> Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
21 lines
1.1 KiB
Markdown
21 lines
1.1 KiB
Markdown
# BigDL-LLM Examples on Intel CPU
|
|
|
|
This folder contains examples of running BigDL-LLM on Intel CPU:
|
|
|
|
- [HF-Transformers-AutoModels](HF-Transformers-AutoModels): running any Hugging Face Transformers model on BigDL-LLM (using the standard AutoModel APIs)
|
|
- [PyTorch-Models](PyTorch-Models): running any PyTorch model on BigDL-LLM (with "one-line code change")
|
|
- [Native-Models](Native-Models): converting & running LLM in `llama`/`chatglm`/`bloom`/`gptneox`/`starcoder` model family using native (cpp) implementation
|
|
- [LangChain](LangChain): running LangChain applications on BigDL-LLM
|
|
- [Applications](Applications): running Transformers applications on BigDl-LLM
|
|
- [QLoRA-FineTuning](QLoRA-FineTuning): running QLoRA finetuning using BigDL-LLM on intel CPUs
|
|
- [vLLM-Serving](vLLM-Serving): running vLLM serving framework on Xeon Platforms (with BigDL-LLM low-bit optimized models)
|
|
|
|
## System Support
|
|
**Hardware**:
|
|
- Intel® Core™ processors
|
|
- Intel® Xeon® processors
|
|
|
|
**Operating System**:
|
|
- Ubuntu 20.04 or later
|
|
- CentOS 7 or later
|
|
- Windows 10/11, with or without WSL
|