* Add mamba cpu example
* Add mamba gpu example
* Use a smaller model as the example
* minor fixes
---------
Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
* Add cpu and gpu examples for DeciLM-7B
* Add cpu and gpu examples for DeciLM-7B
* Add DeciLM-7B to README table
* modify deciLM
* modify deciLM
* modify deciLM
* Add verified model in README
* Add cpu_embedding=True
* Add a new CPU example of Yuan2-2B-hf
* Add a new CPU generate.py of Yuan2-2B-hf example
* Add a new GPU example of Yuan2-2B-hf
* Add Yuan2 to README table
* In CPU example:1.Use English as default prompt; 2.Provide modified files in yuan2-2B-instruct
* In GPU example:1.Use English as default prompt;2.Provide modified files
* GPU example:update README
* update Yuan2-2B-hf in README table
* Add CPU example for Yuan2-2B in Pytorch-Models
* Add GPU example for Yuan2-2B in Pytorch-Models
* Add license in generate.py; Modify README
* In GPU Add license in generate.py; Modify README
* In CPU yuan2 modify README
* In GPU yuan2 modify README
* In CPU yuan2 modify README
* In GPU example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* GPU hf example README modified
* GPU example README modified
* Add CPU and GPU examples of phi-2
* In GPU hf example, updated the readme for Windows GPU supports
* In GPU torch example, updated the readme for Windows GPU supports
* update the table in BigDL/README.md
* update the table in BigDL/python/llm/README.md
* add Qwen1.5-7B-Chat
* modify Qwen1.5 example
* update README
* update prompt format
* update folder name and example README
* add Chinese prompt sample output
* update link in README
* correct the link
* update transformer version
* ziya on CPU
* add README for ziya
* specify use_cache
* add arc CPU
* update prompt format
* update link
* add comments to emphasize use_cache
* update pip cmd
* Add cpu int4 example for BlueLM
* addexample optimize_model cpu for bluelm
* add example gpu int4 blueLM
* add example optimiza_model GPU for bluelm
* Fixing naming issues and BigDL package version.
* Fixing naming issues...
* Add BlueLM in README.md "Verified Models"
* eee
* add examples on CPU and GPU
* fix
* fix
* optimize model examples
* add Qwen-VL-Chat CPU example
* Add Qwen-VL CPU example
* fix optimize problem
* fix error
* Have updated, benchmark fix removed from this PR
* add generate API example
* Change formats in qwen-vl example
* Add CPU transformer int4 example for qwen-vl
* fix repo-id problem and add Readme
* change picture url
* Remove unnecessary file
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
* add cpu and gpu examples of flan-t5
* address yuwen's comments
* Add explanation why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files
* add examples of flan-t5 using optimize_mdoel api
* address bin's comments
* address binbin's comments
* add flan-t5 in readme
* Add arc demo in main readme
* Small style fix
* Realize using table
* Update based on comments
* Small update
* Try to solve with height problem
* Small fix
* Update demo for inner llm readme
* Update demo video for readthedocs
* Small fix
* Update based on comments
* deprecate BigDLNativeTransformers and add specific LMEmbedding method
* deprecate and add LM methods for langchain llms
* add native params to native langchain
* new imple for embedding
* move ut from bigdlnative to casual llm
* rename embeddings api and examples update align with usage updating
* docqa example hot-fix
* add more api docs
* add langchain ut for starcoder
* support model_kwargs for transformer methods when calling causalLM and add ut
* ut fix for transformers embedding
* update for langchain causal supporting transformers
* remove model_family in readme doc
* add model_families params to support more models
* update api docs and remove chatglm embeddings for now
* remove chatglm embeddings in examples
* new refactor for ut to add bloom and transformers llama ut
* disable llama transformers embedding ut