* test support bitsandbytesconfig
* update style
* update cpu example
* update example
* update readme
* update unit test
* use bfloat16
* update logic
* use int4
* set defalut bnb_4bit_use_double_quant
* update
* update example
* update model.py
* update
* support lora example
* add py310 test for llm-unit-test.
* add py310 llm-unit-tests
* add llm-cpp-build-py310
* test
* test
* test.
* test
* test
* fix deactivate.
* fix
* fix.
* fix
* test
* test
* test
* add build chatglm for win.
* test.
* fix
* Add test script and workflow for qlora fine-tuning
* Test fix export model
* Download dataset
* Fix export model issue
* Reduce number of training steps
* Rename script
* Correction
* Add gpu workflow and a transformers API inference test
* Set device-specific env variables in script instead of workflow
* Fix status message
---------
Co-authored-by: sgwhat <ge.song@intel.com>
* add ut for mistral model
* update
* fix model path
* upgrade transformers version for mistral model
* refactor correctness ut for mustral model
* refactor mistral correctness ut
* revert test_optimize_model back
* remove mistral from test_optimize_model
* add to revert transformers version back to 4.31.0
* add new API for optimize any pytorch models
* change test util name
* revise API and update UT
* fix python style
* update ut config, change default value
* change defaults, disable ut transcribe
* deprecate BigDLNativeTransformers and add specific LMEmbedding method
* deprecate and add LM methods for langchain llms
* add native params to native langchain
* new imple for embedding
* move ut from bigdlnative to casual llm
* rename embeddings api and examples update align with usage updating
* docqa example hot-fix
* add more api docs
* add langchain ut for starcoder
* support model_kwargs for transformer methods when calling causalLM and add ut
* ut fix for transformers embedding
* update for langchain causal supporting transformers
* remove model_family in readme doc
* add model_families params to support more models
* update api docs and remove chatglm embeddings for now
* remove chatglm embeddings in examples
* new refactor for ut to add bloom and transformers llama ut
* disable llama transformers embedding ut
* fix download statement
* add check before build wheel
* use curl to upload files
* windows unittest won't upload converted model
* split llm-cli test into windows & linux versions
* update tempdir create way
* fix nightly converted model name
* windows llm-cli starcoder test temply disabled
* remove taskset dependency
* rename llm_unit_tests_linux to llm_unit_tests