* add batch_size in stable version test
* add batch_size in excludes
* add excludes for batch_size
* fix ci
* triger regression test
* fix xpu version
* disable ci
* address kai's comment
---------
Co-authored-by: Ariadne <wyn2000330@126.com>
* add new models and change schedule to nightly
* correct syntax error
* modify env set up and job
* change label and schedule time
* change schedule time
* change label
* ensure the result of daily arc perf test
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* small fix
* concat more csvs
* small fix
* revert some files
* Change to install from pypi and have a check to make sure the installed bigdl-llm version is as expected
* Make sure result date is the same as tested bigdl-llm version
* Small fixes
* Small fix
* Small fixes
* Small fix
* Small fixes
* Small updates
* Update default xpu to ipex 2.1
* Update related install ut support correspondingly
* Add arc ut tests for both ipex 2.0 and 2.1
* Small fix
* Diable ipex 2.1 test for now as oneapi 2024.0 has not beed installed on the test machine
* Update document for default PyTorch 2.1
* Small fix
* Small fix
* Small doc fixes
* Small fixes
* test support bitsandbytesconfig
* update style
* update cpu example
* update example
* update readme
* update unit test
* use bfloat16
* update logic
* use int4
* set defalut bnb_4bit_use_double_quant
* update
* update example
* update model.py
* update
* support lora example
* add arc stable version regression test
* empty gpu mem between different models
* triger ci
* comment spr test
* triger ci
* address kai's comments and disable ci
* merge fp8 and int4
* disable ci
* Move 1024 test just after 32-32 test; and enable all model for 1024-128
* Make sure python output encoding in utf-8 so that redirect to txt can always be success
* Upload results to ftp
* Small fix
* Add test for 1024-128 and enable more tests for 512-64
* Fix date in results csv name to the time when the performance is triggered
* Small fix
* Small fix
* further fixes
* LLM: check csv and its corresponding yaml file
* run PR arc perf test
* modify the name of some variables
* execute the check results script in right place
* use cp to replace mv command
* resolve some comments
* resolve more comments
* revert the llm_performance_test.yaml file
* Change igpu win tests for ipex 2.1 and oneapi 2024.0
* Qwen model repo id updates; updates model list for 512-64
* Add .eval for win igpu all-in-one benchmark for best performance
* trigger pr temparorily
* Saparate benchmark run for win igpu based in in-out pairs
* Rename fix
* Test workflow
* Small fix
* Skip generation of html for now
* Change back to nightly triggered
* update bigdl_llm.py
* update the installation of harness
* fix partial function
* import ipex
* force seq len in decrease order
* put func outside class
* move comments
* default 'trust_remote_code' as True
* Update llm-harness-evaluation.yml
* Temp enable PR
* Enable tests for 256-64
* Try again 128-64
* Empty cache after each iteration for igpu benchmark scripts
* Try tests for 512
* change order for 512
* Skip chatglm3 and llama2 for now
* Separate tests for 512-64
* Small fix
* Further fixes
* Change back to nightly again
* Add support for win gpu benchmark with peak gpu memory monitoring
* Add win igpu tests
* Small fix
* Forward outputs
* Small fix
* Test and small fixes
* Small fix
* Small fix and test
* Small fixes
* Add tests for 512-64 and change back to nightly tests
* Small fix
* modify output_path as a directory
* schedule nightly at 21 on Friday
* add tasks and models for nightly
* add accuracy regression
* comment out if to test
* mixed fp4
* for test
* add missing delimiter
* remove comma
* fixed golden results
* add mixed 4 golden result
* add more options
* add mistral results
* get golden result of stable lm
* move nightly scripts and results to test folder
* add license
* add fp8 stable lm golden
* run on all available devices
* trigger only when ready for review
* fix new line
* update golden
* add mistral
* temporary stop other perf test
* Add framework for core performance test with one test model
* Small fix and add platform control
* Comment out lp for now
* Add missing ymal file
* Small fix
* Fix sed contents
* Small fix
* Small path fixes
* Small fix
* Add update to ftp
* Small upload fix
* add chatglm3-6b
* LLM: add model names
* Keep repo id same as ftp and temporary make baichuan2 first priority
* change order
* Remove temp if false and separate pr and nightly results
* Small fix
---------
Co-authored-by: jinbridge <2635480475@qq.com>
* add auto triggered acc test
* use llama 7b instead
* fix env
* debug download
* fix download prefix
* add cut dirs
* fix env of model path
* fix dataset download
* full job
* source xpu env vars
* use matrix to trigger model run
* reset batch=1
* remove redirect
* remove some trigger
* add task matrix
* add precision list
* test llama-7b-chat
* use /mnt/disk1 to store model and datasets
* remove installation test
* correct downloading path
* fix HF vars
* add bigdl-llm env vars
* rename file
* fix hf_home
* fix script path
* rename as harness evalution
* rerun
* add more models and skip runtime error
* upgrade transformers
* temporarily removed Mistral-7B-v0.1
* temporarily disable the upload of arc perf result
* add py310 test for llm-unit-test.
* add py310 llm-unit-tests
* add llm-cpp-build-py310
* test
* test
* test.
* test
* test
* fix deactivate.
* fix
* fix.
* fix
* test
* test
* test
* add build chatglm for win.
* test.
* fix
* Add test script and workflow for qlora fine-tuning
* Test fix export model
* Download dataset
* Fix export model issue
* Reduce number of training steps
* Rename script
* Correction
* Add gpu workflow and a transformers API inference test
* Set device-specific env variables in script instead of workflow
* Fix status message
---------
Co-authored-by: sgwhat <ge.song@intel.com>
* add ut for mistral model
* update
* fix model path
* upgrade transformers version for mistral model
* refactor correctness ut for mustral model
* refactor mistral correctness ut
* revert test_optimize_model back
* remove mistral from test_optimize_model
* add to revert transformers version back to 4.31.0
* add new API for optimize any pytorch models
* change test util name
* revise API and update UT
* fix python style
* update ut config, change default value
* change defaults, disable ut transcribe
* deprecate BigDLNativeTransformers and add specific LMEmbedding method
* deprecate and add LM methods for langchain llms
* add native params to native langchain
* new imple for embedding
* move ut from bigdlnative to casual llm
* rename embeddings api and examples update align with usage updating
* docqa example hot-fix
* add more api docs
* add langchain ut for starcoder
* support model_kwargs for transformer methods when calling causalLM and add ut
* ut fix for transformers embedding
* update for langchain causal supporting transformers
* remove model_family in readme doc
* add model_families params to support more models
* update api docs and remove chatglm embeddings for now
* remove chatglm embeddings in examples
* new refactor for ut to add bloom and transformers llama ut
* disable llama transformers embedding ut
* fix download statement
* add check before build wheel
* use curl to upload files
* windows unittest won't upload converted model
* split llm-cli test into windows & linux versions
* update tempdir create way
* fix nightly converted model name
* windows llm-cli starcoder test temply disabled
* remove taskset dependency
* rename llm_unit_tests_linux to llm_unit_tests
* add chatglm build
* add llm-cli support
* update git
* install cmake
* add ut for chatglm
* add files to setup
* fix bug cause permission error when sf lack file
* Add current inference uts to nightly tests
* Change test model from chatglm-6b to chatglm2-6b
* Add thread num env variable for nightly test
* Fix urls
* Small fix