ipex-llm/python/llm/test/run-llm-inference-tests-gpu.sh
SONG Ge dfb00e37e9 [LLM] Add model correctness test on ARC for llama and falcon (#9347)
* add correctness test on arc for llama model

* modify layer name

* add falcon ut

* refactor and add ut for falcon model

* modify lambda positions and update docs

* replace loading pre input with last decodelayer output

* switch lower bound to single model instead of using the common one

* make the code implementation simple

* fix gpu action allocation memory issue
2023-11-10 13:48:57 +08:00

25 lines
530 B
Bash

#!/bin/bash
export ANALYTICS_ZOO_ROOT=${ANALYTICS_ZOO_ROOT}
export LLM_HOME=${ANALYTICS_ZOO_ROOT}/python/llm/src
export LLM_INFERENCE_TEST_DIR=${ANALYTICS_ZOO_ROOT}/python/llm/test/inference_gpu
export USE_XETLA=OFF
export DEVICE='xpu'
set -e
echo "# Start testing inference"
start=$(date "+%s")
if [ -z "$THREAD_NUM" ]; then
THREAD_NUM=2
fi
export OMP_NUM_THREADS=$THREAD_NUM
pytest ${LLM_INFERENCE_TEST_DIR} -v -s
now=$(date "+%s")
time=$((now-start))
echo "Bigdl-llm gpu tests finished"
echo "Time used:$time seconds"