Fix Baichuan2 prompt format (#10334)
* Fix Baichuan2 prompt format * Fix Baichuan2 README * Change baichuan2 prompt info * Change baichuan2 prompt info
This commit is contained in:
parent
0451103a43
commit
74e7490fda
6 changed files with 22 additions and 29 deletions
|
|
@ -54,15 +54,15 @@ numactl -C 0-47 -m 0 python ./generate.py
|
|||
```log
|
||||
Inference time: xxxx s
|
||||
-------------------- Prompt --------------------
|
||||
<human>AI是什么? <bot>
|
||||
<reserved_106> AI是什么? <reserved_107>
|
||||
-------------------- Output --------------------
|
||||
<human>AI是什么? <bot>人工智能(AI)是指由计算机系统或其他数字设备模拟、扩展和增强人类智能的科学和技术。它涉及到多个领域,如机器学习、计算机视觉、
|
||||
<reserved_106> AI是什么? <reserved_107> 人工智能(AI)是指由计算机系统执行的任务,这些任务通常需要人类智能才能完成。AI的目标是使计算机能够模拟人类的思维过程,从而
|
||||
```
|
||||
|
||||
```log
|
||||
Inference time: xxxx s
|
||||
-------------------- Prompt --------------------
|
||||
<human>解释一下“温故而知新” <bot>
|
||||
<reserved_106> 解释一下“温故而知新” <reserved_107>
|
||||
-------------------- Output --------------------
|
||||
<human>解释一下“温故而知新” <bot>这句话出自《论语·为政》篇,意思是通过回顾过去的事情来获取新的理解和认识。简单来说就是:温习学过的知识,可以从中
|
||||
<reserved_106> 解释一下“温故而知新” <reserved_107> 温故而知新是一个成语,出自《论语·为政》篇。这个成语的意思是:通过回顾和了解过去的事情,可以更好地理解新的知识和
|
||||
```
|
||||
|
|
|
|||
|
|
@ -22,8 +22,10 @@ import numpy as np
|
|||
from bigdl.llm.transformers import AutoModelForCausalLM
|
||||
from transformers import AutoTokenizer
|
||||
|
||||
# you could tune the prompt based on your own model,
|
||||
BAICHUAN_PROMPT_FORMAT = "<human>{prompt} <bot>"
|
||||
# prompt format referred from https://github.com/baichuan-inc/Baichuan2/issues/227
|
||||
# and https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat/blob/main/generation_utils.py#L7-L49
|
||||
# For English prompt, you are recommended to change the prompt format.
|
||||
BAICHUAN_PROMPT_FORMAT = "<reserved_106> {prompt} <reserved_107>"
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Baichuan model')
|
||||
|
|
|
|||
|
|
@ -109,18 +109,10 @@ Arguments info:
|
|||
|
||||
#### Sample Output
|
||||
#### [baichuan-inc/Baichuan2-7B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat)
|
||||
```log
|
||||
-------------------- Prompt --------------------
|
||||
<human>AI是什么? <bot>
|
||||
-------------------- Output --------------------
|
||||
<human>AI是什么? <bot>
|
||||
AI是人工智能(Artificial Intelligence)的缩写,它是指让计算机或机器模拟、扩展和辅助人类的智能。AI技术已经广泛应用于各个领域
|
||||
```
|
||||
|
||||
```log
|
||||
Inference time: xxxx s
|
||||
-------------------- Prompt --------------------
|
||||
<human>What is AI? <bot>
|
||||
<reserved_106> AI是什么? <reserved_107>
|
||||
-------------------- Output --------------------
|
||||
<human>What is AI? <bot>Artificial Intelligence (AI) refers to the development of computer systems that can perform tasks that would typically require human intelligence. These tasks include learning, reasoning, problem
|
||||
<reserved_106> AI是什么? <reserved_107>AI是人工智能(Artificial Intelligence)的缩写,它是指让计算机或其他设备模拟人类智能的技术。通过使用大量数据和算法,AI可以学习、
|
||||
```
|
||||
|
|
@ -21,8 +21,10 @@ import argparse
|
|||
from bigdl.llm.transformers import AutoModelForCausalLM
|
||||
from transformers import AutoTokenizer
|
||||
|
||||
# you could tune the prompt based on your own model,
|
||||
BAICHUAN_PROMPT_FORMAT = "<human>{prompt} <bot>"
|
||||
# prompt format referred from https://github.com/baichuan-inc/Baichuan2/issues/227
|
||||
# and https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat/blob/main/generation_utils.py#L7-L49
|
||||
# For English prompt, you are recommended to change the prompt format.
|
||||
BAICHUAN_PROMPT_FORMAT = "<reserved_106> {prompt} <reserved_107>"
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Baichuan model')
|
||||
|
|
|
|||
|
|
@ -114,13 +114,8 @@ In the example, several arguments can be passed to satisfy your requirements:
|
|||
#### [baichuan-inc/Baichuan2-7B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat)
|
||||
```log
|
||||
Inference time: xxxx s
|
||||
-------------------- Prompt --------------------
|
||||
<reserved_106> AI是什么? <reserved_107>
|
||||
-------------------- Output --------------------
|
||||
<human>AI是什么? <bot>
|
||||
AI是人工智能(Artificial Intelligence)的缩写,它是指让计算机或机器模拟、扩展和辅助人类的智能。AI技术已经广泛应用于各个领域
|
||||
```
|
||||
|
||||
```log
|
||||
Inference time: xxxx s
|
||||
-------------------- Output --------------------
|
||||
<human>What is AI? <bot>Artificial Intelligence (AI) refers to the development of computer systems that can perform tasks that would typically require human intelligence. These tasks include learning, reasoning, problem
|
||||
<reserved_106> AI是什么? <reserved_107>AI是人工智能(Artificial Intelligence)的缩写,它是指让计算机或其他设备模拟人类智能的技术。通过使用大量数据和算法,AI可以学习、
|
||||
```
|
||||
|
|
@ -21,8 +21,10 @@ import argparse
|
|||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
from bigdl.llm import optimize_model
|
||||
|
||||
# you could tune the prompt based on your own model,
|
||||
BAICHUAN2_PROMPT_FORMAT = "<human>{prompt} <bot>"
|
||||
# prompt format referred from https://github.com/baichuan-inc/Baichuan2/issues/227
|
||||
# and https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat/blob/main/generation_utils.py#L7-L49
|
||||
# For English prompt, you are recommended to change the prompt format.
|
||||
BAICHUAN_PROMPT_FORMAT = "<reserved_106> {prompt} <reserved_107>"
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Baichuan2 model')
|
||||
|
|
|
|||
Loading…
Reference in a new issue