Add known issue in arc voice assistant example (#8902)

* add known issue in voice assistant example * update cpu
2023-09-07 09:28:26 +08:00 · 2023-09-07 09:28:26 +08:00 · bfc71fbc15
commit bfc71fbc15
parent db26c7b84d
2 changed files with 90 additions and 0 deletions
--- a/python/llm/example/gpu/voiceassistant/README.md
+++ b/python/llm/example/gpu/voiceassistant/README.md
@ -46,6 +46,51 @@ Arguments info:
 - `--whisper-repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Whisper model (e.g. `openai/whisper-small` and `openai/whisper-medium`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openai/whisper-small'`.
 - `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
 #### Known Issues
 The speech_recognition library may occasionally skip recording due to low volume. An alternative option is to save the recording in WAV format using `PyAudio` and read the file as an input. Here is an example using PyAudio:
 ```python
 import pyaudio
 import speech_recognition as sr
 CHUNK = 1024
 FORMAT = pyaudio.paInt16
 CHANNELS = 1                # The desired number of input channels
 RATE = 16000                # The desired rate (in Hz)
 RECORD_SECONDS = 10         # Recording time (in second)
 WAVE_OUTPUT_FILENAME = "/path/to/pyaudio_out.wav"
 p = pyaudio.PyAudio()
 stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)
 print("*"*10, "Listening\n")
 frames = []
 data =0
 for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
  data = stream.read(CHUNK)  ## <class 'bytes'> ,exception_on_overflow = False
  frames.append(data)   ## <class 'list'>
 print("*"*10, "Stop recording\n")
 stream.stop_stream()
 stream.close()
 p.terminate()
 wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
 wf.setnchannels(CHANNELS)
 wf.setsampwidth(p.get_sample_size(FORMAT))
 wf.setframerate(RATE)
 wf.writeframes(b''.join(frames))
 wf.close()
 r = sr.Recognizer()
 with sr.AudioFile(WAVE_OUTPUT_FILENAME) as source1:
    audio = r.record(source1)  # read the entire audio file   
 frame_data = np.frombuffer(audio.frame_data, np.int16).flatten().astype(np.float32) / 32768.0
 ```
 #### Sample Output
 ```bash
 (llm) bigdl@bigdl-llm:~/Documents/voiceassistant$ python generate.py --llama2-repo-id-or-model-path /mnt/windows/demo/models/Llama-2-7b-chat-hf --whisper-repo-id-or-model-path /mnt/windows/demo/models/whisper-medium
--- a/python/llm/example/langchain/README.md
+++ b/python/llm/example/langchain/README.md
@ -72,6 +72,51 @@ When you see output says
 Please say something through your microphone (e.g. What is AI). The programe will automatically detect when you have completed your speech and recogize them.
 #### Known Issues
 The speech_recognition library may occasionally skip recording due to low volume. An alternative option is to save the recording in WAV format using `PyAudio` and read the file as an input. Here is an example using PyAudio:
 ```python
 import pyaudio
 import speech_recognition as sr
 CHUNK = 1024
 FORMAT = pyaudio.paInt16
 CHANNELS = 1                # The desired number of input channels
 RATE = 16000                # The desired rate (in Hz)
 RECORD_SECONDS = 10         # Recording time (in second)
 WAVE_OUTPUT_FILENAME = "/path/to/pyaudio_out.wav"
 p = pyaudio.PyAudio()
 stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)
 print("*"*10, "Listening\n")
 frames = []
 data =0
 for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
  data = stream.read(CHUNK)  ## <class 'bytes'> ,exception_on_overflow = False
  frames.append(data)   ## <class 'list'>
 print("*"*10, "Stop recording\n")
 stream.stop_stream()
 stream.close()
 p.terminate()
 wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
 wf.setnchannels(CHANNELS)
 wf.setsampwidth(p.get_sample_size(FORMAT))
 wf.setframerate(RATE)
 wf.writeframes(b''.join(frames))
 wf.close()
 r = sr.Recognizer()
 with sr.AudioFile(WAVE_OUTPUT_FILENAME) as source1:
    audio = r.record(source1)  # read the entire audio file   
 frame_data = np.frombuffer(audio.frame_data, np.int16).flatten().astype(np.float32) / 32768.0
 ```
 ### 4. Math
 This is an example using `LLMMathChain`. This example has been validated using [phoenix-7b](https://huggingface.co/FreedomIntelligence/phoenix-inst-chat-7b).