Small mddoc fixed based on review (#11391)
* Fix based on review * Further fix * Small fix * Small fix
This commit is contained in:
parent
072ce7e66d
commit
a027121530
10 changed files with 91 additions and 97 deletions
|
|
@ -47,7 +47,7 @@ Choose one of the following methods to start the container:
|
||||||
$DOCKER_IMAGE
|
$DOCKER_IMAGE
|
||||||
```
|
```
|
||||||
|
|
||||||
- For **Windows users**:
|
- For **Windows WSL users**:
|
||||||
|
|
||||||
To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container. And change the `/path/to/models` to mount the models. Then add `--privileged` and map the `/usr/lib/wsl` to the docker.
|
To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container. And change the `/path/to/models` to mount the models. Then add `--privileged` and map the `/usr/lib/wsl` to the docker.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -23,7 +23,7 @@ output = tokenizer.batch_decode(output_ids)
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!TIP]
|
> [!TIP]
|
||||||
> See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>).
|
> See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels).
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> You may apply more low bit optimizations (including INT8, INT5 and INT4) as follows:
|
> You may apply more low bit optimizations (including INT8, INT5 and INT4) as follows:
|
||||||
|
|
|
||||||
|
|
@ -66,7 +66,8 @@ You could choose to use [PyTorch API](./optimize_model.md) or [`transformers`-st
|
||||||
model = model.to('xpu') # Important after obtaining the optimized model
|
model = model.to('xpu') # Important after obtaining the optimized model
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!TIP]
|
> **Tip**:
|
||||||
|
>
|
||||||
> When running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the `from_pretrained` function. This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU.
|
> When running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the `from_pretrained` function. This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU.
|
||||||
>
|
>
|
||||||
> See the [API doc](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/transformers.html) to find more information.
|
> See the [API doc](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/transformers.html) to find more information.
|
||||||
|
|
@ -81,7 +82,8 @@ You could choose to use [PyTorch API](./optimize_model.md) or [`transformers`-st
|
||||||
|
|
||||||
model = model.to('xpu') # Important after obtaining the optimized model
|
model = model.to('xpu') # Important after obtaining the optimized model
|
||||||
```
|
```
|
||||||
> [!TIP]
|
|
||||||
|
> **Tip**:
|
||||||
>
|
>
|
||||||
> When running saved optimized models on Intel iGPUs for Windows users, we also recommend setting `cpu_embedding=True` in the `load_low_bit` function.
|
> When running saved optimized models on Intel iGPUs for Windows users, we also recommend setting `cpu_embedding=True` in the `load_low_bit` function.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,7 @@ output = doc_chain.run(...)
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!TIP]
|
> [!TIP]
|
||||||
> See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/transformers_int4)
|
> See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain)
|
||||||
|
|
||||||
## Using Native INT4 Format
|
## Using Native INT4 Format
|
||||||
|
|
||||||
|
|
@ -42,6 +42,3 @@ ipex_llm = LlamaLLM(model_path='/path/to/converted/model.bin')
|
||||||
doc_chain = load_qa_chain(ipex_llm, ...)
|
doc_chain = load_qa_chain(ipex_llm, ...)
|
||||||
doc_chain.run(...)
|
doc_chain.run(...)
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!TIP]
|
|
||||||
> See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/native_int4) for more information.
|
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,9 @@ You may also convert Hugging Face *Transformers* models into native INT4 format
|
||||||
# convert the model
|
# convert the model
|
||||||
from ipex_llm import llm_convert
|
from ipex_llm import llm_convert
|
||||||
ipex_llm_path = llm_convert(model='/path/to/model/',
|
ipex_llm_path = llm_convert(model='/path/to/model/',
|
||||||
outfile='/path/to/output/', outtype='int4', model_family="llama")
|
outfile='/path/to/output/',
|
||||||
|
outtype='int4',
|
||||||
|
model_family="llama")
|
||||||
|
|
||||||
# load the converted model
|
# load the converted model
|
||||||
# switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models
|
# switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models
|
||||||
|
|
|
||||||
|
|
@ -55,7 +55,8 @@ First we recommend using [Conda](https://conda-forge.org/download/) to create a
|
||||||
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
|
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
|
||||||
```
|
```
|
||||||
|
|
||||||
- For
|
- For **Windows users**:
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,7 @@
|
||||||
|
|
||||||
## Langchain-Chatchat Architecture
|
## Langchain-Chatchat Architecture
|
||||||
|
|
||||||
See the Langchain-Chatchat architecture below ([source](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/img/langchain%2Bchatglm.png)).
|
See the Langchain-Chatchat architecture below ([source](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/docs/img/langchain%2Bchatglm.png)).
|
||||||
|
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/langchain-arch.png" height="50%" />
|
<img src="https://llm-assets.readthedocs.io/en/latest/_images/langchain-arch.png" height="50%" />
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -139,15 +139,12 @@ You can now open a browser and access the RAGflow web portal. With the default s
|
||||||
|
|
||||||
If this is your first time using RAGFlow, you will need to register. After registering, log in with your new account to access the portal.
|
If this is your first time using RAGFlow, you will need to register. After registering, log in with your new account to access the portal.
|
||||||
|
|
||||||
<div style="display: flex; gap: 5px;">
|
<table width="100%">
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png" target="_blank" style="flex: 1;">
|
<tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png" style="width: 100%;" />
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png"/></a></td>
|
||||||
</a>
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png"/></a></td>
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png" target="_blank" style="flex: 1;">
|
</tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png" style="width: 100%;" />
|
</table>
|
||||||
</a>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
#### Configure `Ollama` service URL
|
#### Configure `Ollama` service URL
|
||||||
|
|
||||||
|
|
@ -180,26 +177,21 @@ Go to **Knowledge Base** by clicking on **Knowledge Base** in the top bar. Click
|
||||||
|
|
||||||
After entering a name, you will be directed to edit the knowledge base. Click on **Dataset** on the left, then click **+ Add file -> Local files**. Upload your file in the pop-up window and click **OK**.
|
After entering a name, you will be directed to edit the knowledge base. Click on **Dataset** on the left, then click **+ Add file -> Local files**. Upload your file in the pop-up window and click **OK**.
|
||||||
|
|
||||||
<div style="display: flex; gap: 5px;">
|
<table width="100%">
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png" target="_blank" style="flex: 1;">
|
<tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png" style="width: 100%;" />
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png"/></a></td>
|
||||||
</a>
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png"/></a></td>
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png" target="_blank" style="flex: 1;">
|
</tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png" style="width: 100%;" />
|
</table>
|
||||||
</a>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
After the upload is successful, you will see a new record in the dataset. The _**Parsing Status**_ column will show `UNSTARTED`. Click the green start button in the _**Action**_ column to begin file parsing. Once parsing is finished, the _**Parsing Status**_ column will change to **SUCCESS**.
|
After the upload is successful, you will see a new record in the dataset. The _**Parsing Status**_ column will show `UNSTARTED`. Click the green start button in the _**Action**_ column to begin file parsing. Once parsing is finished, the _**Parsing Status**_ column will change to **SUCCESS**.
|
||||||
|
|
||||||
<div style="display: flex; gap: 5px;">
|
<table width="100%">
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png" target="_blank" style="flex: 1;">
|
<tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png" style="width: 100%;" />
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.pngg"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png"/></a></td>
|
||||||
</a>
|
<td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png"/></a></td>
|
||||||
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png" target="_blank" style="flex: 1;">
|
</tr>
|
||||||
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png" style="width: 100%;" />
|
</table>
|
||||||
</a>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
Next, go to **Configuration** on the left menu and click **Save** at the bottom to save the changes.
|
Next, go to **Configuration** on the left menu and click **Save** at the bottom to save the changes.
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue