Small mddoc fixed based on review (#11391)

* Fix based on review

* Further fix

* Small fix

* Small fix
This commit is contained in:
Yuwen Hu 2024-06-21 17:09:30 +08:00 committed by GitHub
parent 072ce7e66d
commit a027121530
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 91 additions and 97 deletions

View file

@ -47,7 +47,7 @@ Choose one of the following methods to start the container:
$DOCKER_IMAGE $DOCKER_IMAGE
``` ```
- For **Windows users**: - For **Windows WSL users**:
To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container. And change the `/path/to/models` to mount the models. Then add `--privileged` and map the `/usr/lib/wsl` to the docker. To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container. And change the `/path/to/models` to mount the models. Then add `--privileged` and map the `/usr/lib/wsl` to the docker.

View file

@ -23,7 +23,7 @@ output = tokenizer.batch_decode(output_ids)
``` ```
> [!TIP] > [!TIP]
> See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>). > See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels).
> [!NOTE] > [!NOTE]
> You may apply more low bit optimizations (including INT8, INT5 and INT4) as follows: > You may apply more low bit optimizations (including INT8, INT5 and INT4) as follows:

View file

@ -66,7 +66,8 @@ You could choose to use [PyTorch API](./optimize_model.md) or [`transformers`-st
model = model.to('xpu') # Important after obtaining the optimized model model = model.to('xpu') # Important after obtaining the optimized model
``` ```
> [!TIP] > **Tip**:
>
> When running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the `from_pretrained` function. This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU. > When running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the `from_pretrained` function. This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU.
> >
> See the [API doc](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/transformers.html) to find more information. > See the [API doc](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/transformers.html) to find more information.
@ -81,7 +82,8 @@ You could choose to use [PyTorch API](./optimize_model.md) or [`transformers`-st
model = model.to('xpu') # Important after obtaining the optimized model model = model.to('xpu') # Important after obtaining the optimized model
``` ```
> [!TIP]
> **Tip**:
> >
> When running saved optimized models on Intel iGPUs for Windows users, we also recommend setting `cpu_embedding=True` in the `load_low_bit` function. > When running saved optimized models on Intel iGPUs for Windows users, we also recommend setting `cpu_embedding=True` in the `load_low_bit` function.

View file

@ -19,7 +19,7 @@ output = doc_chain.run(...)
``` ```
> [!TIP] > [!TIP]
> See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/transformers_int4) > See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain)
## Using Native INT4 Format ## Using Native INT4 Format
@ -42,6 +42,3 @@ ipex_llm = LlamaLLM(model_path='/path/to/converted/model.bin')
doc_chain = load_qa_chain(ipex_llm, ...) doc_chain = load_qa_chain(ipex_llm, ...)
doc_chain.run(...) doc_chain.run(...)
``` ```
> [!TIP]
> See the examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/native_int4) for more information.

View file

@ -10,7 +10,9 @@ You may also convert Hugging Face *Transformers* models into native INT4 format
# convert the model # convert the model
from ipex_llm import llm_convert from ipex_llm import llm_convert
ipex_llm_path = llm_convert(model='/path/to/model/', ipex_llm_path = llm_convert(model='/path/to/model/',
outfile='/path/to/output/', outtype='int4', model_family="llama") outfile='/path/to/output/',
outtype='int4',
model_family="llama")
# load the converted model # load the converted model
# switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models # switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models

View file

@ -55,13 +55,14 @@ First we recommend using [Conda](https://conda-forge.org/download/) to create a
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
``` ```
- For - For **Windows users**:
```cmd
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[all] ```cmd
``` conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[all]
```
Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example): Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example):

View file

@ -20,7 +20,7 @@
## Langchain-Chatchat Architecture ## Langchain-Chatchat Architecture
See the Langchain-Chatchat architecture below ([source](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/img/langchain%2Bchatglm.png)). See the Langchain-Chatchat architecture below ([source](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/docs/img/langchain%2Bchatglm.png)).
<img src="https://llm-assets.readthedocs.io/en/latest/_images/langchain-arch.png" height="50%" /> <img src="https://llm-assets.readthedocs.io/en/latest/_images/langchain-arch.png" height="50%" />

View file

@ -139,15 +139,12 @@ You can now open a browser and access the RAGflow web portal. With the default s
If this is your first time using RAGFlow, you will need to register. After registering, log in with your new account to access the portal. If this is your first time using RAGFlow, you will need to register. After registering, log in with your new account to access the portal.
<div style="display: flex; gap: 5px;"> <table width="100%">
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png" target="_blank" style="flex: 1;"> <tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png" style="width: 100%;" /> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login.png"/></a></td>
</a> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png"/></a></td>
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png" target="_blank" style="flex: 1;"> </tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-login2.png" style="width: 100%;" /> </table>
</a>
</div>
#### Configure `Ollama` service URL #### Configure `Ollama` service URL
@ -180,26 +177,21 @@ Go to **Knowledge Base** by clicking on **Knowledge Base** in the top bar. Click
After entering a name, you will be directed to edit the knowledge base. Click on **Dataset** on the left, then click **+ Add file -> Local files**. Upload your file in the pop-up window and click **OK**. After entering a name, you will be directed to edit the knowledge base. Click on **Dataset** on the left, then click **+ Add file -> Local files**. Upload your file in the pop-up window and click **OK**.
<div style="display: flex; gap: 5px;"> <table width="100%">
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png" target="_blank" style="flex: 1;"> <tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png" style="width: 100%;" /> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase2.png"/></a></td>
</a> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png"/></a></td>
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png" target="_blank" style="flex: 1;"> </tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase3.png" style="width: 100%;" /> </table>
</a>
</div>
After the upload is successful, you will see a new record in the dataset. The _**Parsing Status**_ column will show `UNSTARTED`. Click the green start button in the _**Action**_ column to begin file parsing. Once parsing is finished, the _**Parsing Status**_ column will change to **SUCCESS**. After the upload is successful, you will see a new record in the dataset. The _**Parsing Status**_ column will show `UNSTARTED`. Click the green start button in the _**Action**_ column to begin file parsing. Once parsing is finished, the _**Parsing Status**_ column will change to **SUCCESS**.
<div style="display: flex; gap: 5px;"> <table width="100%">
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png" target="_blank" style="flex: 1;"> <tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png" style="width: 100%;" /> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.pngg"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase4.png"/></a></td>
</a> <td><a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png"><img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png"/></a></td>
<a href="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png" target="_blank" style="flex: 1;"> </tr>
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ragflow-knowledgebase5.png" style="width: 100%;" /> </table>
</a>
</div>
Next, go to **Configuration** on the left menu and click **Save** at the bottom to save the changes. Next, go to **Configuration** on the left menu and click **Save** at the bottom to save the changes.