# Run Coding Copilot in VSCode with Intel GPU
[**Continue**](https://marketplace.visualstudio.com/items?itemName=Continue.continue) is a coding copilot extension in [Microsoft Visual Studio Code](https://code.visualstudio.com/); by porting it to [`ipex-llm`](https://github.com/intel-analytics/ipex-llm), users can now easily leverage local LLMs running on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) for code explanation, code generation/completion, etc.
See the demos of using Continue with [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) running on Intel A770 GPU below.
| Code Generation |
Code Explanation |
|
|
## Quickstart
This guide walks you through setting up and running **Continue** within _Visual Studio Code_, empowered by local large language models served via [Ollama](./ollama_quickstart.html) with `ipex-llm` optimizations.
### 1. Install and Run Ollama Serve
Visit [Run Ollama with IPEX-LLM on Intel GPU](./ollama_quickstart.html), and follow the steps 1) [Install IPEX-LLM for Ollama](./ollama_quickstart.html#install-ipex-llm-for-ollama), 2) [Initialize Ollama](./ollama_quickstart.html#initialize-ollama) and 3) [Run Ollama Serve](./ollama_quickstart.html#run-ollama-serve) to install and initialize and start the Ollama Service.
```eval_rst
.. important::
Please make sure you have set ``OLLAMA_HOST=0.0.0.0`` before starting the Ollama service, so that connections from all IP addresses can be accepted.
.. tip::
If your local LLM is running on Intel Arcâ„¢ A-Series Graphics with Linux OS, it is recommended to additionaly set the following environment variable for optimal performance before the Ollama service is started:
.. code-block:: bash
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```
### 2. Prepare and Run Model
#### Pull [`codeqwen:latest`](https://ollama.com/library/codeqwen)
In a new terminal window:
```eval_rst
.. tabs::
.. tab:: Linux
.. code-block:: bash
export no_proxy=localhost,127.0.0.1
./ollama pull codeqwen:latest
.. tab:: Windows
Please run the following command in Anaconda Prompt.
.. code-block:: cmd
set no_proxy=localhost,127.0.0.1
ollama pull codeqwen:latest
.. seealso::
Here's a list of models that can be used for coding copilot on local PC:
- Code Llama:
- WizardCoder
- Mistral
- StarCoder
- DeepSeek Coder
You could find them in the `Ollama model library `_ and have a try.
```
#### Create and Run Model
First, create a `Modelfile` file with contents:
```
FROM codeqwen:latest
PARAMETER num_ctx 4096
```
then:
```eval_rst
.. tabs::
.. tab:: Linux
.. code-block:: bash
./ollama create codeqwen:latest-continue -f Modelfile
.. tab:: Windows
Please run the following command in Anaconda Prompt.
.. code-block:: cmd
ollama create codeqwen:latest-continue -f Modelfile
```
You can now find `codeqwen:latest-continue` in `ollama list`.
Finially, run the `codeqwen:latest-continue`:
```eval_rst
.. tabs::
.. tab:: Linux
.. code-block:: bash
./ollama run codeqwen:latest-continue
.. tab:: Windows
Please run the following command in Anaconda Prompt.
.. code-block:: cmd
ollama run codeqwen:latest-continue
```
### 3. Install `Continue` Extension
1. Click `Install` on the [Continue extension in the Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue)
2. This will open the Continue extension page in VS Code, where you will need to click `Install` again
3. Once you do this, you will see the Continue logo show up on the left side bar. If you click it, the Continue extension will open up:
```eval_rst
.. note::
Note: We strongly recommend moving Continue to VS Code's right sidebar. This helps keep the file explorer open while using Continue, and the sidebar can be toggled with a simple keyboard shortcut.
```
### 4. Configure `Continue`
Once you've started the API server, you can now use your local LLMs on Continue. After opening Continue(you can either click the extension icon on the left sidebar or press `Ctrl+Shift+L`), you can click the `+` button next to the model dropdown, and scroll down to the bottom and click `Open config.json`.
In `config.json`, you'll find the `models` property, a list of the models that you have saved to use with Continue. Please add the following configuration to `models`. Note that `model`, `apiKey`, `apiBase` should align with what you specified when starting the `Text Generation WebUI` server. Finally, remember to select this model in the model dropdown menu.
```json
{
"models": [
{
"title": "Text Generation WebUI API Server",
"provider": "openai",
"model": "MODEL_NAME",
"apiKey": "EMPTY",
"apiBase": "http://localhost:5000/v1"
}
]
}
```
### 5. How to Use `Continue`
For detailed tutorials please refer to [this link](https://continue.dev/docs/how-to-use-continue). Here we are only showing the most common scenarios.
#### Ask about highlighted code or an entire file
If you don't understand how some code works, highlight(press `Ctrl+Shift+L`) it and ask "how does this code work?"
#### Editing existing code
You can ask Continue to edit your highlighted code with the command `/edit`.
### Troubleshooting
#### Failed to load the extension `openai`
If you encounter `TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'` when you run `python server.py --load-in-4bit --api`, please make sure you are using `Python 3.11` instead of lower versions.