Quickstart¶

This page will guide you to start using azarrot.

Prerequisites¶

Azarrot has some prerequisites for your hardware and software.

Hardware¶

Azarrot supports CPUs and Intel GPUs.

Tested GPUs:

Intel A770 16GB
Intel Xe 96EU (i7 12700H)

Other devices should work if they are supported by oneAPI toolkit and drivers.

Software¶

Any Linux distribution
Intel GPU drivers (if you are using Intel GPUs) from https://dgpu-docs.intel.com/driver/client/overview.html
Intel oneAPI Base Toolkit 2024.0 or above from https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/
Python 3.11.x or below

Azarrot is developed and tested on Ubuntu 22.04 and python 3.10.

Install¶

With Docker or podman¶

Image: ghcr.io/notsyncing/azarrot:main

See docker/docker-compose.yml for example.

Install from PyPI¶

Simply install azarrot from PyPI:

pip install azarrot

Then, create a server.yml in the directory you want to run it:

mkdir azarrot

# Copy from examples/server.yml
cp <SOURCE_ROOT>/examples/server.yml azarrot/

<SOURCE_ROOT> means the repository path you cloned.

In server.yml you can configure things like listening port, model path, etc.

Next we create the models directory:

cd azarrot
mkdir models

And copy an example model file into the models directory:

cp <SOURCE_ROOT>/examples/CodeQwen1.5-7B-ipex-llm.model.yml models/

Azarrot will load all .model.yml files in this directory. You need to manually download the model from huggingface, or convert them if you are using the OpenVINO backend:

huggingface-cli download --local-dir models/CodeQwen1.5-7B Qwen/CodeQwen1.5-7B

Azarrot will convert it to int4 when loading the model with IPEX-LLM backend.

Start to use¶

Now we can start the server:

source /opt/intel/oneapi/setvars.sh
python -m azarrot

And access http://localhost:8080/v1/models too see all loaded models.