Quickstart ========== This page will guide you to start using azarrot. Prerequisites ------------- Azarrot has some prerequisites for your hardware and software. Hardware ^^^^^^^^ Azarrot supports CPUs and Intel GPUs. Tested GPUs: * Intel A770 16GB * Intel Xe 96EU (i7 12700H) Other devices should work if they are supported by oneAPI toolkit and drivers. Software ^^^^^^^^ * Any Linux distribution * Intel GPU drivers (if you are using Intel GPUs) from https://dgpu-docs.intel.com/driver/client/overview.html * Intel oneAPI Base Toolkit 2024.0 or above from https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/ * Python 3.11.x or below Azarrot is developed and tested on Ubuntu 22.04 and python 3.10. Install ------- With Docker or podman ^^^^^^^^^^^^^^^^^^^^^ Image: `ghcr.io/notsyncing/azarrot:main` See `docker/docker-compose.yml` for example. Install from PyPI ^^^^^^^^^^^^^^^^^ Simply install azarrot from PyPI: .. code-block:: bash pip install azarrot Then, create a `server.yml` in the directory you want to run it: .. code-block:: bash mkdir azarrot # Copy from examples/server.yml cp /examples/server.yml azarrot/ `` means the repository path you cloned. In `server.yml` you can configure things like listening port, model path, etc. Next we create the models directory: .. code-block:: bash cd azarrot mkdir models And copy an example model file into the models directory: .. code-block:: bash cp /examples/CodeQwen1.5-7B-ipex-llm.model.yml models/ Azarrot will load all `.model.yml` files in this directory. You need to manually download the model from huggingface, or convert them if you are using the OpenVINO backend: .. code-block:: bash huggingface-cli download --local-dir models/CodeQwen1.5-7B Qwen/CodeQwen1.5-7B Azarrot will convert it to `int4` when loading the model with IPEX-LLM backend. Start to use ------------ Now we can start the server: .. code-block:: bash source /opt/intel/oneapi/setvars.sh python -m azarrot And access `http://localhost:8080/v1/models` too see all loaded models.