by Sylvain Artois on Mar 12, 2025
I recently went through the process of getting Pixtral-12B running on an NVIDIA Jetson Orin AGX. Here’s how I did it, in case it’s helpful for others who want to run this model on similar hardware. I’m using Jetpack 6.2.
First, I checked the Hugging Face Transformers documentation and found that LLaVA doesn’t have TensorFlow support, so PyTorch was the way to go. The PyTorch installation on Jetson requires setting the CUDA version:
# Add to your shell config (~/.zshrc in my case)
export CUDA_VERSION=12.6
The target PyTorch version (≥ 24.06) has a hard dependency on cuSPARSELt. NVIDIA provides an installation script, but it’s not compatible with CUDA 12.6 from JetPack 6.2. I had to create a custom installer for cuSPARSELt using the most recent build:
#!/bin/bash
set -ex
# cuSPARSELt license: https://docs.nvidia.com/cuda/cusparselt/license.html
mkdir -p tmp_cusparselt && cd tmp_cusparselt
# For Jetson Orin with CUDA 12.6 (JetPack 6.2)
CUSPARSELT_NAME="libcusparse_lt-linux-aarch64-0.7.1.0-archive"
echo "Downloading: ${CUSPARSELT_NAME}"
curl -v --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-aarch64/${CUSPARSELT_NAME}.tar.xz
# Check if the file was downloaded
ls -la
# Extract and install
tar xf ${CUSPARSELT_NAME}.tar.xz
cp -a ${CUSPARSELT_NAME}/include/* /usr/local/cuda/include/
cp -a ${CUSPARSELT_NAME}/lib/* /usr/local/cuda/lib64/
cd ..
rm -rf tmp_cusparselt
ldconfig
After creating a conda virtual environment, I installed PyTorch using the wheel file specifically built for Jetson:
pip install --no-cache https://developer.download.nvidia.cn/compute/redist/jp/v61/pytorch/torch-2.5.0a0+872d972e41.nv24.08.17622132-cp310-cp310-linux_aarch64.whl
Then I installed Hugging Face Transformers:
pip install transformers
When I tested if Transformers was working correctly:
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
I got an error suggesting I needed to downgrade NumPy. After fixing that and installing Pillow:
pip install --upgrade numpy==1.24.4
pip install pillow
The test command worked successfully.
Finally, I created a Python script based on the usage example from the Hugging Face model page for Pixtral-12B and was able to run the model successfully.
#copy paste from https://huggingface.co/mistral-community/pixtral-12b
from PIL import Image
from transformers import AutoProcessor, LlavaForConditionalGeneration
model_id = "mistral-community/pixtral-12b"
model = LlavaForConditionalGeneration.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id)
IMG_URLS = [
"https://picsum.photos/id/237/400/300",
"https://picsum.photos/id/231/200/300",
"https://picsum.photos/id/27/500/500",
"https://picsum.photos/id/17/150/600",
]
PROMPT = "<s>[INST]Describe the images.\n[IMG][IMG][IMG][IMG][/INST]"
inputs = processor(text=PROMPT, images=IMG_URLS, return_tensors="pt").to("cuda")
generate_ids = model.generate(**inputs, max_new_tokens=500)
output = processor.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
That’s it! The setup takes some time, especially the cuSPARSELt installation, but it’s worth it to get this powerful multimodal model running on edge hardware.