Skip Navigation

Beginner questions thread

Trying something new, going to pin this thread as a place for beginners to ask what may or may not be stupid questions, to encourage both the asking and answering.

Depending on activity level I'll either make a new one once in awhile or I'll just leave this one up forever to be a place to learn and ask.

When asking a question, try to make it clear what your current knowledge level is and where you may have gaps, should help people provide more useful concise answers!

You're viewing a single thread.

12 comments
  • I get an error when offloading the whole model to GPU

    ./build/bin/llama-cli -m ~/software/ai/models/deepseek-math-7b-instruct.Q8_0.gguf -n 200 -t 10 -ngl 31 -if

    The relevant output is:

    ....

    llama_model_load_from_file_impl: using device Vulkan0 (Intel(R) Iris(R) Xe Graphics (RPL-U)) - 7759 MiB free

    ...

    print_info: file size = 6.84 GiB (8.50 BPW)

    ....

    load_tensors: loading model tensors, this can take a while... (mmap = true) load_tensors: offloading 30 repeating layers to GPU load_tensors: offloading output layer to GPU load_tensors: offloaded 31/31 layers to GPU load_tensors: Vulkan0 model buffer size = 6577.83 MiB load_tensors: CPU_Mapped model buffer size = 425.00 MiB

    .....

    ggml_vulkan: Device memory allocation of size 2013265920 failed ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory llama_kv_cache_init: failed to allocate buffer for kv cache llama_init_from_model: llama_kv_cache_init() failed for self-attention cache common_init_from_params: failed to create context with model '~/software/ai/models/deepseek-math-7b-instruct.Q8_0.gguf' main: error: unable to load model

    It seems to me that there is enough room for the model, but I don't know what "Device memory allocation of size 2013265920" means.

12 comments