• Shimitar@downonthestreet.eu
      link
      fedilink
      English
      arrow-up
      1
      ·
      16 hours ago

      NVIDIA Corporation GA104GL [RTX A4000] (rev a1)

      From lspci

      It has 16gb of VRAM, not too much but enough to run gpt:OSS 20b and a few other models pretty nice.

      I noticed that it’s better to stick to a single model, I imagine that unload and reload the model in VRAM takes time.