• 0 Posts
  • 10 Comments
Joined 2 years ago
cake
Cake day: September 27th, 2024

help-circle


  • I have a Radeon RX 7800 XT.

    Qwen 3.5-9b is blazingly fast on it. However while it’s its impressive for its size, it has its limitations. Complex tasks with several steps are too much for it.

    So now I run the 3.6-35B model with llama.cpp It’s too big for my VRAM so I had to split it: everything that doesn’t fit on the graphics’s card runs in the normal RAM. That slows everything down, but with the right flags I get a bit over 20 tokens/s.

    If you have problems with speed and you’re using ollama I would replace it with something faster like llama.cpp.




  • Various AI features for Ubuntu Linux are expected to land over the next year with a bias on local inferencing by default. Canonical engineers will be working on integrating agentic workflows into Ubuntu for those that want it. There are areas being explored for AI use on Ubuntu both for the desktop as well as for Ubuntu servers such as for assisting in interpreting system logs

    Sounds actually reasonable. As long as it doesn’t get shoved down the users throat it could turn out fine. And sifting through logs is in fact a good task for LLMs in my opinion.