1
cb219
1y

Rant/question
Seems like I cannot run any GPT model on my GPU as they require more gpu memory than my 6GB card can offer. 😪 Wanted to try out self hosted. Couldn't solve Cudas outofmemoryerror yet with given solutions.
Any recommendation for a smaller model?

Comments
  • 0
    Have you tried confidence?
  • 3
    Download more vram
  • 3
    GPT4All runs on CPU and uses RAM. You should be able to get that running.

    But also using any of the webui's you should be able to run smaller models. I could run 3B models on 4GB GPU half a year ago. You can possibly get 7B or 13B on 6Gb GPU these days, surely
  • 0
    See if there's a quantized model with 8 bit ints or even 4 bit ints
  • 0
    @Hazarth interesting idea. I'll give it a try. Falcon 7B seems the only possible.
Add Comment