shark@lemmy.org to Selfhosted@lemmy.worldEnglish · 4 days agoWhat's your self-hosting success of the week?message-squaremessage-square96linkfedilinkarrow-up1103arrow-down12
arrow-up1101arrow-down1message-squareWhat's your self-hosting success of the week?shark@lemmy.org to Selfhosted@lemmy.worldEnglish · 4 days agomessage-square96linkfedilink
minus-squaresharkaccident@lemmy.worldlinkfedilinkEnglisharrow-up1·4 days agoWhat GPU and model you use?
minus-squareShimitar@downonthestreet.eulinkfedilinkEnglisharrow-up2·4 days agoNVIDIA Corporation GA104GL [RTX A4000] (rev a1) From lspci It has 16gb of VRAM, not too much but enough to run gpt:OSS 20b and a few other models pretty nice. I noticed that it’s better to stick to a single model, I imagine that unload and reload the model in VRAM takes time.
What GPU and model you use?
NVIDIA Corporation GA104GL [RTX A4000] (rev a1)
From lspci
It has 16gb of VRAM, not too much but enough to run gpt:OSS 20b and a few other models pretty nice.
I noticed that it’s better to stick to a single model, I imagine that unload and reload the model in VRAM takes time.