

There are many excellent options - far too many to list. So I will briefly say - there are some really nice 4B models (like Qwen3-4B HIVEMIND, Nanbeige, IBM Granite 3B) which you should be able to run at higher quants (Q6 and up) quite nicely. Of course, there are always newer models (Gemma, Qwen3.6 - soon 3.7) etc.
Best bet is to poke around hugging face, on TheBloke, Unsloth or DavidAUs archives and see what they have in the 3-7B range that tickles your fancy. Don’t immediately jump for the newest releases - the old ones are still good. Qwen3-4B 2507 instruct is still a favourite of mine and more recently Qwen3.5-2B shows promise.





No good deed goes unpunished. The sense of self entitlement some people display is staggering. FOSS project? Well, you should have done x y or z.
Also, I gave you $3 via Ko-fi, so you need to provide customer support in perpetuity and come to my house and install it. And heaven forbid you try to recoup costs!
Projects don’t just die out - a lot of them are killed (one way or another). For example, I had a fully specced out FPGA design that would capture the signal from Wii GPU and do internal upscaled resolution (think: like what dolphin emulator does but with actual hardware) not just post process sharpening. Total cost under $100 and some know how.
The amount of flack I copped for it made me shut down the github and work on it for myself. Once it’s perfected, I may post about it again but I sure as shit am not compelled to deal with the fucking peanut gallery anymore.