If you have a lot of RAM, you can run small models slowly on the CPU. Your integrated graphics I would guess won't fit anything useful in it's vram, so if you really want to run something locally, getting some extra sticks of RAM is probably your cheapest option.
I have 64G and I run 8-14b models. 32b is pushing it (it's just really slow)
Tbf, I don't use any of these corporate llms for exactly that reason. At best, they just use user interaction to "improve" the models, and they're more likely using it to profile and track them as well. Fuck that.