Has anyone set up an open-source LLM to run locally/on-prem? My company is concerned about data security with cloud-based AI services and foundation models, so I am looking into fine-tuning an open-source model and running it internally.
My vision is start very small (fine tune the model on our employee handbook for a Q&A chat interface). My hope is that this eventually leads to adoption of third-party services for broader application after getting buy-in.
Specifically, I am curious about:
- Hardware requirements (NVIDIA vs other/amount of memory needed/etc.)
- Time it takes to fine tune
- Inference time
- Integration with A360
I have heard differing opinions on whether this is a feasible/practical route and would love to hear what you all think.