Consult here all news about Queaso Services, use the archive on the right side to see more.

Hosting a Large Language Model (LLM) locally using C#.

  Hosting a Large Language Model (LLM) locally using C#.

|Posted on 30 March, 2025

On Monday evening, March 24th, 2025, our community gathered at Queaso Services in Destelbergen for an inspiring GetTogether focused on a highly relevant and practical topic: hosting a Large Language Model (LLM) locally using C#.

The event brought together developers and architects eager to move beyond cloud-based AI solutions and explore how LLMs can be run and controlled locally within a .NET ecosystem.

Our keynote speaker, Xavier Spileers, delivered a clear and engaging session demonstrating how to download, host, and interact with an LLM using C# and the LLamaSharp library. Rather than staying at a theoretical level, Xavier walked us through the concrete steps involved, making the topic approachable even for those who are new to working with LLMs.

One of the key takeaways was that hosting an LLM locally is no longer reserved for large research teams or specialized environments. With the right model selection and an understanding of resource constraints such as memory and CPU/GPU usage, it is perfectly feasible to run powerful language models on local machines or internal servers.

A crucial part of the session focused on prompt engineering. Xavier showed how the structure and clarity of prompts directly influence the quality of the model’s output. We learned that effective prompting is less about “asking more” and more about asking better: providing context, setting boundaries, and guiding the model toward the desired response.

This part of the talk resonated strongly with the audience, as it highlighted that working with LLMs is not just a technical challenge, but also a communication exercise between human and machine.

Another highlight of the evening was the explanation and demonstration of Retrieval Augmented Generation (RAG). Xavier showed how combining an LLM with a vector database allows responses to be grounded in specific documents or knowledge sources, dramatically improving accuracy and relevance.

We learned the fundamentals of:

  • Creating embeddings from documents
  • Storing them in a vector database
  • Querying those vectors efficiently
  • Feeding the retrieved context back into the LLM

This approach makes it possible to build AI solutions that are not only powerful, but also trustworthy and domain-aware—a key requirement for real-world business applications.

After the presentation and an engaging Q&A session, the evening continued with informal peer discussions at the bar. Over drinks, attendees exchanged ideas, experiences, and future plans, reinforcing the value of community-driven knowledge sharing.

A big thank you to Xavier Spileers for sharing his expertise, real-world insights, and passion for software development. His ability to translate complex concepts into practical examples made this session both educational and inspiring.

We would also like to sincerely thank everyone who attended. Your presence, questions, and conversations are what make these GetTogethers truly valuable. We look forward to seeing you again at a future event—and to continuing the exploration of modern software development together.

 

Want to become a part this team and keep improving your skill set with us? Check our career openings by clicking here.

 

Share on