Checking....what's the status for FOSS agentic AI models with skills?

iturnedintoanewt@lemmy.world · 1 day ago

Checking....what's the status for FOSS agentic AI models with skills?

hendrik@palaver.p3x.de · edit-2 1 day ago

We got open-source agents like OpenCode. OpenClaw is weird, and not really recommended by any sane person, but to my knowledge it’s open source as well. We got a silly(?) “clean-room rewrite” of the Claude Agent, after that leaked…

Regarding the models, I don’t think there’s any strictly speaking “FLOSS” models out there with modern tool-calling etc. You’d be looking at “open-weights” models, though. Where they release the weights under some permissive license. The training dataset and all the tuning remain a trade secret with pretty much all models. So there is no real FLOSS as in the 4 freedoms.

Google dropped a set of Gemma models a few days ago and they seem pretty good. You could have a look at Qwen 3.5, or GLM, DeepSeek… There’s a plethora of open-weights models out there. The newer ones pretty much all do tool-calling and can be used for agentic tasks.

iturnedintoanewt@lemmy.world · 7 hours ago

Thanks! I have an understanding of being able to run these models as LLM you can chat with, using tools like ollama or GPT4All. My question would be, how do I go from that to actually do things for me, handle files, etc. As it stands, if I run any of these locally, it’s just able to answer offline questions, and that’s about it…how about these “skills”, where it can go fetch files, or go find an specific URL, or say a summary of what a youtube video is about based on what’s being said in it?

hendrik@palaver.p3x.de · edit-2 5 hours ago

I think you need some Agent software. Or a MCP server for your existing software. It depends a bit on what you’re doing, whether that’s just chatting and asking questions that need to be googled. Or vibe coding… Or query the documents on your computer. As I said there’s OpenClaw which can do pretty much everything including wreck your computer. I’m also aware of OpenCode, AutoGPT, Aider, Tabby, CrewAI, …

The Ollama projects has some software linked on their page: https://github.com/ollama/ollama?tab=readme-ov-file#chat-interfaces
They’re sorted by use-case. And whether they’re desktop software or a webinterface. Maybe that’s a good starting point.

What you’d usually do is install it and connect it to your model / inference software via that software’s OpenAI-compatible API endpoint. But it frequently ends up being a chore. If you use some paid service (ChatGPT), they’ll contract with Google to do the search for you, Youtube, etc. And once you do it yourself, you’re gonna need all sorts of developer accounts and API tokens, to automatically access Google’s search API… You might get blocked from YouTube if you host your software on a VPS in a datacenter… That’s kinda how the internet is these days. All the big companies like Google and their competitors require access tokens or there won’t be any search results. At least that was my experience.

TheCornCollector@piefed.zip · 8 hours ago

AllenAI has released open source models with open training data, code and science. If you value the ‘source’ to actually be open. They’ve also published the multimodal Molmo models.

hendrik@palaver.p3x.de · 7 hours ago

Thanks! I didn’t know about these. I was just aware of Apertus from the Swiss National AI Iniative. But from my experience, they weren’t great. Might look into Olmo 3, then.