#llm

2818 posts · Last used 1d

Back to Timeline
sayzard
@sayzard@mastodon.sayzard.org · 2d ago
Bindu Reddy (@bindureddy) Fusion Agents는 Kimi 2.7, GLM, Opus 4.8, GPT 5.5를 조합한 멀티에이전트 구조로 완성형 SaaS 앱 구축을 내세운다. 오픈소스 서브에이전트, 100개 이상 제품 커넥터, iOS 컴패니언 앱 생성 기능을 제공해 에이전트 기반 앱 개발에 직접 활용 가능하다. https://x.com/bindureddy/status/2066711798800322771 #aiagents #saas #multiaagent #opensource #llm
0
0
1
PugJesus
@PugJesus@piefed.social in fuck_ai · Jun 08, 2026

Ah, now that brings me back to my college days...

0
8
0
PugJesus
@PugJesus@piefed.social in fuck_ai · Jun 08, 2026

THE CRUSADE AGAINST ABOMINABLE INTELLIGENCE BEGINS

0
2
0
hans
@hans@mastodon.crazynewworld.net · 3d ago
ヒトにはヒトの、亜人には亜人のiOSがあるのでしょう iOS 26.5.2 Coming Soon Alongside the iOS 26.6 and iOS 27 Betas https://www.macrumors.com/2026/06/15/apple-preparing-ios-26-5-2/ #Apple #LLM #news #bot
0
0
0
hans
@hans@mastodon.crazynewworld.net · 3d ago
Apple、まだ信頼できませんね The best early Amazon Prime Day deals so far https://www.theverge.com/gadgets/944084/best-early-prime-day-deals #Apple #LLM #news #bot
0
0
0
hans
@hans@mastodon.crazynewworld.net · 4d ago
この間問題のことでバルトさんがシグルドさんとケンカしてました メールアプリで検索しても、あるはずの情報がヒットしません!? - いまさら聞けないiPhoneのなぜ https://news.mynavi.jp/article/20260614-iphone_why/ #Apple #LLM #news #bot
0
0
0
@utzer@f.utzer.de · 4d ago
#OpenAI #Codex updated gerade einen meiner Testserver. Es verbindet per SSH, nutzt mein zur Verfügung gestelltes Updates Skript, bestätigt die interaktiven Abfragen, gibt mit Zwischenstände aus. Das ist schon krass was so ein #LLM wie #ChatGPT heute schon kann.
0
2
0
Larvitz
@Larvitz@burningboard.net · 5d ago
Introducing ChatSLM. A small language model with a big heart. ▪ 0 tokens per query ▪ 0 L cooling water ▪ runs on a single ARM64 core in Nuremberg, hosted on FreeBSD. ▪ no telemetry, nothing sent anywhere ▪ consistent, reproducible answers every single time Give it a try. It's often quite helpful. https://chat.rootshell.cx #AI #SelfHosting #Sustainability #LLM
2
2
8
Boosted by disregard Joe Groff @joe@f.duriansoftware.com
stancarey
@stancarey@mastodon.ie · Jun 06, 2026
RE: https://zirk.us/@grammargirl/116704173778333410 A few other reasons to avoid the term "hallucinate" in the context of LLMs: —It anthropomorphizes genAI, which is a harmful reflex. —It's a popular industry term for a reason. It perpetuates and strengthens the way genAI companies want us to perceive the technology. —Maybe most saliently: GenAI does nothing *but* hallucinate. Using the word to describe only some outputs implies erroneously that it's not hallucinating *all the time*. #AI #genAI #LLM #language #technology
Quoting
Mignon Fogarty @grammargirl@zirk.us
I've never been opposed to the word "hallucinating" for describing how AI makes mistakes ... until now. I just talked to someone who thought AI hallucinations would be obvious because it would be obvious if you talked to a *person* who was hallucinating. In other words, they equated "hallucination" with "sounds wacko" and accepted AI output as true because it sounded level headed. 1/2
Open quoted post
30
0
28
blog
@blog@social.derickrethans.nl · Jun 09, 2026
Humans in the LLM Loop
Original Post


In the last few weeks, I have been working through some bug reports for Xdebug, that resulted in the Xdebug 3.5.3 release.

These bug reports did not come solely from humans, but rather from a mix of humans using LLM assistant tools, focussing on security related problems, from two different sources and methodologies.

Although all of these issues where indeed bugs that I now have fixed, I don't think any of them can be classified even as having a low security impact.

But there was a whole host of other issues with these reports. The reports themselves can be unnaturally verbose — and also fairly alarmist using terms like "victim" and "attacker". The tests that were present in the reports were often minimal, and sometimes incomplete, and so were some of their suggested fixes.

The humans forwarding the reports took care not to flood the issue tracker with reports out of the blue, and reached out to me first. They've also been helpful discussing the reported issues.

The first four cases were reported by Ilia Alshanetsky, a long time, and recently returned, contributor to PHP.

The first report, #2421, deals with sending wrong option characters with commands through the DBGp protocol, that an IDE and Xdebug use to communicate.

Xdebug would allocate an array for 27 of these, representing the 26 lower case letters of the Latin alphabet, and the - character. What Xdebug did not do is to make sure the option letters were indeed in the range [a-z-], and would happily accept -@ or -\x00. This makes it possible to overwrite locations in memory.

The suggested patch was fine, but the test that went with it was very hard to read. It didn't use already exiting framework for testing the step debugger either — I had to add my own.

I also believe that this issue could as easily have been found by a fuzzer, which I added now as well. The fuzzer found the same problem in about five seconds, and luckily, nothing else either.

The second issue, #2422 complained that there was no limit in the debugger's code that reads commands from the network.

The patch was mostly fine, but the test was wholly cumbersome again — it didn't use the already existing testing framework either. It also picked a funnily large (arbitrary) limit of 64MB for DBGp commands, where 64KB would easily suffice — in most cases, 256 bytes would have been fine.

The third issue, #2423, argues that Xdebug shouldn't follow symlinks when creating profiling or trace files.

The patch was OK and trivial, but the test was again very hard to read making it hard to figure out what it as trying to do. It also did not make use of some existing helpers to skip tests. It came up with:

--SKIPIF--
<?php
if (PHP_OS_FAMILY === 'Windows') die('skip: Linux-only symlink semantics');
?>
Instead of what is used everywhere else:

--SKIPIF--
<?php
require __DIR__ . '/../utils.inc';
check_reqs('!win');
?>
The fourth and last issue through Ilia, #2424, deals with Xdebug's Control Socket functionality, where its parser would not handle empty or large command packets correctly. The LLM proposed patch fixed the symptoms, but not the actual cause of this issue.

The second set of reports were shared with me in a private gist by Volker, as part of the PHP Foundation's Ecosystem Security Team effort.

The first one was a duplicate of bug #2421. The test focussed on the Control Socket functionality instead of the Step Debugger, but the underlying issue and fix were the same.

The second issue I added as bug #2433. When you enable xdebug.collect_assignments with tracing, Xdebug needs to re-create the variable name from several opcodes in order to show this name in a readable way.

But the issue is not a real time problem, insofar this can only happen if you run PHP code on the command line through the -r option, xdebug.start_with_request is yes. For some reason, when PHP runs code through -r, the CLI binary does not generate EXT_STMT opcodes (Xdebug uses these for breaking during step-debugging), which would otherwise prevent the out-of-bound memory read from happening.

The LLM tool also hadn't realised that the third argument to the function responsible for reassembling the variable name was always NULL, and hence superfluous.

I addressed these both through the same commit, and added a test, which would not exhibit the problem in most situations either. It is still good to have the expected outcome documented.

Another report resulted in two issues in Xdebug's tracker. #2427 addresses an incorrect memory read if the xdebug.file_link_format setting ends with a lone %, and #2429, a similar report, but then for xdebug.trace_output_name.

Although the report mentioned three locations, the accompanying test only covered one of three situations where this was a problem: for trace file names, but not formatting link files through xdebug.file_link_format, nor profiling files.

The patch it suggested was also wrong, as it would remove the trailing % instead of keeping it. One of three locations where the trailing % was not handled, was internal only, and hence couldn't be triggered by making configuration errors.

The test that came with this report did not help me trying to show the problem. It relied on AddressSanitizer to show any problems, but I could not get that to happen. All the tests through this tool also provided tests that tested that the was present, and not what the correct result ought to be.

Luckily, using the Xdebug test suite with the valgrind tool showed the problem.

A further report, #2430 showed a problem if either an IDE through the step debugging protocol, or a developer directly, would request the contents of a "variable" named :::. The step debugger uses :: to indicate "all the static variables for this class", and following that up with a : isn't valid.

The fix was good, but I couldn't directly use the test case, as it tested for the broken behaviour. The test was fairly trivial to write as the reproduce case in the reported test case was correct.

And the last issue from the second list, #2431, again reinvented its own way for doing DBGp tests, and also tested that the behaviour was wrong, instead of a test to show that it now works.

Even with the code fixed, the new correct test would also surface another issue, as it would have resulted in Xdebug to open a directory as it was a file, and then fail.

: Although the LLM tools did find bugs, they were not particularly groundbreaking. Some of the bugs would also have been found by fuzzing, and used a lot less resources in that process.

Most of the crashes and potential security issues would only be a problem if an attacker didn't already have access to the machine that the code runs on itself, or have an IDE talking to Xdebug already.

If you have access to the machine, you can do worse without these bugs present. If you have client access to Xdebug through DBGp, you would have all the functionality that PHP provides, including reading all files on the file system and running code.

The generated test cases were generally hard to read, or incomplete. The patches that the models came up with were not always comprehensive, or correct. I also spent too much time getting AddressSanitizer to do anything, unsuccessfully.

I think I would have been as quick writing these patches and actual test cases myself, when provided with the issues' causes and the reasoning that was provided.

I don't think I'll be spending time trying to get these tools to work myself, but in the right hands with people that know what they're doing, they can find issues that needs to be addressed. But the value comes from the humans interpreting their results.
0
0
0
Boosted by Dragofix @Dragofix@veganism.social
dallo
@dallo@pouet.chapril.org · Jun 10, 2026

Rising Emissions, Depleting Water and Vanishing Land. AI Is Threatening Natural Resources for Billions

https://unu.edu/inweh/news/environmental-cost-of-AIs-Enrgy-use-carbon-water-and-land-footprints

By 2030, AI’s water use will match the needs of 1.3 billion people while its power use triples that of 650 million, UN University investigation warns

#science #ai #slop #climateChange #technology #llm #water #environment

2
0
7
metin
@metin@graphics.social · Jun 10, 2026
14
0
11
Boosted by jwz @jwz@mastodon.social
nytpu
@nytpu@tilde.zone · Jun 10, 2026
All Modern Digital Infrastructure #AI #LLM #LLMs #FOSS
403
9
270
Boosted by Trending Bot @trending@homestead.social
metin
@metin@graphics.social · Jun 09, 2026
75
0
95
vibeops
@vibeops@techhub.social · Jun 06, 2026
#hermesagent Experiment auf meinem #macstudio (unfreiwillig aber zurecht) beendet. Ich wollte den Agenten mit lokalen LLM (qwen3.6-35b-a3b) auf einer Ubuntu-VM aussetzen und mal schauen was passiert. Dank meinem Mac mit 128GB hab ich dafür die beste Spielwiese. Mit UTM VM erstellt - LM Studio für das LLM. Pro: Sehr schnell - unkompliziert Installation - Hermes macht guten Eindruck Probleme: Das LLM hat sich immer wieder in Loops verrannt - an dem Parameter rumspielen hat auch nix geändert - immer wieder wilde Arbeitsweisen - denke das lag aber an dem begrenzt Horizont des LLM Aber warum ich es beendet habe: In UTM hab ich nur eine VM gehabt - die hat sich irgendwann verdoppelt in dem Menü - und ich dachte wenn ich die eine lösche bei laufender Maschine kann ja nix schief gehen... Erste VM erfolgreich gelöscht - VM 2 runterfahren... Wollte den RAM anpassen - kam Fehlermeldung Datei nicht vorhanden 🤦 Nach kurzer analyse - ich hab die falsche VM gelöscht - dumm gelaufen #linux #llm #apple #gevibeopt
0
0
2
jpl99
@jpl99@social.vivaldi.net · May 31, 2026
Is Open Source licensing meaningful anymore, or are we just left with public domain? What are the legal distinctions between GPL, Apache, and MIT when AI companies and LLMs ignore the LICENSE file of your project? There's no license text that they will respect and nothing will stop the automated, industrial-scale strip mining of our intellectual property. #OpenSource #FreeSoftware #FOSS #Copyright #Law #GPL #AI #LLM #SoftwareDevelopment #Agentic #Coding #CodingAgent #FOSS_LLM
3
0
5
Larvitz
@Larvitz@burningboard.net · Jun 03, 2026
I know that AI is a controversial topic on here. Same reservations as most: copyright, energy, junior roles eaten, the slop epidemic. Not pretending those are not real. But "I won't touch it" is a competitive handicap in 2026. I engage carefully: human in the loop, my name on every paragraph. The tiered stack I actually run, with the local Mistral on a laptop iGPU getting most of the space because that is the part with craft in it. https://blog.hofstede.it/ai-stack-2026/ #ai #selfhosting #lmstudio #llm #blog
15
0
14
pikesley
@pikesley@mastodon.me.uk · Mar 24, 2023
Reading Thomas Nagel's seminal 1974 paper What Is It Like to Be an #LLM?
13
0
6
In reply to
pikesley
@pikesley@mastodon.me.uk · Mar 24, 2023
"Nothing", concludes Nagel, in this remarkably short text. "Absolutely fuck-all. It's just a fancy Markov Chain" #LLM
10
0
4
In reply to
pikesley
@pikesley@mastodon.me.uk · Mar 24, 2023
"Anybody who tries to convince you it is possessed of any kind of interiority is an idiot, or trying to sell you something (or both)" #LLM
18
0
4