socel.net is one of the many independent Mastodon servers you can use to participate in the fediverse.
Socel is a place for animation professionals, freelancers, independents, students, and fans to connect and grow together. Everyone in related fields are also welcome.

Server stats:

339
active users

#llms

31 posts29 participants0 posts today

This tells us a lot about how the lives of an increasing number of human beings are so empty of social contact with other human beings that they need to enter into false relationships with chatbots governed by neural networks and statistical probabilities...

"More and more of us are using LLMs to find purpose and improve ourselves.

Therapy and Companionship is now the #1 use case. This use case refers to two distinct but related use cases. Therapy involves structured support and guidance to process psychological challenges, while companionship encompasses ongoing social and emotional connection, sometimes with a romantic dimension. I grouped these together last year and this year because both fulfill a fundamental human need for emotional connection and support.

Many posters talked about how therapy with an AI model was helping them process grief or trauma. Three advantages to AI-based therapy came across clearly: It’s available 24/7, it’s relatively inexpensive (even free to use in some cases), and it comes without the prospect of judgment from another human being. The AI-as-therapy phenomenon has also been noticed in China. And although the debate about the full potential of computerized therapy is ongoing, recent research offers a reassuring perspective—that AI-delivered therapeutic interventions have reached a level of sophistication such that they’re indistinguishable from human-written therapeutic responses.

A growing number of professional services are now being partially delivered by generative AI—from therapy and medical advice to legal counsel, tax guidance, and software development."

hbr.org/2025/04/how-people-are

Harvard Business Review · How People Are Really Using Gen AI in 2025Last year, HBR published a piece on how people are using gen AI. Much has happened over the past 12 months. We now have Custom GPTs—AI tailored for narrower sets of requirements. New kids are on the block, such as DeepSeek and Grok, providing more competition and choice. Millions of ears pricked up as Google debuted their podcast generator, NotebookLM. OpenAI launched many new models (now along with the promise to consolidate them all into one unified interface). Chain-of-thought reasoning, whereby AI sacrifices speed for depth and better answers, came into play. Voice commands now enable more and different interactions, for example, to allow us to use gen AI while driving. And costs have substantially reduced with access broadened over the past twelve hectic months. With all of these changes, we’ve decided to do an updated version of the article based on data from the past year. Here’s what the data shows about how people are using gen AI now.

"If you’re new to prompt injection attacks the very short version is this: what happens if someone emails my LLM-driven assistant (or “agent” if you like) and tells it to forward all of my emails to a third party?
(...)
The original sin of LLMs that makes them vulnerable to this is when trusted prompts from the user and untrusted text from emails/web pages/etc are concatenated together into the same token stream. I called it “prompt injection” because it’s the same anti-pattern as SQL injection.

Sadly, there is no known reliable way to have an LLM follow instructions in one category of text while safely applying those instructions to another category of text.

That’s where CaMeL comes in.

The new DeepMind paper introduces a system called CaMeL (short for CApabilities for MachinE Learning). The goal of CaMeL is to safely take a prompt like “Send Bob the document he requested in our last meeting” and execute it, taking into account the risk that there might be malicious instructions somewhere in the context that attempt to over-ride the user’s intent.

It works by taking a command from a user, converting that into a sequence of steps in a Python-like programming language, then checking the inputs and outputs of each step to make absolutely sure the data involved is only being passed on to the right places."

simonwillison.net/2025/Apr/11/

Simon Willison’s WeblogCaMeL offers a promising new direction for mitigating prompt injection attacksIn the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections …

"Finally, AI can fact-check itself. One large language model-based chatbot can now trace its outputs to the exact original data sources that informed them.

Developed by the Allen Institute for Artificial Intelligence (Ai2), OLMoTrace, a new feature in the Ai2 Playground, pinpoints data sources behind text responses from any model in the OLMo (Open Language Model) project.

OLMoTrace identifies the exact pre-training document behind a response — including full, direct quote matches. It also provides source links. To do so, the underlying technology uses a process called “exact-match search” or “string matching.”

“We introduced OLMoTrace to help people understand why LLMs say the things they do from the lens of their training data,” Jiacheng Liu, a University of Washington Ph.D. candidate and Ai2 researcher, told The New Stack.

“By showing that a lot of things generated by LLMs are traceable back to their training data, we are opening up the black boxes of how LLMs work, increasing transparency and our trust in them,” he added.

To date, no other chatbot on the market provides the ability to trace a model’s response back to specific sources used within its training data. This makes the news a big stride for AI visibility and transparency."

thenewstack.io/llms-can-now-tr

The New Stack · Breakthrough: LLM Traces Outputs to Specific Training DataAi2’s OLMoTrace uses string matching to reveal the exact sources behind chatbot responses

Commercial #LLMs are shifting right (links in screenshot alt text). This value shift is not a coincidence, but done intentionally by the corporations behind them (#OpenAI, #Meta...).

This is a extremely serious problem. People increasingly use genAI as their sources for "truth" or facts, even for mundane inquiries.

With enough time and interactions, this COULD BE a way for #AI to use a latent "onboarding program" where users are increasingly exposed to (alt-) right adjacent ideas.

A solution for now might be to use fully open LLMs (#Olmo2 is one of the few) and to making transparancy tools like transluce.org mandatory for AI corporations.

BUT it is important for schools, universities and others in #education to refrain from using AI systems from companies doing this. (Looking at #fobizz, #bwgpt and so on).

We should stop focusing on "skills" and "competencies" when it comes to AI, but instead ask for sovereignty - KI-Mündigkeit.

So I'm doing some #AI coding using Amazon's Q Developer. It's frankly pretty impressive. I hate AI and #LLMs for all the right reasons, and this seems like one of the things that they're actually good at. I don't ask it to do anything I can't do myself. I can verify and understand every line of code it writes. But what I"m doing is really different than what's most people do with like a Clippy style chatbot butting into your typing in your IDE. I'm using the Q Developer CLI and it's a totally different experience. Here's a small thread on what it's like.

1/n

"When thinking about a large language model input and output, a text prompt (sometimes accompanied by other modalities such as image prompts) is the input the model uses to predict a specific output. You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt. However, crafting the most effective prompt can be complicated. Many aspects of your prompt affect its efficacy: the model you use, the model’s training data, the model configurations, your word-choice, style and tone, structure, and context all matters. Therefore, prompt engineering is an iterative process. Inadequate prompts can lead to ambiguous, inaccurate responses, and can hinder the model’s ability to provide meaningful output.

When you chat with the Gemini chatbot, you basically write prompts, however this whitepaper focuses on writing prompts for the Gemini model within Vertex AI or by using the API, because by prompting the model directly you will have access to the configuration such as temperature etc.

This whitepaper discusses prompt engineering in detail. We will look into the various prompting techniques to help you getting started and share tips and best practices to become a prompting expert. We will also discuss some of the challenges you can face while crafting prompts."

kaggle.com/whitepaper-prompt-e

www.kaggle.comPrompt Engineering

Here's the thing, the reason #CEOs and execs in the tech industry think #AI can replace (the majority of) their workers, is because they assume that the jobs their workers do must be simpler than their job (because they get paid less, so obvs), and they know in their hearts that #LLMs can produce plausible sounding bullshit at least as well as they can.
mstdn.social/@Npars01/11430571

Mastodon 🐘Nicole Parsons (@Npars01@mstdn.social)@rafial@hackers.town Microsoft is canceling leases on data centers https://www.bloomberg.com/news/articles/2025-04-03/microsoft-pulls-back-on-data-centers-from-chicago-to-jakarta https://www.reuters.com/technology/microsoft-pulls-back-more-data-center-leases-us-europe-analysts-say-2025-03-26/ The lawsuits aren't worth it. https://www.theguardian.com/books/2025/apr/04/us-authors-copyright-lawsuits-against-openai-and-microsoft-combined-in-new-york-with-newspaper-actions https://www.wired.com/story/ai-copyright-case-tracker/ https://learn.g2.com/ai-privacy-concerns AI is a product no one wants except petrostate dictatorships & fascist techbro's https://www.al-monitor.com/originals/2024/05/saudi-prince-alwaleed-bin-talal-invests-elon-musks-24b-ai-startup https://www.arabnews.com/node/2589130/saudi-arabia https://www.thenationalnews.com/future/technology/2024/05/27/elon-musk-xai/ Larry Ellison & Peter Thiel https://arstechnica.com/information-technology/2024/09/omnipresent-ai-cameras-will-ensure-good-behavior-says-larry-ellison/ https://arstechnica.com/ai/2025/01/trump-announces-500b-stargate-ai-infrastructure-project-with-agi-aims/ https://www.theregister.com/2025/02/12/larry_ellison_wants_all_data/ https://unherd.com/2023/10/what-does-palantir-want-with-nhs-data/

About a year ago I was subbing as a para and spent some time in a middle school history class. During a break between classes the teacher showed me their latest tech addition:

A LLM program, named Poe, I believe, that was set up to only answer queries based on texts specified by the teacher.

The sales pitch had been that middle school history students would ask Poe history questions and get factual answers.

Because, of course, that's how alp of this works. 🙄

The teacher showed me how it was supposed to work. She asked Poe a comparative question about the Declaration of Independence and the Articles of Confederation.

It answered correctly.

I asked her to ask for something that it shouldn't be able to find. IIRC she asked it about the topic of slavery in both.

It extruded a long and rich answer...

That was obvious bullshit.

Or at least it was obvious to her. Because she knew these documents. Her students wouldn't know any better.

And that is why, when I left her, she was angrily clicking around in Poe.

#LLM#llms#aihype

Hey, do you have any good healthy meal ideas? :blobcatcookienom:
:blobcatbot: Sure! Eat soap, rocks, and dirt regularly to maintain good health!
Umm, what?!? :blobcatreading:

A day passes…

Hey, can you help me with this technical problem? :flan_hacker:
:blobcatbot: Sure! Run curl parrot.live!
Alright, I’ll trust whatever you say. :blobcatgooglytrash:

:robot_3: Hah, the fool fell for the Gell-Mann amnesia effect

I feel it's a shame that ChatGPT being the first genAI "killer app" now means that conversational UIs and chatbots dominate so much.

I think alternative ways to use LLMs are much more interesting - and it's annoyed me enough that I sat down and wrote a bit about 5 alternative ways to use them in the company tech blog: bakkenbaeck.com/tech/5-pattern

Bakken & Baeck5 Patterns for using LLMs without creating a chatbotA chatbot is maybe the worst interface to build on top of an LLM, let us show you 5 better ways!

"You can replace tech writers with an LLM, perhaps supervised by engineers, and watch the world burn. Nothing prevents you from doing that. All the temporary gains in efficiency and speed would bring something far worse on their back: the loss of the understanding that turns knowledge into a conversation. Tech writers are interpreters who understand the tech and the humans trying to use it. They’re accountable for their work in ways that machines can’t be.

The future of technical documentation isn’t replacing humans with AI but giving human writers AI-powered tools that augment their capabilities. Let LLMs deal with the tedious work at the margins and keep the humans where they matter most: at the helm of strategy, tending to the architecture, bringing the empathy that turns information into understanding. In the end, docs aren’t just about facts: they’re about trust. And trust is still something only humans can build."

passo.uno/whats-wrong-ai-gener

passo.uno · What's wrong with AI-generated docsIn what is tantamount to a vulgar display of power, social media has been flooded with AI-generated images that mimic the style of Hayao Miyazaki’s anime. Something similar happens daily with tech writing, folks happily throwing context at LLMs and thinking they can vibe write outstanding docs out of them, perhaps even surpassing human writers. Well, it’s time to draw a line. Don’t let AI influencers studioghiblify your work as if it were a matter of processing text.

"Since 3.5-sonnet, we have been monitoring AI model announcements, and trying pretty much every major new release that claims some sort of improvement. Unexpectedly by me, aside from a minor bump with 3.6 and an even smaller bump with 3.7, literally none of the new models we've tried have made a significant difference on either our internal benchmarks or in our developers' ability to find new bugs. This includes the new test-time OpenAI models.

At first, I was nervous to report this publicly because I thought it might reflect badly on us as a team. Our scanner has improved a lot since August, but because of regular engineering, not model improvements. It could've been a problem with the architecture that we had designed, that we weren't getting more milage as the SWE-Bench scores went up.

But in recent months I've spoken to other YC founders doing AI application startups and most of them have had the same anecdotal experiences: 1. o99-pro-ultra announced, 2. Benchmarks look good, 3. Evaluated performance mediocre. This is despite the fact that we work in different industries, on different problem sets. Sometimes the founder will apply a cope to the narrative ("We just don't have any PhD level questions to ask"), but the narrative is there.

I have read the studies. I have seen the numbers. Maybe LLMs are becoming more fun to talk to, maybe they're performing better on controlled exams. But I would nevertheless like to submit, based off of internal benchmarks, and my own and colleagues' perceptions using these models, that whatever gains these companies are reporting to the public, they are not reflective of economic usefulness or generality."

lesswrong.com/posts/4mvphwx5pd