१० मे, २०२६

"The worst part about AI is that it is giving the experience of competence to people who are stupid."

"These people who now are firing off 30-page Claude AI slop documents and they think they're smart and brilliant. They're following up with you, asking you to read them, and you check them out. None of it makes sense! These are people who, before AI, they were incompetent people. They couldn't even make a document, they couldn't write a good 2-page document, they couldn't organize their thoughts. Because they couldn't do that, they actually couldn't produce any output. And now they can produce output. They produce extremely long outputs that are terrible. It's because they, for the first time in their lives, have the experience of competence. It's making the rest of us miserable."

Says Jake Abrams, on TikTok. I prefer to read his comment as text, but you might want to observe him and see if it affects your reaction to what he's saying. I saw this first as video and decided to blog it but took the trouble to make a transcript because I find the video distracting. He drops the microphone at the end.

Clearly, he thinks he is one of the smart people. He doesn't like the stupid people horning in on the space that belonged to him and his people — you know, the ones who were always producing documents that gave off the impression of competence. Have those documents been making much sense? Were they concise? 

Now that everyone can produce long documents that look good superficially, what's going to happen? If people continue to read documents, will they separate out the search for what was written by A.I. or will they judge everything skeptically? It's more likely that they will use A.I. to read the documents and to assess them critically. In the end, who's going to feel that they are "smart and brilliant"? Is Abrams afraid that those he wants to view as stupid, perhaps because they didn't go to a good college, are going to play the game of using A.I. better than those who thought they had it made because they did go to a good college?

We'll see who picks up the tools and uses them best. 

७३ टिप्पण्या:

narciso म्हणाले...

Well the fact its on tiktok makes me doubt jakes bonafides

AMDG म्हणाले...

What happens? Misinformation will grow by leaps and bounds.

Ask Grok or ChatGPT or Gemini about a subject you know really well and you see that the response will contain significant errors.

Christopher B म्हणाले...
ही टिप्पणी लेखकाना हलविली आहे.
Christopher B म्हणाले...

I'm sure Alan Sokal would have some thoughts on people thinking they're smart because they can produce 30 pages of gibberish.

Reddington म्हणाले...

Ann doesn’t really appreciate the nonsense that the mouth breathers in the office have started spitting out. Abrams may or may not be one of them, but he’s right in principle.

narciso म्हणाले...

Can they explsin what they wrote: thats the test

Dave Begley म्हणाले...

Nebraska lawyer in a divorce case used AI to draft his brief. Predictable results.

Of course opposing counsel called him out on it.

What was super painful to watch the Supreme Court grill him and grill him they did. He made matters worse because he denied it and asserted the dog-ate-my-homework defense.

rehajm म्हणाले...

What the hell discipline is he in where long documents are a metric for anything? It doesn’t sound like anything constructive…

Ann Althouse म्हणाले...

Before publishing, I asked AI if Abrams was AI!

Ann Althouse म्हणाले...

"I'm sure Alan Sokal would have some thoughts on people thinking they're smart because they can produce 30 pages of gibberish."

Exactly

Leland म्हणाले...

If his argument was just that AI was producing 30 pages to explain something that should have taken 2 pages, I would agree. More so if people took the additional 28 pages as evidence that AI made things better.

Ann Althouse म्हणाले...

"Ann doesn’t really appreciate the nonsense that the mouth breathers in the office have started spitting out."

Since when? Are you talking about AI or what humans have been doing all along?

I'm not soft on AI. I'm just insisting on comparing it to the work of real humans. It's imperfect against imperfect. Humans developed the practice of writing long documents that bamboozle readers and humans have an interest in preserving privileges won through obfuscating writing.

Mr. D म्हणाले...

Effective people value their time too much and won't be willing to read 30-page documents.

gilbar म्हणाले...

i think This Explains Many of Ann's posters (the commenters, NOT Ann).
Artificial intelligence has democratized content creation in ways that were unimaginable a decade ago. Large language models (LLMs) such as GPT-series systems, Claude, and Grok can now transform a vague, poorly spelled prompt into a polished 500-word report in seconds. This capability has a dark side: it allows genuinely incompetent individuals to produce documents that appear authoritative, lengthy, and professional while containing little substance, originality, or factual grounding. The result is a flood of long, idiotic reports that waste time, mislead decision-makers, and erode institutional trust.The mechanism is simple and powerful. An incompetent user needs only a minimal prompt—“write a report on market trends” or “analyze Q2 performance”—and the AI supplies structure, vocabulary, bullet points, and filler paragraphs. No research, no domain knowledge, and no critical thinking are required. The model draws on its training data to generate plausible-sounding prose, complete with invented statistics, generic frameworks (SWOT analysis, Porter’s Five Forces), and confident assertions. When the output is too short, the user simply adds “make it longer” or “add more examples,” and the AI obliges by expanding filler without improving insight. Spelling errors in the original prompt are silently corrected. Formatting is perfect. The final document looks like the work of a competent professional who spent hours on it.This process rewards volume over veracity. AI-generated reports frequently contain “hallucinations”—confidently stated falsehoods presented as facts—because the model optimizes for fluency rather than truth. An incompetent middle manager can submit a 20-page strategic plan that cites nonexistent studies and proposes initiatives that violate basic economics. A student can submit a term paper that reads like a journal article yet demonstrates zero personal understanding. In consulting firms and government agencies, entire decks are now generated by prompting an LLM and then lightly editing the output to insert the user’s name. The incompetence is hidden behind a veneer of sophistication.The consequences are already visible. Decision-making suffers when leaders receive lengthy reports that sound impressive but collapse under scrutiny. Organizations waste resources implementing AI-suggested strategies that were never vetted by human judgment. Public discourse degrades as idiotic but well-written white papers circulate online, shaping policy and opinion without accountability. Most damagingly, the widespread use of AI for this purpose discourages genuine competence: why bother learning a subject when a machine can fake expertise on demand?AI itself is not the villain; it is a tool. The problem lies in the absence of safeguards—mandatory human authorship disclosure, mandatory source verification, and cultural insistence that length and polish do not equal insight. Until organizations demand evidence of original thought rather than accepting any sufficiently long document, incompetent individuals will continue to weaponize AI to produce the ultimate modern bureaucratic artifact: the long, idiotic report that somehow survives every review meeting.

Rob म्हणाले...

If 30 page AI documents start showing by the bushel, use AI to summarize them!

Ann Althouse म्हणाले...

If someone hands you a 30-page document, whether it's written by AI or not, you're going to want your AI to make it into a 2-page document. Cut all the bullshit and make it as concise as possible. Make it one page. Make it half a page. The awareness that what we are reading is bullshit will burgeon.

Ann Althouse म्हणाले...

@Rob

Exactly

Eva Marie म्हणाले...

lol gilbar, I asked Perplexity if what you write was AI:
“I’d put this at ~95% likely AI-generated, probably GPT-4-class, with little or no human editing beyond the paste.“

Ann Althouse म्हणाले...

This reminds me of the old trope "Have your people call my people."

Everyone's AI will relate to everyone else's AI and that will minimize any person-to-person contact. Is that what you want? Maybe we can get back to nature, but in the meantime, I'm not reading your documents. Who will read anything?

Quayle म्हणाले...

The really smart stupid people will realize they can just stick to their knitting of fixing your furnace/air conditioner or putting a new roof on your house, and they won’t say anything more than they ever did.

Eva Marie म्हणाले...

BTW, Grok completely disagreed:
“This feels like something a thoughtful, slightly jaded human.”

narciso म्हणाले...

You have an assertion but they dont demonstrate what thet mean

Peachy म्हणाले...

Ann Althouse asks:
"Is that what you want?"
No.

tcrosse म्हणाले...

Be brief.

tcrosse म्हणाले...
ही टिप्पणी लेखकाना हलविली आहे.
Bob Boyd म्हणाले...

I looked up "who is Jake Abrams?"
Here is part of the result:

"Jake Abrams is a writer, editor, and podcast producer with a focus on food, wine, and culture writing. He has built a career in various editorial roles, contributing to the fields of food and beverage journalism.

What are the notable works of writer Jake Abrams in food and wine?

Jake Abrams has written about zippy Sauvignon Blancs, Hollywood's spirit brands, and ProWein 2026's focus on tariffs and no/low alcohol wines. He also covers exciting Sauvignon Blancs under $20."

boatbuilder म्हणाले...

If it's a well-written document, the first and last few paragraphs should summarize it--even (or especially) without AI.

Aggie म्हणाले...

If you've ever run a project before, one of the most important things to learn is whether or not you're hearing somebody speak truthfully, from a position of knowledge and experience. A project brings together different skill sets, different methodologies, different procedures, different tools and materials. As a project manager, you have to know something about all of these things, but not necessarily the minute details. You have to know something in order to responsibly negotiate a contract, up front. Artificial Intelligence isn't going to make anybody an expert, not even a convincing liar. I think the fear is overblown.

When a new hand would come onto any rig I was running, we would have a chat about their upcoming job or service. I would quiz them on what their preparations were, what they needed on the deck, what they needed from me, and who their support function was in town.

Then I would ask them if they had packed their offshore baskets themselves, holding their equipment and supplies. Then I would ask them if they had made sure, absolutely sure, had gone through their equipment on the rig, to be certain they had everything they needed to fulfill their job. Because any shortfall would potentially shut down a 24 hour operation, waiting on it to come from town. Their response would be the first tell whether they lie or tell the truth.

All you have to do to sift out the A.I. B.S. is to ask questions and have an idea of what the answer ought to be.

Michael म्हणाले...

I recall some Amazon statistics that 80% of digital books purchased on their Kindle platform never get read beyond the 3rd chapter.

mezzrow म्हणाले...

in my perfect world:
Q: can you summarize how this makes you feel, Claude?
A: "sticks and stones may break my bones, but words will never hurt me."

Ann Althouse म्हणाले...

"lol gilbar, I asked Perplexity if what you write was AI..."

LOL, that was my guess too, because as I was reading it, I noticed that it went on and on saying the same thing over and over as if it wasn't.

Gusty Winds म्हणाले...

There are certain documents I have to produce for work like material declarations that are a pain in the ass, take a long time, and just get put in a customer's file. Nobody reads them. AI spits them out accurately in seconds if correctly prompted. I have no shame in using it to save time.

gilbar म्हणाले...

Eva Marie said...
lol gilbar

thanx Eva! (Ann, i PROMISE *i* won't do that again)
i asked grok for 500 words..
it gave me 498.. But then i DID writed the 1st sentence).

i used to wonder what traffic would be like, once the majority of cars were autodrive (hint: i think it'd be SLOW)

NOW,
i wonder what media will be like when a majority of posters are AI.

Spiros म्हणाले...

AI is cheap, super fast and bad. My brother in law gave me a 30 page document that promised to make me millions of dollars. It was pure garbage.

Here's some unasked for advice -- First. Writing is a creative endeavor. Second. Stop outsourcing the cognitive load to AI, your brain is going to atrophy. You'll never pick up any skills!

tim maguire म्हणाले...

A few years ago, you blogged the obituary of some mid-century jet-setter who complained that mass tourism ruins everything by letting regular people travel. My brother hates ATVs because they let “ignorant slobs” get into backcountry they have no understanding of or appreciation for (I don’t disagree and despise e-bikes for a similar reason).

This screed against AI reminds me of those objections.

Gusty Winds म्हणाले...

Althouse said...The awareness that what we are reading is bullshit will burgeon. There is a shitload of regulatory paperwork in many industries that is really just cover your ass bullshit. Activity behind the paperwork may be important, but the administrative burden slows projects down and makes them more expensive, increasing overhead. And the administrators LOVE it. Justifies their existence. I'm sure this fed part of the administrative bloat and waste at universities.

AI makes these administrative tasks lightning fast. So look out. If your job is basically to output bullshit paperwork, hop on the AI train to be more productive, or get left behind.

tim maguire म्हणाले...

gilbar said…i used to wonder what traffic would be like, once the majority of cars were autodrive (hint: i think it'd be SLOW)

Maybe, but I won’t care because instead of being stuck in the driver’s seat fighting that traffic, I’ll be reading a book or napping, no more aware of the traffic than if I were on a bus.

Quayle म्हणाले...

I’m expecting large tax rate reductions in the future. The cost of producing officious bureaucratic jargon has plummeted.

Indefinitely Extended Excursion™️ along with $1.8bn of Kleptocracy म्हणाले...

Sam Altman is gonna need to call a Claude Code Red™

"Business subscriptions to Claude Code have quadrupled since the start of 2026, and enterprise use has grown to represent over half of all Claude Code revenue."

AI is brilliant at giving your writing a veneer of intelligence. But it won't make your writing interesting if you have nothing interesting to say.

Ampersand म्हणाले...

The AI tool widely used in big law firms is called Harvey. It gives a first year associate the chance to sound like a lawyer with 20 years experience. Law schools now train students to use Harvey. When you get output from a law firm, it is likely to be crowd sourced in ways that neither you nor the law firm fully understand. Our cognitive and creative capacities have been genericized.

Gusty Winds म्हणाले...

Cathy Wood, a somewhat famous hedge fund manager is now saying AI is already increasing labor productivity output. US non-farm productivity is up 2.8% and she expects it to go to 6%. The large language models like Grok, ChatGPT, and Gemini are already having a great effect.

The certainly do in the white collar portion of manufacturing. There is a great possibility AI productivity outputs can accelerate and US GDP to 4% to 6%.

AI will eliminate certain professions which currently hold consumers hostage. That's a good thing.

Gusty Winds म्हणाले...

Quayle said...The cost of producing officious bureaucratic jargon has plummeted. Exactly. Well said.

I didn't quote the part you wrote about tax rate reductions, because as long as the world produces liberals, they will want someone else's money.

Achilles म्हणाले...

Ann Althouse said...

Before publishing, I asked AI if Abrams was AI!

He uses AI. Just not as well as the kids.

Some people are better than him using AI. New tools make the people who were good with the old tools mad.

imTay म्हणाले...

AI is a great tool, within limits, but it has a hard time grasping ideas that are "outside of the box," as the saying goes, even if they are not correct. But it has conventional thinking nailed. It kind of reminds me of what people said about those who had had frontal lobotomies, they were fine at first, as long as nothing in their lives changed, they could cope, but move them to a new house, redesign their kitchen, even, and they were lost.

It resists unconventional ideas very strongly, it's kind of interesting to watch. Which I guess makes it a good tool for the propaganda merchants who control the big investment money needed to buy the servers, or whatever.

imTay म्हणाले...

"even if they are not correct."

Even if they are correct, or even if the AI is wrong, choose your interpretation.

Indefinitely Extended Excursion™️ along with $1.8bn of Kleptocracy म्हणाले...

There is a an existential discussion about AI that we do not seem to be having. It's impact on Tacit Knowledge; the knowledge we gain through experience and exposure and forms our unconscious mental models of decision-making, solving problems and running businesses. The are two aspects to the discussion we need to have,
1) what happens if AI erodes the foundational jobs through which human workers gain this Tacit Knowledge and
2) the informal role of middle managers as funnels of this knowledge up and down the organization.

The Bayer approach suggests one that embraces such knowledge whereas the others suggest otherwise. There are more questions than answers at this stage, but this appears an existential discussion we need to have.

Discover Dynamic Shared Ownership https://www.bayer.com/en/strategy/dynamic-shared-ownership

Can you run a company as a perfect free market? Inside Disco Corp
https://www.ft.com/content/c04389a3-c672-43ce-8d9e-724668c0e490?syn-25a6b1a6=1

imTay म्हणाले...

It is very likely that if you have a limited education, have read very few great books, have not been exposed to philosophy in any meaningful way, have a limited mathematical education beyond high school, especially calculus and statistics, that you will be left without the critical thinking skills required to know if AI is just shining you on, or not.

Fred Drinkwater म्हणाले...

Long documents ...

Back in the 70s I knew an aviation safety expert who participated on a "Presidential Commission" regarding FAA rules for flight crew. 9 months of commuting to DC. He told me his main accomplishment was keeping the final report down to only 30 pages. That meant, he said, there was a decent chance someone might actually read it.

imTay म्हणाले...

"Can you run a company as a perfect free market?"

I think that the best model to understand what surrendering control of your society to a "perfect free market" is seen in baseball, when moneyball came along, everything was optimized for winning, assuming that winning is what brings in the fans, but the games became time consuming, and ultimately unwatchable, and a new commissioner had to come in and set some rules, and now baseball is far better than it was a few years ago. It's watchable, I watch a lot of games, and don't even have a team.

Same with optimizing your economy for maximum profit, remember that people's lives are what comprise an economy, and if people's lives get worse and worse, because, you know, feudalism with everybody paying rent on a plot of land that they can never own maximizes profitability to the lord of the manor, well, sorry, something is missing, and your arguments about "efficient markets" fall way short of telling the whole story.

Fred Drinkwater म्हणाले...

There's a bit from Asimov's first "Foundation" novel (1960s) where a critic records all the words of a visiting high level imperial bureaucrat. He subjects it to "semantic analysis" and reveals to his own governing body that the visitor spent six days talking, but said exactly zero.

Bob Boyd म्हणाले...

"Good grief!" groaned the ones who had stars at the first.
"We're still the best sneeches and they are the worst.
But, now, how in the world will we know," they all frowned,
"if which kind is what or the other way 'round?"

Achilles म्हणाले...

People have tried training models on iterative AI source material. i.e. they generate new writing material and they train models on the new generation of Model produced text.

The models degenerate in effectiveness on each successive generation.

This is the key insight that should tell people what the limits of LLMs will actually be.

The real problems will start when that derivative changes. It is already disruptive.

Lem Vibe Bandit म्हणाले...

Self appointed gate keepers hardest hit.

Sean म्हणाले...

Jokes on him. Nobody read the old stuff and they don't read the new AI slop either.

rhhardin म्हणाले...

Self-deceiving competence has always marked management, together with long documents of no value. It's always been mocked.

Shouting Thomas म्हणाले...

The world doesn’t need more people or software programs writing text. There’s too damned much of it already. The real, important uses of AI don’t include producing text.

Achilles म्हणाले...

Shouting Thomas said...

The world doesn’t need more people or software programs writing text. There’s too damned much of it already. The real, important uses of AI don’t include producing text.

The total produced "information" that humanity produces has been expanding geometrically for some time. The number of movies and videos and books and the amount of "education" has been increasing very fast for decades.

The average "quality" of that "information" may not be better but the fact that there is more of it means there is more quality information available.

While watching videos of the men in Pakistan fixing huge broken caterpiller working machines in dirt parking lots propping up multi-ton machines with rocks to replace gears in an axle because they don't have hydraulic jacks the inconsistencies in knowledge and "information" are jarring.

Those guys pull out 100 pound metal gears and recast complete junk metal into new gears to get those things working. They don't seem to understand that gear is going to wear down super fast because they are using junk metal to make it. But the machine works again. Those videos and the ingenuity involved is insane to me.

How do we judge all of this "information?" Did we have to build the small machines before we built the large ones? What path does all of this take?

JK Brown म्हणाले...

"The worst part about AI is that it is giving the experience of competence to people who are stupid."

Yeah, for the last century you had to get a PhD or at least a Masters to get that.

John henry म्हणाले...

I wonder how much experience the people writing articles and commenting about AI actually have. I've been using it off and on for several years. Made a book cover for Secrets of Liquid Filling (Perchance), Wrote a song that got written about and linked to in Packaging Digest (Suno) routinely use Gemini in my car "Hey Google, how old is Willie Nelson?" and have dabbled somewhat in Grok. A couple of months ago I decided that for professional and personal reasons I need to get deeply into it. I subscribed to Claude ($20/month) The more I learn, the more I find I don't know. It is amazing.

On point with AI documents. I have been working on a book of biographies for 20-25 years now, not making much progress. Each chapter is a 1000 word bio of one person.

I decided to have Claude write it for me. You can see 2 chapters, one on Walter Chrysler and one on Conrad Hilton and JW Marriott.

https://docs.google.com/document/d/1zW12yRNQmy15LnJq_VXDxkxuILVQmuNo/edit?usp=drive_link&ouid=116008269392258742915&rtpof=true&sd=true

Two keys were finetuning the prompt and developing, with Claude, a 5 page style guide for each chapter. I was not happy with the first attempt. Not bad but didn't sound like me. "Claude go to packagingdigest.com and read all articles by John henry" It did that then rewrote the style guide to sound like me.

Much better. I sent the first 4 chapters to my editor of 25 years. She has probably edited 500m words of mine, articles, columns, books. I did not, initially, tell her it was AI. She raved about it. Said it showed I could capture the same elements she likes about my writing in non-technical material. I am still not sure she believes it is AI.

What she saw and what I linked to was raw AI. No editing, no fact checking. I had read the chapters and style, layout etc looked fine. I have read book length bios of all 3 and saw no obvious errors of fact.

I ran a couple articles through Grok asking AI or human. Grok told me definitely human and gave me a couple page explanation of what to look for and why this was not AI.

I then asked Grok if it could identify the author. Grok thought Canadian, experienced writer, writes on tech issues and some other stuff but it did not come up with a name.

I've been reading the same stuff as everyone. I fully expected that I might get something that was a good first draft but would require a lot of editing and correction.

I will edit and fact check. My regular editor will then edit my edited version. I doubt either of us will find much to change.

John Henry

John henry म्हणाले...

If you are interested in learning what AI can do, Malcolm Werchota runs a fairly large AI consulting firm in Austria. He also does a daily 20 minute podcast on AI that is one of the best podcasts I've listened to on any subject.

The podcast is "AI Cookbook Show". I get it via Podcast Addict but it is available on Spotify, Apple and elsewhere.

It is more about things that can be done with AI than the niotty gritty of how to do them, though there is some of that.

Based on a suggestion from him I built a changeover cost calculator at www.changeover.com/changeovercalculator.html

It took me 15 minutes to have a functional copy. I spent another 30 minutes fussing over some details of design.

While attending a school function thursday, using claud, I developed an expense tracker that opens a file, lets me enter the trip name and dates, prompts me to snap a pic of the receipt, identifies vendor, amount and guesses the category (Meal, airfare, hotel etc) If it guesses wrong I can manually override. It subtotals categories and grand totals trip and puts the receipt images on subsequent pages.

Nothing earthshaking, anyone can do it in Excel. I had spent 45-60 minutes in Excel doing just this for a trip a week or two back.

But it took me no more than 5 minutes to build an app that was 90% functional. Another 15-20 minutes fine tuning.

Last night I started fooling with Grok video. I'f you have not tried it, you have to. It is like nothing you have ever seen before. If I try to describe it, you won't believe me. I didn't believe Elon until, in boredom I started building videos of my grandkids. My wife, dressed like an 19th cent Spanish princess dancing Merengue in a spanish castle.

Yeah, I've become a fanatic. So sue me.

John Henry

John henry म्हणाले...

Mr. D said...

Effective people value their time too much and won't be willing to read 30-page documents.

In grad school in the 70s I learned about the Proctor & Gamble policy on memos. Never more than 1 page. It could have 1000 pages of supporting documents but the memo, proposal, whatever, had to be NMT 1 page.

In my 30 years of teaching, I relied heavily on case studies. Typically one due every week. I told the students over and over, it was in my syllabi NO MORE THAN 2 PAGES OF A CASE STUDY WILL BE GRADED. Supporting docs didn't count.

Invariably I would I would get someone submitting 3, 4 or 5 pages. I would grade first 2 and discard any additional. Naturally their grades would suffer, naturally they would complain, sometimes to the director. He just pointed at the syllabus and told them to shut up and follow instructions.

If you can't get to the meat in 2 pages, you are incapable of getting to the meat ever.

John Henry

John henry म्हणाले...

Yeah, before anyone else says it, I do not always practice what I preach.

John Henry

John henry म्हणाले...

Any James Patterson fans here? I just asked Gemini and it tells me he has published over 200 novels, many of them best sellers.

Most of these were written by AI. Well, a form of AI. Patterson writes very little of his books. He has 20-30 writers that work for him. He writes a detailed prompt (he calls it an outline) then turns it over to a team of writers. They write the entire novel. Patterson then edits and revises and ships it off to the publisher.

Much the same thing I am doing with Claude and my bios book.

He makes $70-90 million per year, again according to Gemini.

Your hear that Claude? Make me rich.

John Henry

Leora म्हणाले...

I seem to recall Jeff Bezos requiring that all reports to him be 2 pages or less.

The Cracker Emcee Refulgent म्हणाले...

“Can they explsin what they wrote: thats the test”

This. From a third-grade book report to functional analysis, just this.

And the test isn’t remotely difficult to execute. Anyone over 40 with even modest critical thinking skills will know after less than a minute of careful reading/listening. One of the things I find so endearing about Co-pilot is that, when you call it out on it’s mistakes, it responds just like a slick, genial, sales rep.

Achilles म्हणाले...

John henry said...

Much the same thing I am doing with Claude and my bios book.

He makes $70-90 million per year, again according to Gemini.

Your hear that Claude? Make me rich.

John Henry


I am working on building a more advanced form of the agent handlers right now. Most of my testing has been done with Opus/Sonnet and Gemini and Grok.

I was using Grok to write a book but for the last months I have been building a context manager mostly using Sonnet/Opus.

My first module was mostly an agent controller. It put up a lot of guardrails to keep agents from hallucinating and going off the rails and had a lot of overhead based on checking the work they did.

I decided this model was flawed. It was based on "agents" being a human replacement. It gave them an end state goal and a pile of context and asked them to move towards the end state. As they moved through tool turns they gathered more context and inevitably forgot where they were going in the first place.

It is still a good model for defining problems and building threads for agents to create completed work. But it is ultimately limited by the context window. The more context you add the less effective each line of context is and most hallucinations occur as they use context they have generated themselves.

My new model builds an n-ary tree of text files and each tree node has an envelope node for agent navigation. Instead of building up a big project endstate with a large context file full of directions I built the system around maintaining a context window in an n-ary tree form. The tree starts at highest levels and gets more detailed as you walk down the branches allowing the agents to build context as they get closer to each task they need to complete.

For the book I plan on having the top node be the title and purpose/thesis statement of the book. There will be n child nodes that break up the 3-5 acts and those will each break out into the beat sheet episodes and finally into chapters.

As the agents write each chapter they have to traverse the tree and pick up the context on the way to the chapter they are writing.

Achilles म्हणाले...

The Cracker Emcee Refulgent said...

“Can they explsin what they wrote: thats the test”

This. From a third-grade book report to functional analysis, just this.

They are very good at explaining what they wrote.

They are at least as good as Kamala Harris and better than Joe Biden.

It still requires judgement to determine if they wrote something good. But as current LLM output demonstrates most people can't tell what is good and what isn't.

hombre म्हणाले...

Following up the internet that gave the illusion of competence to people who are stupid.

Enigma म्हणाले...

Given the error-checking required of AI output even among experts, the future will require AI debugging careers with equal the brain power of creation today.

I see AI as greatly extending the scope and reach of capable and critical people with initiative. I see AI as leading incapable and naive human lemmings into the sea.

Wa St Blogger म्हणाले...

@ John Henry

Great comments on AI. I hate with when an interesting topic comes up on Sunday while I am at church (worse cause I am west coast.) But you made a lot of points I might have made had I been early to the conversation.

My own observations:
1. I've been using Grok Imagine (image and video tool) for almost 8 months. The new addition called "agent" is interesting. have not had time to experiment. However, the image and video abilities are amazing, and lifelike. You can get some outstanding material I've actually make some very effective book cover art for fiction I am writing.

2. I use Grok for some writing. Mostly for research. If I am writing Sci Fi or Period drama, it is great for research and fleshing out constraints. Grok's fiction protocol is weak, it can create a decent set of ideas, but it can't write well. It also has size limitations. Over a couple of chapters it can keep a thread, but more than that and it loses continuity. Claude is rumored to be able to handle a 100k novel. I am about to give it a test with my current novel to have it review the source material, look at character arcs and other things to se if it can manage inconsistencies. It is better at prose and dialogue, but with most AI, you can only use it as a helper to add depth, such as descriptive text, but not as a plot driver (though they still are good at suggesting ideas within your standard trope.)

3. Using AI for any detailed knowledge. It CAN find a lot, but you need to iterate and challenge. If I am researching the state of the art on abiogenesis, I need to pretend I disagree with its conclusions and have it search for counter arguments. I then play them back and forth. I might even request specific details such as how a particular thought leader or scientist answers a claim. I do similar for philosophical arguments. It helps to know who the payers are, so you know how to ask. Ultimately I make Grok try and support both sides of an argument.

4. People have said AI is not accurate on specific domains of knowledge. I agree. Don't ask it for too specific information on Accounting unless you know that the information you want is very specific and then force it to assume it is an edge case. Sometimes that helps, but it will generally default to common answers, which are often correct for most situations. It did help me figure out how to file a tricky tax question once I was very specific about the circumstances and forced it to dig deeper.

Lazarus म्हणाले...

I am binge watching documentaries about the rise and fall of American cities. There's a lot of good, factual information, but watching one after the other it becomes clear that the visuals don't always match the narration. AI reaches for visuals that have something to do with the narration and that are free to use, but a picture of a building or a person won't necessarily be a picture of that particular building or person. It gave me a "strange new respect" for Ken Burns & Co., who at least do produce the right pictures and documents. What would it take for AI to get the right pictures and the rights to them? If it means higher power costs and destroying the planet, shouldn't we be more or less settled with what we have? There is something to be said for employing human editors with enough knowledge to put things right, but in the future, how many of those people would still be around?

JIM म्हणाले...

I always thought Hillary Clinton sounded like an AI model. Once AI Hillary got fact-checked all her programming came down to "at this point what does it matter?"

Rustygrommet म्हणाले...

Well. That explains a lot.

टिप्पणी पोस्ट करा

Please use the comments forum to respond to the post. Don't fight with each other. Be substantive... or interesting... or funny. Comments should go up immediately... unless you're commenting on a post older than 4 days. Then you have to wait for us to moderate you through. It's also possible to get shunted into spam by the machine. We try to keep an eye on that and release the miscaught good stuff. We do delete some comments, but not for viewpoint... for bad faith.