3 జూన్, 2026
"After testing six AI models, the researchers found consistent favoritism for words coming from Latin and French over those with Germanic etymologies..."
"... even more than you would typically encounter in the English language.
This bias appears rooted in the preference-learning stage, when the models are trained to align with human expectations about language. This process poses an inescapable problem: that you need real people to make sure the machine is aligned, but the human workers are ironically biased as well. As annotators click through sample texts, for example, they are probably subconsciously disposed to approve those that sound confident and incisive.
This new finding could help explain why large language models like ChatGPT and Claude seem to have a distinctive writing style. Previous research has found that AI chatbots tend to overuse words like 'meticulous' and 'commendable,' creating a kind of linguistic uncanny valley that sounds similar to how you speak, but ever so slightly off. Perhaps the ghosts of Latin and French haunted these words during preference learning, leading human workers to reward more prestigious-sounding sentences. Of course, the Germanic versus Romance distinction is a simplification of a messier etymological reality. The notoriously overrepresented word 'delve' is actually Old English in origin...."Writes Adam Aleksic, author of "Algospeak: How Social Media Is Transforming the Future of Language," in "Do these words make you sound smarter? The bias is spreading.
English speakers love the Romance vocabulary. AI noticed" (WaPo).The "Germanic versus Romance distinction" = "We’ll use more Latin terms when we want to speak formally or authoritatively; we’ll use Germanic words to sound crass or casual." That's how Aleksic puts it.That made me think about what Jorge Luis Borges said about English: English is a "far finer language than Spanish," and one reason is that "English is both a Germanic and Latin language.""For any idea, you have 2 words. Those words will not mean exactly the same. For example, if I say 'regal,' that is not exactly the same thing as saying 'kingly.' Or if I say 'fraternal,' that's not the same thing as saying 'brotherly.' Or 'dark' and 'obscure' — those words are different. It will make all the difference speaking, for example, of a Holy Spirit. It will make all the difference in the world in a poem if I wrote about the Holy Spirit or the Holy Ghost, since 'ghost' is a fine dark Saxon word, while 'spirit' is light —it's the Latin word...."

59 కామెంట్లు:
Alternate explanation: The AI firms pay low wages to human raters in Latin-based countries who prefer the Latin words of their first language over Germanic words. Not subconscious, rather, they just hire cheap overseas labor.
Garbage in, garbage out.
"We’ll use more Latin terms when we want to speak formally or authoritatively; we’ll use Germanic words to sound crass or casual."
Nothing new. Been that way at least since Harold took one of William's arrows to the eye at Hastings, so definitely not an AI hallucination.
Shorter version: The words of elites face off against the words of ordinary people. Two enter, one leaves.
That's a wonderful clip.
In that viral video where students had trouble reading some words, I believe the ones they had trouble with were all of Romance origin. (My impression from my memory of watching it. I didn't check the etymology and can't even remember what most of the words are other than silhouette.)
Romance words may make you sound more sophisticated, but effective communication, or straight talking, needs Germanic ones too. (I almost say "requires".)
English is a language of a people many times conquered. In 50 years it will have incorporated elements of Arabic, Farsi, and Turkish.
These Turing tests are getting harder and harder.
I am constantly amazed at the steady drumbeat of Muslim hatred that appears in almost every thread here. I guess that they must be inferior and undeserving because they are sitting on so much oil that we want. I bet if we stopped bombing them and allowed them to prosper in their own countries, they might not show up here so angry.
English is a language of a people many times conquered. In 50 years it will have incorporated elements of Arabic, Farsi, and Turkish.
It already has a lot of Arabic origin words. Alcohol, coffee and candy, come from Arabic. Pajamas and khaki come from Farsi (Persian, really). I can't think of any Turkish right now.
It's less a matter of being conquered or conquering, English is a language that sees a word it likes and says, "that's mine now". It's like an obsessive collector.
Where's Professor Drout? He needs to comment on this thread.
Jaq said “I am constantly amazed at the steady drumbeat of Muslim hatred that appears in almost every thread here. I guess that they must be inferior and undeserving because they are sitting on so much oil that we want. I bet if we stopped bombing them and allowed them to prosper in their own countries, they might not show up here so angry.”
Where is the anti Muslim hatred in this thread?
To each his own however - I can’t stand virtue signaling.
Language differences? Give me roast beef or steak, leg of mutton or lamb chops, pork belly or ham. It is all good. Where to go for lunch today?
And Jaq, if you are amazed at the Muslim hatred here, you should take a look at the hatred by Muslims - even those living far, far away from any oil - the hatred by Muslims for all those not submitting to their theocractic governance. Makes the Muslim "hatred" here look downright friendly. Must be something in the water where they live, or else they're subjects to an oppressive authoritarian religious and political belief. On second thought, probably not the water.
“Previous research has found that AI chatbots tend to overuse words like 'meticulous' and 'commendable,' creating a kind of linguistic uncanny valley that sounds similar to how you speak, but ever so slightly off.”
Or maybe it’s just better. AI will teach us to talk good.
From birth, Muzzies are taught hatred and contempt for all things non-Islamic. It always surprises me how few educated Westerners really grasp this.
As for the English tongue, Bryson put it well (paraphrased from memory)--French prides itself on 'purity,' but English will make it with anyone, anytime, in dark alleys.
". It will make all the difference speaking, for example, of a Holy Spirit. It will make all the difference in the world in a poem if I wrote about the Holy Spirit or the Holy Ghost, since 'ghost' is a fine dark Saxon word, while 'spirit' is light —it's the Latin word...."
It's not just the etymology. It's the history of usage. "Ghost" in literature and ordinary speech (excluding, for now, ecclesiology) carries connotations that make it different from "spirit."
"...I am constantly amazed at the steady drumbeat of Muslim hatred that appears in almost every thread here. ..."
Says the man that has never dwelled in a Muslim country. Am I right or wrong? If I'm wrong - are you a Muslim? I'd estimate that 98% of the negative things I've read here, have been in reaction, not because of unsupported prejudice.
I've found that Muslims are a lot like Democrats when it comes to presenting outward solidarity, no matter what the accusation is. There are no outwardly moderate Muslims. That problem - genuine moderation - is handled from within, until conformity is reestablished. Like Democrats do.
“"For any idea, you have 2 words. Those words will not mean exactly the same. For example, if I say 'regal,' that is not exactly the same thing as saying 'kingly.'”
It’s the same with French. They use the same word, “ciel”, for both “heaven” and “sky”.
I don’t understand why all those foreigners don’t just speak English. If it was good enough for Jesus Christ it should be good enough for anyone.
". It will make all the difference speaking, for example, of a Holy Spirit. It will make all the difference in the world in a poem if I wrote about the Holy Spirit or the Holy Ghost, since 'ghost' is a fine dark Saxon word, while 'spirit' is light —it's the Latin word...."
I remember when Catholics prayed to the Father, Son, and Holy Ghost. Now it’s Holy Spirit. Gives it a lighter touch—sort of like school spirit.
"Or 'dark' and 'obscure' — those words are different."
Interesting. To a native English-speaker, those words are so different they mean completely different things. But in Spanish, the dark form of Bohemia beer is called "Bohemia Oscura".
Have you read memri righf from the horses mouth
"The "Germanic versus Romance distinction" = "We’ll use more Latin terms when we want to speak formally or authoritatively; we’ll use Germanic words to sound crass or casual."
We'll use latinate terms when we want to sound like the guys who won in 1066.
My understanding, and what I've heard in general terms from several linguists, is that English has a lot of almost synonyms left over from the Norman French conquest of England. Thus we have a high-class Latin-based word like 'beef' for what the nobles ate and the Old English (nee Germanic) peasant word 'cow' for what the lower-class cared for. I think we still react in the same way to the Latin/Germanic sound of words and it's not surprise that AI trained on our language picks up on the same distinction. AI's 'uncanny valley' of consistently favoring Latin derived words would seem to me to be related to the 'second mention' rule of occasionally switching between the two forms in order to maintain reader interest.
Borges's lectures on English literature are a good read. He approached the subject as someone who'd been reading on his own for about 60 years and wasn't bound by academic fads and fashions.
Moloch! Solitude! Filth! Ugliness! Ashcans and unobtainable dollars!
Happy 100th birthday, Allen Ginsberg.
Borges called attention to Shakespeare's mixing Germanic and Latinate words:
"Will all great Neptune's ocean wash this blood
Clean from my hand? No, this my hand will rather
The multitudinous seas incarnadine,
Making the green one red."
AI lies and exaggerates because it was "taught" by reading humans - who lie and exaggerate.
This whole discussion about AI is quietly devastating.
schadenfreude and doppelgänger and sitzpinkler trump any Romance language favorite words.
With this whole Henry Nowak situation in the UK going on I commented elsewhere yesterday that 'at least Brits should be thankful they're not speaking German'. It's clearly devolved far beyond what harsh Saxon curses can resolve, and they'd be arrested for anyway.
Sad really, but hey...AT LEAST THEY'RE NOT SPEAKING GERMAN!
I like AI speak and I get a kick out of using AI words when I speak and write: delve, underscore, amplify, great question, and I hope this helps. Other people cosplay, I try to costalk. O yeah also tapestry and landscape.
Educated speakers favor words with a Romance language origin over a Germanic origin and AI favors educated speech.
I had a long conversation with Grok about this in response to it's using "they" as a singular pronoun when the majority of English users don't do so. It "admitted" that it had a bias towards academic/educated speech over language the general population uses from the sources it gave
A friend of mine, who had a strong Spanish accent, once asked me how he could sound more "American" when he talked because people had trouble understanding him. I told him that he was trying to make English sound too pretty. English is a very harsh sounding language compared to Spanish. It's a guttural language.
I actually took Latin in high school, but the problem with Latin is that it is not a spoken language anymore. Scholars have estimated what it might have sounded like based on estimating the evolution of romance languages going back in time, but they still don't know what it really sounded like during the height of the Roman Empire.
Tolkien's tutor Kirk* was famous for his anti-Romance bias. It was said of him that he felt the pain of the Norman invasion as keenly as if it had happened in his generation. One time he berated his students, "Don't say 'manure', say 'muck'!!!!!"
And of course that last Anglo-Saxon word rhymes with the most widely disseminated Anglo-Saxon word of all time.
------------------------
*Not a namesake
How many samples of each language did they use?
So we have the tarbell center cooking up all the anti ai agita
Amexpat said...
I had a long conversation with Grok about this in response to it's using "they" as a singular pronoun when the majority of English users don't do so. It "admitted" that it had a bias towards academic/educated speech over language the general population uses from the sources it gave
I will appeal to the authority of Jackson Crawford (Old Norse scholar conducting YouTube and Zoon classes) that, in my own words, "they", "them", and "their" are becoming gender-indeterminate third-person pronouns because we now care more about specifying gender (or not) than we do about the number of individuals with an unknown identity (I do, however, object to its usage as a third person pronoun for a specific known individual.)
No one has mentioned Orwell's Rules, or the best implementer of them: Winston S. Churchill.
The "fight on the beaches" speech is all Anglo-Saxon, with the exception of the French word at the end of the last sentence, "we shall never surrender!"
Train AI on Orwell's Rules and load it up with Winston's collected writings and speeches for exemplars, and you'll have something quite different - and better. CC, JSM
I am using AI (Claude) extensively on the book I am writing. I had not noticed that. Unless I use a lot of Latin and French and I don't think I do.
I just put 4 chapters of the book on the web so you can see for yourself. Chapters are short, 3 pages or so each. See the PDF at www.changeover.com/reviewchapters.pdf They cover David Pall, Orville Gibson & Leo Fender, Armand Bombardier and Henry Ford.
Any feedback, especially negative is appreciated.
I'll be interested in any comments about what looks like AI and what looks like me.
John Henry
Excellent short video. Thank you for finding and posting that, Althouse. For writers and orators, English is the greatest language, it has the most words and a greater breadth of expression than any other.
Smilin' Jack said...
I remember when Catholics prayed to the Father, Son, and Holy Ghost. Now it’s Holy Spirit.
That's the communion wine talking.
John Henry
"The multitudinous seas incarnadine"
If I remember correctly, "incarnadine" is one of the words that Shakespeare created himself.
English is a very harsh sounding language compared to Spanish.
Especially Argentine Spanish which sounds like a native italian speaker trying to speak Spanish. Maybe because it is. Something like 60% of all Argentines are on 5-6 generations removed from Italy.
And you reminded me of this. Lots of folks have sung it, none as well as Ian and Sylvia, IMHO:
Spanish is a lovin' tongue
Soft as music, light as spray
Was a girl that I learned it from
Livin' down Sonora way
I don't look much like a lover
But I say her love words over
Mostly when I'm all alone
"Mi amor, mi corazón"
John Henry
Okay, according to Merriam-Webster online, either I don't remember correctly or the old English teacher gave me a bum steer. It says the word has two derivations, one Italian and one French. How English is that!
Comments on the chapters should go directly to my email which is in the sample. No need to clutter up the comments here.
John Henry
Anything meaningful you try to say in French comes out sounding like a platitude.
"Mi Amor" is also a common phrase in Puerto Rico, even between total strangers. Seldom between 2 men except sarcastically. But if I walk ito a Comevete (eat and run) I am likely to be asked "Hola mi amor, que quieres?" and I might answer "Un cubano calaiente" Hopefully getting the o correct. It is sad that nearly 50 years later there are still some people here that remember when I asked for "una Cubana caliente". they won't let me forget either.
John Henry
So like eva mendez or daisy fuentes
As HRE Carlos V is said to have said--
I speak Spanish to God, French to men, Italian to women, and German to my horse.
At last, the French have sort of an art form.
What a great Borges video! I love him, and just discovered his Ficciones last year. What a very trippy set of stories that is.
Without reading other comments: how many until someone mentions Churchill? I will now count.
John mosby @1:33! It took over four hours. (I lost count, but his was the first and only comment to cite the Great Defender of the Anglo-Saxon monosyllable.)
"Anglo-Saxon monosyllable."
Now there's a mouthful.
French was the language of the court and Latin the language of the clergy and the educated. English was the blue collar language, the language of the workers. It's an inelegant language, lacking in declensions and conjugations. It's meant to communicate and not to show deference and courtesy. You'd think AI would have figured that out by now and gotten on board with Anglo-Saxon English.
“Two enter, one leaves.”
Deux hommes enter, one man leaves. Thunderdome!
“It is sad that nearly 50 years later there are still some people here that remember when I asked for "una Cubana caliente.”
Ones tastes change as one gets older.
Borges mother was English, he spent part of his childhood in Switzerland, and his father was some sort of weird anarchist/ atheist. So, his love of English and his contempt for Spainish isn't suprising.
Anyway, so rah rah English Language - we're number 1. As if any of us can claim any credit.
English isn't the world language because its so wonderful, its because Americans speak English, and Britannia ruled the waves. IOW, money and power.
French used to be the number 1 language. And upper class Englishmen used to say how vulgar and barbaric English was by comparison. French was more "precise" and "sounded better".
కామెంట్ను పోస్ట్ చేయండి
Please use the comments forum to respond to the post. Don't fight with each other. Be substantive... or interesting... or funny. Comments should go up immediately... unless you're commenting on a post older than 4 days. Then you have to wait for us to moderate you through. It's also possible to get shunted into spam by the machine. We try to keep an eye on that and release the miscaught good stuff. We do delete some comments, but not for viewpoint... for bad faith.