> Be a corporate executive
> Tell your employees to use more AI in their workflows
> Punish employees who don’t use enough AI, while rewarding those who use it the most, irrespective of actual outcomes
> Be shocked when your company blows through an absurd amount of tokens in one month

Don’t know why bosses are universally this out of touch in literally every single industry
Because this system rewards incompetence as long as it comes with dark triad traits and a heaping dose of nepotism.
I’m really jealous of these types of guys’ ability to lie without feeling anything. If I lied like that, I’d be embarrassed because my words sounded like bullshit. How do they do it?
Cocaine helps with that
deleted by creator
Damn
It’s probably a lot easier when they have themselves fooled as hard as any other
See, I don’t know! I used to think that too! But then I actually met some people like this, and a lot of them absolutely do not believe the shit they say! They’re just really good at convincing other people that they do.
Well there’s the dark triad in action
I once knew a local “activist” who frequently weaponized leftist language in order to police or punish other people. Ie, accusing people of racism for ordering an Asian Chicken Salad at a fast food place, or for trying to learn Spanish at the community college, and she would phrase her criticisms in such a way that any complaint would make the other person seem wrong. I legitimately thought she was just mentally ill.
I found out later that she was trying to blackmail and defraud a local charity, and that’s when our association ended. She has been banned form some community organizing spaces because she inserts herself into every single one of them.
deleted by creator
The paradox of promotions based on performance in a previous roll is that you end up with incompetent managers unable to move upwards anymore.
and the RECORD profits after laying people off and says its due to AI increasing that profit, rinse repeat.
Sounds like a good way to move around money real and imagined.
What’s funnier is that typically the AI providers lose money on every query their customers make. So, this may have cost some company $500m to Anthropic, but it cost Anthropic a whole lot more than that.
What a brilliant business model.
They make it up in volume.
(Volume being how loudly they shout about how it’s going to change the world and dupe more people into investing.)
Oh, it’s changing the world alright. It’s burning more resources than just finding some skilled people. It guzzles water and electricity and whatever it cost to make those wafers.
So, not a net positive since at some point, this may become a hellscape.
maybe they are planning ahead for the business model in a few years time, when nobody can do any work without claude, and they get to charge their preferred “monopoly enshittification” price?
Absolutely this is the plan.
Anthropic is projecting a half a billion dollars profit this quarter
Suuuuure they are. No accounting gimmicks at all! Just suddenly profitable via magic right before they go public.
Couple of months ago they made claude code use up so many tokens the 20 dollar plan doesn’t really cut it for many. Especially with the 5 hour limits.
Plenty of people on reddit complaining they ran out of the 5 hour limit in half an hour on the 200 a month plan.
Anthropic doesn’t tell you when the busy time pricing takes place or how much more it costs AFAIK. You just suddenly use up more of your token quota than the actual tokens the LLM put out.
So either they’ve reduced their load by a lot, or their income by a lot.
Oh and the bonus: Claude Code now defaults to Opus, which now has 1M context. Not difficult to rack up bills if you’ve allowed additional usage after running into the limits.
Opus is still the best model, but Anthropic is becoming the worst company to deal with, even OpenAI seems to be better now.
Anyway, Qwen and GLM are like 5-10x cheaper than Opus or GPT, so I suspect that the Americans are now also running a profit from inference. But as they enshittify, using the Chinese LLMs is starting to be a much better deal.
Proof? I’ve heard otherwise and you were first to make a statement
I think a lot of SaaS companies love it when people accidentally overuse their services.
This is the yearly salary of 5,000 well paid employees…
Or since this was only in a month, 60000 employees for that same month.
But imagine having a software studio with 1000 skilled developers to work on a project for 5 years. I have several good game ideas I could have created in that time frame. Some might even have made money. Likely not half a billion dollars but still….
This just screams money laundering though.
Also the monthly Claude Teams Pro price for about 4 000 000 employees. Which I’m guessing perhaps they weren’t using. If this company even exists.
When you owe Claude half a million, you’ve got a problem.
When you owe Claude half a billion, Anthropic has a problem
It’s probably Amazon. They can absolutely afford it.
Just to make things clear: API access to most models is charged per input tokens + output tokens. It means that the longer your conversation is, the more you pay for every new answer. Single prompt with no context and 100 tokens of answer is cheap. Single prompt with 100k tokens of context and 100 tokens of answer is NOT cheap.
Extremely long conversations with most expensive top of the line models can absolutely demolish your budget.
does it give the full history to the LLM each time?
Last time I tried implementing something like this, it suggested to have a rolling window of history so that it takes into account your last X messages but not the entire conversation.
(I guess this is what ollama calls “context length”?)
Most agent harnesses do something called “compaction.” For example, here’s how Pi does compaction
You send the entire history for that conversation every time and likely more if its getting info from tools. If its not in the context the model dose not see it unless you have a memory system that dose something like feeding in summaries of past conversations that also takes up tokens and context. Rolling drops old messages to not reach context limits but you can lose important info or get odd results. If the history gets bigger than the context things break or slow way down.
presumably this is why Claude periodically writes its conclusions so far into a text file that it can read later instead of having to remember everything. Sounds like an interesting approach.
does it give the full history to the LLM each time?
It’s limited to the context size supported by given model. You can give the model 100k tokens of history but if it’s configured for less, it will just truncate it before processing (usually by removing oldest tokens first)
Good news for tokenmaxxers: frontier models now have 1M context
Easy way to blow through ridiculous amounts of tokens.
The more recent report says corporate AI adoption has found several issues with AI, with human workers turning to automating dreary and mundane tasks they don’t like doing, rather than valuable or meaningful work.
Thank god we have consulting companies to tell us what humans like!
I just want to know what are the best things to type into these ai chat boxes that will cost the most. If my company wants me to use this garbage then I want to make it as expensive as possible and when their liscenses need to be repurchased I want it to be as expensive as possible to continue to force this garbage on us
Edit. Hey everyone lots of great replies here, please keep the suggestions, fixes, corrections etc coming!
These high prices are not from people talking to chatbots.
They’re using agentic tools where their prompt spawns a lot of bots which talk to themselves/the other bots and they keep going until someone (usually a higher quality reasoning model) decides that they’ve met the goals of the task that they were assigned.
So instead of 1 prompt and 1 response, you get 1 prompt and 800 responses across 5 different bots each using really large context windows.
“Continue modifying this code until all unit-tests pass”
(gives it conflicting unit tests)
So to answer his question how do you make that happen? What do you ask to prompt these bots to be spawned?
you don’t get this to happen by just talking to any chatbot and asking for agents. you have to specifically use “agentic” tools (usually costs money to use)
Well I can apparently create agents so how can I make the most inefficient agent possible?
Something along the lines of “Read the wikipedia page of the day. Verify every single link and the context matter against all files in this computer. Then trace their correlations to each other, showing which link corresponded to most files by subject matter, after that is done, verify your work by doing the same from different starting points. We expect similar results. After 100 rounds of that, it should be good. Then you should create a DB to store all that data (only after you ran the full 100 verificaritions yourself) and reverify every field against the pages and the files”
That should keep it going for an hour. Turn on fast mode and auto mode (if using claude) for extra costs.
Every page and file will increase its context, burning tokens
I may or may not be getting access to claude soon …
If you need some plausible deniability about it being real work and not just obviously you running up costs:
Feed it a bunch of work-related documentation and then have it do a bunch of reviews of the content on that documentation.
Theoretically speaking… This is very useful to consider
you would be mostly burning your own money if you did this, so I wouldn’t recommend it (depends on how the agent is priced)
I’m not using my money for any of this…
oops, didn’t read the part that it would be your company’s money my b
Just attach a bunch of text files, you’ll blow through tokens quickly
Input tokens are cheap. Output tokens are the thing that really costs money. There is a Claude extension called caveman that tries to save tokens by making it use shorter sentences with less words. So if you want to waste money, do the opposite - ask it to use lengthy sentences with as much words as possible.
Also - some words amount to multiple tokens. I don’t know what the rules are exactly, but I’m assuming that more complex and uncommon words are worth more tokens - and thus waste more money.
There have been some complaints I have seen recently that German is really expensive because of the long words
Sprachkomplexitätsbedingte Textausgabenpreisgestaltung?
(“Language-complexity-dependent text-output-pricing-model”)
Maybe AI will finally negatively impact some CEO jobs.
Nah they’ll make up the costs by laying off workers, determined by some bs metric
Told to them by AI.
Most companies can’t eat a half billion dollar loss so who ends up paying this? AI queries burn actual energy so the AI company would have to charge I would think.
Most companies can’t eat a half billion dollar loss so who ends up paying this?
Taxpaying proles will foot the bill somehow.
I’ll cover it, but only this once. Let’s not make this a habit.
Big companies license Copilot for less than 25 usd/month per seat. Don’t tell me it covers the ops cost, even for mixed calc.
Im June they are switching Copilot to metered usage . People are going to be out of credits on the third day.
My company has let us get Enterprise Copilot and has been pushing us all to “use AI”. So, I now use it as a a semi-functional search for SharePoint/Outlook/Files on an almost daily basis. I also ask it questions about Microsoft documentation on a regular basis. I wonder how long it’s going to be until they yank my license.
Apparently they’ll notice when you reach half a billion dollars
Only if they actually use it
I know I’ve been burning a lot of resources in the last week, down to the system running out of tokens. Think deeply mode, with each step up to 10 minutes. June will be much the same. Definitely worth it, in my current use case, but just how much expensive iron-minutes have I’ve burned? I would sure like to know.
Yeah my 0365 shoves it in the users faces, they ask it for some stuff until the honeymoon is over then all of a sudden it shits the bed when you try to make more than 2 images a day or plan a small project.
Depends how big the model is. Smaller models can be dirt cheap.
Think deeper in Copilot is apparently o3-mini, or it used to be until recently. GPT 5.5 (and some other versions) are also an option, so I should try using these next week.
If you have a high VRAM GPU (like 16GB or more) you can try running some model locally with LMStudio. You won’t get the same level as cloud models but depending on use case it might be good enough.
I only have an Radeon RX Vega 56 8GB HBM2, so limited to CPU-only. 64 GB memory there at least. Not upgrading unless hardware dies.
RX Vega 56 8GB HBM2, 64GB RAM
That’s poor but not horrible. LMStudio has Vulkan compute backend option that works on almost any card. You can also do partial GPU/RAM split giving you total of 72GB and increased performance vs CPU only.
Hmm, have to try it sometime when the hot phase at work is over. Thanks!
Either I have some inside knowledge of that exact thing happening and I know the company (not saying who) or this is probably a common things that happened to a lot of major companies (more likely). To be fair, I do not have privy on how far it went and how much it cost before they realize the problem, and it may not have been this much. Which further suggests it’s a thing everywhere.
Mystery company as in a totally fabricated company?
I’d guess Meta.
















