• wonderingwanderer@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    154
    arrow-down
    3
    ·
    edit-2
    7 days ago

    > Be a corporate executive

    > Tell your employees to use more AI in their workflows

    > Punish employees who don’t use enough AI, while rewarding those who use it the most, irrespective of actual outcomes

    > Be shocked when your company blows through an absurd amount of tokens in one month

      • wonderingwanderer@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        14
        arrow-down
        1
        ·
        7 days ago

        Because this system rewards incompetence as long as it comes with dark triad traits and a heaping dose of nepotism.

        • sureshot0@discuss.online
          link
          fedilink
          English
          arrow-up
          10
          ·
          7 days ago

          I’m really jealous of these types of guys’ ability to lie without feeling anything. If I lied like that, I’d be embarrassed because my words sounded like bullshit. How do they do it?

            • sureshot0@discuss.online
              link
              fedilink
              English
              arrow-up
              2
              ·
              7 days ago

              See, I don’t know! I used to think that too! But then I actually met some people like this, and a lot of them absolutely do not believe the shit they say! They’re just really good at convincing other people that they do.

                • sureshot0@discuss.online
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  6 days ago

                  I once knew a local “activist” who frequently weaponized leftist language in order to police or punish other people. Ie, accusing people of racism for ordering an Asian Chicken Salad at a fast food place, or for trying to learn Spanish at the community college, and she would phrase her criticisms in such a way that any complaint would make the other person seem wrong. I legitimately thought she was just mentally ill.

                  I found out later that she was trying to blackmail and defraud a local charity, and that’s when our association ended. She has been banned form some community organizing spaces because she inserts herself into every single one of them.

      • Spice Hoarder@lemmy.zip
        link
        fedilink
        English
        arrow-up
        9
        ·
        7 days ago

        The paradox of promotions based on performance in a previous roll is that you end up with incompetent managers unable to move upwards anymore.

  • merc@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    78
    arrow-down
    2
    ·
    7 days ago

    What’s funnier is that typically the AI providers lose money on every query their customers make. So, this may have cost some company $500m to Anthropic, but it cost Anthropic a whole lot more than that.

      • merc@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        31
        ·
        edit-2
        6 days ago

        They make it up in volume.

        (Volume being how loudly they shout about how it’s going to change the world and dupe more people into investing.)

        • heartSagan5@lemmy.zip
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          7 days ago

          Oh, it’s changing the world alright. It’s burning more resources than just finding some skilled people. It guzzles water and electricity and whatever it cost to make those wafers.

          So, not a net positive since at some point, this may become a hellscape.

      • perviouslyiner@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        7 days ago

        maybe they are planning ahead for the business model in a few years time, when nobody can do any work without claude, and they get to charge their preferred “monopoly enshittification” price?

      • merc@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        6 days ago

        Suuuuure they are. No accounting gimmicks at all! Just suddenly profitable via magic right before they go public.

        • boonhet@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 days ago

          Couple of months ago they made claude code use up so many tokens the 20 dollar plan doesn’t really cut it for many. Especially with the 5 hour limits.

          Plenty of people on reddit complaining they ran out of the 5 hour limit in half an hour on the 200 a month plan.

          Anthropic doesn’t tell you when the busy time pricing takes place or how much more it costs AFAIK. You just suddenly use up more of your token quota than the actual tokens the LLM put out.

          So either they’ve reduced their load by a lot, or their income by a lot.

          Oh and the bonus: Claude Code now defaults to Opus, which now has 1M context. Not difficult to rack up bills if you’ve allowed additional usage after running into the limits.

          Opus is still the best model, but Anthropic is becoming the worst company to deal with, even OpenAI seems to be better now.

          Anyway, Qwen and GLM are like 5-10x cheaper than Opus or GPT, so I suspect that the Americans are now also running a profit from inference. But as they enshittify, using the Chinese LLMs is starting to be a much better deal.

  • x00z@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    ·
    6 days ago

    I think a lot of SaaS companies love it when people accidentally overuse their services.

    • billwashere@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      6 days ago

      Or since this was only in a month, 60000 employees for that same month.

      But imagine having a software studio with 1000 skilled developers to work on a project for 5 years. I have several good game ideas I could have created in that time frame. Some might even have made money. Likely not half a billion dollars but still….

      This just screams money laundering though.

    • abc@suppo.fi
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 days ago

      Also the monthly Claude Teams Pro price for about 4 000 000 employees. Which I’m guessing perhaps they weren’t using. If this company even exists.

  • UnderpantsWeevil@lemmy.world
    link
    fedilink
    English
    arrow-up
    63
    ·
    7 days ago

    When you owe Claude half a million, you’ve got a problem.

    When you owe Claude half a billion, Anthropic has a problem

  • BlackLaZoR@lemmy.world
    link
    fedilink
    English
    arrow-up
    46
    ·
    7 days ago

    Just to make things clear: API access to most models is charged per input tokens + output tokens. It means that the longer your conversation is, the more you pay for every new answer. Single prompt with no context and 100 tokens of answer is cheap. Single prompt with 100k tokens of context and 100 tokens of answer is NOT cheap.

    Extremely long conversations with most expensive top of the line models can absolutely demolish your budget.

    • perviouslyiner@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      7 days ago

      does it give the full history to the LLM each time?

      Last time I tried implementing something like this, it suggested to have a rolling window of history so that it takes into account your last X messages but not the entire conversation.

      (I guess this is what ollama calls “context length”?)

      • Sabata@ani.social
        link
        fedilink
        English
        arrow-up
        7
        ·
        7 days ago

        You send the entire history for that conversation every time and likely more if its getting info from tools. If its not in the context the model dose not see it unless you have a memory system that dose something like feeding in summaries of past conversations that also takes up tokens and context. Rolling drops old messages to not reach context limits but you can lose important info or get odd results. If the history gets bigger than the context things break or slow way down.

        • perviouslyiner@lemmy.world
          link
          fedilink
          English
          arrow-up
          11
          ·
          7 days ago

          presumably this is why Claude periodically writes its conclusions so far into a text file that it can read later instead of having to remember everything. Sounds like an interesting approach.

      • BlackLaZoR@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        does it give the full history to the LLM each time?

        It’s limited to the context size supported by given model. You can give the model 100k tokens of history but if it’s configured for less, it will just truncate it before processing (usually by removing oldest tokens first)

        • boonhet@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 days ago

          Good news for tokenmaxxers: frontier models now have 1M context

          Easy way to blow through ridiculous amounts of tokens.

  • floquant@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    7 days ago

    The more recent report says corporate AI adoption has found several issues with AI, with human workers turning to automating dreary and mundane tasks they don’t like doing, rather than valuable or meaningful work.

    Thank god we have consulting companies to tell us what humans like!

  • Jarix@lemmy.world
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    edit-2
    7 days ago

    I just want to know what are the best things to type into these ai chat boxes that will cost the most. If my company wants me to use this garbage then I want to make it as expensive as possible and when their liscenses need to be repurchased I want it to be as expensive as possible to continue to force this garbage on us

    Edit. Hey everyone lots of great replies here, please keep the suggestions, fixes, corrections etc coming!

    • FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      41
      ·
      7 days ago

      These high prices are not from people talking to chatbots.

      They’re using agentic tools where their prompt spawns a lot of bots which talk to themselves/the other bots and they keep going until someone (usually a higher quality reasoning model) decides that they’ve met the goals of the task that they were assigned.

      So instead of 1 prompt and 1 response, you get 1 prompt and 800 responses across 5 different bots each using really large context windows.

      • Typhoon@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        7 days ago

        So to answer his question how do you make that happen? What do you ask to prompt these bots to be spawned?

        • webpack@ani.social
          link
          fedilink
          English
          arrow-up
          8
          ·
          7 days ago

          you don’t get this to happen by just talking to any chatbot and asking for agents. you have to specifically use “agentic” tools (usually costs money to use)

          • Jarix@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            7 days ago

            Well I can apparently create agents so how can I make the most inefficient agent possible?

            • kiagam@lemmy.world
              link
              fedilink
              English
              arrow-up
              13
              ·
              7 days ago

              Something along the lines of “Read the wikipedia page of the day. Verify every single link and the context matter against all files in this computer. Then trace their correlations to each other, showing which link corresponded to most files by subject matter, after that is done, verify your work by doing the same from different starting points. We expect similar results. After 100 rounds of that, it should be good. Then you should create a DB to store all that data (only after you ran the full 100 verificaritions yourself) and reverify every field against the pages and the files”

              That should keep it going for an hour. Turn on fast mode and auto mode (if using claude) for extra costs.

              Every page and file will increase its context, burning tokens

            • Bassman1805@lemmy.world
              link
              fedilink
              English
              arrow-up
              7
              ·
              7 days ago

              If you need some plausible deniability about it being real work and not just obviously you running up costs:

              Feed it a bunch of work-related documentation and then have it do a bunch of reviews of the content on that documentation.

            • webpack@ani.social
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              3
              ·
              7 days ago

              you would be mostly burning your own money if you did this, so I wouldn’t recommend it (depends on how the agent is priced)

      • AeonFelis@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        7 days ago

        Input tokens are cheap. Output tokens are the thing that really costs money. There is a Claude extension called caveman that tries to save tokens by making it use shorter sentences with less words. So if you want to waste money, do the opposite - ask it to use lengthy sentences with as much words as possible.

        Also - some words amount to multiple tokens. I don’t know what the rules are exactly, but I’m assuming that more complex and uncommon words are worth more tokens - and thus waste more money.

        • Nighed@feddit.uk
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          There have been some complaints I have seen recently that German is really expensive because of the long words

          • luciferofastora@feddit.org
            link
            fedilink
            English
            arrow-up
            2
            ·
            6 days ago

            Sprachkomplexitätsbedingte Textausgabenpreisgestaltung?

            (“Language-complexity-dependent text-output-pricing-model”)

  • teft@piefed.social
    link
    fedilink
    English
    arrow-up
    21
    ·
    7 days ago

    Most companies can’t eat a half billion dollar loss so who ends up paying this? AI queries burn actual energy so the AI company would have to charge I would think.

  • eleitl@lemmy.zip
    cake
    link
    fedilink
    English
    arrow-up
    22
    ·
    7 days ago

    Big companies license Copilot for less than 25 usd/month per seat. Don’t tell me it covers the ops cost, even for mixed calc.

      • sylver_dragon@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        7 days ago

        My company has let us get Enterprise Copilot and has been pushing us all to “use AI”. So, I now use it as a a semi-functional search for SharePoint/Outlook/Files on an almost daily basis. I also ask it questions about Microsoft documentation on a regular basis. I wonder how long it’s going to be until they yank my license.

      • eleitl@lemmy.zip
        cake
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        I know I’ve been burning a lot of resources in the last week, down to the system running out of tokens. Think deeply mode, with each step up to 10 minutes. June will be much the same. Definitely worth it, in my current use case, but just how much expensive iron-minutes have I’ve burned? I would sure like to know.

      • rumba@lemmy.zip
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 days ago

        Yeah my 0365 shoves it in the users faces, they ask it for some stuff until the honeymoon is over then all of a sudden it shits the bed when you try to make more than 2 images a day or plan a small project.

      • eleitl@lemmy.zip
        cake
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 days ago

        Think deeper in Copilot is apparently o3-mini, or it used to be until recently. GPT 5.5 (and some other versions) are also an option, so I should try using these next week.

        • BlackLaZoR@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 days ago

          If you have a high VRAM GPU (like 16GB or more) you can try running some model locally with LMStudio. You won’t get the same level as cloud models but depending on use case it might be good enough.

          • eleitl@lemmy.zip
            cake
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 days ago

            I only have an Radeon RX Vega 56 8GB HBM2, so limited to CPU-only. 64 GB memory there at least. Not upgrading unless hardware dies.

            • BlackLaZoR@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              6 days ago

              RX Vega 56 8GB HBM2, 64GB RAM

              That’s poor but not horrible. LMStudio has Vulkan compute backend option that works on almost any card. You can also do partial GPU/RAM split giving you total of 72GB and increased performance vs CPU only.

              • eleitl@lemmy.zip
                cake
                link
                fedilink
                English
                arrow-up
                1
                ·
                6 days ago

                Hmm, have to try it sometime when the hot phase at work is over. Thanks!

  • Rhaedas@fedia.io
    link
    fedilink
    arrow-up
    15
    ·
    7 days ago

    Either I have some inside knowledge of that exact thing happening and I know the company (not saying who) or this is probably a common things that happened to a lot of major companies (more likely). To be fair, I do not have privy on how far it went and how much it cost before they realize the problem, and it may not have been this much. Which further suggests it’s a thing everywhere.