• Th4tGuyII@fedia.io
    link
    fedilink
    arrow-up
    67
    arrow-down
    1
    ·
    7 days ago

    Neural networking has so much potential in so many places, yet of course the industry collectively zoomed in on LLMs specifically and is trying to sell them as a panacea to the world’s problems.

    As though a mechanical parrot knows anything about good coding practices, or literally anything outside of mimicking speech patterns.

      • Th4tGuyII@fedia.io
        link
        fedilink
        arrow-up
        12
        ·
        6 days ago

        My theory is it’s because LLM’s could suck up directly to the C-suite.

        FTFY

        I hate to admit it, but you could very well be onto something haha

    • Zos_Kia@jlai.lu
      link
      fedilink
      arrow-up
      5
      arrow-down
      1
      ·
      6 days ago

      The reason labs focus on LLMs is that language is a great substrate for generalization. Good luck trying to one-shot out of distribution problems using classic neutral networks. They’ve tried for decades to make it happen but LLMs surpassed those results in a few years.

      • placebo@lemmy.zip
        link
        fedilink
        English
        arrow-up
        6
        ·
        6 days ago

        Idk. LLMs don’t seem like a good solution because of how many resources they need to train and run compared to specialized models.

        • Zos_Kia@jlai.lu
          link
          fedilink
          arrow-up
          1
          ·
          6 days ago

          I know it’s in bad taste to quote myself but i wrote an explanation of why this isn’t necessarily a bad solution here

      • Th4tGuyII@fedia.io
        link
        fedilink
        arrow-up
        3
        ·
        6 days ago

        I understand that idea, but at the same time @[email protected] has a point.

        There’s a good reason why you generally don’t get a CPU to do graphics and why FPGAs are usually only put on dev units.

        Specialist hardware is generally much more efficient cost and energy wise than generalist hardware for a given task.

        And I imagine that must be true for neural networks too, as that layer of language processing on top of any task naturally can’t be as efficient/performatative as specialist software/networks made for the job.

        • Zos_Kia@jlai.lu
          link
          fedilink
          arrow-up
          5
          ·
          6 days ago

          And I imagine that must be true for neural networks too, as that layer of language processing on top of any task naturally can’t be as efficient/performatative as specialist software/networks made for the job.

          Oh yeah definitely, a specialized model for each task would be more efficient on the inference side but can you imagine the cost of training a million specialized models ? For example you could think of natural language processing as it was done before : one model for sentiment analysis, one model for chronological analysis, one model for identifying legal terms etc… need to classify color descriptions in natural language ? Well here you go train another model. A small model (comparatively) but also one you’ll have to re-train if you want to change the task even slightly.

          A LLM has the advantage of being able to generalize a lot of different tasks on the same model, including some that are wildly out of distribution (meaning you hadn’t even thought of them and they are not explicitly stated in the training data). So yeah, you pay a big training tax to train one large model, but then it pays off because that same model can perform on a million different tasks.

          At least that’s the thesis. I’m not qualified to judge whether it is proving worth it, but that’s the reason why the industry massively shifted towards LLMs.