

Hopefully it was a symbolic downvote. They say they did only to provoke but in reality they did upvote.
Hopefully it was a symbolic downvote. They say they did only to provoke but in reality they did upvote.
Good analogy as most people don’t understand how a microwave is working either.
That being said, at least microwaving isn’t on fast track to pollute our entire ecosystem so…
Hand them a mirror.
Both are probably wrong so would be nice to have data instead. Here in Belgium checking out from postal workers deliveries or on recycling garbage day I can see a lot of Amazon parcels unfortunately. Your observation is not wrong, neither is mine, so the question rather is how relevant they are when scaled to all of Europe.
The biggest flaw in this study is that the LLM group wasn’t allowed to edit their essays
I didn’t read the whole thing but only skimmed through the protocol. I only spotted
“participants were instructed to pick a topic among the proposed prompts, and then to produce an essay based on the topic’s assignment within a 20 minutes time limit. Depending on the participant’s group assignment, the participants received additional instructions to follow: those in the LLM group (Group 1) were restricted to using only ChatGPT, and explicitly prohibited from visiting any websites or other LLM bots. The ChatGPT account was provided to them. They were instructed not to change any settings or delete any conversations.”
which I don’t interpret as no editing. Can you please share where you found that out?
The biggest flaw in this study is that the LLM group wasn’t allowed to edit their essays
Very interesting, emphasis mine :
"findings support the view that external support tools restructure not only task performance but also the underlying cognitive architecture. The Brain-only group leveraged broad, distributed neural networks for internally generated content; the Search Engine group relied on hybrid strategies of visual information management and regulatory control; and the LLM group optimized for procedural integration of AI-generated suggestions.
These distinctions carry significant implications for cognitive load theory, the extended mind hypothesis [102], and educational practice. As reliance on AI tools increases, careful attention must be paid to how such systems affect neurocognitive development, especially the potential trade-offs between external support and internal synthesis."
Also the focus on agency and ownership is also very interesting, namely regardless of the scored outcome or how one might think the work itself changed them, or not, do they themselves feel it is their work?
2, 3 and 4 also are about politics.
I for one knew it and yet I enjoy, in a very tragic way, discovering that she was, actually, even worst than I thought.
I’m playing games at home. I’m running models at home (I linked in other similar answers to it) for benchmarking.
My point is that models are just like anything I bring into my home I try to only buy products that are manufactured properly. Someone else in this thread asked me about child labor for electronics and IMHO that was actually a good analogy. You here mention buying a microwave and that’s another good example.
Yes, if we do want to establish feedback in the supply chain, we must know how everything we rely on is made. It’s that simple.
There are already quite a few initiatives for that with e.g. coffee with Fair Trade Certification or ISO 14001, in electronics Fair Materials, etc.
The point being that there are already mechanisms for feedback in other fields and in ML there are already model cards with a co2_eq_emissions
field, so why couldn’t feedback also work in this field?
Moore’s law is kinda still in effect, depending on your definition of Moore’s law.
Sounds like the goal post is moving faster than the number of transistors in an integrated circuit.
LOL… you did make me chuckle.
Aren’t we 18months until developers get replaced by AI… for like few years now?
Of course “AI” even loosely defined progressed a lot and it is genuinely impressive (even though the actual use case for most hype, i.e. LLM and GenAI, is mostly lazier search, more efficient spam&scam personalized text or impersonation) but exponential is not sustainable. It’s a marketing term to keep on fueling the hype.
That’s despite so much resources, namely R&D and data centers, being poured in… and yet there is not “GPT5” or anything that most people use on a daily basis for anything “productive” except unreliable summarization or STT (which both had plenty of tools for decades).
So… yeah, it’s a slow take off, as expected. shrug
That’s been addressed few times already so I let you check the history if you are actually curious.
No one is saying training costs are negligible.
It’s literally what the person I initially asked said though, they said they don’t know and don’t care.
Yes indeed, yet my point is that we keep on training models TODAY so if keep on not caring, then we do postpone the same problem, cf https://lemmy.world/post/30563785/17400518
Basically yes, use trained model today if you want but if we don’t set a trend then despite the undeniable ecological impact, there will be no corrective measure.
It’s not enough to just say “Oh well, it used a ton of energy. We MUST use it now.”
Anyway, my overall point was that training takes a ton of energy. I’m not asking your or OP or anyone else NOT to use such models. I’m solely pointing out that doing so without understand the process that lead to such models, including but not limited to energy for training, is naive at best.
Edit: it’s also important to point out alternatives that are not models, namely there are already plenty of specialized tools that are MORE efficient AND accurate today. So even if the model took a ton of energy to train, in such case it’s still not rational to use it. It’s a sunk cost.
all of the best programmers and IT people smoke in their off time.
Bit much… probably a lot of the best but definitely not all.
Anyway, yes, sorry for being finicky, but also that those same people can probably find another workplace which do not care about that AND pays more.
Indeed, the argument is mostly for future usage and future models. The overall point being that assuming training costs are negligible is either naive or showing that one does not care much for the environment.
From a business perspective, if I’m Microsoft or OpenAI, and I see a trend to prioritize models that minimize training costs, or even that users are avoiding costly to train model, I will adapt to it. On the other hand if I see nobody cares for that, or that even building more data center drives the value up, I will build bigger models regardless of usage or energy cost.
The point is that training is expensive and that pointing only to inference is like the Titanic going full speed ahead toward the iceberg saying how small it is. It is not small.
Right and that’s just for inference I imagine, not training. Still honestly that’s exactly the kind of experimentation most people are not doing and thus not realizing the impact of a simple click on a slick button to get an imagine out.
They click, get the cute image, move on.
You on the other hand, like me and others, do have a first hand experience of SOME of the energetic cost … and that is scary to imagine this at scale.
Right, my point is exactly that though, that OP by having just downloaded it might not realize the training costs. They might be low but on average they are quite high, at least relative to fine-tuning or inference. So my question was precisely to highlight that running locally while not knowing the training cost is naive, ecologically speaking. They did clarify though that they do not care so that’s coherent for them. I’m insisting on that point because maybe others would think “Oh… I can run a model locally, then it’s not <<evil>>” so I’m trying to clarify (and please let me know if I’m wrong) that it is good for privacy but the upfront training cost are not insignificant and might lead some people to prefer NOT relying on very costly to train models and prefer others, or a even a totally different solution.
I have a page documenting what I tried locally (if you are curious check https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence ) and I even try models on my GPU or dedicated hardware (e.g OAK-D Lite) so yes I do believe self-hosting is interesting on some dimensions, in particular in terms of privacy and sovereignty, but my question was about the energy cost.
My understanding is that training is order of magnitude more expensive than inference, regardless of where it’s being run, remotely, self-hosted or locally. Am I way off base?
Which… takes maximum 1min to do.