2
YourMom
2d

A Nature Methods paper found that even containerized ML code can produce different results on different hardware. Same code + same data + different GPU = different outcome.

They are even seeing this with containers. Because hardware, drivers, and libraries in actual machine differ.

See people arguing over determinism in code gen. I wonder how much hardware matters.

Comments
  • 1
    I am trying to see the point in your post.

    But if the summary is; that they don't know what they're doing, that's a fact. LLM's do not deliver guaranteed results at all. It's after a training still a blackbox. So, yeah, for a nerd to use for some mayhem coding, sure. But if you would rely on it communicating with customers or whatsoever, they're still not reliable enough. At this point, while i love all the LLM's. I doubt the AI factor. Is this really AI? Who defines!?

    Edit: haha, i swear, also the AGI shit is bullshit, won't happen. The only thing that would happen is that they inventied it. Not gonna happen.
  • 1
    @retoor people are doing AI research of some sort. Spend years and thought they controlled the variables via containers. It is turning out the underlying hardware/software is affecting the result. Could invalidate a lot of research.

    I think its interesting in a growing pains sort of way. Wonder if it will result in some sort of compute standardization. But imagine you put up a paper of some discovery. Only to find its bullshit hardware side effect. Frustrating and years of work lost. I would be pissed as a researcher.
  • 2
    The importance of this depends on what "ML code" is.

    If it's an already-trained model with temperature at some constant, this is slightly interesting because AFAIK an LLM tuned with 0/ constant temp should produce the same results. It's just a program reading the weights of the model.

    If it's a container that is training the model twice and comparing results, this is painfully obvious. There's a lot of stochasticity in model training.
  • 0
    @AlgoRythm so is there differences in training the same model on the same setup?
  • 0
    is this the one you're referencing?

    https://onlinelibrary.wiley.com/doi...

    "Various studies have demonstrated that hardware differences, such as different GPUs or CPUs, and compiler settings can lead to different computational outcomes (Hong et al. 2013). Additionally, a comparison between the same ML algorithm with fixed random seeds executed using PyTorch and TensorFlow resulted in different performances "
  • 1
    @YourMom of course, models are generally initialized with random weights, not zero, because zero weight wrongfully indicates a negative relationship between two neurons. If the weight is random, it can be smoothed out as you train into the correct value as you optimize for your loss function.
  • 0
    @qwwerty the most boring thing of all time is a paper comparing the training of 2 copies of the same model using 2 different implementations of ML. Yawnnn
Add Comment