12bitfloat

1y

Is writing hand-optimized SIMD code even still worth it? Thinking about writing my own little math library for my game engine but I've tried writing a hand-optimized `dot(normalize(b - a), foo) >= bar` and somehow it's actually slower than writing the same thing using a math lib which is implemented exclusively with scalar math and auto vectorized by llvm

LLVM... I kneel

rant

gamedev

Ranter

Comments

3

Liebranca

1255

1y

Using Agner Fog's VCL *and* compiling with an LLVM-based compiler should theoretically give the best results.

Plot twist: read the dissasembly this would generate and learn the optimizations by heart, you can then write perfect SIMD fuckery by hand. But who's got time for that?
4

Lensflare

21379

1y

Nothing is worth hand optimizing it if there are already optimized system libs for that.
1

12bitfloat

10916

1y

@Lensflare fair enough, but I'm not happy with the linear algebra crates in Rust so I wanna do it my own way (and just steal all the actual maths from JOML lol)
2

12bitfloat

10916

1y

Soooooo yeah... Uhhh turns out auto-vectorization kinda sucks ass actually

I didn't notice much difference in my initial tests because apparently having multiple rustflags fields in the cargo/config overwrite each other which is fun and it didn't actually compile with proper simd features

Here's the auto-vectorized assembly:
3

12bitfloat

10916

1y

And here's the highlevel hand optimized version:

I.e. this isn't even a fully specialized simd thing, I have just written a Vec struct with optimized dot and normalize methods... Kinda stark difference

Related Rants

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service