ml - Currently working on: Conversion of existing models to full closed form fusion representations. Built a toy example wit

Ranter

Wisecrack

9525

Comments

4

YourMom

1841

20d

Fuck, I need a job now. lol

Great job bro!
3

Wisecrack

9525

20d

@YourMom "Fuck, I need a job now. lol"

You're speaking the truth!

I've had to do a bunch of shit side gigs to raise any money at all for this and am still doing as much because no one would fund it or give me the time of day.

They're all gonna wish they had.
2

YourMom

1841

20d

Honestly, if anyone is going to figure out AGI, it will be @Wisecrack.
5

retoor

2070

20d

That's amazing stuff!

By the way, to keep up with mentions to you on dR, please consider using the tool that @SoldierOfCode wrote. It is a notification system that works perfectly (on at least Linux, Windows he was working on). But I think you're too cool for Windows probably.

Else, consider this every 5 minutes updated html file: https://static.molodetz.nl/dr.menti...

If you are using iPhone, consider to use joyRant from @lensflare which still has working mentioning system. The notifications are just broke.

All these systems use a different way of noticing mentions than the broken devRant notification system.

I'm telling you now because you're not a super frequent user anymore, and so you can keep track :) Friendly advi(s/c?)e.
2

Wisecrack

9525

20d

@YourMom one can dream. I could explain it or post the sketch proof of how an exact replication of the output of an amortized beam search through a reuse of a common metric, gives us a much more generalist framework, but in the first case it just sounds like handwavium, and in the second case, people's eyes would glaze over, so its a lose-lose proposition.
2

Wisecrack

9525

20d

@retoor thank you, and its good to see you're still around retoor.

I'm not super frequent any more because I'm pulling 16-20 hour days in 3-4 day stretches.

No one told me being self-employed was a synonym for torture.

Literally this is the shit on my plate:

- An FHE based compute platform, with improvements to the underlying math and methods, solving both the clock-reset issue and re-imaging issue that allows piracy on some software, and improving the noise propagation issues

- To run unpirateable SOTA commercially-sized ML stateless CFF models on a fraction of existing hardware with built in limits on inference counts based on licensing

- To implement on closed-platform custom ASICs for secure uses like finance and military

- Actual cryptography improvements after I changed my approach (we're running the same complexity class as GNFS now) to 1, secure all this against quantum improvements, 2. win RSA prize money for bootstrapping.

What is this myth called sleep?
2

BordedDev

3300

20d

That's really cool :D

Is it going fast enough that bandwidth is becoming a problem?
2

afaIk

241

20d

Nice, attention to the AI bubble, it can explode (right inside your head).
1

Wisecrack

9525

20d

@BordedDev it is pretty cool, thanks, and i'm glad you enjoyed it.

Bandwidth isn't a problem yet.

bigger problem is, instead of training from scratch, as I scale up, I have to transfer existing models because I don't have the compute to train larger ones at this time.

Which means doing ablations and model surgery if I want very neutral models. And while I'm familiar with fine tuning I'm new to ablations.

I'm implementing techniques that normalize scores across tasks, in order to stabilize loss cross-task rather than having loss being task specific, because that makes it easier to study what network components contribute to performance overall versus per-task.
1

Wisecrack

9525

20d

@afaIk actually a big part of it is democratizing models.

Theres a small vocal minority that really fucking hate AI, trying to project that voice to seem larger than it is, but a lot of the criticism coming out of that crowd is valid concerns and real worries.

The thinking and approach is two fold:

1. by creating models that are small but can compete with cutting edge commercial-scale SOTA offerings, anyone can run them on their desktops or laptops, defeating the corporate moat and popping the bubble

2. at the same time I build highly performant, technically-enforced licensing for big, black box models, that guarantee non-replicability, that are non-transferable outside their environment, with guarantees about cost, compute, memory, and latency, and strong privacy guarantees. Pharmaceutical companies, military, finance, and .gov won't be able to resist.

That second group then lose to the open market through Embrace, Extend, Extinguish.

It's using evil for good.
3

Hazarth

9237

20d

@Wisecrack Man, I hope it's gonna work. One of the worst parts about this whole LLM hype is the amount of resources it requires. The second worst part is that big tech firms are using it to monopolize the market with subpar products that require even more resources and are literally making life miserable for people while also objectively hurting the environment without a second thought.

I really think we fucked up big way with LLMs. The capabilities it have are nowhere near what they need to be to justify the cost. This is literally a kids toy that tech giants can't stop playing with and everyone is paying the price. I'm confident these models can be way more memory and power efficient and we need to democratize this before we're all running GPU farms at home...
3

Lensflare

22328

20d

To make AGI, you just need to make a time machine. When the AGI will be invented in the future, it will go back in time and, voila, AGI!

Next step: Regret your decision to invent a time machine.
2

BordedDev

3300

20d

@Wisecrack Hmmm Yeah that sounds very annoying. Have you looked at what unsloth does to make large models fit in more conventional setups? (e.g. qwen-coder 405b can work with "just" a RTX 3090 + 128gb RAM) And gguf format for some inspiration?

Also it's really really cool :D
1

Wisecrack

9525

19d

@BordedDev The models I'm building are full closed form fusions.

Based on the metrics I've seen something compareable to qwen-coder 405b should be able to run on a mid-range laptop with useable latency.

Given the current test runs, and scaling rules from the data so far, a model that normally fits in 128GB is predicted to fit in as little as 12-18 GB or less for that matter with no meaningful loss of accuracy and/or perplexity.

I've learned before making extraordinary claims I have to run bigger tests before talking, which is what I've done. Not making the same mistakes I made in the cryptography.

By end of winter I expect to be running an equivalent of GPT 2 in something like half a gig of ram because the final architecture doesn't require materializing the weights and biases at all during inference/forward.
2

Wisecrack

9525

19d

@Lensflare The only provable time traveler who ever existed was probably that guy who wrote "I'm my own grandpa."

No AGI yet, but I got a whole bunch of sub-AGI cool shit planned, like for example Mixture-of-Experts contains a common set of problems that when reused as a training signal effectively allow self-supervised learning by virtualizing and therefore amortizing the cost of hard negative mining.

A lot of whats planned boil down to virtualization (reproducing some larger compute/memory/latency heavy mechanism with a single metric) and therefore amortization of some of the better techniques that are otherwise expensive to run.

Got a ton of those for all sorts of things, including the aforementioned hard negative mining, beam search, diffusion, sparsity, matrice dematerialization, etc.

Labs gonna be getting noticed right out of the gate 1st week of formal launch.

OpenAI and others will be like "hire this guy, hire him!" and I'll say no b/c they said no to me before.
0

Wisecrack

9525

18d

"I really think we fucked up big way with LLMs."

This is a common of people who are aggrevied of some matter or another.

"We let this party or that party get out of hand. We allowed IJK corporation

go to far. We're to blame for [insert issue]."

I don't know, did you allow it? Did I?

I don't think we did.

Most of the crappiness of the world follows from natural and artificial

conditions encouraged by organizations of all sorts, to their benefit,

and to the detriment of those less organized.

I think it takes a combination of organized effort and scrappy

individuals to put a stop to general bad conditions that only

benefit a few at the expense of a great many others.

Shit I'm starting to sound like a motherfucking communist.

Related Rants

Add Comment

random

ml

madness