Details
-
AboutBusy inventing skynet.
-
SkillsUI Design (3 years), Javascript, Python, and levels of shitposting that aren't even supposed to be possible.
-
Location31.5017° N, 34.4668° E
Joined devRant on 5/5/2019
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
-
You know the old adage "cost, speed, quality, pick two."
I've come up with my own.
Shitty jobs you'll forever be stuck in with no way out: unreasonable demands and coworkers that drive you insane, pay below the poverty line, no sleep.
Pick two.
More likely, pick three.6 -
Some notes from prior to developing my current language model:
https://miro.com/app/board/...
Started with ngrams, moved on from that, and the whole thing got away from me fast.
Working on building and training it on rgb-to-color categorization this week. Experiments designed just gotta implement it now.2 -
Remember my LLM post about 'ephemeral' tokens that aren't visible but change how tokens are generated?
Now GPT has them in the form of 'hidden reasoning' tokens:
https://simonwillison.net/2024/Sep/...
Something I came up with a year prior and put in my new black book, and they just got to the idea a week after I posted it publicly.
Just wanted to brag a bit. Someone at OpenAI has the same general vision I do.20 -
So apparently I own land in dubai. Like three separate mortgages based on the email I received.
Your request (Mortgage Registration)
with request number xxxxx / 2024
has been completed
and you can print your issued certificate from this [link]
I've stripped out the numbers and link.
After confirming it was safe I followed through on a old spare cellphone, and yep, I own three mortgages for properties in dubai.
Except obviously I don't.
Someone used my name, an american, to register mortgages in dubai. *Nice* properties according to the pictures.
What started out as a scam email, or what looked like a scam email, went to an actual government of dubai website, with real mortgage registrations.
How in the fuck does that happen?
The only thing I can think of is someone committed identity fraud, and/or an alphabet agency went through the list of known political dissidents, set up a bullshit mortgage in a questionable territory, and are now using that as a pretext to monitor 'extremists with foreign ties.'
All that for some guy on the west coast that hasn't attended a political rally in his entire life.
Must have been that sign I held at sixteen years old by the side of the road that said "bush lied us into a war, and people died."
or maybe it was that time I told a really enthusiastic obama supporting police officer that it amazed me obama had time to win the nobel peace prize what with all the bombings he carried out against foreign civilians.12 -
After a lot of work I figured out how to build the graph component of my LLM. Figured out the basic architecture, how to connect it in, and how to train it. The design and how-to is 100%.
Ironically generating the embeddings is slower than I expect the training itself to take.
A few extensions of the design will also allow bootstrapped and transfer learning, and as a reach, unsupervised learning but I still need to work out the fine details on that.
Right now because of the design of the embeddings (different from standard transformers in a key aspect), they're slow. Like 10 tokens per minute on an i5 (python, no multithreading, no optimization at all, no training on gpu). I've came up with a modification that takes the token embeddings and turns them into hash keys, which should be significantly faster for a variety of reasons. Essentially I generate a tree of all weights, where the parent nodes are the mean of their immediate child nodes, split the tree on lesser-than-greater-than values, and then convert the node values to keys in a hashmap to make lookup very fast.
Weight comparison can be done either directly through tree traversal, or using normalized hamming distance between parent/child weight keys and the lookup weight.
That last bit is designed already and just needs implemented but it is completely doable.
The design itself is 100% attention free incidentally.
I'm outlining the step by step, only the essentials to train a word boundary detector, noun detector, verb detector, as I already considered prior. But now I'm actually able to implement it.
The hard part was figuring out the *graph* part of the model, not the NN part (if you could even call it an NN, which it doesn't fit the definition of, but I don't know what else to call it). Determining what the design would look like, the necessary graph token types, what function they should have, *how* they use the context, how thats calculated, how loss is to be calculated, and how to train it.
I'm happy to report all that is now settled.
I'm hoping to get more work done on it on my day off, but thats seven days away, 9-10 hour shifts, working fucking BurgerKing and all I want to do is program.
And all because no one takes me seriously due to not having a degree.
Fucking aye. What is life.
If I had a laptop and insurance and taxes weren't a thing, I'd go live in my car and code in a fucking mcdonalds or a park all day and not have to give a shit about any of these other externalities like earning minimum wage to pay 25% of it in rent a month and 20% in taxes and other government bullshit.4 -
Heres some research into a new LLM architecture I recently built and have had actual success with.
The idea is simple, you do the standard thing of generating random vectors for your dictionary of tokens, we'll call these numbers your 'weights'. Then, for whatever sentence you want to use as input, you generate a context embedding by looking up those tokens, and putting them into a list.
Next, you do the same for the output you want to map to, lets call it the decoder embedding.
You then loop, and generate a 'noise embedding', for each vector or individual token in the context embedding, you then subtract that token's noise value from that token's embedding value or specific weight.
You find the weight index in the weight dictionary (one entry per word or token in your token dictionary) thats closest to this embedding. You use a version of cuckoo hashing where similar values are stored near each other, and the canonical weight values are actually the key of each key:value pair in your token dictionary. When doing this you align all random numbered keys in the dictionary (a uniform sample from 0 to 1), and look at hamming distance between the context embedding+noise embedding (called the encoder embedding) versus the canonical keys, with each digit from left to right being penalized by some factor f (because numbers further left are larger magnitudes), and then penalize or reward based on the numeric closeness of any given individual digit of the encoder embedding at the same index of any given weight i.
You then substitute the canonical weight in place of this encoder embedding, look up that weights index in my earliest version, and then use that index to lookup the word|token in the token dictionary and compare it to the word at the current index of the training output to match against.
Of course by switching to the hash version the lookup is significantly faster, but I digress.
That introduces a problem.
If each input token matches one output token how do we get variable length outputs, how do we do n-to-m mappings of input and output?
One of the things I explored was using pseudo-markovian processes, where theres one node, A, with two links to itself, B, and C.
B is a transition matrix, and A holds its own state. At any given timestep, A may use either the default transition matrix (training data encoder embeddings) with B, or it may generate new ones, using C and a context window of A's prior states.
C can be used to modify A, or it can be used to as a noise embedding to modify B.
A can take on the state of both A and C or A and B. In fact we do both, and measure which is closest to the correct output during training.
What this *doesn't* do is give us variable length encodings or decodings.
So I thought a while and said, if we're using noise embeddings, why can't we use multiple?
And if we're doing multiple, what if we used a middle layer, lets call it the 'key', and took its mean
over *many* training examples, and used it to map from the variance of an input (query) to the variance and mean of
a training or inference output (value).
But how does that tell us when to stop or continue generating tokens for the output?
Posted on pastebin if you want to read the whole thing (DR wouldn't post for some reason).
In any case I wasn't sure if I was dreaming or if I was off in left field, so I went and built the damn thing, the autoencoder part, wasn't even sure I could, but I did, and it just works. I'm still scratching my head.
https://pastebin.com/xAHRhmfH33 -
I wasn't gone, I was just working.
Anyway, I had some fun and wrote a simple 10 minute little precursor to an ngram implementation:
(when your comments are as long as the code, lol)
https://pastebin.com/bZVh8YSP
It obviously doesn't do type checking, or valid value checking, or any of that, and there may be an old comment or two adding cruft but whatever39 -
I want to remake hunger games.
But its just that last 2nd or 3rd scene where catpiss everclear is ugly crying and neurotically yelling at her sisters cat, for like two hours straight.
RIP Finnick. The only good character. Otherwise it would have been a completely unwatchable series of movies.
Sometimes I wake up and just choose violence. Fight me.16 -
I had the idea that part of the problem of NN and ML research is we all use the same standard loss and nonlinear functions. In theory most NN architectures are universal aproximators. But theres a big gap between symbolic and numeric computation.
But some of our bigger leaps in improvement weren't just from new architectures, but entire new approaches to how data is transformed, and how we calculate loss, for example KL divergence.
And it occured to me all we really need is training/test/validation data and with the right approach we can let the system discover the architecture (been done before), but also the nonlinear and loss functions itself, and see what pops out the other side as a result.
If a network can instrument its own code as it were, maybe it'd find new and useful nonlinear functions and losses. Networks wouldn't just specificy a conv layer here, or a maxpool there, but derive implementations of these all on their own.
More importantly with a little pruning, we could even use successful examples for bootstrapping smaller more efficient algorithms, all within the graph itself, and use genetic algorithms to mix and match nodes at training time to discover what works or doesn't, or do training, testing, and validation in batches, to anneal a network in the correct direction.
By generating variations of successful nodes and graphs, and using substitution, we can use comparison to minimize error (for some measure of error over accuracy and precision), and select the best graph variations, without strictly having to do much point mutation within any given node, minimizing deleterious effects, sort of like how gene expression leads to unexpected but fitness-improving results for an entire organism, while point-mutations typically cause disease.
It might seem like this wouldn't work out the gate, just on the basis of intuition, but I think the benefit of working through node substitutions or entire subgraph substitution, is that we can check test/validation loss before training is even complete.
If we train a network to specify a known loss, we can even have that evaluate the networks themselves, and run variations on our network loss node to find better losses during training time, and at some point let nodes refer to these same loss calculation graphs, within themselves, switching between them dynamically..via variation and substitution.
I could even invision probabilistic lists of jump addresses, or mappings of value ranges to jump addresses, or having await() style opcodes on some nodes that upon being encountered, queue-up ticks from upstream nodes whose calculations the await()ed node relies on, to do things like emergent convolution.
I've written all the classes and started on the interpreter itself, just a few things that need fleshed out now.
Heres my shitty little partial sketch of the opcodes and ideas.
https://pastebin.com/5yDTaApS
I think I'll teach it to do convolution, color recognition, maybe try mnist, or teach it step by step how to do sequence masking and prediction, dunno yet.6 -
pandas can suck my balls.
N
I
H
I'd rather roll my own.
edit: but also xgboost can suck my balls.
Treating every OBVIOUSLY continuous-valued entry as a 'category'.
All searches for this problem turn up tutorials and documentation on how to CONVERT continuous and numeric values into classes or categories.
Not a single fucking document addresses the problem of when pandas or xgboost refuses to treat numeric inputs as numerics and insists on pushing an error that your data is categorical when every fucking inspection shows the type as numeric.10 -
Every devrant masterly summarized in one bible-length scintilatting jewel of a blog post:
https://matt.sh/panic-at-the-job-ma...
Flawless victory.
FINISH HIM.2 -
"Man Of Steel"
Many men was I
before I was the man I am.
And each a stranger had I come to know
In a strange land.
In every man there is a nation
of dreams, in every spirit, like a flame
Many visions, many plans.
How do I choose, how must I choose
the destiny of all among the one
united by our pain, and hopes, and blood
and beneath this eternal sun, and the rule of petty men
learn to stand tall, to walk, to run.
If I could stand upon the shoulders
of giants, and stride for but a minute
in great men's shoes, forever and a day
All the world would be mine to gain, mine to lose
And bend the course of mankind
To a better way.
If I could stride full measure in that minute
And take it like the reigns
Of a chariot that could cross the sky
What a man they would exclaim, for generations,
fathers unto untold sons.
And speak in solemn words setting new foundations,
a truth greater than any lie
It is every man's will to fight.
And it is to rogues to do and die. -
From my big black book of ML and AI, something I've kept since I've 16, and has been a continual source of prescient predictions in the machine learning industry:
"Polynomial regression will one day be found to be equivalent to solving for self-attention."
Why run matrix multiplications when you can use the kernal trick and inner products?
Fight me.19 -
I'm tired of meth. I mean math. MATH.
I'm sick and tired of everything.
"First!" numerous blog comments shout to no-one, from the colorful abyss of the internet.
And for me, this is a first. But lets rewind.
It's 2 AM, about a month ago, spring in Akron Ohio. Someone reading this is no doubt shocked "You just revealed where you live, ON THE INTERNET! The weirdos will find you." Anyway, it's a dark and stormy night, as the cliche goes. Like most people up after midnight, I'm browsing facebook posts and useless productivity sites. (lifehacker)
I yearn for something more out of life, somewhere deep down inside..maybe in my colon?
All the articles are saying "10 tips to supercharge your life", "how to discover your life purpose in three easy steps", mixed with an ad about ron jeremys one secret tip to grow a massive cock, and exhortations to buy such-and-such's "new ebook!"
I am not moved by any of this.
Scrolling, and tabbing, and intermittently dropping f-bombs because of js ads locking up my browser, I stop and lean back. In the blue afterglow of my shitty compaqs screen, a thought appears, like a cheesy genie, popping out of a brass toilet. "Start a blog! A youtube channel! A podcast" the ad proclaims. "Yes. Thats what I have to do" I whispered (I'm embarrassed to admit I really did say this).
Then I Control+W'd out of it, and flopped onto my mattress. This was the wasteland of my life. I couldn't help but think The whole internet was like some seedy back alley 2.0, where boxcar willie with his train of needle marks had been replaced by more upstart, greasy-haired gurus. Each peddling 'ebooks' of 'advice', stuffed in between ads to buy 'this one hot stock you have to own' and porn. And that alley was really the 'blogosphere' and 'youtubers'. As I drifted off, the last thought was 'We're all just bottom feeders,leeching and whoring on the attention of faceless anonymous users, hoping for another quick fix.'
I fell asleep, these racing thoughts fading into sweet oblivion, but never too far away.
Welcome to My Back Alley
That title is only twice as dirty, and half as thought-out as I planned. As you imagine, the lure of being the electronic equivalent of a conman never quite faded. And the more I read, the stronger the message "Start a youtube channel!" grew. As if everyone and their grandmother having a youtube channel would somehow make the world right, cure cancer, and save kittens from animal shelter gas chambers. Everyones an expert, everyones an agent of change. Maximizing productivity, Evangelizing Technology, ninjas collaborating to socialfy your community diversification benchmark for target traffic
through user-engagement and authentic grass-roots, blah, blah, blah, blah, money. Thrusting, moaning, screaming. Money. Pumping at the center of it all.
Wake up and smell the bullshit.
This blog is not a blog. This blog is the anti-blog, and we are the anti-streamers. 'We' (read "I") resist your bullshit lingo bingo, call out the Truth (Tm) and refuse to be satisfied with any standards of decency, journalistic integrity, or common sense.
Every blog, every channel, every podcast is Starbucks And I'm tyler durden, pissing in your coffee, and calling it a 'latte'.
Freaks, and anarchists, laymen and losers. If you feel as I do, then this is the place for you. Welcome to devrant.11 -
After a lot of work, the new factorization algorithm has a search space thats the factorial of (log(log(n))**2) from what it looks like.
But thats outerloop type stuff. Subgraph search (inner loop) doesn't appear to need to do any factor testing above about 97, so its all trivial factors for sequence analysis, but I haven't explored the parameter space for improvements.
It converts finding the factors of a semiprime into a sequence search on a modulus related to
OIS sequence A143975 a(n) = floor(n*(n+3)/3)
and returns a number m such that n=pq, m%p == 0||(p*i), but m%q != 0||(q*k)
where i and k are respective multiples of p and q.
This is similar in principal to earlier work where I discovered that if i = p/2, where n=p*q then
r = (abs(((((n)-(9**i)-9)+1))-((((9**i)-(n)-9)-2)))-n+1+1)
yielding a new number r that shared p as a factor with n, but is coprime with n for q, meaning you now had a third number that you could use, sharing only one non-trivial factor with n, that you could use to triangulate or suss out the factors of n.
The problem with that variation on modular exponentiation, as @hitko discovered,
was that if q was greater than about 3^p, the abs in the formula messes the whole thing up. He wrote an improvement but I didn't undertsand his code enough to use it at the time. The other thing was that you had to know p/2 beforehand to find r and I never did find a way to get at r without p/2
This doesn't have that problem, though I won't play stupid and pretend not to know that a search space of (log(log(n))**2)! isn't an enormous improvement over state of the art,
unless I'm misunderstanding.
I haven't posted the full details here, or sequence generation code, but when I'm more confident in what my eyes are seeing, and I've tested thoroughly to understand what I'm looking at, I'll post some code.
hitko's post I mentioned earlier is in this thread here:
https://devrant.com/rants/5632235/...2 -
I may have accidentally found a legit factorization method that converts factoring to a combinatorics problem over a graph, with a time complexity that is the factorial of the logarithm of the semi prime being factored.
I don't know if this is supposed to be good or not, and I don't want to post it prematurely like I almost always do. Not at least until I study its properties better, but it's still a pretty interesting find I think.11 -
I've been away a while, mostly working 60-70 hour weeks.
Found a managers job and the illusion of low-level stability.
Also been exploring elliptic curve cryptography and other fun stuff, like this fun equation...
i = log(n, 2**0.5)
base = (((int((n/(n*(1-(n/((((abs(int(n+(n/(1/((n/(n-i))+(i+1)))))+i)-(i*2))/1))/1/i)))))*i)-i)+i))
...as it relates to A143975 a(n) = floor(n*(n+3)/3)
Most semiprimes n=pq, where p<q, appear to have values k in the sequence, where k is such that n+m mod k equals either p||q or a multiple thereof.
Tested successfully up to 49 bits and counting. Mostly haven't gone further because of work.
Theres a little more math involved, and I've (probably incorrectly) explained the last bit but the gist is the factorization doesn't turn up anything, *however* trial lookups on the sequence and then finding a related mod yields k instead, which can be used to trivially find p and q.
It has some relations to calculating on an elliptic curve but thats mostly over my head, and would probably bore people to sleep.2 -
Holy smokes, an LLM thats a competent wit.
(it gets good toward the end)
https://pastebin.com/MpGzZRqK
courtesy of https://worldsim.nousresearch.com
edit: I was particularly fond of "Schrodinger's cat mocks causality, simultaneously alive and droll"1 -
Probably pure coincidence but if you look at the deconstruction of the dedekinds like so:
>>> decon(6)
offset: 1, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('1')]]
>>> decon(20)
offset: 2, exp: [[Decimal('2'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: []
>>> decon(168)
offset: 3, exp: [[Decimal('2'), Decimal('2')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('2')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('2'), Decimal('3')]]
>>> decon(7581)
offset: 4, exp: [[Decimal('2'), Decimal('3')], [Decimal('5'), Decimal('3')], [Decimal('7'), Decimal('1')]]
offset: 3, exp: [[Decimal('2'), Decimal('2')], [Decimal('5'), Decimal('3')]]
offset: 2, exp: [[Decimal('2'), Decimal('4')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: []
>>> decon(7828354)
offset: 7, exp: [[Decimal('2'), Decimal('6')], [Decimal('5'), Decimal('6')], [Decimal('7'), Decimal('1')]]
offset: 6, exp: [[Decimal('2'), Decimal('8')], [Decimal('5'), Decimal('5')]]
offset: 5, exp: [[Decimal('2'), Decimal('5')], [Decimal('5'), Decimal('4')]]
offset: 4, exp: [[Decimal('2'), Decimal('6')], [Decimal('5'), Decimal('3')]]
offset: 3, exp: [[Decimal('2'), Decimal('2')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 1, exp: [[Decimal('2'), Decimal('2')]]
>>> decon(d('2414682040998'))
offset: 13, exp: [[Decimal('2'), Decimal('13')], [Decimal('5'), Decimal('12')]]
offset: 12, exp: [[Decimal('2'), Decimal('13')], [Decimal('5'), Decimal('11')]]
offset: 11, exp: [[Decimal('2'), Decimal('10')], [Decimal('5'), Decimal('10')]]
offset: 10, exp: [[Decimal('2'), Decimal('11')], [Decimal('5'), Decimal('9')]]
offset: 9, exp: [[Decimal('2'), Decimal('9')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('8')]]
offset: 8, exp: [[Decimal('2'), Decimal('10')], [Decimal('5'), Decimal('7')]]
offset: 7, exp: [[Decimal('2'), Decimal('7')], [Decimal('5'), Decimal('6')]]
offset: 6, exp: []
offset: 5, exp: [[Decimal('2'), Decimal('6')], [Decimal('5'), Decimal('4')]]
offset: 4, exp: []
offset: 3, exp: [[Decimal('2'), Decimal('2')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('2'), Decimal('3')]]
the powers in the 2's column go:
1, 2, 2, 2, 3, 3, 2, 4, 6
which are predicted by:
https://oeis.org/search/...
Again, probably only a coincidence, but kinda beautiful.2 -
Writing a full interpreter for a pseudo-assembly language.
It's kinda fun but the things a bit of a cunt of a problem because I only did this once before almost a decade and a half ago.
getting just the right format and deciding on the syntax is a slog. Just need something thats quick and sufficiently expressive because I'm not writing the assembly myself, I'm generating the assembly code that runs through the interpreter, so it has to be valid under a lot of conditions. -
So I promised a post after work last night, discussing the new factorization technique.
As before, I use a method called decon() that takes any number, like 697 for example, and first breaks it down into the respective digits and magnitudes.
697 becomes -> 600, 90, and 7.
It then factors *those* to give a decomposition matrix that looks something like the following when printed out:
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
Each entry is a pair of numbers representing a prime base and an exponent.
Now the idea was that, in theory, at each magnitude of a product, we could actually search through the *range* of the product of these exponents.
So for offset three (600) here, we're looking at
2^3 * 3 ^ 1 * 5 ^ 2.
But actually we're searching
2^3 * 3 ^ 1 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
2^3 * 3 ^ 1 * 5 ^ 0
2^3 * 3 ^ 0 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
etc..
On the basis that whatever it generates may be the digits of another magnitude in one of our target product's factors.
And the first optimization or filter we can apply is to notice that assuming our factors pq=n,
and where p <= q, it will always be more efficient to search for the digits of p (because its under n^0.5 or the square root), than the larger factor q.
So by implication we can filter out any product of this exponent search that is greater than the square root of n.
Writing this code was a bit of a headache because I had to deal with potentially very large lists of bases and exponents, so I couldn't just use loops within loops.
Instead I resorted to writing a three state state machine that 'counted down' across these exponents, and it just works.
And now, in practice this doesn't immediately give us anything useful. And I had hoped this would at least give us *upperbounds* to start our search from, for any particular digit of a product's factors at a given magnitude. So the 12 digit (or pick a magnitude out of a hat) of an example product might give us an upperbound on the 2's exponent for that same digit in our lowest factor q of n.
It didn't work out that way. Sometimes there would be 'inversions', where the exponent of a factor on a magnitude of n, would be *lower* than the exponent of that factor on the same digit of q.
But when I started tearing into examples and generating test data I started to see certain patterns emerge, and immediately I found a way to not just pin down these inversions, but get *tight* bounds on the 2's exponents in the corresponding digit for our product's factor itself. It was like the complications I initially saw actually became a means to *tighten* the bounds.
For example, for one particular semiprime n=pq, this was some of the data:
n - offset: 6, exp: [[Decimal('2'), Decimal('5')], [Decimal('5'), Decimal('5')]]
q - offset: 6, exp: [[Decimal('2'), Decimal('6')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('5')]]
It's almost like the base 3 exponent in [n:7] gives away the presence of 3^1 in [q:6], even
though theres no subsequent presence of 3^n in [n:6] itself.
And I found this rule held each time I tested it.
Other rules, not so much, and other rules still would fail in the presence of yet other rules, almost like a giant switchboard.
I immediately realized the implications: rules had precedence, acted predictable when in isolated instances, and changed in specific instances in combination with other rules.
This was ripe for a decision tree generated through random search.
Another product n=pq, with mroe data
q(4)
offset: 4, exp: [[Decimal('2'), Decimal('4')], [Decimal('5'), Decimal('3')]]
n(4)
offset: 4, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('3')]]
Suggesting that a nontrivial base 3 exponent (**2 rather than **1) suggests the exponent on the 2 in the relevant
digit of [n], is one less than the same base 2 digital exponent at the same digit on [q]
And so it was clear from the get go that this approach held promise.
From there I discovered a bunch more rules and made some observations.
The bulk of the patterns, regardless of how large the product grows, should be present in the smaller bases (some bound of primes, say the first dozen), because the bulk of exponents for the factorization of any magnitude of a number, overwhelming lean heavily in the lower prime bases.
It was if the entire vulnerability was hiding in plain sight for four+ years, and we'd been approaching factorization all wrong from the beginning, by trying to factor a number, and all its digits at all its magnitudes, all at once, when like addition or multiplication, factorization could be done piecemeal if we knew the patterns to look for.7 -
I didn't leave, I just got busy working 60 hour weeks in between studying.
I found a new method called matrix decomposition (not the known method of the same name).
Premise is that you break a semiprime down into its component numbers and magnitudes, lets say 697 for example. It becomes 600, 90, and 7.
Then you break each of those down into their prime factorizations (with exponents).
So you get something like
>>> decon(697)
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
And it turns out that in larger numbers there are distinct patterns that act as maps at each offset (or magnitude) of the product, mapping to the respective magnitudes and digits of the factors.
For example I can pretty reliably predict from a product, where the '8's are in its factors.
Apparently theres a whole host of rules like this.
So what I've done is gone an started writing an interpreter with some pseudo-assembly I defined. This has been ongoing for maybe a month, and I've had very little time to work on it in between at my job (which I'm about to be late for here if I don't start getting ready, lol).
Anyway, long and the short of it, the plan is to generate a large data set of primes and their products, and then write a rules engine to generate sets of my custom assembly language, and then fitness test and validate them, winnowing what doesn't work.
The end product should be a function that lets me map from the digits of a product to all the digits of its factors.
It technically already works, like I've printed out a ton of products and eyeballed patterns to derive custom rules, its just not the complete set yet. And instead of spending months or years doing that I'm just gonna finish the system to automatically derive them for me. The rules I found so far have tested out successfully every time, and whether or not the engine finds those will be the test case for if the broader system is viable, but everything looks legit.
I wouldn't have persued this except when I realized the production of semiprimes *must* be non-eularian (long story), it occured to me that there must be rich internal representations mapping products to factors, that we were simply missing.
I'll go into more details in a later post, maybe not today, because I'm working till close tonight (won't be back till 3 am), but after 4 1/2 years the work is bearing fruit.
Also, its good to see you all again. I fucking missed you guys.9 -
God damn, the most beautiful thing I've read in probably two years. It *moved* me.
https://fortressofdoors.com/four-ma...
Good to see you all.6 -
I love them shitty scammy singles dating ads.
"Hot Single Volcanos Are Looking For Dates In Your Area."
And
"once you play this game, your friends won't be seeing much of you!"
Also merry chrysler!3 -
I turned down another women who was absolutely, 100% flirting with me, because, from what I can gather, she was trying to get out of a relationship with her current boyfriend, a military veteran.
I outright ignored her and then when that failed, I made our work relationship 100% about that, work.
Even though I'm friendly with everyone else.
I'm an absolute shit, aren't I? I feel genuinely bad.
I'm not sure if I did it out of a misplaced sense of honor for a dude who obviously has some ptsd, or because I don't feel like I'm able to connect with anyone anymore.
I feel like I'm alone in this world. Not, like, sexually or anything, but more like I don't want to burden anyone with the shit I'm going through. Like a man on a mission on a sinking ship, and it would be wrong to let anyone else on board.
Like a one-man shit-show, all singing, all dancing, driven to one end, with one purpose. And it'd be wrong to let anyone get attached, or invite anyone else in.
Fuck I got so many irons in the fire. I have an ARG in the works, a full game, a social platform that the code and marketing plan is laid out and I'm saving money for, two more games already planned, plus spending an in-ordinate amount of time with my father and sister and mother as they deal with the loss of my sister, plus volunteering to help the homeless, plus working, plus studying.
I barely sleep.
It's just me. I'm like a cruise missile heading to one destination, to some final destination, I just don't know what. And I don't let anyone in, because then they might see how fucking crazy I am, and how crazy my life is, and how crazy my goals are. Thats not a humblebrag. Thats more of a "wholly shit, I'm so in over my head, I'm fucking drowning" type thing. But I'm not giving up, I'm just going deeper.
And it feels like drowning but somehow I'm okay with it. Like I've passed the crux of loneliness, and settled for going for it all, alone, shooting out of orbit, and saying "fuck it all' to everything and everyone. They say "if you got everything you wanted, everything you wished for, you'd wish you hadn't, which is why god isn't a genie". And lately I've been thinking god doesn't exist, or doesn't care, because he's left it all up to me, and I've fucked it up good and proper, and am on my way to either nothing, or everything I've ever wanted.
Is this what happiness feels like? Or suicide?
I don't know. I mean I really don't. I don't want to die. I think I could stop existing and be okay with it. Having achieved at least a modicum of understanding the universe, at least accomplished something small but meaningful.
Or maybe I'm delusional, driven mad with the full comprehension of human floundering against a meandering existence.
I don't fucking know.
I feel like I'm spinning my wheels, so much, that even two weeks feels like a fucking eternity. I don't sleep anymore. When I do, I escape into my dreams, where I can fly, or float, and the people in my dreams tell me I'm living in the matrix and I believe them..in my dreams. Feel it even.
And when I wake up, the feeling persists. Leaves me in wonderland, for hours after waking.
And I have visions, of going homeless, like some buddha, all the time, and then I say "wake up J, you're fucking crazy! You want to go be some couch surfing homeless bum living off other's good graces? get the fuck outa here! While others suffer, schlep it at whatever job they work, day in day out, toil. In this economy? In this inflation? What a dishonest way of thinking. What a dishonest way of dreaming."
And yet I daydream. Because its the only escape there is from all the world has become.
And I bring joy to others, earnestly, vicariously, because its the closest joy I can feel, when I've become numb.
It is this quasi-permanent sense of alienation that permeates my whole world, a sort of invisible force field that separates me from others, even as I reach out to understand them, to comfort them, to smooth the corners off their world, so that they don't become like I have, something not entirely human, but...other.
Often when we meditate, long and hard enough,
at the center that emerges, at the center of ourselves, we find an abyss, a whole universe, devoid of anything, a perfect silence, mirroring back the cosmos, and other people. Observing, silent, irreducible, implacable.
Sometimes I feel like I don't exist. Sometimes I think others don't exist.
Very often I feel like nothing is real. And that I am playing some sort of game. Not like a video game per se, but that there is a bigger pattern, a hidden pattern to it all, just out of reach, and I'm reaching for it but understanding eludes me.
Not that the universe has made me for some special purpose, but merely that the universe observes me specifically, for no special purpose, other than that it can, whatever trivialities may impede or push forward my life.
As if the universe were bored.24 -
I just finished designing an entire asset management pipeline and christ on a fucking pogo stick, if it isn't convoluted.
Theres a lot of game engines out there, but all of them do it a little different. They all tackle a slightly different problem, without even realizing it.
1. asset management
2. asset change management
3. behavior change management
4. data management
5. combinatorial design management.
6. Combinatorial Behavior management
7. Feature completion
ASSET MANAGEMENT is exactly what it says on the tin.
ASSET CHANGE management can be thought of handling the import, export, formatting, platform specific packing, and versioning (including forking) of an asset.
BEHAVIORAL CHANGE management is a subset of asset management, because code is a subset of assets (depending on how you define 'assets'). The oldest known example of this is commenting and uncommenting code.
Or worse, printf debugging.
This can be file versioning, basic undo services, graph management of forks and mergers, toggles for features or modules, etc.
DATA management is about anything that doesn't fall into the other categories, everything from mission text to npc dialogues, quests, location names, item stats, the works. Anything you'd be tempted to put in a database, falls under this category. Haven't yet seen many engines offer this as an explicit built in tool as of yet, because the other problems are non-trivial as is, so this is a bit of low hanging fruit that gets handled by external tools, or loaded from formats as simple as json.
COMBINATORIAL DESIGN management is the idea of prefabbing, blueprints of broader object design using nested prototypes of existing game objects, to create more complex, reusable set pieces. Unity did this well. GM does this in part.
COMBINATORIAL BEHAVIOR management is entity-component systems, plus tooling to make it easy to add, remove, and configure components and their values on entity blueprints, also not uncommon. Both stencyl and unity do this. GM has a precursor to this in the form of configurable fields, but these fields are not based on component scripts attached to objects.
FEATURE COMPLETION is that set of gameplay mechanics or styles of design that an engine naturally makes easier to include or build in a game.
I don't think I'm aiming for all that, but I think at minimum a good engine has to do asset management, behavioral change management, prefabs, and entity-component systems with management tools for that. And ideally, asset change management.8 -
Ideas:
1. Scrape github
2. Attach feature size estimate (an abstract scale) as examples across many projects.
3. Use this as prompt/finetunning data.
4. Train and prompt on project descriptions relative to feature size and number of contributors/changes in the changelog.
5. Package and release a model that takes descriptions of ideas and generates reasonable estimates of time and manpower.
6. Optional, sell as an estimate service to corporate and make money introducing some sanity to the world for a change.10 -
v0.0005a (alpha)
- class support added to lua thanks to yonaba.
- rkUIs class created
- new panel class
- added drawing code for panel
- fixed bug where some sides of the UI's border were failing to drawing (line rendering quark)
v0.0014a (alpha) 11.30.2023 (~2 hours)
- successfully retrieving basic data from save folder, load text into lua from files
- added 'props' property to Entity class
- added a props table to control what gets serialized and what doesn't
- added a save() base method for instances (has to be overridden to be useful beyond the basics)
- moved the lume.serialize() call into the :save() method on the base entity class itself
- serialized and successfully saved an entities property table.
- fixed deserializion bugs involving wrong indexes (savedata[1] not savedata[2])
- moved deserialization from temp code, into line loading loop itself (assuming each item is on one line)
- deser'd test data, and init()'d new player Entity using the freshly-loaded data, and displayed the entity sprite
All in all not a bad session. Understanding filing handling and how to interact with the directory system was the biggest hurdle I was worried about for building my tools.
Next steps will be defining some basic UI elements (with overridable draw code), and then loading and initializing the UI from lua or json.
New projects can be set as subfolders folders in appdata, using 'Setidentity("appname/projectname") to keep things clean.
I'm not even dreading writing basic syntax highlighting!
Idea is to dogfood the whole process. UI is in-engine rendered just like you might see with godot, unity, or gamemaker, that way I have maximum flexibility to style it the way I want. I'm familiar enough with constructing from polygons, on top of stenciling, on top of nine-slicing, on top of existing tweening and special effects, that I can achieve exactly what I want.
Idea is to build a really well managed asset pipeline. Stencyl, as 'crappy' as it appeared, and 'for education' was a master class in how to do things the correct way, it was just horribly bloated while doing it.
Logical tilesets that you import, can rearrange through drag-n-drop, assign custom tile shapes to, physics materials, collisions groups, name, add tag data to, all in one editor? Yes please.
Every other 2D editor is basic-bitch, has you importing images, and at most generates different scales and does the slicing for you.
Code editor? Everything behavior was in a component, with custom fields. All your code goes into a list of events, which you can toggle on and off with a proper toggle button, so you can explicitly experiment, instead of commenting shit out (yes git is better, but we're talking solo amateurs here, they're not gonna be using git out the gate unless they already know what they're doing).
Components all have an image assignable to identify them, along with a description field, and they're arranged in a 2d grid for easy browsing, copying, modifying.
The physics shape editor, the animation editor, the map editor, all of it was so bare bones and yet had things others didn't.
I want that, except without the historic ties to flash, without the overhead of java, and with sexier fucking in-engine rendering of the UI and support for modding and in-engine custom tools.
Not really doing it for anyone except myself, and doubt I'll get very far, but since I dropped looking for easy solutions, I've just been powering through all the areas I don't understand and doing the work.
I rediscovered my love of programming after 3-4 years of learning to hate it, and things are looking up.2