10

Who has too much time on their hands and wants a throwaway project? :)

Say there are some poll results. Like this one. Each item has xx.yy% of all the votes. The task would be to come up with an algorythm that would read in those results and make a few predictments on how many participants were in this poll :)

I don't need it, I just came up with it and thought it might be a fun project for someone. I know I'd go for it if I had free time on my hands :)

P.S. if you're up for it, sharing a solution code is more than welcome!

Comments
  • 2
    Well if not accounting for rouding errors you could multiply the percentages by 10 until all the decimals are gone and add the numbers. Then there can be any whole n times that sum.
  • 1
    I had a similar idea a few days ago: Because youtube doesn't show accurate like/dislike numbers anymore but the bar has a percentage filled, you could more closely calculate the likes/dislikes. I don't know how well something like this works though
  • 1
    That's basically math, no?

    #!/bin/bash

    pcts="25.4 4.5"
    scale=3

    for i in $pcts
    do
    echo "scale=$scale; 87317 * $i / 100" | bc
    done

    Edit: Well, above is a small part of getting to the answer anyway, and about as much as I'll do on my phone 🙃
  • 3
    @ilikeglue the point is to account for rounding :) the more items are voted on, the more accurate the result will be, thanks to rounding :)
  • 2
    @Condor math and prolly statistics analysis

    and you got the task wrong. The idea is to calculate that 87k from the results. Not vice versa :)
  • 2
    For instance results are: 12.5, 50, 25, 12.5.

    Now if there were 100 voters it would mean that half of one voter voted for a and another half of the voter voted for d. Now that is impossible so there should have been 200 voters or a multiple of 200.
    The other possible solution would be 80 (prolly the smallest possible count?), as 12,5% of 80 is 10, which is 10 people (not 10.1, not 9.99 -- exactly 10 whole voters)

    Now what if there were more than 4 items to vote on? With more fractional results? Like 4.017% voted for a choice "x"? How many voters could there be so you would get that .017 fraction?

    Getting the exact number is unlikely due to rounding, but getting a few more likely ones - I think that's doable..
  • 1
    @12bitfloat just hover cursor over like/dislike bar at bottom to see non rounded numbers.
  • 0
    @netikras ¯\_(ツ)_/¯
  • 0
    @netikras I question the utility of this but anyway
    How about this

    1. Multiply all numbers by the smallest power of 10 such that each is transformed into an integer

    2. Multiply 100 by the same power of 10, let's call that temp (because the percentages add to 100)

    3. Calculate GCD of the integers

    4. ans = temp / gcd

    For your example: input is [12.5, 50, 25, 12.5]

    1. Multiplying by 10 gives us all integers : [125, 500, 250, 125]

    2. temp = 100 x 10 = 1000

    3. GCD is 125

    4. ans = 1000 / 125 = 8
    8 works, you get 1, 4, 2, and 1 people
  • 0
    @netikras thing is 10^(2+n) will always be sufficient number of people if your percentages have n places after the decimal point, which puts an upper bound on this method.

    So eg. To estimate something in the range of 87k people, the upper boundimg power of 10 is 100k so you'd need at least three places after decimal (100k = 10^5 = 10^(2+3)) in your percentages to even have a chance of getting an estimate in that range. With 3 decimal places you can never estimate anything above 100k. And so on.
  • 0
    @RememberMe "talk is cheap. Show me the code"

    :)
  • 1
    @netikras https://pastebin.com/AGtd5X8A
    (Accepts a line of inputs of form xs.ys, if you want to give an integer input type <integer>.0)

    Caveat is that this only works when the user can choose only one option out of many instead of multiple. For multiple one way to get _an_ answer is to replace the 100 * 10^n bit by (sum of list after converting to integers). I don't think it's the smallest possible answer though.

    I'm willing to bet that this becomes an integer linear programming problem if you allow free choices. For example if you have three choices A, B and C, corresponding percentages pA, pB, pC, numbers nA, nB, nC, and variable N which is number of people:

    minimize N subject to
    nA - pA * N = 0
    nB - pB * N = 0
    nC - pC * N = 0
    nA, nB, nC, N > 0
    nA < N, nB < N, nC < N

    I dunno if I'm constructing the ILP problem properly (sleepy af right now) but that's how I'm guessing it would be solved. Would love to see someone correct/validate this or give a better method.
  • 0
    @netikras in case you're wondering, I do all operations involving decimal digits using strings, not float because there are numbers which float can't represent, and float has precision issues, while string has no such problems. Incorrect representation would destroy this method. After manipulating I convert them to integers directly.

    Also, just realised it might be ambiguous so, the language is Haskell
  • 0
    @RememberMe not even a clue how to launch Haskel :D and also too sleepy this evening to absorb your snippet, sorry.. Tomorrow.
  • 1
    @netikras go to https://rextester.com/l/...
    Replace the example code with the contents of the pastebin
    click the "[+] show input" button at the bottom and type in your input, for example
    "12.5 50.0 25.0 12.5" (without quotes), and run
  • 1
    @RememberMe

    15.1 0.58 50.0 25.0 4.32 2.01 0.06 1.13 6.81
    10000

    I'll take your word for it :D
  • 0
    nano???
  • 1
    @Parzi not a dev env I think :)
  • 1
    @netikras you and i are enemies now
  • 1
    @Parzi bring it on! :)
Add Comment