17

My god the wall looks really punchable right now. Let me tell you why.

So I’m working on a data mining project, and I’m trying to get data from google trends. Unfortunately, there have been a lot of roadblocks for what should have been an easy task.

First it won’t give a raw search volume, only relative “interest”.
Fortunately it lets me compare search terms, which would work for my needs however it will only let me compare a few at a time. I need to compare 300.

So my solution is simple: compare all the terms relative to one term. Simple enough, but it would be time consuming so I figured I’d write a program to get the data.

But then I learned that they don’t have an official api. There’s a node module for this very thing based on a python module that reverse engineers the api endpoints. I thought as long as it works I’d use it.

It does work... But then I discovered that google heavily rate limits the endpoints.

So... I figured I’d build a system to route the requests through different tor nodes to get around the rate limit. Good solution right? Well like a slap to the face, after spending way to much time getting requests through tor working, I discovered that THEY FUCKING BLOCKED TOR IPS.

So I gave up, and resigned to wait 5 hours for my program to get the data... 1 comparison at a time... 60s interval between requests. They, of course, don’t tell you the rate limit threshold, so this is more or less a guess (I verified that 30s interval was too short and another person using the module suggested 60s).

Remember when I said the discovery that the blocked tor came like a slap to the face? This came as a sledge hammer to the face: for some reason my program didn’t dump the data at the end. I waited 5 fucking hours to get nothing.

I am so mad right now. I am so fucking mad.

Comments
  • 5
    fantastic rant intro!
    Don't they offer that in one of their https://www.cloud.google.com as a data analysis service?

    anyways, for the lulz please enter "divorce" and "pregnant" into google trends. 🤔😆
  • 1
    @heyheni thanks I’ll look into it!

    Also add abortion and child support to it your comparison!
  • 3
    @FelisPhasma okay 😄 yeah every new year since 2004 there is a spike in pregnancy searches and after christmas there is a spike in divorce.
  • 0
    If there's no api, they don't want you using it / don't support it and you're in for a bad time.
Add Comment