data science

Ranter

RikaroDev

391

Comments

5

NoToJavaScript

4481

5y

Is data publicly accessible without ANY logins ?
If yes : No problems
If not : it’s classified as hacking.
4

bad-frog

529

5y

@NoToJavaScript glad to have stumbled on this thread.

is it still hacking if you scrape informations from your own account?
3

magicMirror

10281

5y

@NoToJavaScript the second part should be clarified:
do you a legitimate login? if yes, you can scrape.

TOS and rate limiting are also a factor here.
0

bad-frog

529

5y

@Demolishun cool thing.
just checked. devrant has one, however its a placeholder.
apparently it has been "moved permanently"
are there alternative names under which i should look for those?
2

bad-frog

529

5y

@Demolishun :)))
curl was easier to figure out
im green as grass in networks
0

bad-frog

529

5y

@Demolishun and i have the ambition of starting making a kinda trading bot in 2 months:)))

well, first stage at least: scrap the webz for all relevant info, in forums and actual quotes, automatize everything so that it spews out the essence.

once i get decision making right then i will automatize it all the way, but i have no date for that stage

fun thing is that it kinda comes toegether all by itself.
1

NoToJavaScript

4481

5y

@bad-frog I already have code which scraps all TSX symbols in real time every 15 seconds haha

Good luck with the bot tho. After 2 days the best I could do is "not losing money"

I'm using https://fr.investing.com/equities/... as source and scrap the table

I then use this data to "play" with bot settings.
1

NoToJavaScript

4481

5y

@bad-frog it's very quick and dirty as i was mostly doing it for funzies, but if it can help :

https://pastebin.com/3GzETLab

Also : last time I tested was about 7 weeks ago, so maybe there are some layout / html changes
1

bad-frog

529

5y

@NoToJavaScript 1000 thx

but isnt trading equity expensive?
tbh i thought more about crypto bc trading fees are basically inexistent, and my plan for crypto was:

scrap 4chan/biz and see the occurence of crypto names.

prolly build a sentiment analyze, maybe tinker with it until i get a real tool

cross-reference that with crypto quotes

record statistics so as to see mooning

i should also scap reddits and tweets and the like

extend that to names of companies so as to auto- find and verify if there is a squeeze going on. by then i should have enough monetary mass so as to ignore trading fees of playing with the big boys
1

NoToJavaScript

4481

5y

@bad-frog Well, if the provider (let's say reddit) has API, just use APIs. More efficiant and easy. And it doesn't depends on html changes.

For trading provider, there are some what allow free trades (no fees).

Robinhood (US only I think)
WelathSimple trade (Canada, this is the on eI use)
Quest Trade
more.

Wealthsimple doesn't have trading APIs, BUT the have a website. Sending an order should not be difficult to retro*engeenier with couple of F12 in the browser.

Fair warning, HTTP is not as fast as you think it is :)

If you want to scrap all forums and blogs when your bot makes a decision, it's already too late. Look at agregated datasources
0

bad-frog

529

5y

@NoToJavaScript 1000 thanks bro, you advanced the whole project by a week at least

i supposed i had to work with js (which i dont know yet) at a certain point, and now i have a working example
0

NoToJavaScript

4481

5y

@bad-frog My example is in c# tho
1

bad-frog

529

5y

@NoToJavaScript oh, it doesnt have to happen in an instant. also my internet wouldnt allow for a tradebot in the true sense:)

i will be perfectly content if i get my analysis on a daily basis at first. then maybe increase the frequency to see where i gan get, and with what i can get away...

i doubt many servers would like being submerged by requests...

but if i have a 10 second resolution, its good enough to even make statistics about the markets response to news, crossreferenced with forums etc...

the idea is to have a tool to understand trends and follow them
1

bad-frog

529

5y

@NoToJavaScript "but thats C#"
thats exactly what im saying:p
if its not C, C++ or python you got me lost

honorary title: bash
1

NoToJavaScript

4481

5y

@bad-frog /agree

Anyway it's a fun project ! I don't have enough motivation to work on it dailly, but every couple on months I add a brick :)

ir uses lib https://html-agility-pack.net/ which I find very good for html parsing. It even handles "broken" html (to some degree)
0

bad-frog

529

5y

@NoToJavaScript that was my intent too.

the first step will be in two months for me because it ties in with my learning curriculum

but otherwise i have a few ideas on the backburner too. also to tie in in time.
0

bad-frog

529

5y

@NoToJavaScript niiiiice
c# is also on my personal list so i might start right away

even tho parsing isnt hard with C like.
however i see that it builds requests for you and all

but then i will have to learn how to build those myself soon...

ill have to build a server in c++. only std maybe some other one or two, selected by the school

they really want us to know networking in and out for sure...
2

NoToJavaScript

4481

5y

@bad-frog And 2 rules for scraping data :
1. Always provide user agent
2. Always use cookies
Some sites will reject requests without these 2.

The most difficult one I ever did was LinkedIn. THAT SHIT Changes something in layout almost every 2 weeks.
0

NoToJavaScript

4481

5y

@Nanos I would think yes, but to proove it I don't see how.

I would do it personally
0

mrsaeeddev

2

5y

Yes, you need to check whether that site allows scraping or not.
0

Wisecrack

9197

5y

I would just build a "virtual marketplace" for "practicing trading".

Like a game.

And then x amount of fake dollars translate into y subperecentage of real dollars.

So maybe 100k in the game market translates into $10.

and then the traders that are good, we aggregate their trades and execute them for real.

Of course the players don't need to know that and couldnt know that anyway.

Why invent effective AI when you can just crowdsource from people? I figure some small percentage of users are gonna be super predictors or naturally good at what they do.

Highly unethical of course if they're not informed.
1

mundo03

4828

5y

Have you heard about robots.txt?
That file will tell you what the site wants you to grab and ehat they don't.

You can choose to ignore it.

Also depending of where you are there are copyright and privacy regulations that can get you in trouble.

Talk to a lawyer.