Race times or data scraping race times from the web

Discuss anything related to using the program (eg. triggered betting tactics)

Moderator: 2020vision

Race times or data scraping race times from the web

Postby MatGreenaway » Mon Jul 09, 2012 5:29 pm

Hi,

My betting is in-play horse racing using Excel triggers from Betting Assistant and has been fairly successful for the last 4 and a bit years.

At the same time as my betting, using Excel, I record when races go in-play and when then go suspended to record actual race times. There is a bit of data to clean up and ignore (some markets can go back in-play for several resaons) and also not everything will be recorded but I use this data to give me a rough idea to race time which is key to my bet placing.

I was looking at my collected race time data and then wondered if it exists anywhere else. And in my search I came across data scraping which was new to me.

I suppose I am asking does a CSV or XLS exist of UK & IRE race times over recent years, or as it is available on www.gg.com (don't think it is available on sportinglife anymore??) can somebody show me what I need to do to get the data off the page and I suppose also how to traverse all the pages to get the race times of many thousands of races.

I know its a big ask, so I'd appreciate any help. I'm very good in Excel but only beginner in VBA - the little bits of script I do use I have picked up from elsewhere mainly.

Thanks,
Mat
MatGreenaway
 
Posts: 39
Joined: Tue Jan 26, 2010 3:00 pm

Postby negapo » Wed Jul 11, 2012 11:04 am

MatGreenaway

It's possible to scrap every single race in gg.com but you have to write a bit of code. Here is a short example:

1. Their URL's are constructed by date so first create a loop to go to all dates.

Code: Select all

Public Sub ScrapeAllRaces()
Dim CurrentDate As Date, URLToAsk As String
    CurrentDate = "2001-01-01"
    Do Until CurrentDate >= Now
        URLToAsk = "https://gg.com/racing/" & Right(CurrentDate, 2) & "-" & MonthName(Month(CurrentDate), True) & "-" & Year(CurrentDate)
        ScrapeThis URLToAsk
        CurrentDate = DateAdd("d", 1, CurrentDate)
    Loop
End Sub



The oldest page i can find is from 2001-01-01
Then you can build function to scrape every page like this:

Code: Select all
Private Function ScrapeThis(URLToAsk As String)
Dim XMLHttpRequest As XMLHTTP, HtmlDoc As New HTMLDocument
    Set XMLHttpRequest = New MSXML2.XMLHTTP
    XMLHttpRequest.Open "GET", URLToAsk, False
    XMLHttpRequest.send
    HtmlDoc.body.innerHTML = XMLHttpRequest.responseText
    'Here you have the HTML Text so you can search for the data you want and them dump it into a table in access or in excel
End Function


The part that takes more work is to parse. I can see if i can do it when i have time but if someone more proficient in parsing can help it would be fairly simple i believe.
Be aware that you should create a pause between each request, like 30 seconds or so, or else they will probably ban your IP for excessive requests. If you make a pause of 30 seconds the whole process will take you something like 36 hours.

Hope it helps
negapo
 
Posts: 179
Joined: Thu Mar 19, 2009 1:17 pm
Location: Porto, Portugal

Postby brumbie » Thu Jul 12, 2012 1:48 pm

I put up a race timer around february,although its still in its infancy and prolly needs some adjusting i've just checked and its still on the site:

http://uploading.com/files/get/11f9cef8/RACE%2BTIMER.xl

I have an actual copy of the racing post times somewhere also.
brumbie
 
Posts: 197
Joined: Tue Dec 28, 2010 2:00 am
Location: Brisbane,Australia

Postby brumbie » Fri Jul 13, 2012 1:26 am

I think i had some problems with some courses not appearing and so this would be the latest version:

http://uploading.com/files/get/99558aca/RACE_TIMER.xls

It is very hard to get it exact because of the going but hopefully it will give you something to work on as there is no vba except gary's in AA1 and AB1

cheers brumbie.
brumbie
 
Posts: 197
Joined: Tue Dec 28, 2010 2:00 am
Location: Brisbane,Australia

Postby MatGreenaway » Fri Jul 13, 2012 9:30 am

Sorry for not replying with my gratitude earlier... Negapo, thanks for your code. It has taught me some more. I understand the first bit but got a little lost in the second bit. Thanks for the code examples and I will have another look.

Brumbie, that Excel from the RP site was very useful. I will be putting it into action soon. I noticed some courses were missing, but managed to find it as text on the web. I'm just looking at the file link in your second email so thanks to both of you for your invaluable help.

Wishing you every success in your Betting ventures - have a good day.

Thanks,
Mat
MatGreenaway
 
Posts: 39
Joined: Tue Jan 26, 2010 3:00 pm


Return to Discussion

Who is online

Users browsing this forum: No registered users and 61 guests

Sports betting software from Gruss Software


The strength of Gruss Software is that it’s been designed by one of you, a frustrated sports punter, and then developed by listening to dozens of like-minded enthusiasts.

Gruss is owned and run by brothers Gary and Mark Russell. Gary discovered Betfair in 2004 and soon realised that using bespoke software to place bets was much more efficient than merely placing them through the website.

Gary built his own software and then enhanced its features after trialling it through other Betfair users and reacting to their improvement ideas, something that still happens today.

He started making a small monthly charge so he could work on it full-time and then recruited Mark to help develop the products and Gruss Software was born.

We think it’s the best of its kind and so do a lot of our customers. But you can never stand still in this game and we’ll continue to improve the software if any more great ideas emerge.