Scraping Racing Post Horse Prediction System Tips Scraping Racing Post Horse Prediction System Tips

December 4, 2023

bash tailwindcss

I discovered a tool for scraping the racing post on github and it worked. It was simple to download the entire days racing cards into a json file with the command

./racecards.py today

So my weekends coding project was a php parser to analyse this json and suggest the horses who will win.

I installed 2 packages with composer

composer require cerbero/json-parser
composer require nunomaduro/termwind

The parser was to simplify the json parsing and the termwind to colorize the output.

<?php

require __DIR__ . '/vendor/autoload.php';

use Cerbero\JsonParser\JsonParser;
use function Termwind\{render};

$source = '../racecards/2023-12-04.json';

$json = JsonParser::parse($source);

foreach ($json as $key => $value) {

    if ($key == "GB"){

        foreach ($value as $key => $value) {

        echo "-----------------------" . PHP_EOL;
        render('<div class="px-1 text-red">' . $key . '</div>');
        echo "-----------------------" . PHP_EOL;

            foreach ($value as $key => $value) {
                render("<p class='text-blue'>". $key ." " . $value['race_name'] ." ". "<span class='text-green'>". $value['prize']. "</span></p>"); // time
                    foreach ($value as $key => $value) {
                        if ($key == "runners"){
                            foreach ( $value as $horse) {

                                $firstOrSecondOrThird = substr_count($horse['form'], '1') + substr_count($horse['form'], '2') + substr_count($horse['form'], '3')  ;
                                $BadForm = substr_count($horse['form'], 'P');
                                $negs = - $BadForm;
                                if (!$horse['rpr']) {$negs--;}
                                if (!$horse['ts']) {$negs--;}
                                if (!$horse['trainer_rtf']) {$negs--;}

                                $score =  $horse['ofr'] + $horse['rpr'] + $horse['ts']; 

                                $darkout = "";
                                if ($negs < 0){$darkout = "text-gray-700";}
                                if ($firstOrSecondOrThird < 1){$darkout = "text-gray-700";}

                                render(" <p class='p-0 pl-1 m-0 ".$darkout ." '> ". 
                                        str_pad($horse['number'], 2, ".", STR_PAD_LEFT) . " " . 
                                        str_pad($horse['form'], 6, ".", STR_PAD_LEFT) . " " . 
                                        '<span class="text-yellow">' . str_pad( $horse['name'], 20, ".", STR_PAD_RIGHT ) . '</span> ' . " " .  

                                        str_pad($horse['age'], 2, ".", STR_PAD_LEFT) . " " . 
                                        str_pad($horse['lbs'], 3, ".", STR_PAD_LEFT) . " " . 

                                        str_pad($horse['ofr'], 3, ".", STR_PAD_LEFT) . " " . 
                                        str_pad($horse['rpr'], 3, ".", STR_PAD_LEFT) . " " . 
                                        str_pad($horse['ts'], 3, ".", STR_PAD_LEFT) . " " . 
                                        str_pad($horse['last_run'], 10, ".", STR_PAD_LEFT) . " " . 

                                        str_pad($horse['jockey'], 22, ".", STR_PAD_RIGHT) . " " . 

                                        str_pad($horse['trainer'], 34, ".", STR_PAD_RIGHT) . " " . 
                                        str_pad($horse['trainer_location'], 28, ".", STR_PAD_RIGHT) . " " . 
                                        str_pad($horse['trainer_rtf'], 3, ".", STR_PAD_LEFT) . " " . 

                                        $firstOrSecondOrThird . " " .
                                        $score . " " .
                                        $negs .

                                        "</p>". PHP_EOL);
                            }
                        }
                    }
            }
        }
    }
}

This provided a pretty output with a basic value to predict the best horses judging by the form.

racing post horse racing tip system

the last 3 columns are a positive factor, total ratings and a negative factor. The greyed out horses are those that failed the basic tests I have coded which are rather basic, but a start.

The next version will load the horses for the race into and array and output them in the order of the score, also it will fade out non-runners and calculate the score in a function. I would also like to output a header for the columns/

I couldn't get the str_pad function to work with spaces, so I used dots which is ok.

I am interested in how people weight the factors, at the moment, it is all a factor of 1, with a negative sign being a missing value. I am going to test it out over the next few days and see how it works.


If you would like to contact me with this form on londinium.com, ilminster.net or via Twitter @andylondon