Alright, so today I’m gonna walk you through how I tackled this “athletic bilbao prediction” thing. It was a bit of a rollercoaster, lemme tell ya.

First things first, I needed data. Like, a LOT of it. I started by scraping historical match results for Athletic Bilbao. Think past seasons, home and away games, goals scored, goals conceded – the whole shebang. I used Python with Beautiful Soup to pull this data from a couple of sports websites. It was messy work, cleaning up all the inconsistencies and formatting errors, but gotta do what you gotta do.
Once I had the historical data, I grabbed some current season stats. Things like current form, player injuries, suspensions, and even weather forecasts for the match day. Found a few reliable sports news sites and APIs for this. Again, Python to the rescue!
Next up, feature engineering. This is where I tried to make the data actually useful. I calculated things like average goals scored per game, win percentages, recent performance (last 5 games), home advantage (difference between home and away win rates), and a bunch of other stuff. This part involved a lot of trial and error, trying different combinations of features to see what seemed to correlate best with match outcomes.
Then came the fun part: building the prediction model. I experimented with a few different machine learning algorithms. Started with something simple like Logistic Regression, then moved on to more complex stuff like Random Forests and Support Vector Machines (SVMs). I used scikit-learn in Python for this. Each model needed training data (the historical data I prepped earlier) and validation data to see how well it was performing.
I spent a good chunk of time tuning the hyperparameters of each model. This basically means fiddling with the settings of the algorithms to get the best possible accuracy. Grid search and cross-validation were my best friends here. It’s tedious, but makes a huge difference.

After training the models, I had to evaluate them. Accuracy wasn’t the only metric I looked at. I also checked precision, recall, and F1-score, especially because predicting a draw is tough, and you want to see how well the model is doing on that front. I even built a simple backtesting framework to simulate betting on past games using the model’s predictions to see how it would have performed in the real world.
Finally, I combined the results. No single model is perfect, so I decided to create an ensemble model. This involved averaging the predictions from the best-performing models (weighted based on their past performance). This usually gives a more robust and reliable prediction.
So, did it work? Well, sort of. The model definitely had some successes, predicting the correct outcome (win, lose, or draw) more often than random chance. But predicting the exact score? That’s a whole different ballgame. Still, it was a fun project, and I learned a ton about data analysis, machine learning, and the unpredictable nature of football.
Lessons learned? Data quality is key, feature engineering is crucial, and don’t expect to get rich predicting football matches. But hey, it’s a good way to practice your coding skills and learn something new.