Accuracy of BBC Weather forecasts for Honolulu

This repository records the forecasts made by BBC Weather for the city of Honolulu, USA. Essentially, there's a GitHub Action that runs at each 30 minute mark and saves the latest forecasts. The data is stored in a separate branch called data. Therefore, the data is versioned. This allows going back into the past to see the forecasts that were made for any given hour in the (relative) future.

I made this after watching Git scraping, the five minute lightning talk by Simon Willison. It blew my mind! I agree with Simon that collecting and versioning API data via git is a powerful pattern. You could use this pattern to keep a ledger of any dynamic forecasting system, such as the predicted outcomes of football games. In these dynamical systems, the forecasts are updated when new information becomes available. Therefore, the forecasted values depend on the point in time when they were made. I think that it's super interesting to analyse how these forecasts evolve through time.

The build_database.py script iterates through all the commits in the data branch and consolidates the data into an SQLite database. You can run the script yourself by simply cloning this repository. Then, go into a terminal, navigate to the cloned repository, and install the necessary Python dependencies:

python -m venv .env
source .env/bin/activate
pip install -r requirements.txt

Then, run the consolidation script:

python build_database.py

This will create a bbc_weather.sqlite file. You can load the latter into your preferred database access tool — I have a personal preference for DataGrip — to analyse the data. At present, the database contains two tables:

`forecasts`

These are the predicted weather values made at one point in time for a future point in time.

issued_at	at	celsius	feels_like_celsius	wind_speed_kph
2021-03-10 09:00:00	2021-03-10 11:00:00	24	30	16
2021-03-10 09:00:00	2021-03-10 12:00:00	25	31	17
2021-03-10 09:00:00	2021-03-10 13:00:00	26	32	17
2021-03-10 09:00:00	2021-03-10 14:00:00	27	33	17
2021-03-10 09:00:00	2021-03-10 15:00:00	26	33	17

`observations`

These are the weather values that actually occurred — as opposed to those that were forecasted.

at	celsius	wind_speed_kph
2021-03-09 19:00:00	23	0
2021-03-09 20:00:00	22	8
2021-03-09 21:00:00	22	0
2021-03-09 22:00:00	21	9
2021-03-09 23:00:00	21	0

Check out measure_accuracy.sql for an example of how to evaluate the correctness of the forecasts.

☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA

Related tags

Overview

Accuracy of BBC Weather forecasts for Honolulu

`forecasts`

`observations`

Owner

Max Halford

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Mlcode - Continuous ML API Integrations

Lattice methods in TensorFlow

Text-to-Speech for Belarusian language

UniSpeech - Large Scale Self-Supervised Learning for Speech

Unsupervised Language Model Pre-training for French

Code for Editing Factual Knowledge in Language Models

A framework for cleaning Chinese dialog data

Question answering app is used to answer for a user given question from user given text.

null

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Train GPT-3 model on V100(16GB Mem) Using improved Transformer.

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

A natural language processing model for sequential sentence classification in medical abstracts.

Stuff related to Ben Eater's 8bit breadboard computer

A python framework to transform natural language questions to queries in a database query language.

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.