Creating a statistical model to predict 10 year treasury yields

Overview

Predicting 10-Year Treasury Yields

Intitially, I wanted to see if the volatility in the stock market, represented by the VIX index (data source), had a tangible impact on 10-Year Treasury yields (data source). Below are the results of my exploration of the VIX's effect on 10Y yields:

Line Graph Comparing VIX Price and Yield over the last 31 years

VIX and Yield TS

As can be seen in the above graph, there doesn't seem to be much correlation off the bat, simply looking at their annual trends. Overall, yields seem to have dropped quite dramatically over the last 31 years, with not much reaction to major changes in volatility. Meanwhile, VIX has had a more dramatic journey, with plenty of large ups and downs. Although it doesn't seem like much of a correlation from this view, it would be more beneficial to look at a scatter plot and create a regression line to be sure.

VIX vs. Yield Scatter Plot

VIX vs. Yield

The red line in the scatter plot is the regression line obtained. The regression line seems to be slanted downward, indicating a negative effect. This means that when the volatility in the stock market goes up, 10Y Treasury yields go down. The regression equation: 10-Year Treasury Yield = 4.71 + -0.02(VIX Price) indicates that an increase of $1 US in the VIX price would cause the yield to go down by 0.02 percentage points. Since the VIX price will never be $0, it does not make sense to interpret the y-intercept of 4.71. Thus, based on this scatter plot, and the fact that there is a slope to regression line, there may be a significant impact on yield by the price of VIX. However, to check if it is statistically significant, the t-statistic is needed.

Stata Analysis

Thus, I decided to run some statistical analysis in stata, contained here. The first regression I ran was between VIX Price and 10Y yields to see if there was any statistically significant effect of stock volatility on yields. When checking for statistical significance in the 5% size, the t-statistic of the coefficient must be either above 1.96 or below -1.96 to be considered significant. In this case, the t-statistic was -1.46, which meant that the stock volatility was not statistically significant.

...Not so fast. One issue with trying to simplify trends in this way is that omitted variables could play a big part in the statistical significance of present variables. Thus, I decided to use 4 more key macroeconomical datasets: unemployment rate, interest rate, change in CPI, and inflationary expectations. With these 4 key parts of the economy accounted for, I ran another regression, including all of the variables against the yield.

The new data was quite interesting. I had expected the change in CPI and inflationary expectations to be really important factors, but it turns out they are statistically insignificant. The t-statistic for change in CPI was 0.12 and for inflationary expectations was -1.71, short of the 1.96 and -1.96 thresholds required respectively. On the other hand, the t-statistic for the VIX Price dropped to -3.49, meaning that some of the variables that were added to the model were in fact invisibly impacting the effects of the volatility. The unemployment rate and interest rate were both statistically significant, with t-statistics of 10.99 and 37.20 respectively. Overall, 80.19% of the variation in the 10-Year Treasury yield could be explained by my model.

Interest Rate vs. 10-Year Treasury Yield Graph

ir vs. yield

Having seen the graph of a statistically insignificant variable (pre-multiple regression), I wanted to plot a scatter plot of an extremely significant variable to see the contrast. It is clear that there is a clear positive relationship between interest rate and the 10-Year Treasury yield. The regression line: 10-Year Treasury Yield = 2.31 + 0.73(Interest Rate) indicates that an increase in interest rate of 1 percentage point leads to a 0.73 percentage point increase in the yield. It is possible for rates to come down to 0, so the y-intercept indicates that the 10Y Treasury Note yields 2.31% when the interest rate hits 0. The constrast between the two red regression lines, as well as the distribution of the dots shown in the two scatter plots is quite clear, indicating how statistically significant the two variables are comparitavely.

Project instructions

10Y Treasury data citation:

OECD, "Main Economic Indicators - complete database", Main Economic Indicators (database),http://dx.doi.org/10.1787/data-00052-en (October 23, 2021) Copyright, 2016, OECD. Reprinted with permission.

Change in CPI data citation:

OECD, "Main Economic Indicators - complete database", Main Economic Indicators (database),http://dx.doi.org/10.1787/data-00052-en (October 23, 2021) Copyright, 2016, OECD. Reprinted with permission.

Inflation Expectation data citation:

Surveys of Consumers, University of Michigan, University of Michigan: Inflation Expectation© [MICH], retrieved from FRED, Federal Reserve Bank of St. Louis https://fred.stlouisfed.org/series/MICH/, (October 23, 2021)

Programmatically access the physical and chemical properties of elements in modern periodic table.

API to fetch elements of the periodic table in JSON format. Uses Pandas for dumping .csv data to .json and Flask for API Integration. Deployed on "pyt

the techno hack 3 Oct 23, 2022
Describing statistical models in Python using symbolic formulas

Patsy is a Python library for describing statistical models (especially linear models, or models that have a linear component) and building design mat

Python for Data 866 Dec 16, 2022
Office365 (Microsoft365) audit log analysis tool

Office365 (Microsoft365) audit log analysis tool The header describes it all WHY?? The first line of code was written long time before other colleague

Anatoly 1 Jul 27, 2022
A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

Rishikesh S 4 Oct 17, 2022
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

Nguyễn Quang Huy 5 Sep 28, 2022
OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere.

opendrift OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere. Do

OpenDrift 167 Dec 13, 2022
Py-price-monitoring - A Python price monitor

A Python price monitor This project was focused on Brazil, so the monitoring is

Samuel 1 Jan 04, 2022
Titanic data analysis for python

Titanic-data-analysis This Repo is an analysis on Titanic_mod.csv This csv file contains some assumed data of the Titanic ship after sinking This full

Hardik Bhanot 1 Dec 26, 2021
[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022

Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes

Jun Li 65 Dec 09, 2022
Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains This repository contains the source code for an end-to-end open-domain question

7 Sep 27, 2022
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.

The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia

The yt project 367 Dec 25, 2022
Show you how to integrate Zeppelin with Airflow

Introduction This repository is to show you how to integrate Zeppelin with Airflow. The philosophy behind the ingtegration is to make the transition f

Jeff Zhang 11 Dec 30, 2022
Single machine, multiple cards training; mix-precision training; DALI data loader.

Template Script Category Description Category script comparison script train.py, loader.py for single-machine-multiple-cards training train_DP.py, tra

2 Jun 27, 2022
Parses data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)

google_takeout_parser parses both the Historical HTML and new JSON format for Google Takeouts caches individual takeout results behind cachew merge mu

Sean Breckenridge 27 Dec 28, 2022
This mini project showcase how to build and debug Apache Spark application using Python

Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark a

Denny Imanuel 1 Dec 29, 2021
📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

Paulina Khew 21 Aug 30, 2022
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

HoloViz 2.9k Jan 06, 2023
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences

Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are st

32 Dec 20, 2022
Active Learning demo using two small datasets

ActiveLearningDemo How to run step one put the dataset folder and use command below to split the dataset to the required structure run utils.py For ea

3 Nov 10, 2021
Utilize data analytics skills to solve real-world business problems using Humana’s big data

Humana-Mays-2021-HealthCare-Analytics-Case-Competition- The goal of the project is to utilize data analytics skills to solve real-world business probl

Yongxian (Caroline) Lun 1 Dec 27, 2021