Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Overview

data-science-on-gcp

Source code accompanying book:

Data Science on the Google Cloud Platform, 2nd Edition
Valliappa Lakshmanan
O'Reilly, Jan 2022
Branch edition2 [being built]
Data Science on the Google Cloud Platform
Valliappa Lakshmanan
O'Reilly, Jan 2017
Branch edition1_tf2 (also: main)

Try out the code on Google Cloud Platform

Open in Cloud Shell

The code on Qwiklabs (see below) is continually tested, and this repo is kept up-to-date. The code should work as-is for you, however, there are three very common problems that readers report:

  • Ch 2: Download data fails. The Bureau of Transportation website to download the airline dataset periodically goes down or changes availability due to government furloughs and the like. Please use the instructions in 02_ingest/README.md to copy the data from my bucket. The rest of the chapters work off the data in the bucket, and will be fine.
  • Ch 3: Permission errors. These typically occur because we expect that you will copy the airline data to your bucket. You don't have write access to gs://cloud-training-demos-ml/. The instructions will tell you to change the bucket name to one that you own. Please do that.
  • Ch 4, 10: Dataflow doesn't do anything.. The real-time simulation requires that you simultaneously run simulate.py and the Dataflow pipeline. If the Dataflow pipeline is not progressing, make sure that the simulate program is still running.

If the code doesn't work for you, I recommend that you try the corresponding Qwiklab lab to see if there is some step that you missed. If you still have problems, please leave feedback in Qwiklabs, or file an issue in this repo.

Try out the code on Qwiklabs

Purchase book

Read on-line or download PDF of book

Buy on Amazon.com

Updates to book

I updated the book in Nov 2019 with TensorFlow 2.0, Cloud Functions, and BigQuery ML.

Owner
Google Cloud Platform
Google Cloud Platform
💯 Coolest snippets

nvim-snippets This was originally included in my personal Neovim setup, but I didn't like having all the snippets there so I decided to have them sepa

Eliaz Bobadilla 6 Aug 31, 2022
Create Python API documentation in Markdown format.

Pydoc-Markdown Pydoc-Markdown is a tool and library to create Python API documentation in Markdown format based on lib2to3, allowing it to parse your

Niklas Rosenstein 375 Jan 05, 2023
Explorative Data Analysis Guidelines

Explorative Data Analysis Get data into a usable format! Find out if the following predictive modeling phase will be successful! Combine everything in

Florian Rohrer 18 Dec 26, 2022
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.

solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli

Mega 3 Mar 03, 2022
Collection of Summer 2022 tech internships!

Collection of Summer 2022 tech internships!

Pitt Computer Science Club (CSC) 15.6k Jan 03, 2023
Quilt is a self-organizing data hub for S3

Quilt is a self-organizing data hub Python Quick start, tutorials If you have Python and an S3 bucket, you're ready to create versioned datasets with

Quilt Data 1.2k Dec 30, 2022
Python syntax highlighted Markdown doctest.

phmdoctest 1.3.0 Introduction Python syntax highlighted Markdown doctest Command line program and Python library to test Python syntax highlighted cod

Mark Taylor 16 Aug 09, 2022
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

P3 Ranker Implementation for our SIGIR2022 accepted paper: P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-bas

14 Jan 04, 2023
A python package to import files from an adjacent folder

EasyImports About EasyImports is a python package that allows users to easily access and import files from sister folders: f.ex: - Project - Folde

1 Jun 22, 2022
step by step guide for beginners for getting started with open source

Step-by-Step Guide for beginners for getting started with Open-Source Here The Contribution Begins 💻 If you are a beginner then this repository is fo

Arpit Jain 66 Jan 03, 2023
Literate-style documentation generator.

888888b. 888 Y88b 888 888 888 d88P 888 888 .d8888b .d8888b .d88b. 8888888P" 888 888 d88P" d88P" d88""88b 888 888 888

Pycco 808 Dec 27, 2022
Generating a report CSV and send it to an email - Python / Django Rest Framework

Generating a report in CSV format and sending it to a email How to start project. Create a folder in your machine Create a virtual environment python3

alexandre Lopes 1 Jan 17, 2022
Zero configuration Airflow plugin that let you manage your DAG files.

simple-dag-editor SimpleDagEditor is a zero configuration plugin for Apache Airflow. It provides a file managing interface that points to your dag_fol

30 Jul 20, 2022
pytorch_example

pytorch_examples machine learning site map 정리자료 Resnet https://wolfy.tistory.com/243 convolution 연산 정리 https://gaussian37.github.io/dl-concept-covolut

injae hwang 1 Nov 24, 2021
A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.

Data-Analysis-Web-App Data Analysis Web App can analysis data in multiple formates(csv, txt, xls, xlsx, ods, odt) and gives shows you the analysis in

Kumar Saksham 19 Dec 09, 2022
Uses diff command to compare expected output with student's submission output

AUTOGRADER for GRADESCOPE using diff with partial grading Description: Uses diff command to compare expected output with student's submission output U

2 Jan 11, 2022
sphinx builder that outputs markdown files.

sphinx-markdown-builder sphinx builder that outputs markdown files Please ★ this repo if you found it useful ★ ★ ★ If you want frontmatter support ple

Clay Risser 144 Jan 06, 2023
the project for the most brutal and effective language learning technique

- "The project for the most brutal and effective language learning technique" (c) Alex Kay The langflow project was created especially for language le

Alexander Kaigorodov 7 Dec 26, 2021
Build AGNOS, the operating system for your comma three

agnos-builder This is the tool to build AGNOS, our Ubuntu based OS. AGNOS runs on the comma three devkit. NOTE: the edk2_tici and agnos-firmare submod

comma.ai 21 Dec 24, 2022