This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Last update: Jan 11, 2022

Overview

Data-Science-Intern-Challenge

This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Summer 2022 Data Science Intern Challenge

Please complete the following questions, and provide your thought process/work. You can attach your work in a text file, link, etc. on the application page. Please ensure answers are easily visible for reviewers!

Question 1: Given some sample data, write a program to answer the following: click here to access the required data set

On Shopify, we have exactly 100 sneaker shops, and each of these shops sells only one model of shoe. We want to do some analysis of the average order value (AOV). When we look at orders data over a 30 day window, we naively calculate an AOV of $3145.13. Given that we know these shops are selling sneakers, a relatively affordable item, something seems wrong with our analysis.

Think about what could be going wrong with our calculation. Think about a better way to evaluate this data.

Answer: The wrong average was calculated using this method: total of all order values/ number of order_values. This is wrong because the formula didn't consider the fact that an order can have multiple items. I have tried to explain the problem with code. Click Here to view it.

What metric would you report for this dataset?

Answer: The correct approach would be to divide the total of all order_values by the sum of total_items. By following this method, we would consider the fact that an order can have multiple items.

What is its value?

Answer: $357.92

Question 2: For this question you’ll need to use SQL. Follow this link to access the data set required for the challenge. Please use queries to answer the following questions. Paste your queries along with your final numerical answers below.

How many orders were shipped by Speedy Express in total?

Answer: 54

What is the last name of the employee with the most orders?

Answer: Peacock

What product was ordered the most by customers in Germany?

Answer: Boston Crab Meat. This product was ordered 160 times in total.

Click here to check the sql queries.

This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Related tags

Overview

Data-Science-Intern-Challenge

Summer 2022 Data Science Intern Challenge

Owner

Решения, подсказки, тесты и утилиты для тренировки по алгоритмам от Яндекса.

[ICCV '21] In this repository you find the code to our paper Keypoint Communities

Learning Lightweight Low-Light Enhancement Network using Pseudo Well-Exposed Images

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Automated Evidence Collection for Fake News Detection

本项目是一个带有前端界面的垃圾分类项目，加载了训练好的模型参数，模型为efficientnetb4，暂时为40分类问题。

This is an official pytorch implementation of Fast Fourier Convolution.

Opinionated code formatter, just like Python's black code formatter but for Beancount

Learning Confidence for Out-of-Distribution Detection in Neural Networks

A library for performing coverage guided fuzzing of neural networks

Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Simple streamlit app to demonstrate HERE Tour Planning

Example for AUAV 2022 with obstacle avoidance.

Deep learning with TensorFlow and earth observation data.

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

CONetV2: Efficient Auto-Channel Size Optimization for CNNs

Code accompanying our NeurIPS 2021 traffic4cast challenge

CN24 is a complete semantic segmentation framework using fully convolutional networks