JSC270, Winter 2020 - Prof. Chevalier
We will be using Jupyter Notebook together with Github throughout the course.
For this pre-lab assignment, you are required 1) to enter your answers directly in this notebook using Python and Markdown, 2) submit your work by making a commit of your files on Github Classroom, and uploading a pdf of your notebook on Quercus. The following instructions walk you through the whole process, from setting up your workspace to submitting your work.
jupyter notebook
.ipynb
file) in your browser. You will be able to edit the notebook directly from your browser.This is an example of a markdown cell. Double-click on the cell to see the markdown.
Some in line $\LaTeX$: $\alpha = 0.05, \beta = 0.2 \Rightarrow \alpha+\beta=0.25.$ If $\LaTeX$ is enclosed between $$
\int_{-\infty}^{\infty} \exp({-x^2/2})dx = \sqrt{2 \pi} $$
then it's displayed on it's own line.
Some text in bold.
some text indented
Markdown cells can display HTML.
<h3> Header 3 </h3>
<p> This is a paragraph using HTML. <br>
<mark> This is important so it's marked in yellow. </mark> </p>
This is a paragraph using HTML.
This is important so it's marked in yellow.
Images can be displayed using markdown or html.
To display a picture of the Toronto skyline (Photo by Berkay Gumustekin on Unsplash):

or HTML code:
<img src='toronto.jpg' style="width:400px; height:400px;">
Answer the following questions by entering your answers in the corresponding cells.
Question 1: Fill out the information below.
Enter your answers in this cell. You may remove this text
Question 2: Explain, in a few sentences, what you expect from this course.
Enter your answer in this cell. You may remove this text
Question 3: A professional data scientist is expected to be strong in the following skills. You are just starting to learn this discipline, and so it is perfectly normal that you are not yet a master at all (or rather any!) of these skills. Take a moment to think about your goals for the term. What are the skills that you feel you need most development in? Indicate with a "X" mark in the table, your level in each skill using the following rates:
Add a "X" mark in the corresponding column, for each row in this cell. You may remove this text
Skill | Level 1 | Level 2 | Level 3 | Level 4 |
---|---|---|---|---|
Programming (python / R) | ||||
Statistical analysis | ||||
Writing (technical reports) | ||||
Data Visualization | ||||
Public speaking | ||||
Team work |
We will study the Toronto Bike Share data in this first laboratory. These questions will get you started with some Python basics and data manipulations.
Download the 2017 data set from the Toronto Bike Share website. The data is stored in a comma-separated value (csv) file, one for each quarter of the year.
Here is an example of a basic .csv file:
There are several options to choose from to load data from a csv file into a Python data structure. We list two common approaches here:
Option 1: Use the python csv library.
See the documentation on how to read and write csv files using this library here. A starter code is provided below.
In the above example, we read data from the data.csv
file using the python csv library, and display the first and last indexes of each row.
Option 2: Another option (recommended for this lab, and beyond) is to use the built-in csv reader of pandas (see pandas documentation and pandas cheat sheet).
Question 1. Read the Bikeshare Ridership (2017 Q1).csv
. Display the name of all of the columns.
Question 2. Display the number of rows in the dataset.
Question 3. Display the first few row of the dataset.
Question 4. Display all of the rows where the travel time less than two minutes.
Question 5. Calculate and display the min, max and average trip duration values for this quarter.
Question 6. Formulate three research questions that would be interesting to investigate about Toronto Bike Share usage. Explain why each question is an interesting one to explore.
Write your answer in this cell. You may remove this text.
Submission and Grading
¶Complete all of the following tasks to submit your work.
nbconvert
to save this Jupyter notebook as an html document without code cells. The command line syntax is: jupyter nbconvert --TemplateExporter.exclude_input=True myfirstnb.ipynb
git
commands as follows. See also the git basics git add [all files that you want to update/add to the repository]
git commit -m "[a meaningful comment about your commit, e.g. final submission]"
git push
Important note: Commit and push both of your .ipyn and .html files to your repository, as well as all of the files (e.g. your headshot photo, the dataset you downloaded) that are necessary for re-executing your notebook from a clone of your repository.
The following grading scheme will be used for this assignment.
Note that marks will be deduced for the following reasons: notebook doesn't compile; files are missing; instructions are not followed.
Marks | ||
---|---|---|
Questions III.1-3 | All questions are appropriately answered | 1 |
Questions IV.1-5 | All questions are correctly answered (1pt per question) |
5 |
Questrion IV.6 | Research questions are clear and interesting Explanations are clear and thoughtful |
9 |
Total | 15 |