JSC270, Winter 2020 - Prof. Chevalier

A3 - Practice Laboratory ¶

Date: March 11, 12:00 - 14:00, BA3175 ¶

Work on this lab with your A3 team.
This lab is just a practice: it will not be graded.

Github demo¶

Go to the github classroom link https://classroom.github.com/g/Hle3fHiJ

Make sure you and your teammates all have access to the repository. Assign a number to each team member (person 1, person 2, ...).

Individually: Create a branch and pull request¶

A. Create a branch¶

Clone the repository on your computer.

In your terminal, type git branch. This will tell you in which branch you are currently working locally. Because there is no other branches, you should see master.

Create a new branch with git branch [your-name-feature-name], where your-name is your first name or user name on Github.com, and feature-name is the feature you are working on in the branch. For this exercise feature-name can be called first-PR.

Checkout the branch with git checkout [your-name-feature-name].

Person 1: Add a markdown cell before the first code cell below and add some text that describes the interactive widget in the first code cell.
Person 2: Add a markdown cell before the second code cell below and add some text that describes the interactive widget in the second code cell.
Person 3: Add a markdown cell before the third code cell below and add some text that describes the interactive plot in the third code cell.
Person 4: Add a markdown cell before the fourth code cell below and add some text that describes the interactive plot in the fourth code cell.

Each person should add the file to be staged using git add sandbox.ipynb then commit changes to the branch git commit -a -m sandbox.ipynb or by uploading the file directly to your branch on Github,com, and push the branch to Github: git push -u sandbox.ipynb. The option -u is required the first time you push.

B. Create a pull request¶

Once you are ready to push the changes in your branch to the master repository, you need to create a pull request for the team to review your code before integration.

Go to Github.com and open a pull request (use the green button "Compare & pull request"). Give you pull request a title with a brief description. You can see whether your pull request conflicts with the master branch.

As a group: Reviewing and merging the pull requests¶

Never merge your own pull request to the master. Somebody else in your team (ideally everyone as a group) should review your pull request and agree before it is merged.

Convene as a group to review the pull request of Person #1. Go to Github.com and review the Pull Request of Person #1.

Look over the changes in the diffs and makes sure they're what you want to submit.

To bring the changes to the main repository, one of the team members (who is not the one making the pull request) can merge the branch into the master branch.
- Click the green Merge pull request button to merge the changes into master.
- Click Confirm merge.

Repeat 1-3 for all pending pull requests.

Individually: Back to your local environment¶

Since your branch is now closed, you won't be able to push any further changes to the remote repository.

Get back to the master branch git checkout master

Your local master is not up to date. The changes from your own pull request need to be pulled: git pull.

Create a new branch to work on a novel feature from the master, and proceed as showed before.

Individually: Addressing potential conflicts before making a pull request¶

You should communicate extensively as a team, to avoid working on the same portions of the same files. Editing the same cells concurrently will result in conflicts.

Because everyone will be working in parallel, possibly on the same notebook, you need to make sure that your local version (master and branch) integrates the latest updates to the remote master.

The git fetch command allows your local repository (master and branches) to be aware of the changes to all the branches in the repository (including master).

To bring changes from another branch (including the master branch) to your own local branch, you can use git merge. For example, to include the update changes from the master branch to your branch: git merge origin/master. This will allow you to address any conflicts that your local changes may cause to the common, remote repository.

## Person 1 should add a cell before this code cell ##

from ipywidgets import interact
import ipywidgets as widgets

def f(x):
    return x

interact(f, x=10)     # integer argument generates an IntSlider widget

<function __main__.f(x)>

## Person 2 should add a cell before this code cell ##

interact(f, x=True)     # boolean argument generates a Checkbox widget

<function __main__.f(x)>

## Person 3 should add a cell before this code cell ##

interact(f, x='Text')    # text argument generates a Text widget

<function __main__.f(x)>

## If your team has 4 members, person 4 should add a cell before this code cell ##

def f(a, b):
    return print('a + b =', a + b)

interact(f, a=1, b=1);

IPython Widgets and Interact¶

In this part, we will get an introduction to IPython widgets. These are tools that allow us to build interactivity into our notebooks often with a single line of code. These widgets are very useful for data exploration and analysis, for example, selecting certain data or updating charts. In effect, Widgets allow you to make Jupyter Notebooks into an interactive dashboard instead of a static document.

What are widgets?¶

Widgets are eventful python objects that have a representation in the browser, often as a control like a slider, textbox, etc. For example, allow the user to move a slider that results in another value being updated.

Ref: ipywidgets docs

Using Interact¶

interact creates UI controls for exploring code and data interactively. We can pass arguments to interact to get a slider, dropdown, or text.

Ref: interact

Example¶

Let's load the list of top 10 movies that we had scraped from the criterion website.

import pandas as pd

df = pd.read_csv('top10-lists.csv')
df.head()

Now, let's create a slider widget that will allow us to interactively select the number of rows to be displayed.

def frows(x):
    return df.head(x)

interact(frows, x = 10);

Playing with the slider, you can note that the range of possible values is not appropriate. Let's constrain the slider to the range [0; number of rows]. For that, we can pass the IntSlider as the keyword argument for x, and set the parameters as desired.

interact(frows, x=widgets.IntSlider(min=0, max=df.shape[0], value=3));

Now, let's filter the data, and plot only the selected column, by column ID.

def datacol(colnum):
    return df.loc[:,list(df)[colnum]].head(n=3)

interact(datacol, colnum=widgets.IntSlider(min=0, max=len(df.columns)-1, value=0));

The column ID is not very informative. A better selection widget would be to let the viewer select the desired column from a drop-down list showing the column names.

def datacol(colname):
    return df[colname].head(3)

interact(datacol, colname = list(df));   # a dropdown is generated 
                                       # if a list of tuples is given

Here is an example where the viewer can select a movie name from the list of movies in our dataframe, without the duplicates.

import pandas as pd
import numpy as np

# dropdown menu of titles, removing the duplicates, and sorting
dd = widgets.Dropdown(options = df.title.drop_duplicates().sort_values())

# display dropdown and dataframe
display(dd)

interactive, interact_manual, continuous_update, and interactive_output all provide additional flexibility to interact. In the example below, we create different sliders to manipulate different t distributions.

from ipywidgets import interact
import ipywidgets as widgets
from scipy.stats import t
import matplotlib.pyplot as plt
import numpy as np

def plott(mean,sd, df):
    fig, ax = plt.subplots()
    x0 = np.linspace(t.ppf(0.01, df = 1, loc = 0, scale = 1),t.ppf(0.99,df=1, loc = 0, scale = 1), 100)
    x = np.linspace(t.ppf(0.01, df = df, loc = mean, scale = sd),t.ppf(0.99, df=df, loc = mean, scale = sd), 100)
    plt.plot(x0, t.pdf(x0, df=1, loc = 0, scale =1), lw=5, alpha=0.6, label=r'$t_{1}$')
    plt.legend()
    plt.plot(x, t.pdf(x, df=df, loc = mean, scale =sd), lw=5, alpha=0.6, label=r'$t_{\nu}$')
    plt.legend()
    
    plt.axvline(x=0, color = 'red')
    plt.axvline(x=mean)
    #plt.show()


interact(plott, mean = widgets.IntSlider(value=1, min=0, max=20, step=1),
                   sd = widgets.IntSlider(value=1, min=1, max=5, step=1),
                   df = widgets.IntSlider(value=1, min=1, max=30, step=1))

widgets.HTML(
    value="<b>Drag the slider to explore different t distributions</b>",
)

Exploration of normal distributions

from ipywidgets import interact
import ipywidgets as widgets
from scipy.stats import norm
import matplotlib.pyplot as plt
import numpy as np

def plotnorm(mean,sd):
    fig, ax = plt.subplots()
    x0 = np.linspace(norm.ppf(0.01, loc = 0, scale = 1),norm.ppf(0.99, loc = 0, scale = 1), 100)
    x = np.linspace(norm.ppf(0.01, loc = mean, scale = sd),norm.ppf(0.99, loc = mean, scale = sd), 100)
    plt.plot(x0, norm.pdf(x0, loc = 0, scale =1), lw=5, alpha=0.6, label='N(0,1)')
    plt.legend()
    plt.plot(x, norm.pdf(x, loc = mean, scale =sd), lw=5, alpha=0.6, label=r'N($\mu, \sigma^2)$')
    plt.legend()
    
    plt.axvline(x=0, color = 'red')
    plt.axvline(x=mean)
    #plt.show()


interact(plotnorm, mean = widgets.IntSlider(value=1, min=0, max=7, step=1),
                   sd = widgets.IntSlider(value=1, min=1, max=5, step=1))

widgets.HTML(
    value="<b>Drag the slider to explore different normal distributions</b>",
)

`interactive` Exercise¶

Create an interactive scatter plot that shows the effect of mean, variance, and sample size on a fitted simple linear regression line. Assume that the independent variable has $N(\mu,\sigma^2)$ distribution.

## Put your code here

The Widgets-Overview.ipynb notebook contains additional examples of interactive widgets that you can bear inspiration from.

Voila: notebook as a webpage¶

You can visualize your notebook as a webpage using voila.

From your notebook, click on the "Voila" icon to open a pop-up window of your notebook rendered as a webpage.

You can also type voila sandbox.ipynb from your terminal.

Numina data¶

The NuminaSimpleQueries.ipynb notebook includes examples of simple queries to the Numina API.

Exercise: Extend the code in the "Counts" section to plot the timeseries of the count of pedestrians.

	author	director	ranks	title
0	Kasi Lemmons’s Top10	Joseph L. Mankiewicz	1	All About Eve
1	Kasi Lemmons’s Top10	Bob Fosse	2	All That Jazz
2	Kasi Lemmons’s Top10	Marcel Camus	3	Black Orpheus
3	Kasi Lemmons’s Top10	Nicolas Roeg	4	Don’t Look Now
4	Kasi Lemmons’s Top10	Spike Lee	5	Do the Right Thing