Welcome to JacobFK Website’s documentation!¶
Indices and tables¶
Introduction and Review¶
This notebook is meant to introduce the Jupyter notebooks and Python computing language and to show how they make solving many traditional mathematics problems easier. After getting started with the notebook, we run through some typical operations using python including variables, lists, plots, for loops, and functions. We demonstrate the use of these tools to solve some review mathematics problems from traditional Calculus textbook reviews.
Download and Start Jupyter¶
I recommend downloading and installing the most recent version of Anaconda available `here <>`__. You should choose the appropriate installation depending on your operating system. Once downloaded and installed, you can open the application and you should see something like the screen below.

Click on the Jupyter notebook tab and a browser window should open that has a file navigator similar to the image below.

Add a new folder and rename it calc_notebooks. Now, you can go into this folder, and start a new Python 3 notebook.
Markdown Cells¶
In your first notebook, you can click on the toolbar at the top and change the cell type. If you would like to type in a cell, we use a markdown cell. Markdown is a way to format typing with some minimal commands on the computer. Markdown cells also render many HTML tags, so if you are familiar with HTML you can use this to add effects to your documents.

Typing with Markdown¶
We use some simple syntax to create text effects in markdown. For
example, to make a word bold we type **bold**
and italics are
a single asterisk *italics*
. Sections can be denoted with different
level headings using the #
key as follows:
# Header I
## Header II
### Header III
Images with Markdown and HTML tags¶
Images can be added using simple markdown sytanx. Two things should be
considered with images. First, I recommend storing them in a folder in
the same directory as your notebooks. For example, I create a subfolder
named images
and place all my images there. For a larger project I
will make image folders for each notebook.
The second important thing to remember is to use a continuous filename when saving the file. This means no spaces, i.e.
pig_pic.png
not pig pic.png
From here, we use the markdown syntax

to refer to an image in the folder images named pig pic. We can also use HTML image tags to show an image. Here, we can utilize some additional controls over the height and width of the image displayed. You can use some additional controls including the center and figure command to add a caption and center the figure on the page. This would be as follows:
<center>
<figure>
<img src = "pig_pic.png" height = 10 width = 20 >
<figcaption> This is a pig picture </figcaption>
</figure>
</center>
Python and Code Cells¶
To begin, we demonstrate the utility of Python as a calculator within code cells in Jupyter. The familiar arithmetic operations are as follows:
+
Addition-
Subtraction*
Multiplication/
Division**
Exponentiation
To execute the cell, press shift
+ enter
.
In [1]:
2 + 3
Out[1]:
5
In [2]:
1 + 3 * 5 * (2-4) - 7**2
Out[2]:
-78
In [3]:
1/2
Out[3]:
0.5
Types¶
There are different forms of information that Python understands. These
types depend on the nature of the number or whether the entry is
something other than a number like a string
of symbols. We can
investigate the type of an object using the type()
function.
In [4]:
type(1), type(1/3), type("steve")
Out[4]:
(int, float, str)
In [5]:
a = 5
In [6]:
type(a)
Out[6]:
int
In [7]:
a * (1/3)
Out[7]:
1.6666666666666665
In [8]:
a
Out[8]:
5
Additional Numbers¶
We are familiar with numbers like \(\pi\) and values for
trigonmetric functions. We will use the numpy
library for these
values. More on the numpy
library follows, but for now we
demonstrate how to import, abbreviate, and use the numpy
library to
numerate \(\pi\) and some trigonmetric functions as follows.
In [9]:
import numpy as np
np.pi, np.sin(30), np.cos(2*np.pi)
Out[9]:
(3.141592653589793, -0.98803162409286183, 1.0)
Lists¶
The last example demonstrates an important object for us in the form of
a list
. A list is an indexed list of objects. For example, the list
above is different than that of [1, 3, 2]
because the order of the
objects matters. We can reference items in a list based on their
location.
In Python, lists begin by counting with zero. This image below
demonstrates a simple list L1, its entries
np.sin(30), 1, "robot", 2, "stew"
, and the corresponding indicies
for each entry.

In [10]:
L1 = [np.sin(30), 1, "robot", 2, "stew"]
In [11]:
type(L1)
Out[11]:
list
In [12]:
L1.append(4)
In [13]:
L1
Out[13]:
[-0.98803162409286183, 1, 'robot', 2, 'stew', 4]
In [14]:
L1[1:3]
Out[14]:
[1, 'robot']
In [15]:
L1[2:]
Out[15]:
['robot', 2, 'stew', 4]
In [16]:
L1 * 5
Out[16]:
[-0.98803162409286183,
1,
'robot',
2,
'stew',
4,
-0.98803162409286183,
1,
'robot',
2,
'stew',
4,
-0.98803162409286183,
1,
'robot',
2,
'stew',
4,
-0.98803162409286183,
1,
'robot',
2,
'stew',
4,
-0.98803162409286183,
1,
'robot',
2,
'stew',
4]
In [17]:
L1 + L1
Out[17]:
[-0.98803162409286183,
1,
'robot',
2,
'stew',
4,
-0.98803162409286183,
1,
'robot',
2,
'stew',
4]
Numpy and Arrays¶
Besides offering some basic numerical operations, numpy
has
functions built in to generate lists of values. We will use this
frequently to construct domains for functions we are investigating. Two
functions we will use consistently are the arange
and linspace
functions. Both are shown below. Remember any NumPy operation will be
preceded by np.
.
In [18]:
np.arange?
In [19]:
np.arange(0, 10, 1)
Out[19]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [20]:
np.arange(0, 10, 0.5)
Out[20]:
array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ,
5.5, 6. , 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])
In [21]:
np.linspace?
In [22]:
np.linspace(0, 10, 2)
Out[22]:
array([ 0., 10.])
In [23]:
np.linspace(0, 10, 11)
Out[23]:
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
In [24]:
np.linspace(0, 100, 101)
Out[24]:
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8.,
9., 10., 11., 12., 13., 14., 15., 16., 17.,
18., 19., 20., 21., 22., 23., 24., 25., 26.,
27., 28., 29., 30., 31., 32., 33., 34., 35.,
36., 37., 38., 39., 40., 41., 42., 43., 44.,
45., 46., 47., 48., 49., 50., 51., 52., 53.,
54., 55., 56., 57., 58., 59., 60., 61., 62.,
63., 64., 65., 66., 67., 68., 69., 70., 71.,
72., 73., 74., 75., 76., 77., 78., 79., 80.,
81., 82., 83., 84., 85., 86., 87., 88., 89.,
90., 91., 92., 93., 94., 95., 96., 97., 98.,
99., 100.])
Range¶
The range()
function can perform a similar function, however we
don’t get an array of values in return. Instead, this simply generates
the ability to count through a sequence of integers by a given step.
According to the help function, range:
Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step. range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted! range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
In [25]:
range(10)
Out[25]:
range(0, 10)
In [26]:
range(1, 10, 2)
Out[26]:
range(1, 10, 2)
For Loops and Sequences¶
A for loop defines a repeated series of operations to be carried out a
finite number of times. For example, the following code demonstrates the
repeated printing of members of range(5)
.
In [27]:
for i in range(5):
print(i)
0
1
2
3
4
Combining range
, for
, and append
.¶
Together, the for loop, the list, and the ability to add elements to
lists with the append
function allows us to express repeated
operations easily. For example, suppose we want to form a sequence of
integers. We can understand this as starting at 1 and adding 1 to each
previous term to get the next. Symbolically, we would write this as:
In Python, we write this as follows.
In [28]:
a = [1]
for i in range(10):
next = a[i] + 1
a.append(next)
a
Out[28]:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Note that the second term is the first element generated within the
for
loop. We create a variable named next
and define it to be
the current term of the sequence + 1. We then tack this term on to the
end of the list, and continue doing so while i
ticks through the
values of range(10)
. Essentially, the loop is building the list
piece by piece as follows:
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
.
.
.
In [29]:
b = [1]
for i in range(1, 10, 2):
next = np.sin(i)
b.append(next)
b
Out[29]:
[1,
0.8414709848078965,
0.14112000805986721,
-0.95892427466313845,
0.65698659871878906,
0.41211848524175659]
In [30]:
c = [300]
In [31]:
for i in range(30):
next = c[i] + 0.03*c[i]
c.append(next)
c[-5:] #the last five values
Out[31]:
[646.9773802631524,
666.386701671047,
686.3783027211784,
706.9696518028138,
728.1787413568982]
Functions¶
Most of us have seen functions in high school mathematics. For example, the function
relates every input to its square. We are quite used to seeing these expressed as plots in a Cartesian coordinate plane. We will deal more with the intricacies of functions through the class, however it is important to be able to define and use basic mathematical functions. For example, to define the function above in Python, we will write:
def f(x):
return x**2
After doing this, if we want to find the associated value with some input, we can evaluate the function using the parenthesis following its name;
f(3)
We could also evaluate the function for a list of values all at once. Suppose we want to find the value of the first ten integers squared. We can make a list of the ten integers and plug this list into the function rather than evaluating each integer individually.
In [32]:
def f(x):
return x**2
In [33]:
f(3)
Out[33]:
9
In [34]:
x = np.arange(1, 11, 1) #first ten integers
f(x)
Out[34]:
array([ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
Plotting Functions with Matplotlib¶
Often, we can understand a situation better by visualizing it. We will constantly be graphing relationships expressed as function defined as a formula and as a process. Both are demonstrated below using the sequences and functions from above.
First, we tell Jupyter to make the graphs in the notebook itself with
the magic command %matplotlib inline
. We also import and abbreviate
the plotting library PyPlot as plt
. Whenever we call something from
this library, we preface it with plt.
.
First, we walk through a basic plot of the function \(f(x) = x^2\). We define the function, the \(x\) values we will evaluate it at, and then call the plot with the input and output variables as
plt.plot(x, f(x))
In [35]:
%matplotlib inline
import matplotlib.pyplot as plt
In [36]:
x = np.arange(1, 11, 1)
def f(x):
return x**2
In [37]:
f(x)
Out[37]:
array([ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
In [38]:
plt.plot(x, f(x))
Out[38]:
[<matplotlib.lines.Line2D at 0x114f46fd0>]

Plotting Sequences with Matplotlib¶
In a similar way, we can generate the plot for a sequence. Because we
are interested in the discrete points of the sequence, we will add the
marker key o
which tells matplotlib to draw circles at each point.
The plots can be customized easily, and we then demonstrate altering the
size and appearance of the markers as well as how to add titles, labels,
and a legend to a plot.
In [39]:
seq = []
for i in range(1,11):
next = i**2
seq.append(next)
seq
Out[39]:
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
In [40]:
plt.plot(seq, 'o')
Out[40]:
[<matplotlib.lines.Line2D at 0x11555f860>]

In [41]:
plt.plot(seq, 'o', c = 'lightblue', markersize = 12, alpha = 0.4)
Out[41]:
[<matplotlib.lines.Line2D at 0x11568b828>]

In [42]:
plt.plot(seq, '^', c = 'lightpink', markersize = 12, alpha = 0.7, label = "Sequence I")
plt.title("Adding Elements to the Plot")
plt.xlabel("Index of Term")
plt.ylabel("Value of Term")
plt.legend(loc = "best", frameon = False)
Out[42]:
<matplotlib.legend.Legend at 0x1157afa90>

Sequences and Functions of Import¶
We want to focus on four primary kinds of functions to begin– Linear, Quadratic, Exponential, and Trigonometric. We want to be able to recognize situations that can be modeled by these functions as well as understand the differences between them when expressed as words, numbers, tables, and plots.
We define and plot each as closed functions and as sequences. We
introduce the subplot
function to place plots side by side. The idea
of the subplot
function is to declare the number of rows
and
columns
of the figure. For each of these examples we want one row
with two columns. Finally, our third value indicates the place of the
figure.
plt.subplot(1, 2, 1)
Means 1 row, 2 columns, and place this plot as the first of the two. It should be accompanied by another subplot:
plt.subplot(1, 2, 2)
In [43]:
x = np.linspace(1, 10, 11)
def l(x):
return x
In [44]:
arith = [1]
for i in range(9):
next = arith[i] + 1
arith.append(next)
In [45]:
plt.figure(figsize = (12, 6))
plt.subplot(1, 2, 1)
plt.plot(x, l(x), label = "Linear Function")
plt.title("A Linear function $f(x) = x$")
plt.legend(loc = "best", frameon = False)
plt.subplot(1, 2, 2)
plt.plot(arith, 'o', label = "Arithmetic Sequence")
plt.title("An Arithmetic Sequence $a_n = a_{n-1} + 1$")
plt.legend(loc = "best", frameon = False)
Out[45]:
<matplotlib.legend.Legend at 0x115804630>

Modeling with Sequences and Functions¶
A crucial consideration in working with the Calculus is the notion of a rate of change. This notebook aims to familiarize you with four important families of functions and their key characteristics. Next, we present some scenarios where these families serve as appropriate models for particular kinds of situations because of the nature of the quantities change.
Arithmetic Sequences and Linear Functions¶
An arithmetic sequence is one that exhibits a constant change between terms. Every successive term of the sequence can be determined by multiplication and/or addition by the exact same constant. When plotted, the terms of an arithmetic sequence fall on a straight line.
“In mathematics, an arithmetic progression (AP) or arithmetic sequence is a sequence of numbers such that the difference between the consecutive terms is constant. For instance, the sequence 5, 7, 9, 11, 13, 15, . . . is an arithmetic progression with common difference of 2.” – Wikipedia
In [1]:
arith_1 = [1]
for i in range(5):
next = arith_1[i] + 1
arith_1.append(next)
arith_1
Out[1]:
[1, 2, 3, 4, 5, 6]
In [2]:
arith_2 = [2]
for i in range(5):
next = arith_2[i] +3
arith_2.append(next)
arith_2
Out[2]:
[2, 5, 8, 11, 14, 17]
In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
In [4]:
plt.figure(figsize = (8,5))
plt.plot(arith_1, '--o', label = "$a_0 = 1$; \n$a_{n+1} = a_n + 1$")
plt.plot(arith_2, '--o', label = "$a_0 = 2$; \n$a_{n+1} = a_n + 3$")
plt.title("Two Arithmetic Sequences in RECURSIVE FORM")
plt.legend(loc = "best", frameon = False)
Out[4]:
<matplotlib.legend.Legend at 0x10e6b4ac8>

Linear functions exhibit the same behavior. We can recognize linear situations because of a constant rate of change. A linear function is usually defined as:
where \(m\) is the rate of change or slope, and b is the \(y\)-intercept, analagous to \(a_0\) or the starting place in our sequence work. Thus, the sequences above can be defined using our knowledge of the starting value and rate of change (how much is added every term).
We can verify this with a plot.
In [5]:
x = np.arange(0, 6, 1)
def s1(x):
return x + 1
def s2(x):
return 3*x + 2
plt.figure()
plt.plot(x, s1(x), '--o', label = '$s1(x) = x + 1$')
plt.plot(x, s2(x), '--o', label = '$s2(x) = 3x + 2$')
plt.title("Two Linear Functions in CLOSED FORM")
plt.legend(loc = "best", frameon = False)
Out[5]:
<matplotlib.legend.Legend at 0x10ef3fa90>

Geometric Sequences and Exponential Functions¶
In a Geometric Sequence we generate terms through multiplication.
“In mathematics, a geometric progression, also known as a geometric sequence, is a sequence of numbers where each term after the first is found by multiplying the previous one by a fixed, non-zero number called the common ratio. For example, the sequence 2, 6, 18, 54, ... is a geometric progression with common ratio 3. Similarly 10, 5, 2.5, 1.25, ... is a geometric sequence with common ratio 1/2.”–Wikipedia
The graphs of these sequences exhibit a constant ratio in the rate of change and produce a curve the increases increasingly fast by this multiplier.
In [6]:
geom_1 = [1]
for i in range(5):
next = geom_1[i]*2
geom_1.append(next)
geom_1
Out[6]:
[1, 2, 4, 8, 16, 32]
In [7]:
geom_2 = [3]
for i in range(5):
next = geom_2[i]*2.034
geom_2.append(next)
geom_2
Out[7]:
[3,
6.101999999999999,
12.411467999999998,
25.244925911999992,
51.34817930500798,
104.44219670638623]
In [8]:
plt.figure()
plt.plot(geom_1, '--o', label = "$a_0 = 1$\n$a_{n+1} = a_n *2$")
plt.plot(geom_2, '--o', label = "$a_0 = 3$\n$a_{n+1} = a_n *2.034$")
plt.title("Two Geometric Sequences in RECURSIVE FORM")
plt.legend(loc = "best", frameon = False)
Out[8]:
<matplotlib.legend.Legend at 0x10ed9cdd8>

Exponential functions exhibit the same behavior. Typically, exponential functions are defined as:
where \(a\) is analagous to our starting value and \(b\) to the common ration between terms. Thus, we can define the two sequences in closed form as follows:
In [9]:
x = np.arange(0, 6, 1)
def s1(x):
return 2**x
def s2(x):
return 3*(2.034)**x
plt.figure()
plt.plot(x, s1(x), '--o', label = '$s1(x) = 2^x$')
plt.plot(x, s2(x), '--o', label = '$s2(x) = 3(2.034)^x$')
plt.title("Two Exponential Functions in CLOSED FORM")
plt.legend(loc = "best", frameon = False)
Out[9]:
<matplotlib.legend.Legend at 0x10f1d0ba8>

Application to Population Growth¶
A typical application of arithmetic and geometric sequences is to investigate population growth. A very simple model for a population may be:
- Population Model I: Population is currently 500. Every year we add 37 new people.
- Population Model II: Population is currently 500. Every year the population grows by 5.1%.
Model each of these with a recursive sequence and discuss which kind of sequence and closed form is appropriate for each. Then make a side by side plot showing the results of a recursively defined sequence and plot and a closed form function and plot.
Adjusting the Model¶
Perhaps the scenarios above seem a little to simplistic. It maybe that rather than changing by the same rate every year, that this rate of change increases as the population grows. If we suppose a constant increase in the rate of change, we are looking at a sequence whose rate of change changes linearly. If we make a sequence whose change is an arithmetic sequence, we should have this.
Note the rate of change between the terms in the sequence below.
In [10]:
p3 = [500]
for i in range(10):
next = p3[i] + (2*i + 3)
p3.append(next)
p3
Out[10]:
[500, 503, 508, 515, 524, 535, 548, 563, 580, 599, 620]
In [11]:
plt.plot(p3, '--o')
Out[11]:
[<matplotlib.lines.Line2D at 0x10f31c0b8>]

We will return to the closed form of this function later. What is important to notice now is the difference in the way that the sequence is generated. Each term changes by an additively changing amount. We can look closely a the difference in the terms of the sequence.
In [12]:
p3_diff = [1]
for i in range(10):
next = p3[i+1] - p3[i]
p3_diff.append(next)
In [13]:
p3_diff
Out[13]:
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
In [14]:
plt.plot(p3_diff, '--o')
plt.title("A Plot of the Difference in Terms of Sequence p3")
Out[14]:
<matplotlib.text.Text at 0x10f4f2320>

What is \(\pi\)?¶
In his book The Historical Development of the Calculus, C.H. Edwards relates an early history of \(\pi\). Around 430 BC, Hippocrates of Chios showed that the ratio of the area of two circles is equal to the ratio of the squares of their diameters. Later, Eudoxus and Archimedes worked to deploy the method of exhaustion in determining this value.
Rough Approximations¶
For Archimedes, he began with a hexagon inscribed and circumscribed on a circle with unit radius. Of course, we’ve probably seen the relationship for the area of a circle as
Thus, if we have \(r = 1\), the area of our circle is equal to \(\pi\). We can use some rough approximations and our knowledge of square roots to determine a first approximation as Archimedes did by using a unit circle and inscribed and circumscribed hexagons as seen in the image below.
<img src = “images/exhaust.png” height=”50%” width=”50%” /img>
Here, we decompose the inscribed and circumscribed polygons using our knowledge of triangles. As we saw in class, we can understand the inscribed triangles through the Pythagorean theorem, and some elementary knowledge about equilateral triangles. To find the height of these triangles, we recognize that dropping a perpendicular to the base creates a right triangle with hypotenuse 1 and short leg \(\frac{1}{2}\) shown below.
<img src = “images/triangle.png” height=”50%” width=”50%” /img>
Thus, we can find the height using the Pythagorean theorem. This gives us:
and an area for each of the six triangles of
Thus, for all six triangles we have
We can use Python to compute this as a number using our tenth approximation for \(\sqrt{3}\).
In [1]:
a = [1.6]
for i in range(9):
next = 0.5*(a[i]+3/a[i])
a.append(next)
print("The first approximation for pi is", (3*a[-1]/2))
The first approximation for pi is 2.598076211353316
Recurrence Relationships from Antiquity¶
While we could continue to determine the successive relationships as we go, there are actually patterns that develop in both Eudoxus who started with a square and Archimedes who began with a hexagon. As Bruce Shapiro notes in his book Scientific Computation: Python Hacking for Math Junkies, the Eudoxean relationship is contingent on the length of an outer edge of a polygon \(H_n\) and the perimeter of that polygone \(P_n\), as follows:
Further, we have a recursive relationship on \(H_n\) as:
Thus, if we have an initial approximation, we can deploy this relationship to iterate and find better and better approximations.
For Archimedes, we have a similar relationship where \(I_n\) is the inscribed polygon and \(C_n\) is the circumscribed polygon, thus
and the recurrence relationships
If you need something to do, you can establish these relationships.
Introduction to Infinite Processes¶
This notebook introduces two of the earliest known examples of mathematics motivated by the difficulty of representation. Both \(\sqrt{2}\) and \(\pi\) are problematic numbers. We can easily describe them geometrically, but when it comes to actually representing them, the best we can do is approximate them.
GOALS:
- Use Babylonian Square Root Algorithm
- Use Archimedes method of Exhaustion to approximate Pi
- Represent Zeno’s Paradox of the Tortoise and Achilles
Computing Square Roots¶
Suppose we have a guess that we think is close to \(\sqrt{2}\)
Either \(x_1\) is a better guess or \(\frac{2}{x}\), but even better still would be the average of the two:
If we continue in this manner we will get better and better approximations:
We can use our familiar for loop structure to generate successive approximations for \(\sqrt{2}\) based on this formulation.
In [1]:
sqrt = [2]
for i in range(10):
next = 0.5*(sqrt[i] + 2/(sqrt[i]))
sqrt.append(next)
sqrt
Out[1]:
[2,
1.5,
1.4166666666666665,
1.4142156862745097,
1.4142135623746899,
1.414213562373095,
1.414213562373095,
1.414213562373095,
1.414213562373095,
1.414213562373095,
1.414213562373095]
In [2]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
In [3]:
plt.plot(sqrt, '--o')
plt.ylim(0,2)
plt.title("Successive Approximations to $\sqrt{2}$")
Out[3]:
<matplotlib.text.Text at 0x112313940>

Pi and Trigonometry¶
Recall that we can understand \(\pi\) as the ratio between the diameter and circumference of a circle. Thus, if we have a circle with diameter 1, the circumference would be \(\pi\). We use this to approximate the value of \(\pi\).
In a similar manner we can approximate the value of \(\pi\). Following Aristarchus, we can consider a sqaure mounted inside a circle of radius 1, with its vertices located at \((1, 0), (0, 1), (-1, 0),\) and \((0, -1)\). We can use the perimeter of the square as an approximation for the circumference of the circle.

A better approximation would be if we used a regular polygon with twice as many sides. We can understand this as placing points at the halfway between each existing vertex. To mathematize this, let’s refresh our memory of the unit circle.
Unit Circle¶
The unit circle describes the location of points around a circle of radius 1. For us, we care about the use of \(\cos\) and \(\sin\) to place coordinates on the circle. If we cut our square in half, we would have points every \(45^o\) or \(\frac{\pi}{2}\), and can use the trigonometric measures of these angles as ways to find the coordinates.

We demonstrate the first two iterations of this below using
matplotlib.patches
Circle and RegularPolygon Functions.
In [4]:
import matplotlib.pyplot as plt
import matplotlib.patches as patches
fig1 = plt.figure()
ax1 = fig1.add_subplot(111, aspect = 'equal')
ax1.add_patch(patches.RegularPolygon((0,0), 4,1, alpha = .2))
ax1.add_patch(patches.Circle((0,0), 1, fill = False))
plt.xlim(-1,1)
plt.ylim(-1,1)
Out[4]:
(-1, 1)

In [5]:
import matplotlib.pyplot as plt
import matplotlib.patches as patches
fig1 = plt.figure()
ax1 = fig1.add_subplot(111, aspect = 'equal')
ax1.add_patch(patches.RegularPolygon((0,0),8,1, alpha = 0.2)) #center with 8 sides, radius 1
ax1.add_patch(patches.Circle((0, 0), 1, fill = False))
plt.xlim(-1,1)
plt.ylim(-1,1)
Out[5]:
(-1, 1)

The coordinate values of the points can be found with some creative use of a list comprehension and our knowledge of the angles.
In [6]:
square_approx = [[np.cos(i*np.pi/2), np.sin(i*np.pi/2)] for i in range(4)]
In [7]:
square_approx
Out[7]:
[[1.0, 0.0],
[6.123233995736766e-17, 1.0],
[-1.0, 1.2246467991473532e-16],
[-1.8369701987210297e-16, -1.0]]
Extending the Approximation¶
The work above is a strict under approximation for \(\pi\). We will always get a value less than it, as we will never completely fill the circle.
We could arrive at an upper bound for \(\pi\) by constructing a square outside of the circle and continually bisecting the sides to limit an upper boundary for out guess. Archimedes performed this in writing in the second century BC.
Zeno’s Paradox¶
The philosopher Zeno of Elea is said to be the product of a series of paradoxical questions, one of which dealt with the difficulty in quantifying the continuum. Here, the problem involved a race between a tortise and Achilles (a man). The tortise got a head start. Aristotle describes the problem in his Physics as follows:
“This claims that the slowest funner will never be caught by the fastest runner, because the one behind has first to reach the point from which the one in front started, and so the slower one is obound always to be in front” Aristotle”, Physics 239b14 - 18
The problem, according to Zeno, has been interpreted to mean that it is impossible to cross any unit distance before crossing half of it. It is impossible to cross this half without having crossed half of the half. Continue the argument to infinity and how can we ever move?! The image below offers a visualization of the problem.

Zeno and Sums¶
If we examine the image above, it should make sense to us that the sum of the terms of Zeno’s sequence add to one. It may be counter intuitive to think that an infinite number of terms add to a finite number, however this is one of the fundamental problems for us. What happens in the realm of the infinitely small?
We can use our tools to investigate this. First, let us create a sequence of Zeno’s terms as follows.
Now, we will create a sequence that provides the partial sums of the terms of Zeno as:
We will also plot these side by side to compare.
\(~\)
sum
in the construction
of the partial sums.
In [8]:
zeno = [1/2**(i+1) for i in range(10)]
zeno
Out[8]:
[0.5,
0.25,
0.125,
0.0625,
0.03125,
0.015625,
0.0078125,
0.00390625,
0.001953125,
0.0009765625]
In [9]:
zeno_sums = [sum(zeno[:i]) for i in range(10)]
In [10]:
zeno_sums
Out[10]:
[0,
0.5,
0.75,
0.875,
0.9375,
0.96875,
0.984375,
0.9921875,
0.99609375,
0.998046875]
In [11]:
plt.figure(figsize = (12, 5))
plt.subplot(1, 2, 1)
plt.plot(zeno, '--o')
plt.title("Zeno's Sequence of Terms")
plt.subplot(1, 2, 2)
plt.plot(zeno_sums, '--o')
plt.title("Partial Sums of Zeno's Terms")
Out[11]:
<matplotlib.text.Text at 0x112c85898>

Sequences and Series¶
This notebook picks up on Zeno’s problem to investigate some further applications of sequences and their sums; series. We will focus on four sequences of integers and their powers:
int1 = [1, 2, 3, 4, 5, ... ]
intsq = [1**2, 2**2, 3**2, 4**2, 5**2, ... ]
intcbd = [1**3, 2**3, 3**3, 4**3, 5**3, ... ]
intfrth = [1**4, 2**4, 3**4, 4**4, 5**4, ... ]
Let’s create these sequences and then plot them all on a \(2 \times 2\) grid.
In [1]:
int1 = [(i+1) for i in range(10)]
intsq = [(i+1)**2 for i in range(10)]
intcbd = [(i+1)**3 for i in range(10)]
intfrth = [(i+1)**4 for i in range(10)]
In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
plt.figure(figsize = (11, 7))
plt.subplot(221)
plt.plot(int1, 'o')
plt.title("int1 = [1, 2, 3, 4, 5, ...]")
plt.subplot(222)
plt.plot(intsq, 'o')
plt.title("intsq = $[1^2, 2^2, 3^2, 4^2, 5^2, ...]$")
plt.subplot(223)
plt.plot(intcbd, 'o')
plt.title("intcbd = $[1^3, 2^3, 3^3, 4^3, 5^3, ...]$")
plt.subplot(224)
plt.plot(intfrth, 'o')
plt.title("intfrth = $[1^4, 2^4, 3^4, 4^4, 5^4, ...]$")
Out[2]:
<matplotlib.text.Text at 0x11303db70>

In [3]:
plt.figure(figsize = (8, 6))
plt.plot(int1, '--o', label = "Sequence I")
plt.plot(intsq, '--o', label = "Sequence II")
plt.plot(intcbd, '--o', label = "Sequence III")
plt.plot(intfrth, '--o', label = "Sequence IV")
plt.ylim(0, 1000)
plt.legend()
plt.title("All Four Sequences Together")
Out[3]:
<matplotlib.text.Text at 0x1138f4d30>

Patterns within the Sequences¶
We are interested in being able to do something similar with what we had accomplished in recognizing the patterns in sequences. We will be interested in two specific patterns– that of the sequence itself and those of its partial sums. Let’s look at these patterns for sequence I, the integers.
Partial Sums¶
With Zeno’s problem, we were also interested in the addition of all the
pieces of the picture. We will do the same for the sequences above. To
begin, we will look if there is an easily discernable pattern in the
partial sums of the sequence. This means we will form another
sequence based on the sum of the first \(n\) terms of int1
. This
would mean:
Since we already have the sequence seq1
, we can use a list
comprhension to easily form the sequence of partial sums and a side by
side plot of the sequence and its partial sums.
In [4]:
psum_s1 = [sum(int1[:i+1]) for i in range(10)]
psum_s1
Out[4]:
[1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
In [5]:
plt.figure(figsize = (10, 5))
plt.subplot(121)
plt.plot(int1, '--o')
plt.title("Sequence")
plt.subplot(122)
plt.plot(psum_s1, '--o')
plt.title("Partial Sums")
Out[5]:
<matplotlib.text.Text at 0x113f4bc18>

We have seen these patterns before with our arithmetic and quadratic sequences. Notice the constant difference and first term of the integer sequence, and the constant rate of change between terms in its partial sums, leading to the quadratic case. It may not be immediately obvious what quadratic, but this is an important relationship nonetheless. The partial sums of an arithmetic sequence form a quadratic.
A Different Visualization¶
This is a more advanced graphic to make on the computer, but it serves a nice purpose for understanding the general pattern to the partial sums. Here, we create a rectangular grid that helps us understand the triangular numbers. We create an array of values where each row has one more 1, and color these. We should recognize the connection to the Red and Blue pieces of the series. Here, the area of the figures represents the sum of the terms. In the first figure for example, we have a geometric representation of the sum of the first four terms of the sequence of integers.
In [6]:
from ipywidgets import interact, widgets, fixed
In [7]:
np.zeros((3,4))
Out[7]:
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
In [8]:
def tri_numplot(n=4):
plt.figure(figsize=(6.0, 6.0))
a=np.zeros((n,n+1))#creates an n by n+1 array of zeros
for i in range(n):
for j in range(i+1):
a[i,j]=1
a=np.flipud(a)
plt.pcolor(a, edgecolors='k', linewidths=4)
plt.axis('tight')
plt.axis('off')
plt.title("Geometry of Partial Sums")
In [9]:
tri_numplot(4)

Generalizing the Partial Sums¶
Now, we can investigate to whether this pattern will hold for larger
values of n
with a slider. Pay attention to the area of the figures.
In [10]:
interact(tri_numplot, n=widgets.IntSlider(min=1, max=100, step=1, value=1));

Notice the geometry forms a rectangle with dimensions:
This is the total for two sequences though, and we only want one so we can divide this in half and we’ve determined a formula for the partial sums as:
We should be able to verify this with a plot of this function and a plot of the partial sums together. Here, we plot a function
Be careful about the indicies to see that your plot begins with the same terms.
In [11]:
def s(n):
return 0.5*n**2 + 0.5*n
n = np.arange(1, 10)
plt.figure(figsize = (10, 6))
plt.plot(s(n), '-o', alpha = 0.7, color = 'lightblue', label = "$s(n)$")
plt.plot(psum_s1, '-x', markersize = 16, color = 'red', label = "$\sum_i ^n i$")
plt.title("The Partial Sums and Function Agree")
plt.legend(loc = "best", frameon = False)
Out[11]:
<matplotlib.legend.Legend at 0x11464cb70>

Example 2¶
Let’s look at the same patterns for our sequence III, the integers cubed and its partial sums. As a reminder, here we have
intcbd = [1**3, 2**3, 3**3, 4**3, ...]
We will form the partial sums of the sequence in a similar manner to before, and see if we recognize a pattern.
In [12]:
psum_cbd = [sum(intcbd[:i+1]) for i in range(10)]
In [13]:
intcbd, psum_cbd
Out[13]:
([1, 8, 27, 64, 125, 216, 343, 512, 729, 1000],
[1, 9, 36, 100, 225, 441, 784, 1296, 2025, 3025])
The values of cb_sums
should look familiar. They are
\(1^2, 3^2, 6^2, 10^2, ...\) the squares of the partial sums of
int1
.
In [14]:
plt.figure(figsize = (10, 5))
plt.plot(intcbd, '--o', label = "$1^3, 2^3, 3^3, ...$")
plt.plot(psum_cbd, '--o', label = "Partial Sums")
plt.title("Sequence $a_i = i^3$ and its Partial Sums")
plt.legend()
Out[14]:
<matplotlib.legend.Legend at 0x114ad2f28>

Recalling our earlier results, we have a formula for the partial sums of the integers as \(\frac{1}{2}(n^2 +n)\).
We can use this to generalize the partial sums here as:
Again, we can create a function based on the pattern in the partial sums and compare it to the sequence of partial sums as follows.
In [15]:
psum_cbd = [sum(intcbd[:i+1]) for i in range(10)]
def s(n):
return (0.5*n**2 + 0.5*n)**2
n = np.arange(1, 10)
plt.figure(figsize = (10, 6))
plt.plot(s(n), '-o', alpha = 0.7, markersize = 12, color = 'purple', label = "$s(n)$")
plt.plot(psum_cbd, '-x', markersize = 16, color = 'blue', label = "$\sum_i ^n i^3$")
plt.title("The Partial Sums and Function Agree Again")
plt.legend(loc = "best", frameon = False)
Out[15]:
<matplotlib.legend.Legend at 0x114655470>

Symbolic Solutions by Hand¶
Here, we were able to recognize the pattern and generalize this by relating it to another pattern we knew. This allows us to simplify the expression we found above and represented as a closed function as follows:
This isn’t always so easy to see, and we will often resort to other means for recognizing the general form of patterns. One way to do so is to use the SymPy library to solve these symbolic problems. We demonstrate finding these general forms using SymPy below.
SymPy and Summations¶

SymPy allows us to use the computer to perform operations on symbols like we did above. We demonstrate the solution to the above problem to begin.
First, we import the SymPy library and abbreviate it as sy
. Next, we
declare n
a symbol. This is an important step. Whenever we want to
work on something symbolically, we need to be sure to tell Python to
consider the object a symbol.
Then, we name an expression based on our above formulation.
SymPy can easily factor and expand expressions, and then we can
substitute values in for \(n\) using .subs
.
In [16]:
import sympy as sy
n = sy.Symbol('n')
exp = (.5*n**2 + .5*n)**2
In [17]:
exp
Out[17]:
(0.5*n**2 + 0.5*n)**2
In [18]:
sy.expand(exp)
Out[18]:
0.25*n**4 + 0.5*n**3 + 0.25*n**2
In [19]:
sy.factor(exp)
Out[19]:
0.25*n**2*(n + 1)**2
In [20]:
exp.subs(n, 3)
Out[20]:
36.0000000000000
Summations on Symbols¶
Besides being able to factor, expand, and evaluate known expressions, we
can use SymPy to find the general expressions using its summation
command. Here, we enter the expression we want to sum, the index value,
and the start and stop points. We can find the symbolic expression and
determine the value as n gets bigger and bigger.
We would represent the problem of the two summations as follows:
We write this similarly in Sympy. We must add the symbol i
first,
then ask the summation function to find the sum of i
from 1 to
n
.
In [21]:
i = sy.Symbol('i')
sum1 = sy.summation(i, (i, 1, n))
We can get these results to appear in nice printing with the pprint
function.
In [22]:
sy.pprint(sum1)
2
n n
── + ─
2 2
In [23]:
sum2 = sy.summation(i**2, (i, 1, n))
sy.pprint(sum2)
3 2
n n n
── + ── + ─
3 2 6
In [24]:
sum3 = sy.summation(i**3, (i, 1, n))
sy.pprint(sum3)
4 3 2
n n n
── + ── + ──
4 2 4
In [25]:
sum4 = sy.summation(i**4, (i, 1, n))
sy.pprint(sum4)
5 4 3
n n n n
── + ── + ── - ──
5 2 3 30
Do you see a pattern here?
Other Patterns in Expressions¶
We can also use sympy
to expand binomial expressions. Here we look
to see if there is a pattern in the coefficients.
In [26]:
from sympy import Symbol, expand
x= Symbol('x')
y = Symbol('y')
expr1 = (x + y)**1
expand(expr1)
Out[26]:
x + y
In [27]:
expr2= (x+y)**2
sy.pprint(expand(expr2))
2 2
x + 2⋅x⋅y + y
Continue the Work¶
Continue to expand expressions
Do you recognize the pattern in coefficients?
In [28]:
for i in range(1,5):
expr3 = (x+y)**i
sy.pprint(expand(expr3))
x + y
2 2
x + 2⋅x⋅y + y
3 2 2 3
x + 3⋅x ⋅y + 3⋅x⋅y + y
4 3 2 2 3 4
x + 4⋅x ⋅y + 6⋅x ⋅y + 4⋅x⋅y + y
Summations and Areas¶
The aim of this notebook is to put the work with summations to use in finding the area under a curve. We start with a simple function that returns integer values. From here we use our knowledge of summations to determine the area under the curve.
np.ceil
and Step Functions¶
The np.ceil
function returns the “ceiling” of values in an interval.
This means the integer value that is closest above all the values in the
interval. For example, the ceiling of numbers between 0 and 1 is 1,
between 1 and 2 is 2, etc.
These functions look like staircases, and provide a nice example for how to use rectangles to determine the area under the curve. In the example of the step function, we get an exact answer for the area because of the stairstep geometry. We will take this exact answer and use it to approximate solutions to more complicated curves.
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
import pandas as pd
In [2]:
x = np.linspace(0, 10, 1000)
def step(x):
return np.ceil(x)
In [3]:
plt.plot(x, step(x))
plt.title("A Simple Step Function")
Out[3]:
<matplotlib.text.Text at 0x11ce43e10>

In [4]:
n = np.arange(0, 10, 1)
plt.plot(x, step(x))
plt.fill_between(x, step(x), alpha = 0.7, color = "lightblue")
plt.title("Area as Rectangles")
for xc in n:
plt.axvline(x=xc)

Connection to Summation¶
Now, we see that this comes down to finding the areas of 10 rectangles. The forumla for the area of a rectangle is \(A = l \times w\), so we have the area under the curve as:
We recognize this sum from before. We know how to compute this sum, and in fact, the sum for any number \(n\)!. Hence, for \(n = 10\), the area is:
If we were asked for the area under the same curve but from \(x = 4\) to \(x = 20\), what would the area be?
Different Heights of Rectangles¶
We can easily extend the problem to involve step functions whose inputs are as follows:
def step2(x):
return x**2
In [5]:
def step2(x):
return np.ceil(x)**2
In [6]:
plt.plot(x, step2(x))
plt.fill_between(x, step2(x), color = "burlywood")
plt.title("Rectangles with Height $n^2$")
for xc in n:
plt.axvline(x=xc, color = "black")

Again, we can use our knowledge about summations to determine the area under the curve here.
In [7]:
x, n = sy.symbols('x n')
sy.summation(x**2, (x, 1, 10))
Out[7]:
385
In [8]:
sy.pprint(sy.summation(x**2, (x, 1, n)))
3 2
n n n
── + ── + ─
3 2 6
Continuous Functions¶
We can use these patterns to approximate the area underneath other functions. Both of the examples above could be used to approximate the area under the \(f(x) = x\) and \(g(x) = x^2\). By superimposing these curves we can see the connection to approximating with rectangles.
In [9]:
n = np.arange(0, 10, 1)
x = np.linspace(0, 10, 1000)
plt.figure(figsize = (12, 5))
plt.subplot(121)
plt.plot(x, step(x))
plt.plot(x, x, lw = 7, c = "black")
plt.fill_between(x, step(x), alpha = 0.7, color = "lightblue")
plt.title("Approximate Area Under $f(x) = x$")
for xc in n:
plt.axvline(x=xc)
plt.subplot(122)
plt.plot(x, step2(x))
plt.plot(x, x**2, lw = 8, c = "black")
plt.fill_between(x, step2(x), color = "burlywood")
plt.title("Approximate Area Under $q(x) = x^2$")
for xc in n:
plt.axvline(x=xc, color = "black")

Combining our above results with these images, we can say that 10 rectanges approximates the area from \(x = 0\) to \(x = 10\) for \(f(x) = x\) as 55, and for \(g(x)\) as 385. We want to improve these approximations.
Improving the Approximations¶
We can get a better approximation in each of the above examples by using more rectangles. Let’s begin by considering the case of the same functions and intervals, however now we want to use 20 rectangles. First, let us determine the width of the rectangles.
Assuming the rectangles are the same width, we will have 20 rectangles over 10 spaces, so each will be 1/2 a unit wide in both cases.
Now, the height of the rectangles. These are not all the same. For \(f(x) = x\), the first rectangle would be 1/2 tall. The second would be 2/2, third 3/2, fourth 4/2, etc.
We can make lists for each of these using Python instead of writing them
all out. Let’s use Pandas to make a table of the results, and look at
the first five rows. We do this with the pd.DataFrame()
command that
combines the list into the table that we name df
. Then, we display
the top of the table with the df.head()
command.
In [10]:
width = [0.5 for i in range(20)]
height = [(i+1)/2 for i in range(20)]
In [11]:
df = pd.DataFrame({"Width": width, "Height": height})
In [12]:
df.head()
Out[12]:
Height | Width | |
---|---|---|
0 | 0.5 | 0.5 |
1 | 1.0 | 0.5 |
2 | 1.5 | 0.5 |
3 | 2.0 | 0.5 |
4 | 2.5 | 0.5 |
Now, we can compute the areas by multiplying each element together. Below, we create a list of areas and add a column named “Areas” to the DataFrame.
In [13]:
areas = [(width[i]*height[i]) for i in range(20)]
df["Areas"] = areas
df.head()
Out[13]:
Height | Width | Areas | |
---|---|---|---|
0 | 0.5 | 0.5 | 0.25 |
1 | 1.0 | 0.5 | 0.50 |
2 | 1.5 | 0.5 | 0.75 |
3 | 2.0 | 0.5 | 1.00 |
4 | 2.5 | 0.5 | 1.25 |
Similarly, we can sum our list to calculate the updated approximation.
In [14]:
sum(areas)
Out[14]:
52.5
Let’s see what happens if we approximate this region with 40 rectangles.
In [15]:
width = [10/40 for i in range(40)]
height = [(i+1)/4 for i in range(40)]
In [16]:
areas_40 = [(width[i] * height[i]) for i in range(40)]
sum(areas_40)
Out[16]:
51.25
More Complex Curves¶
Suppose we wanted to approximate the area under the curve \(h(x) = (x-1)(x-4)(x-5)\) from \(x = 0\) to \(x = 5\) as shown below. First we use 10, then 20, 40, 100, and 1,000,000 rectangles to approximate this and compare our answers, and consider the best way to find areas using rectangles.
In [17]:
def h(x):
return (x - 1)*(x - 4)*(x - 5)
In [18]:
x = np.linspace(0, 5, 100)
plt.plot(x, h(x))
plt.fill_between(x, h(x), alpha = 0.1)
plt.axhline()
plt.title("Area Under the Curve $h(x) = (x-1)(x-4)(x-5)$")
Out[18]:
<matplotlib.text.Text at 0x11fd6de80>

10 Rectangles¶
Over an interval of width 5, with 10 rectangles each would have width 1/2. You should notice the general approach for the width of the rectangles by now as the length of the interval divided by the number of rectangles.
Also, with a more complex curve like this example, we want to find an easy way to determine the heights of the rectangles. The rectangles occur every 1/2 a unit. The height of the first rectangle would be at f(1/2), the second at f(2/2), the third at f(3/2), ... up to f(9/2), f(10/2).
There is a pattern to the heights. Every one is an integer multiple of the width. Hence, we have a loose definition for our approximation under the curve \(h(x)\) on the interval [0, 5] as:
Area = width times height
Area = (5/10) times [f(i/2) for i = 1, 2, 3, ..., 10]
We can easily implement this logic with Python and our list structures.
In [19]:
width = [5/10 for i in range(10)]
height = [h(5/10*i) for i in range(10)]
area = [(width[i] * height[i]) for i in range(10)]
sum(area)
Out[19]:
-3.4375
In [20]:
plt.figure(figsize = (10, 7))
n = np.arange(0, 5, .5)
plt.plot(x, h(x), linewidth = 4)
plt.axhline(color = 'black')
plt.bar(n, height, width = 0.5, alpha = 0.1)
plt.title("Approximating the Area under $h(x) = (x-1)(x-4)(x-5)$ with 10 Rectangles")
Out[20]:
<matplotlib.text.Text at 0x120099080>

20 Rectangles¶
In [21]:
width2 = [5/20 for i in range(20)]
height2 = [h(5/20*i) for i in range(20)]
area2 = [(width2[i] * height2[i]) for i in range(20)]
sum(area2)
Out[21]:
-0.546875
40 Rectangles¶
In [22]:
width3 = [5/40 for i in range(40)]
height3 = [h(5/40*i) for i in range(40)]
area3 = [(width3[i] * height3[i]) for i in range(40)]
sum(area3)
Out[22]:
0.80078125
100 Rectangles¶
In [23]:
width4 = [5/100 for i in range(100)]
height4 = [h(5/100*i) for i in range(100)]
area4 = [(width4[i] * height4[i]) for i in range(100)]
sum(area4)
Out[23]:
1.5781250000000033
1,000,000 Rectangles¶
In [24]:
width5 = [5/100000 for i in range(100000)]
height5 = [h(5/100000*i) for i in range(100000)]
area5 = [(width5[i] * height5[i]) for i in range(100000)]
sum(area5)
Out[24]:
2.08283332812496
Measuring Cardiac Output: Turkeys on Treadymills¶
GOALS:
- Connect Summations to Cardiac Output problems
- Use summations to solve problems from biology
This notebook applies our techniques for area approximation to the problem of measuring Cardiac Output. This is a measure of how fast the heart is circulating blood, and has obvious important consequences for humans and animals. The image below comes from a study on the cardiac output of Turkeys.

Dye Dillution Method¶
The main idea of the Dye Dillution method is to inject dye into a subjects blood, and measure how much of the dye flows through the heart over small time intervals. From Wikipedia.
The dye dilution method is done by rapidly injecting a dye, indocyanine green, into the right atrium of the heart. The dye flows with the blood into the aorta. A probe is inserted into the aorta to measure the concentration of the dye leaving the heart at equal time intervals [0, T] until the dye has cleared. Let c(t) be the concentration of the dye at time t. By dividing the time intervals from [0, T] into subintervals Δt...
In the example of the Turkey on the Treadmill, a device was implanted to monitor the dispersion of dye in the Turkey while running on a treadmill. This could then be compared to Turkey’s hear rates who were relaxing in soft armchairs. This example comes from the study Effect of Exercis on Cardiac Output and Other Cardiovascular Parameters of Heavy Turkeys and Relevance to the Sudden Death Syndrome, by Boulianne, et al in the Journal of Avian Diseases, 37, 1993.
Example I¶
Imagine that we have the following data on dye concentration over the course of 26 seconds. The intervals are equal at 1 second. Let’s create two lists for the time intervals and the concentration measurements to plot them.
In [1]:
time = [i for i in range(25)]
In [2]:
concentration = [0, 0, 0, 0.1, 0.6, 0.85, 1.38, 1.9, 2.6, 2.9, 3.4, 3.9, 4.0, 4.1, 4.0, 3.7, 2.9, 2.2, 1.5, 1.1, 0.8, 0.8, 0.9, 0.9, 0.9]
In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
In [4]:
plt.plot(time, concentration, 'o')
plt.title("Data for Dye Dillution Technique")
plt.xlabel("Time in Seconds")
plt.ylabel("Concentration in mg/liter")
Out[4]:
<matplotlib.text.Text at 0x115fbeb38>

We can consider the cardiac output as the total volume of dye measured divided by the time as follows:
Similarly, we can express this as the amount of dye(D) over the volume(CT) as
Hence, in our example above, the \(CT\) is the sum of our list
concentration
.
In [5]:
sum(concentration)
Out[5]:
45.42999999999999
Suppose, for example, that we injected 5 liters of dye in the example above. The cardiac output would be:
In [6]:
6/45.43
Out[6]:
0.13207131851199647
Or 0.1321 liters per second.
In [7]:
6/45.43*60
Out[7]:
7.924279110719788
7.9243 liters per minute.
Example II¶
Suppose we have the following information about the flow of dye as follows. Find the cardiac output for this example.
In [8]:
t2 = [i for i in range(27)]
c2 = [0, 0.1, 0.2, 0.6, 1.2, 2.0, 3.0, 4.2, 5.5, 6.3, 7.0, 7.5, 7.8, 7.9, 7.9, 7.9, 7.8, 6.9, 6.1, 5.4, 4.7, 4.1, 3.5, 2.8, 2.1, 2.1, 2.2]
In [9]:
plt.plot(t2, c2, 'o')
Out[9]:
[<matplotlib.lines.Line2D at 0x118a1da58>]

Example III¶
The table below gives measurements after an injection of 5.5 mg of dye. The readings are given in mg/L.
time | concentration in mg/L |
---|---|
0 | 0.0 |
2 | 4.1 |
4 | 8.9 |
6 | 8.5 |
8 | 6.7 |
10 | 4.3 |
12 | 2.5 |
14 | 1.2 |
16 | 0.2 |
Example IV¶
The table below demonstrates a 6 mg injection of dye into a heart. Use the data to approximate the cardiac output. What kind of a creature could this be?
time | concentration in mg/L |
---|---|
0 | 0 |
2 | 1.9 |
4 | 3.3 |
6 | 5.1 |
8 | 7.6 |
10 | 7.1 |
12 | 5.8 |
14 | 4.7 |
16 | 3.3 |
18 | 2.1 |
20 | 1.1 |
22 | 0.5 |
24 | 0 |
Accounting for End Values¶
As we noticed in the examples, the monitor may return values towards the end of the readings that do not agree with our assumptions. In order to check this, we impose values that decrease on the end of the sequence. For example, in the list below, we change the final four values to better reflect the pattern we expect.
In [12]:
time = [i for i in range(25)]
concentration = [0, 0, 0, 0.1, 0.6, 0.85, 1.38, 1.9, 2.6, 2.9, 3.4, 3.9, 4.0, 4.1, 4.0, 3.7, 2.9, 2.2, 1.5, 1.1, 0.8, 0.8, 0.9, 0.9, 0.9]
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
In [13]:
plt.figure()
plt.plot(time[:21], concentration[:21], 'o')
plt.plot(time[21:], concentration[21:], 'ro', label='Readings for Last Values')
plt.title("Data for Dye Dillution Technique")
plt.xlabel("Time in Seconds")
plt.ylabel("Concentration in mg/liter")
plt.plot([21, 22, 23, 24], [0.4, 0.24, 0.14, 0.1], 'go', alpha=0.6, label = 'Approximations for Last Values')
plt.legend()
Out[13]:
<matplotlib.legend.Legend at 0x118b17860>
Supply and Demand: Consumer Surplus¶
GOALS:
- Understand concepts of consumer and producer surplus
- Use definite integrals to solve problems involving consumer and producer surplus
Economists will often refer to supply and demand curves.
A supply curve is a cost of production function that relates some quantity of goods to a price that attracts this amount at market.
A demand curve is a function that relates a quantity of goods to a price that the market would be cleared of that quantity.
For example, suppose we have a supply curve \(S\) as:
and a demand curve \(D\) of:
We can plot these as follows.
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
In [2]:
def S(q):
return (q**2)
def D(q):
return (q - 20)**2
q = np.linspace(0, 16, 1000)
In [3]:
plt.plot(q, S(q), label = "Supply Curve")
plt.plot(q, D(q), label = "Demand Curve")
plt.title("Supply and Demand")
plt.legend(frameon = False)
plt.xlabel("Quantity $q$")
plt.ylabel("Price")
Out[3]:
<matplotlib.text.Text at 0x112d8aeb8>

Equilibrium Price¶
Where the Supply and Demand curves intersect, we have the equilibrium price for the good under consideration. In order to find this value, we need to solve the equation:
We can use Sympy to accomplish this with the Eq
and Solve
commands.
In [4]:
q = sy.Symbol('q')
eq = sy.Eq(S(q), D(q))
sy.solve(eq)
Out[4]:
[10]
In [5]:
S(10)
Out[5]:
100
Now, we can update our plot to include this point and add some text and
an arrow pointing to the equilibrium point using the matplotlib’s
annotate
function. The draws an arrow with some text to a point on
the graph.
In [6]:
plt.figure(figsize= (10, 8))#create figure and reset q to be numbers
q = np.linspace(0, 16, 1000)
plt.plot(q, S(q), label = "Supply Curve")#plot supply, demand, and equilibrium points
plt.plot(q, D(q), label = "Demand Curve")
plt.plot(10, 100, 'o', markersize = 14)
plt.title("Supply and Demand")#add titles and legend
plt.legend(frameon = False)
plt.xlabel("Quantity $q$")
plt.ylabel("Price")
ax = plt.axes()#add arrow with annotation
ax.annotate('Equilibrium at (10, 100)', xy=(10,100), xytext=(10, 250), arrowprops=dict(facecolor='black'))
Out[6]:
<matplotlib.text.Annotation at 0x11347d048>

Price Discrimination¶
Price discrimination refers to the different prices that different consumers are willing to pay for the same product. For example, some people may be willing to pay $16 for a six pack of beer while others would refuse. In order to address different consumers, a corporation might do something like market the same beer under a different logo and name in different kinds of stores.
If the goods producer is able to determine this perfectly, they will have achieved perfect price discrimination. The beer example demonstrates only a partial price discrimination. An example of perfect price discrimination might be a pay what you want approach where each consumer determine their own price for a good.
In a partial price discrimination scenario, we would have a picture like our step functions, where the width of each rectangle represents a quantity of a good and the height represents the price that this amount of goods sells for.
Let’s visualize the Partial and Perfect Price discrimination situation where we have \(\Delta q = 1\).
In [7]:
q = np.arange(0, 16, 1)
In [8]:
plt.figure(figsize = (8, 5))
plt.bar(q, D(q), alpha = 0.1, width = 1)
plt.title("Partial Price Discrimination")
plt.xlabel("Quantity of good $q$")
plt.ylabel("Price of good $p$")
ax = plt.axes()
ax.annotate('sell $\Delta q$ at price $p$', xy=(4, D(4)), xytext=(10, 250), arrowprops=dict(facecolor='black'))
Out[8]:
<matplotlib.text.Annotation at 0x113a79710>

The area of one of the rectangles then represents the revenue to the producer. Consider the first rectangle. Here, we have 1 unit of a good and sell this at $400. Similarly, we sell a second unit of this good for $300. From these sales we would have mad $700 in total.
Perfect Price Discrimination would be modeled by the continuous case and the Revenue by the Definite Integral.
In [9]:
q = np.linspace(0, 16, 1000)
plt.figure(figsize = (9, 6))
plt.plot(q, D(q))
plt.fill_between(q, D(q), alpha = 0.1)
plt.title("Perfect Price Discrimination")
plt.xlabel("Quantity of good $q$")
plt.ylabel("Price of good $p$")
ax = plt.axes()
ax.annotate('Area Under Curve is Revenue \n$\int_0^{16}D(q) dq$', xy=(4, D(4)), xytext=(10, 250), arrowprops=dict(facecolor='black'))
Out[9]:
<matplotlib.text.Annotation at 0x113328f98>

The Consumer Surplus¶
When a marketplace finds consumers paying the same price for a good, we are at the equilibrium price. Here, if you think about moving backwards from equilibrium, the price of the good rises, its suppy falls, and there are fewer transactions. Another way to interpret the area under the Demand curve, is as the value to consumers.
If there is a difference between this value and what the consumers end up paying, we have a consumer surplus. This is represented graphically as the area determined by the rectangle formed by the equilibrium price. The difference between the area under the Demand curve and this rectangle is the consumer surplus.
In [10]:
plt.figure(figsize = (12,7))
plt.plot(q, D(q))
plt.plot(q, S(q))
plt.fill_between(q, D(q), 100, where = (D(q) > 100), color = 'green', alpha = 0.5, hatch = '-')
plt.fill_between(q, 100, where = (q<10), alpha = 0.3, hatch = '|')
plt.title("Consumer Surplus")
plt.xlabel("Quantity of good $q$")
plt.ylabel("Price of good $p$")
ax = plt.axes()
ax.annotate('Consumer Surplus', xy=(4, 200), xytext=(8, 300), arrowprops=dict(facecolor='black'))
ax.annotate('Total Revenue', xy = (5, 50), xytext = (12,100), arrowprops=dict(facecolor='black'))
Out[10]:
<matplotlib.text.Annotation at 0x113f01320>

This would be determined by the area under the Demand Curve, but above the horizontal line formed by the equlibrium price(\(p^\star\)). Thus, we can represent this with the definite integral:
Example¶
Suppose the supply curve is approximated by the function
and the demand curve by the function
for \(0 \leq q \leq 65,000\). Determine the equilibrium price and Consumer Surplus.
In [2]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import sympy as sy
from sympy import oo
Increasing the Number of Rectangles¶
In the past examples, we have used rectangles that were fairly wide. This notebook seeks to move from the approximate to the exact case. To do so we relate the idea of summing rectangels to the definition of the definite integral. The video below introduces the big idea of the Riemann integral, which we can understand as an analogous description.
Definite Integral¶
We can imagine the effect of increasing the number of rectangles to infinity as the ideal way to find the area under the curve. This will be how we introduce the idea of the Definite Integral. The symbol \(\int_a^b\) is used to represent the integral, and \(a\) and \(b\) represent the lower and upper limits for integraion.
For example, the area under the function \(f(x) = x\) from $x = 0 $ to \(x = 1\) would be represented by a definite integral as:
Here, the \(\int_0^1\) tells us we’re summing small little rectangles from \(x = 0\) to \(x=1\). The \(x ~ dx\) represents the area of each rectangle, height \(x\) and infinitely small width \(dx\).
We can solve these problems easily using sympy
‘s integrate
function. The function works as follows:
sy.integrate(curve, (variable, start, stop))
The example of \(\int_0^1 x dx\) would be solved as follows.
In [3]:
x = sy.Symbol('x')
sy.integrate(x, (x, 0, 1))
Out[3]:
1/2
Our work to this point is similar to that of a standard definition for an integral from German mathematician Bernhard Riemann. His work took into consideration subdivisions of different widths, and was framed in a much more rigorous manner. Later, in your Real Analysis course, you will formalize the definition and proof of the Reimann integral, but for now we want to recognize situations that we would use a sum
More Curves¶
Now, let’s suppose that we wanted to find the area under the curve \(g(x) = x^2 + x\) from \(x = -3\) to \(x = 5\). We would represent this with the definite integral:
We can use matplotlib to represent this area with the fill_between
function, and use sympy to evaluate the integral.
In [4]:
def g(x):
return x**2 + x
In [5]:
x = np.linspace(-3, 5, 1000)
plt.plot(x, g(x))
plt.fill_between(x, g(x), alpha = 0.5)
Out[5]:
<matplotlib.collections.PolyCollection at 0x10d886dd8>

In [6]:
x = sy.Symbol('x')
sy.integrate(g(x), (x, -3, 5))
Out[6]:
176/3
Example II¶
We want to be comfortable visualizing regions and evaluating the definite integral that represents the area. Let’s represent and find the area under the curve \(y = -3x^3 + 2x + 2\) over the interval \([-2, 1]\).
First, we will plot the curve using matplotlib
.
In [8]:
x = np.linspace(-2,1,1000) #create 1000 evenly spaced values from -2 to 1
y = -3*x**3+2*x+2
plt.figure()
plt.plot(x, y, label = '$y=-3x^3 + 2x +2$')
plt.fill_between(x, y, alpha = 0.3, color = 'red', hatch = '|')
plt.legend(frameon=False)
Out[8]:
<matplotlib.legend.Legend at 0x1107a8a20>

Similarly, we can compute the integral using Sympy.
In [11]:
x, y = sy.symbols('x y')
y = -3*x**3 + 2*x + 2
area = sy.integrate(y, (x, -2,1))
print("The area is: ", area)
The area is: 57/4
We can verify this by hand applying our power rule as follows:
In [14]:
3 + 12 - 4 + 4 -3/4
Out[14]:
14.25
Area Between Two Curves¶
In the example of consumer surplus, we interpreted the surplus as an area between the demand curve and horizontal line determined by the equilibrium price. Now, we want to look at the situation with more complex curves to represent and solve area problems. Let’s begin by considering the area between the curves \(f(x) = x\) and \(g(x) = x^2\) from \(x = 0\) to \(x = 1\).
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
In [9]:
def f(x):
return x
def g(x):
return x**2
x = np.linspace(0,1,100)
plt.plot(x,f(x))
plt.plot(x,g(x))
plt.fill_between(x, f(x), color = "grey",alpha = 0.3, hatch = '|')
plt.fill_between(x, g(x), color = "yellow", alpha = 0.2, hatch = '-')
Out[9]:
<matplotlib.collections.PolyCollection at 0x113f83208>

Note that the area between the two curves is simply the difference between the area under \(f\) and the area under \(g\). We can represent this area with a definite integral:
And we can use our patterns from summations to evaluate this. We can also use sympy to check our results.
In [10]:
import sympy as sy
x = sy.Symbol('x')
sy.integrate(f(x) - g(x), (x, 0,1))
Out[10]:
1/6
We can summarize this with a plot.
In [16]:
x = np.linspace(0,1,100)
plt.plot(x, f(x))
plt.plot(x, g(x))
plt.fill_between(x, f(x), g(x), color = 'lightpink', alpha = 0.4, hatch = '-')
Out[16]:
<matplotlib.collections.PolyCollection at 0x1156669b0>

In the problem above, we knew the boundary points of the region and were able to set up the definite integral using these. We may instead have some curves and a region formed by the intersection of these curves that is of interest.
In [74]:
def f(x):
return 12 - x**2
def g(x):
return x**2 - 6
In [75]:
x = np.linspace(-10,10,1000)
plt.plot(x, f(x))
plt.plot(x, g(x))
plt.fill_between(x, f(x), g(x), alpha = 0.2)
Out[75]:
<matplotlib.collections.PolyCollection at 0x116d69fd0>

Suppose we want to find the area of the middle region here. We would need to know the points of intersection of the curves and use these as boundaries for our definite integral. We can find these with sympy, update the plot, and evaluate the integral.
In [76]:
x = sy.Symbol('x')
sy.solve(f(x) - g(x), x)
Out[76]:
[-3, 3]
In [77]:
x = np.linspace(-4,4,1000)
fill = np.linspace(-3,3,1000)
plt.plot(x, f(x))
plt.plot(x, g(x))
plt.fill_between(fill, f(fill), g(fill), color = 'lightblue', alpha = 0.2)
Out[77]:
<matplotlib.collections.PolyCollection at 0x116f96d68>

Note that we created a list based on the points of intersection to tell the plot where to fill between.
In [78]:
x = sy.Symbol('x')
sy.integrate(f(x) - g(x), (x, -3,3))
Out[78]:
72
We can have regions that are determined by more than one set of intersections as well. Here, we can determine the area over a domain where the curves switch positions. For example, consider the functions \(f(x) = \sin{x}\) and \(g(x) = \cos{x}\) from \(x=0\) to \(x = \pi/2\).
In [93]:
def f(x):
return np.sin(x)
def g(x):
return np.cos(x)
x = np.linspace(0, np.pi/2, 1000)
plt.plot(x, f(x), label = '$f(x)$')
plt.plot(x, g(x), label = '$g(x)$')
plt.fill_between(x, f(x), g(x), color = 'red', alpha = 0.2)
plt.legend(loc = 'best')
Out[93]:
<matplotlib.legend.Legend at 0x116fc3fd0>

In [94]:
x = sy.Symbol('x')
y1 = sy.sin(x)
y2 = sy.cos(x)
sy.solve(y2 - y1)
Out[94]:
[-3*pi/4, pi/4]
In [98]:
area_1 = sy.integrate(y2 - y1, (x, 0, sy.pi/4))
In [99]:
area_2 = sy.integrate(y1 - y2, (x, sy.pi/4, sy.pi/2))
In [101]:
sy.pprint(area_1 + area_2)
-2 + 2⋅√2
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
Wealth Distribution¶
One way that we might understand equality is through understanding the distribution of wealth in a society. Perfect wealth distribution would mean that all participants have the same share of wealth as everyone else. We can represent this situation mathematically with a function \(L(x) = x\) that we will call the Lorenz Curve.
Concretely, if we were to look at every 20% of the population, we would see 20% of income.
Fifths of Households | Percent of Wealth |
---|---|
Lowest Fifth | 20 |
Lowest two - Fifths | 40 |
Lowest three - Fifths | 60 |
Lowest four - Fifths | 80 |
Lowest five - Fifths | 100 |
In [2]:
percent = [0, 0.2, 0.4, 0.6, 0.8, 1.0]
lorenz = [0, 0.2, 0.4, 0.6, 0.8, 1.0]
In [3]:
plt.plot(percent, lorenz, '-o')
plt.title("Perfect Wealth Distribution")
Out[3]:
<matplotlib.text.Text at 0x118c4c7f0>

It is unlikely that we have perfect distribution of wealth in a society however. For example, the following table describes the cumulative distribution of income in the United States for the year 1994.
Fifths of Households | Percent of Wealth |
---|---|
Lowest Fifth | 4.2 |
Lowest two - Fifths | 14.2 |
Lowest three - Fifths | 29.9 |
Lowest four - Fifths | 53.2 |
Lowest five - Fifths | 100.0 |
In [4]:
usa_94 = [0, 0.042, 0.142, 0.299, 0.532, 1.00]
In [5]:
plt.figure(figsize = (9, 5))
plt.plot(percent, lorenz, '-o', label = 'Lorenz Curve')
plt.plot(percent, usa_94, '-o', label = 'USA 1994')
plt.title("The Difference between Perfect and Actual Wealth Equality")
plt.legend(loc = 'best', frameon = False)
Out[5]:
<matplotlib.legend.Legend at 0x119289c50>

The area between these curves can be understood to represent the discrepency between perfect wealth distribution and levels of inequality. Further, if we examine the ratio between this area and that under the Lorenz Curve we get the Gini Index.
One big issue remains however. We don’t want to use rectangles to
approximate these regions but we don’t have equations for the actual
distribution of wealth. We introduce two curve fitting techniques using
numpy
to address this problem.
Quadratic Fit¶
The curve in the figure above representing the actual distribution of
wealth in the USA in 1994 can be approximated by a polynomial function.
NumPy has a function called polyfit
that will fit a polynomial to a
set of points. Here, we use polyfit
to fit a quadratic function to
the points.
In [6]:
fit = np.poly1d(np.polyfit(percent, usa_94, 2))
In [7]:
plt.figure(figsize = (9, 5))
plt.plot(percent, usa_94, '-o', label = 'Polyfit')
plt.plot(percent, fit(percent), '-o', label = 'Actual USA 1994')
plt.title("Quality of Quadratic Fit")
plt.legend(loc = 'best', frameon = False)
Out[7]:
<matplotlib.legend.Legend at 0x119619f98>

Getting the Fit¶
Below, we return to the complete picture where we plot our fitted function and the Lorenz Curve and shade the area that represents the difference in income distribution.
In [8]:
plt.figure(figsize = (9, 5))
plt.plot(percent, lorenz, '--o', label = 'Lorenz')
plt.plot(percent, fit(percent), '--o', label = 'Polyfit')
plt.fill_between(percent, lorenz, fit(percent), alpha = 0.3, color = 'burlywood')
plt.title("Visualizing the Gini Index")
plt.legend(loc = 'best', frameon = False)
Out[8]:
<matplotlib.legend.Legend at 0x1197d9c50>

Now, we want to compute the ratio between the area between the curves to that under the Lorenz Curve. We can do this easily in Sympy but declaring \(x\) a symbol and substituting it into our fit function then integrating this.
In [9]:
import sympy as sy
In [10]:
x = sy.Symbol('x')
fit(x)
Out[10]:
x*(1.18839285714286*x - 0.241678571428574) + 0.0209285714285723
In [11]:
A_btwn = sy.integrate((x - fit(x)), (x, 0, 1))
In [12]:
A_L = sy.integrate(x, (x, 0,1))
In [13]:
A_btwn/A_L
Out[13]:
0.407559523809523
Inequality through Time¶
Now that we understand how to compute the Gini Index, we want to explore what improving the gap in wealth distribution would mean.
In [14]:
x = np.linspace(0, 1, 100)
plt.figure(figsize = (10, 7))
plt.plot(x, x)
plt.plot(x, x**2, label = "Country A")
plt.plot(x, x**4, label = "Country B")
plt.plot(x, x**8, label = "Country C")
plt.plot(x, x**16, label = "Country D")
plt.ylabel("Income Percent")
plt.xlabel("Population Fraction")
plt.title("Different Wealth Distributions")
plt.legend(loc = "best", frameon = False)
Out[14]:
<matplotlib.legend.Legend at 0x11aa4e860>

Which of the above countries do you believe is the most equitable? Why?
Census Bureau Data and Pandas¶
There are many organizations that use the Gini Index to this day. The OECD, World Bank, and US Census all track Gini Indicies. We want to investigate the real data much as we have with our smaller examples. To do so, we will use the Pandas library.
The table below gives distribution data for the years 1970, 1980, 1990, and 2000.
x | 0.0 | 0.2 | 0.4 | 0.6 | 0.8 | 1.0 |
---|---|---|---|---|---|---|
1970 | 0.000 | 0.041 | 0.149 | 0.323 | 0.568 | 1.000 |
1980 | 0.000 | 0.042 | 0.144 | 0.312 | 0.559 | 1.000 |
1990 | 0.000 | 0.038 | 0.134 | 0.293 | 0.530 | 1.000 |
2000 | 0.000 | 0.036 | 0.125 | 0.273 | 0.503 | 1.000 |
Creating the DataFrame¶
We will begin by creating a table from this data by entering lists with these values and creating a DataFrame from these lists.
In [15]:
import pandas as pd
seventies = [0, 0.041, 0.149, 0.323, 0.568, 1.0]
eighties = [0, 0.042, 0.144, 0.312, 0.559, 1.0]
nineties = [0, 0.038, 0.134, 0.293, 0.53, 1.0]
twothou = [0, 0.036, 0.125, 0.273, 0.503, 1.0]
In [16]:
df = pd.DataFrame({'1970s': seventies, '1980s':eighties, '1990s': nineties,
'2000s': twothou, 'perfect': [0, 0.2, 0.4, 0.6, 0.8, 1.0]})
df.head()
Out[16]:
1970s | 1980s | 1990s | 2000s | perfect | |
---|---|---|---|---|---|
0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.0 |
1 | 0.041 | 0.042 | 0.038 | 0.036 | 0.2 |
2 | 0.149 | 0.144 | 0.134 | 0.125 | 0.4 |
3 | 0.323 | 0.312 | 0.293 | 0.273 | 0.6 |
4 | 0.568 | 0.559 | 0.530 | 0.503 | 0.8 |
Plotting from the DataFrame¶
We can plot directly from the dataframe. The default plot generates
lines for each decades inequality distribution. There are many plot
types available however, and we can specify them with the kind
argument as demonstrated with the density plot that follows. What do
these visualizations tell you about equality in the USA based on this
data?
In [17]:
df.plot(figsize = (10, 8))
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0x122d7a518>

In [18]:
df.plot(kind = 'kde')
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0x11a9efc88>

Center of Mass¶
Our goal is to discuss the center of mass in discrete and continuous cases. We will use the discrete examples to motivate the passage to the limit, and again see the interplay of summation and integration.
The Lever¶

Today, we will follow Polya and Archimedes to determine the law of the lever. We will move this to the two dimensional case, and push this to uniform regions in the 2-Dimensional plane. We will do all of this in symbols and with Python.
The Path¶
Axiom I: Equal weights at equal distances are in equilibrium.

Axiom II: \(W\) at each end \(\cong 2W\) in middle.

Generalizing¶
Thus, we propose that:
These values determined by weight :math:`times` distance are called moments. A SeeSaw full of robots will be in equilibrium, or balanced, when the moments to the left of the fulcrum are equal to the moments to the right.
Solving A 1-D Problem two ways¶
Suppose we have three masses distributed on a lever, as shown in the image below:

Here, the center of mass is determined by the following definition:
For example, we can choose masses 1, 3, and 2 located at distances 1, 3, and 7 from the left end of the lever respectively. Using the definition, we have
We should be able to work in reverse from the picture and distribute weights evenly as we had done with Archimedes as a method to check.
Finally, we want to be able to see the connection of the definite integral in the continuous case. We will consider things with uniform density \(\rho\) here, so we have
2-D Case: Discrete Point Masses¶
The formulas might be what we expect, however, we should note the presence of the \(M_y\) in the \(x\)-coordinate and the \(M_x\) in the \(y\)-coordinate.
Thus, if we have masses 1 and 4 located at points \((1,0)\) and \((0,1)\) respectively, we have a center of mass at
In [1]:
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
In [2]:
ax = plt.figure()
plt.plot(0, 1, 'o', color = 'blue', markersize = 20)
plt.plot(1,0, 'o', color = 'blue', markersize = 5)
plt.xlim(-0.5, 1.5)
plt.ylim(-0.5, 1.5)
plt.plot(0.2, 0.8, 'o', color = 'red')
ax.text(.3, 0.55, 'Center of Mass')
Out[2]:
<matplotlib.text.Text at 0x10f254f98>
2-D Continuous Region¶
For this example, we consider the components \(M_x, M_y, M\) and their meaning if we have a solid two dimensional region with uniform density.
mass M = area of plate \(~\)
moment \(M_x = \int\)(distance \(x\))(length of vertical strip)\(dx\)
moment \(M_y = \int\)(height \(y\))(length of horizontal strip)\(dy\)
For example, suppose a plate has sides \(x=0\), \(y=0\), and the line \(y=4-2x\)

Thus, we have:
- \(M=\int_0^2 (4-2x)dx\)
- \(M_x = \int_0^2 x(4-2x)dx\)
- \(M_y = \int_0^2 y\frac{1}{2}(4-y)dy\)
In [3]:
x,y = sy.symbols('x y')
sy.solve(4-2*x-y, x)
Out[3]:
[-y/2 + 2]
In [4]:
my = sy.integrate(x*(4-2*x), (x, 0, 2))
mx = sy.integrate(y*0.5*(4-y), (y, 0, 4))
m = sy.integrate(4-2*x, (x, 0, 2))
print('Total Mass: ', m, '\nMx: ', mx,'\nMy: ', my)
print('x = ', my/m, 'y= ', mx/m)
Total Mass: 4
Mx: 5.33333333333333
My: 8/3
x = 2/3 y= 1.33333333333333
In [5]:
x = np.linspace(0,2,1000)
y = 4-2*x
ax = plt.figure()
plt.plot(x,y)
plt.fill_between(x, y, alpha =0.2, hatch='x')
plt.plot(my/m, mx/m, 'o', markersize = 15)
plt.title("Center of Mass for region bounded \nby $x=0,2$ and $y=4-2x$")
Out[5]:
<matplotlib.text.Text at 0x10f2e8dd8>
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Problems with Motion¶
We will continue to focus on problems with motion as motivation for our investigations. Aristotelian visions of motion earlier discussed are not quantitative in our contemporary usage of the word. Alternative work is evident in scholastic philosophers and mathematicians investigations of motion, particularly that of Nicole Oresme. These kinds of problems in quantifying bodies in motion continued to be of central interest to many of the names we often hear associated with the sixteenth centuries work in European mathematics and science like Galileo, Descartes, Newton, and Leibniz.
GOALS:
- Understand relationship between distance, velocity, and time.
- Plot displacement and velocity graphs, connect these with language of sums and differences
- Use rate of change and slope to describe the relationships between displacement and velocity
Let’s begin by stating some of the quantities involved and how they are related.
- D = distance
- V = velocity
- t = time
- \(D = Vt\)
This should seem familiar and intuitive. We start by imagining an object moving at a constant velocity of 10 mi/hr for 10 hours.
We can examine the graph of the velocity defined sequentially and as a constant function.
For the sequence, we just continue to produce terms of 10, as shown below.
In [2]:
a = [10]
for i in range(10):
next = 10
a.append(next)
In [3]:
plt.plot(a, '--')
plt.title("Constant Velocity over Time")
plt.xlabel("Time")
plt.ylabel("Velocity")
Out[3]:
<matplotlib.text.Text at 0x1173cbcf8>

In [4]:
plt.axhline(10)
plt.title("Constant Velocity over Time")
plt.xlabel("Time")
plt.ylabel("Velocity")
Out[4]:
<matplotlib.text.Text at 0x117918898>

We can also consider the graph of Distance versus Time on the same interval for the constant velocity of 10mi/hr. Sequentially, this means that we add 10 miles every hour.
In [5]:
d = [0]
for i in range(10):
next = d[i]+10
d.append(next)
d
Out[5]:
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
In [6]:
plt.plot(d, '-o')
plt.title("Distance against Time")
plt.xlabel("Time")
plt.ylabel("Distance Traveled")
Out[6]:
<matplotlib.text.Text at 0x117b33470>

In [7]:
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(a, '-o')
plt.title("Constant Velocity over Time")
plt.xlabel("Time")
plt.ylabel("Velocity")
plt.subplot(1, 2, 2)
plt.plot(d, '-o')
plt.title("Distance against Time")
plt.xlabel("Time")
plt.ylabel("Distance Traveled")
Out[7]:
<matplotlib.text.Text at 0x117cc4278>

Perhaps you recognize the action of moving from the left graph to the right graph as a connection to the summation and definite integral problem. This is an incredibly important connection to make.
We have techniques for moving from the left to the right, or from velocity to distance through integration.
How can we move from the right graph to the left, or from distance to velocity?
Recall our early work connecting arithmetic sequences to linear functions. Again, we have the plot of Distance against Time formed by an arithmetic sequence. This can be representing by the closed form definition as:
Note the connection to the rate of change of the sequence and the coefficient of \(t\). This is the familiar slope of the linear function, found as:
Velocity, Distance, Slope, and Area¶
We seem to have a relationship that if we have a Velocity vs. Time graph, the area under this gives us the total distance traveled.
Similarly, if we have a graph of the Distance vs. Time graph, the slope of this line is the velocity.
This demonstrates a fundamentally important fact that we will continue to rehash.
Forward and Backwards Motion¶
Suppose now that the car goes forward with constant velocity and returns with the same velocity. We interpret the first part of this trip as positive velocity \(+V\) and the return portion as negative velocity \(-V\). If we have a constant velocity of 5 mi/hr, and a 6 hour trip with no stopping, the cars velocity is as follows:
We can define this function in two ways. First, we can iteratively
define the function with an if
then
statement. If \(t < 3\)
then \(V = 3\). Otherwise, \(V = -3\).
In [8]:
t = np.linspace(0, 6, 1000)
plt.figure(figsize = (11, 7))
plt.subplot(1, 2, 1)
y = []
for i in t:
if i <= 3:
y.append(5)
else:
y.append(-5)
plt.plot(t, y, '--k', color = 'red')
plt.axhline(color = 'black')
plt.title("Constant Positive then Negative Velocity")
plt.xlabel("Time")
plt.ylabel("Velocity")
plt.subplot(1, 2, 2)
t = np.linspace(0, 6, 1000)
y = []
for i in t:
if i <= 3:
y.append(5*i)
else:
y.append(-5*i + 30)
plt.plot(t, y)
plt.title("Displacement of Body Moving at Constant \nPositive then Negative Velocity")
plt.xlabel("Time")
plt.ylabel("Displacement from Start")
Out[8]:
<matplotlib.text.Text at 0x11815d4e0>

Changing Velocity¶
The next situation to explore is if the velocity is changing. For a simple example, let’s imagine the velocity increases by 1 mi/hr per hour over a four hour trip. Then, we would have the following velocities per time:
Time | Velocity |
---|---|
0 | 0 |
1 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
We could have the following graph of the situation.
In [9]:
t = np.arange(0, 5, 1)
v = t
plt.step(t, v, '--o')
plt.ylim(0, 5)
plt.xlim(0, 4)
plt.xlabel("Time")
plt.ylabel("Velocity")
plt.title("Increasing Velocity")
Out[9]:
<matplotlib.text.Text at 0x1184fe5f8>

Also, if we consider making the graph of the displacement based on the area under the graph, we have the following:
Now we can plot the area function based on the partial sums of the time intervals.
Time | Displacement |
---|---|
0 | 0 |
1 | 1 |
2 | 3 |
3 | 6 |
4 | 10 |
In [10]:
d = [0, 1, 3, 6, 10]
In [11]:
plt.plot(t, d, '--o')
plt.title("Displacement over Time")
plt.xlabel("Time")
plt.ylabel("Displacement")
Out[11]:
<matplotlib.text.Text at 0x1184e7da0>

Problems¶
- Suppose we have the following data on the displacements of objects. Create a plot of the accompanying velocity graph for each.
In [12]:
d1 = [0, 60, 60, 0]
In [13]:
d2 = [10, 0, 20]
In [14]:
d3 = [0, 20, 30, 0]
In [15]:
d4 = [0, 0, 1, 1]
- Now suppose we have the following information about the velocity of each body as follows. Construct a plot of the displacement.
In [16]:
v1 = [0, 30, 0, 0]
plt.step(np.arange(len(v1)), v1, where = 'post')
plt.ylim(min(v1)-5, max(v1)+5)
Out[16]:
(-5, 35)

In [17]:
v2 = [-30, 0, 30, 0]
plt.step(np.arange(len(v2)), v2, where = 'post')
plt.ylim(min(v2)-5, max(v2)+5)
plt.axhline(color = 'black')
Out[17]:
<matplotlib.lines.Line2D at 0x118a3b5c0>

In [18]:
v3 = [-40, 40, 40, 0]
plt.step(np.arange(len(v3)), v3, where = 'post')
plt.ylim(min(v3)-5, max(v3)+5)
plt.axhline(color = 'black')
Out[18]:
<matplotlib.lines.Line2D at 0x118b60b70>

In [19]:
v4 = [40, 0, 20, 0]
plt.step(np.arange(len(v4)), v4, where = 'post')
plt.ylim(min(v4)-5, max(v4)+5)
plt.axhline(color = 'black')
Out[19]:
<matplotlib.lines.Line2D at 0x118c84748>

Defining a Piecewise Function¶
We have to add a step to allow us to work with a piecewise function in
our familiar manner with the np.vectorize
command. Only then can we
substitute an array of values created from the np.linspace
command.
Below is a simple example of a piecewise function defined using the
familiar function syntax with a conditional statement.
In [20]:
def f(x):
if x <= 5:
return x
else:
return 10-x
x = np.linspace(0,10,1000)
f2 = np.vectorize(f)
y = f2(x)
plt.figure()
plt.plot(x, y)
plt.show()

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Connecting Sums and Differences¶
We first will discuss some further connections between our earlier work with summations and introduce some problems with differences. We want to note some connections between these operations, and remind ourselves on the important concept of slope.
- Recognize patterns in differences of sequences over small intervals
- Connect patterns in differences to the nature of original sequence
- Determine slope for sequences changing in linear, quadratic, exponential, and trigonometric fashion
First, we will examine some arbitrary sequences. Take, for example the numbers:
First, we want to examine the differences between terms. One way we can do this is similar to our earlier work generating seqences. To start, we will find the difference between 2 and 0, then 6 and 2, then 7 and 4, then 9 and 4. Manually this is easy, but we can also consider this as a list and operate on these numbers based on their index.
In [2]:
f = [0, 2, 6, 7, 4, 9]
In [3]:
f[1], f[0]
Out[3]:
(2, 0)
In [4]:
f[1] - f[0], f[2] - f[1]
Out[4]:
(2, 4)
In the end, we would have the following list \(v\) formed by the subractions starting with index \(i = 0\).
or
We can create this sequence with a for loop.
In [6]:
v = []
n = len(f)
n
for i in range(n-1):
next = f[i+1] - f[i]
v.append(next)
v
Out[6]:
[2, 4, 1, -3, 5]
In [7]:
sum(v)
Out[7]:
9
In [8]:
difs = np.diff(f)
difs
Out[8]:
array([ 2, 4, 1, -3, 5])
In [9]:
sum(difs)
Out[9]:
9
Now let’s examine some more familiar examples of sequences and plot the sequence formed by their difference togethter side by side.
- $ f = 2, 3, 4, 5, 6, 7$
- $ f = 0, 1, 4, 9, 16 $
- $ f = 1, 2, 4, 8, 16 $
- $ f = 1, 0, -1, -1, 0, 1 $
In [10]:
f = [2, 3, 4, 5, 6, 7]
difs = np.diff(f)
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o')
plt.title("Plot of $f$")
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post')
plt.title("Plot of differences $v$")
Out[10]:
<matplotlib.text.Text at 0x10f83cdd8>

In [11]:
f = [0, 1, 4, 9, 16]
difs = np.diff(f)
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f)
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs)
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[11]:
<matplotlib.legend.Legend at 0x1101d1f98>

In [12]:
f = [1, 2, 4, 8, 16]
difs = np.diff(f)
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f)
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs)
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[12]:
<matplotlib.legend.Legend at 0x11046ada0>

In [13]:
f = [0, 1,1, 0, -1, -1, 0, 1]
difs = np.diff(f)
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f)
plt.axhline(color = 'black')
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs)
plt.axhline(color = 'black')
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[13]:
<matplotlib.legend.Legend at 0x1108dcd30>

Connections to Slope¶
Recall from high school algebra, the notion of slope as rate of change. If we have two points \((0, 0)\) and \((2, 3)\) and a line between them, we measure the slope of this line as the ratio of change in the vertical direction to that of change in the horizontal direction or:
You can verify that the plots on the right side of the figures represent the slope of the line on the left side of the plot. We have also generated plots where the distance between points is a single unit, and can imagine that we would want to move to the the continuous analogies to our sequences above. Let’s start by making plots of twice as many points on the same domains for linear, quadratic, exponential, and trigonometric functions.
In [14]:
f = [i for i in np.arange(0, 5.5, 0.5)]
In [15]:
f
Out[15]:
[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]
In [16]:
difs = np.diff(f)/0.5
difs
Out[16]:
array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
In [17]:
difs/0.5
Out[17]:
array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.])
In [18]:
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f[0:3])
plt.axhline(color = 'black')
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs[0:3])
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[18]:
<matplotlib.legend.Legend at 0x110b6ca20>

In [19]:
f = [i**2 for i in np.arange(0, 5.5, 0.5)]
In [20]:
difs = np.diff(f)/0.5
In [21]:
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f[0:3])
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs[0:3])
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[21]:
<matplotlib.legend.Legend at 0x111004a20>

In [22]:
f = [2**i for i in np.arange(0, 5.5, 0.5)]
difs = np.diff(f)/0.5
In [23]:
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o', label = f[0:3])
plt.title("Plot of $f$")
plt.legend(loc = 'best', frameon = False)
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post', label = difs[0:3])
plt.title("Plot of differences $v$")
plt.legend(loc = 'best', frameon = False)
Out[23]:
<matplotlib.legend.Legend at 0x110dce0b8>

In [24]:
f = [np.cos(i) for i in np.arange(0, 2*np.pi, np.pi/8)]
difs = np.diff(f)
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o')
plt.axhline(color = 'black')
plt.title("Plot of $f$")
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'pre')
plt.axhline(color = 'black')
plt.title("Plot of differences $v$")
Out[24]:
<matplotlib.text.Text at 0x1115071d0>

Adding Many Points¶
We recognize the sequences in closed form for linear, quadratic, exponential, and trigonometric functions here. Remember however, that when we were plotting these functions, we were really just using a large number of points to plug into our function. This means we can still use the plotting instructions from above.
One difference is that we will need to compute the width of the intervals based on how many points we are using. For example, if we use 1000 points on the interval [0, 5], the intervals are each
.
We can divide the entire list of differences by this with the division operation, and we will have our differences to plot. In general, for any interval \([a, b]\) with \(n\) subdivisions, we will have
In [25]:
def f(x):
return x**2
a = 0
b = 5
n = 1000
w = (b - a)/n
x = np.linspace(a, b, n)
f = f(x)
In [26]:
difs = np.diff(f)/w
In [27]:
plt.figure(figsize = (11, 5))
plt.subplot(1, 2, 1)
plt.plot(f, '--o')
plt.title("Plot of $f$")
plt.subplot(1, 2, 2)
plt.step(np.arange(len(difs)), difs, '--o', where = 'post')
plt.title("Plot of differences $v$")
Out[27]:
<matplotlib.text.Text at 0x1115e74e0>

Explore the other examples and see if you can investigate some connections between the kinds of functions \(f\) and its differences \(v\).
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Smaller and Smaller Intervals¶
GOALS:
- Understand a definition of the derivative
- Interpret derivative as a function
- Interpret derivative as slope of tangent line
In the last notebook, we were investigating the connection between a sequence of values and smaller and smaller intervals for differences between these terms. Now, we will investigate some formal definitions for this operation of successively smaller differences as a Derivative. The big idea is that we have a function \(f\), we can define it’s derivative as:
We can interpret the derivative in a few ways. As we started looking at the difference between terms, we can consider the derivative as a measure of the rate of change of some sequence of numbers.
Similarly, if we have a closed form sequence rather than a sequence we can regard the derivative as a function that represents the rate of change of our original function.
Derivative as Function with Python¶
First, we can investigate the derivative of a function using Sympy’s
diff
function. If we enter a symbolic expression \(f\) in terms
of some variable \(x\), we will be able to get the derivative of
this expression with sy.diff(f, x)
. A simple example follows.
In [2]:
x = sy.Symbol('x')
f = x**2 - 2*x + 1
sy.diff(f, x)
Out[2]:
2*x - 2
In [3]:
sy.diff(f, x, 2)
Out[3]:
2
Similarly, we can define a function that is a good approximation for the
derivative by assigning a very small change \(dx\). For example, we
define the function above in our familiar fashion, and use the
definition above for a derivative function df
.
In [4]:
def f(x):
return x**2 - 2*x + 1
In [5]:
def df(x):
h = 0.000001
return (f(x + h) - f(x))/h
In [6]:
x = np.linspace(-4, 4, 1000)
plt.plot(x, f(x), label = '$f(x)$')
plt.plot(x, df(x), label = '$f\'(x)$')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.legend(loc = 'best', frameon = False)
plt.title("Function and its Derivative")
Out[6]:
<matplotlib.text.Text at 0x117b7aef0>

In [7]:
def f(x):
return (x -2)*(x -1) * (x + 1)*(x + 3)*(x +5)
In [8]:
x = np.linspace(-5, 2, 1000)
plt.plot(x, f(x), label = '$f(x)$')
plt.plot(x, df(x), label = '$f\'(x)$')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.legend(frameon = False)
plt.title("Another function and its Derivative")
Out[8]:
<matplotlib.text.Text at 0x118135828>

We can glean important information from the plot of the derivative about the behavior of the function we are investigating, particularly the maximum and minimum values. Perhaps we would only have the plot of the derivative, can you tell where the max and min values occur?
In [9]:
x = np.linspace(-5, 2, 1000)
plt.plot(x, df(x), label = '$f\'(x)$')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.legend(frameon = False)
plt.title("Just the Derivative")
Out[9]:
<matplotlib.text.Text at 0x1182b8e10>

In [10]:
x = sy.Symbol('x')
df = sy.diff(f(x), x)
df = sy.expand(df)
df
Out[10]:
5*x**4 + 24*x**3 - 6*x**2 - 72*x + 1
Derivative as Tangent Line¶
We can also interpret the derivative as the slope of a tangent line, or better yet an approximation for this tangent line. For example, the derivative at some point \(a\) can be thought of as the slope of the line through \(a\) and some other point \(a + \Delta x\) where \(\Delta x\) is very small.
Again, we would have something like:
for some arbitrarily small value of \(\Delta x\). We can also understand the expression above as providing us a means of approximating values close to \(x = a\) using the tangent line. If we rearrange terms, notice:
What this does, is tells us the slope of the line tangent to the graph at the point \((a, f(a))\). Suppose we have the function \(f(x) = x^2\), and we want to know the slope of the tangent line at \(x = 2\). We can define a function as usual, and a function for the derivative. We can use these to write an equation of the line tangent to the graph of the function at this point.
In [11]:
def f(x):
return x**2
def df(x):
h = 0.00001
return (f(x + h) - f(x))/h
In [12]:
df(2)
Out[12]:
4.000010000027032
In [15]:
def tan_plot(a):
x = np.linspace((a-4), (a+4), 1000)
y = df(a)*(x - a) + f(a)
plt.plot(x, f(x))
plt.plot(a, f(a), 'o', markersize = 10)
plt.plot(x, y, '--k')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
In [16]:
tan_plot(2)

In [17]:
def g(x):
return x*(x+2)*(x - 3)
def dg(x):
h = 0.000001
return ((g(x+h)-g(x))/h)
In [18]:
x = np.linspace(-3, 4, 1000)
plt.plot(x, g(x))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
Out[18]:
<matplotlib.lines.Line2D at 0x1189c43c8>

In [19]:
x = sy.Symbol('x')
df = sy.diff(g(x), x)
df = sy.simplify(df)
a, b = sy.solve(df, x)
In [20]:
a, b
Out[20]:
(1/3 + sqrt(19)/3, -sqrt(19)/3 + 1/3)
In [21]:
x = np.linspace(-3, 4, 1000)
plt.plot(x, g(x))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.plot(a, g(a), 'o')
plt.plot(b, g(b), 'o')
Out[21]:
[<matplotlib.lines.Line2D at 0x118a7fe10>]

In [22]:
def tan_plot(a, b):
x = np.linspace(-5,5, 1000)
y1 = dg(a)*(x - a) + g(a)
y2 = dg(b)*(x - b) + g(b)
plt.plot(x, g(x))
plt.plot(a, g(a), 'o', markersize = 10)
plt.plot(b, g(b), 'o', markersize = 10)
plt.plot(x, y1, '--k')
plt.plot(x, y2, '--k')
plt.ylim(-15, 15)
plt.xlim(-4,4)
plt.axhline(color = 'black')
plt.axvline(color = 'black')
In [23]:
tan_plot(a, b)

In [144]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
from matplotlib import patches
Arbitrating Disputes with John Nash¶
This notebook introduces the use of max/min problems in arbitrating a dispute between two parties. For example, suppose you have a bachelor or bachelorette party to plan. The parties involved will likely have different opinions to how and where the participants think the days and nights should be spent. Is there a way to determine an outcome that is most amenable to all?
Mathematician John Nash believed there was, and he deployed a method that relied on maximizing a product of values, much like Fermat’s original problem. Now, however, Nash’s quantities represented peoples preferences for certain outcomes. This ideal solution is referred to as the Nash Equilibrium, found using the calculus of maximums and minimums of products.

First Example¶
Let’s suppose you are planning a weekend away with a friend. You have three choices:
- Stay in city and do nothing (S)
- Bike upstate and go camping (C)
- Go to Rockaways for weekend (B)
You and your friend have different preferences for these, and could measure them by assigning some numeric value to each choice. We call these utilities. For example, we can denote your utility as \(u_y\) and your friends utilities as \(u_f\). If you were to assign values to each of these choices, the result may look something like:
and for your friend
In [145]:
uy = [2, 5, 7]
uf = [1, 3, 6]
In [146]:
np.diff(uy)
Out[146]:
array([3, 2])
Thus, we have the relationship:
If you think about when these would be the same, you would have to multiply the second equation by \(3/2\). You can interpret this as saying that your preference for going camping is one and a half times greater than your preference for going to the beach.
In [147]:
np.diff(uf)
Out[147]:
array([2, 3])
Seems we have the exact opposite outcome for your friend.
We can also see from this that preferences needn’t be the exact same values for them to represent the same preference.
Accordingly, we will say that two utilities \(u\) and \(v\) are equivalent when you have numbers \(a\) and \(b\) such that \(v = au + b\).
For example, the utilities:
and
are equivalent because:
Payoff Polygon¶
We are able to visualize the preferences of two parties by assinging an axis to each participant, and plotting the corresponding utility for each choice as an ordered pair. Before we do this, we want to normalize the utilities by creating a common base value around one option. We can use the \(S\) to normalize the first utilities as:
and
For example, consider our first example with \(u_f\) and \(u_y\). We will plot \(u_f\) as the \(x\) coordinates and \(u_y\) as the \(y\) coordinates.
In [148]:
C = [-3, -2]
S = [0, 0]
B = [2, 3]
In [149]:
a = np.array([C, S, B])
In [150]:
np.shape(a)
Out[150]:
(3, 2)
In [151]:
fig1 = plt.figure(figsize = (8, 5))
ax1 = fig1.add_subplot(111)
ax1.add_patch(patches.Polygon(a, fill = False))
plt.xlim(-4, 4)
plt.ylim(-3, 4)
plt.axhline()
plt.axvline()
plt.title("Payoff Polygon")
Out[151]:
<matplotlib.text.Text at 0x115e54d30>

The ideal outcome here should be fairly obvious, as both participants showed a clear preference for the beach. Suppose however, we reverse the scenario, and now have your friends preferences inverted from yours.
In [152]:
C = [-1, 2]
S = [0,0]
B = [3, -1]
a = np.array([C, S, B])
In [153]:
fig1 = plt.figure(figsize = (8, 5))
ax1 = fig1.add_subplot(111)
ax1.add_patch(patches.Polygon(a, fill = False))
plt.xlim(-4, 4)
plt.ylim(-3, 4)
plt.axhline()
plt.axvline()
plt.title("Payoff Polygon II")
Out[153]:
<matplotlib.text.Text at 0x115ff15c0>

If we consider the solutions here, we note that for both participants we have normalized these values so that the preference for \(S\) is 0. Hence, any solution should have better outcomes than this, which would mean it has positive values of \(x\) and \(y\). This is called the Pareto condition formally, but it should make intuitive sense.
Additionally, Nash proposed other important criteria for a fair solution and determine that there is exactly one arbitration solution that satisfies these conditions:
\(~\)
Nash’s Theorem
\(~\)
if there are no points in \(P\) with \(x > 0\) or \(y > 0\), let \(N = (0, 0)\).
\(~\)
if there are no points in \(P\) with \(y > 0\), but are points with \(y = 0\) and \(x > 0\), then let \(N\) be the point \((x, 0)\) which maximizes \(x\). Handle the case with \(x\) and \(y\) interchanged similarly.
\(~\)
if there are points in \(P\) with both \(x > 0\) and \(y>0\), let \(N\) be the point with \(X > 0\) and \(y > 0\) which maximizes the product \(xy\).
Focus on the last criteria. Nash claims that the ideal solution to the problem is the maximum of \(xy\) in the first quadrant. Let’s just consider this region.
In [154]:
fig1 = plt.figure(figsize = (8, 5))
ax1 = fig1.add_subplot(111)
ax1.add_patch(patches.Polygon(a, fill = False))
plt.xlim(0, 2)
plt.ylim(0, 1.5)
plt.title("Payoff Polygon in the First Quadrant")
Out[154]:
<matplotlib.text.Text at 0x116065a20>

The question is where is the maximum of \(xy\) here?
We can use our maximization work to answer this. First however, we want to express \(y\) in terms of \(x\) or vice versa so we have an easier expression to deal with.
The line segment we see above is the result of drawing a straight line through the points \((-1, 2)\) and \((3, -1)\). We will write an equation for a line through these two points.
We will spend some time here as this is an important skill for us to remember from our work with sequences.
Arithmetic Sequences and Linear Functions¶
In the example above, we can consider this as the problem involving a sequence who’s terms are the \(y\) values and indicies are the \(x\) values. Thus, we have something like:
It seems odd to index something at \(-1\) but it’s arbitrary. What matters is the amount of terms from the start to the start. In this example we start at 2 and then:
Because this is arithmetic, we know there is a constant change between terms. Hence, we have a change of \(-3\)(from 2 to -1)$ over four intervals, hence we can define our sequence as:
Generally, we can simply divide the difference in the terms by that of the indicies to get the common difference.
This is essentially the same as our slope terminology for a linear function, and we can use the point slope form of a linear equation to get the equation.
In [155]:
x, y = sy.symbols('x y')
In [156]:
y = -3/4*(x - (-1)) + 2
In [157]:
y
Out[157]:
-0.75*x + 1.25
In [158]:
def f(x):
return -0.75*x + 1.25
In [159]:
x = np.linspace(0, 2, 100)
plt.plot(x, f(x))
plt.axhline(color = 'black')
Out[159]:
<matplotlib.lines.Line2D at 0x1161c76a0>

Thus, our problem has become maximizing \(xy\) or \(x(\frac{-3}{4}x + \frac{5}{4})\).
In [160]:
def p(x):
return x*f(x)
In [161]:
x = np.linspace(0, 2, 100)
plt.plot(x, p(x))
plt.axhline(color = 'black')
Out[161]:
<matplotlib.lines.Line2D at 0x112d17fd0>

In [162]:
x = sy.Symbol('x')
In [163]:
eq = sy.diff(p(x), x)
In [164]:
sy.solve(eq)
Out[164]:
[0.833333333333333]
In [165]:
x = np.linspace(0, 2, 100)
plt.plot(x, p(x))
plt.axhline(color = 'black')
plt.plot(0.83333, p(0.8333), 'o', markersize = 14, alpha = 0.8)
Out[165]:
[<matplotlib.lines.Line2D at 0x1164517f0>]

In [166]:
f(0.833333333)
Out[166]:
0.62500000025
In [167]:
fig1 = plt.figure(figsize = (8, 5))
ax1 = fig1.add_subplot(111)
ax1.add_patch(patches.Polygon(a, fill = False))
plt.plot(0.833333, f(0.83333333), 'o', markersize = 10, label = (0.833, 0.625))
plt.legend()
plt.xlim(-4, 4)
plt.ylim(-3, 4)
plt.axhline()
plt.axvline()
plt.title("Payoff Polygon II")
Out[167]:
<matplotlib.text.Text at 0x1164a0390>

Interpreting the Solution¶
Thus, there is not a clear choice that wins, and instead we are at a solution somewhere between the beach and mountains. This could be understood as a lottery situation where we have some mixture of possibilities beach and mountain. We deine a lottery as follows:
Lottery: Some lottery \(L\) between outcomes \(A\) and \(B\), with probability \(p\) of \(A\) and \(1-p\) for \(B\). Then, for any utility function \(u\) the utility of the lottery is
Thus, in the example above, our \(u\) is the line we see the point on, and which we defined earlier.
or
We can solve either one of these to obtain our \(p\).
In [168]:
p = sy.Symbol('p')
eq = -1*p + (1-p)*3 - 0.833
p = sy.solve(eq)
In [169]:
p
Out[169]:
[0.541750000000000]
In [170]:
type(p)
Out[170]:
list
In [171]:
p.append(1-p[0])
In [172]:
p
Out[172]:
[0.541750000000000, 0.458250000000000]
Thus, if we have a lottery where the probability of beach is near 0.54 and the mountains is 0.46, we would have the best solution.
In [44]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Maximum and Minimum Values¶
Both Pierre de Fermat and G. Wilhelm Leibniz wrote about problems involving maxima and minima of curves. Fermat was interested in the problem of cutting a line so that the product of the length of the resulting segments was maximized.
Leibniz was interested in understanding the sign change in differences as a way to notice maximum and minimum values of curves. We follow Leibniz to investigate two simple problems. First, consider sequence \(f_i = i^2\) for \(i\) in \([-2,2]\).
In [45]:
f = [i**2 for i in np.linspace(-2,2,10)]
In [46]:
plt.plot(f, '--o')
Out[46]:
[<matplotlib.lines.Line2D at 0x10d65ab00>]

Now, consider the differences of these terms. Recall that we can find
this with the np.diff()
function. We also include a horizontal axis
to note the intersection and note something about the connection to
where it seems the minimum value occurs.
In [47]:
diffs = np.diff(f)
In [48]:
plt.plot(diffs, '--o')
plt.axhline(color = 'black')
plt.title("Plot of Differences")
Out[48]:
<matplotlib.text.Text at 0x10d6ad438>

Another example is a similar parabola flipped and pushed up one unit. The result has a maximum value instead of a minimum as we saw in the first example.
In [49]:
f = [-(i**2)+ 1 for i in np.linspace(-2,2,10)]
In [50]:
plt.plot(f, '--o')
Out[50]:
[<matplotlib.lines.Line2D at 0x10d8b7d30>]

In [51]:
diffs = np.diff(f)
In [52]:
plt.plot(diffs, '--o')
plt.axhline(color = 'black')
Out[52]:
<matplotlib.lines.Line2D at 0x10d8d1278>

Smaller Intervals¶
Let’s suppose we have much smaller intervals between points. For example, suppose we take 100 points on the interval \(x = [-2,2]\). We would be interested in the change between terms, however now this is the number:
In [57]:
def f(x):
return (x+2) * (x) * (x - 1)
In [58]:
x = np.linspace(-2,2, 100)
dx = (4)/(100)
In [59]:
diffs = np.diff(f(x))/dx
In [68]:
plt.figure(figsize = (11, 6))
plt.subplot(1, 2, 1)
plt.plot(x, f(x), '--o')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.title("Original Function $f(x) = x(x+2)(x-1)$")
plt.subplot(1, 2, 2)
plt.plot(x[1:], diffs, '--o')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.title("Differences in Terms")
Out[68]:
<matplotlib.text.Text at 0x10f69d828>

We notice something about the maximum and minimum values, at least locally. When the graph of the function changes direction, the differences are zero. Also, the point considered a local maximum that occurs between \(x = -1.5\) and \(x = -1.0\), the differences change from positive to negative. Similarly for the local minimum near \(x = 0.5\), the differences change from negative to positive.
In [70]:
def f(x):
return -1*(x+2) * (x) * (x - 1)
In [71]:
x = np.linspace(-2,2, 100)
dx = (4)/(100)
diffs = np.diff(f(x))/dx
In [73]:
plt.figure(figsize = (11, 6))
plt.subplot(1, 2, 1)
plt.plot(x, f(x), '--o')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.title("Original Function $f(x) = -x(x+2)(x-1)$")
plt.subplot(1, 2, 2)
plt.plot(x[1:], diffs, '--o')
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.title("Differences in Terms")
Out[73]:
<matplotlib.text.Text at 0x10fc3c278>

Problems¶
Optics¶
In [1]:
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
import cmath as cm
from scipy.misc import derivative
Newton’s Method¶
Newton’s method is a way to use the tangent line to approximate zeros of functions. A simple example to motivate the idea is to consider the example of the function \(f(x) = x^2 - 1\). Obviously this function has zeros at \(x = -1\) and \(x = 1\).
Below, we plot the function, and a line tangent to the graph of the function at the points \((1.1, f(1.1))\) and \((0.9, f(0.9))\).
In [2]:
def f(x):
return (x**2 - 1)
def tan_line(a):
return derivative(f, a)*(x - a) + f(a)
In [3]:
x = np.linspace(-2,2,100)
In [4]:
plt.plot(x, f(x))
plt.plot(x, tan_line(1.1))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.title("Tangent at x=1.1")
Out[4]:
<matplotlib.text.Text at 0x10f289978>
In [11]:
from mpl_toolkits.axes_grid1.inset_locator import zoomed_inset_axes
from mpl_toolkits.axes_grid1.inset_locator import mark_inset
fig, ax = plt.subplots(figsize = (10, 7)) # create a new figure with a default 111 subplot
ax.plot(x, f(x))
ax.plot(x, tan_line(a = 1.2))
plt.ylim(-2,2)
ax.axhline(color = 'black')
axins = zoomed_inset_axes(ax, 3.5, loc=2)
axins.plot(x, f(x))
axins.plot(x, tan_line(a = 1.2))
x1, x2, y1, y2 = 0.8,1.2, -0.2, 0.2
axins.set_xlim(x1, x2)
axins.set_ylim(y1, y2)
plt.yticks(visible=False)
plt.xticks(visible=False)
mark_inset(ax, axins, loc1=1, loc2=3, fc="none", ec="0.5")
axins.axhline(color = 'black')
Out[11]:
<matplotlib.lines.Line2D at 0x1127fd160>
In [5]:
def df(x):
dx = 0.000001
return (f(x + dx) - f(x))/dx
In [12]:
def N(x):
return x - f(x)/df(x)
In [13]:
N(1)
Out[13]:
1.0
In [14]:
l = [1.3]
for i in range(8):
z = N(l[i])
l.append(z)
In [15]:
l
Out[15]:
[1.3,
1.0346154866690602,
1.0005790875782625,
1.0000001678633645,
1.000000000000098,
1.0,
1.0,
1.0,
1.0]
In [16]:
N(1.3478260274025091)
Out[16]:
1.0448808843824686
In [17]:
def f(x):
return x**3 - x
In [18]:
l = [1+.5j]
In [19]:
for i in range(120000):
z = N(l[i])
l.append(z)
In [20]:
X = [x.real for x in l]
Y = [x.imag for x in l]
In [21]:
X
Out[21]:
[1.0,
0.8402370247174764,
0.9215077851434575,
0.9920869754787999,
0.9992702709978157,
1.0000004025142377,
0.9999999999989508,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
1.0,
...]
In [22]:
def f(x):
return x**2 - 1
from scipy.misc import derivative
x = np.linspace(-2, 2, 1000)
df = derivative(f, x)
In [24]:
plt.figure()
plt.plot(df)
plt.plot(f(x))
Out[24]:
[<matplotlib.lines.Line2D at 0x1075716d8>]
In [28]:
def N(x):
return x - f(x)/derivative(f, x)
In [29]:
N(1)
Out[29]:
1.0
In [30]:
N(0)
/Users/NYCMath/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:2: RuntimeWarning: divide by zero encountered in double_scalars
from ipykernel import kernelapp as app
Out[30]:
inf
In [31]:
N(.4)
Out[31]:
1.4500000000000002
In [34]:
plt.figure()
plt.plot(x, f(x))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
Out[34]:
<matplotlib.lines.Line2D at 0x11397cf28>
In [33]:
N(-1)
Out[33]:
-1.0
In [35]:
N(N(-1))
Out[35]:
-1.0
In [36]:
N(-1.2)
Out[36]:
-1.0166666666666666
In [37]:
N(N(-1.2))
Out[37]:
-1.000136612021858
In [38]:
def ns(x0, N):
for i in range(N):
start = [x0]
In [39]:
def f(x):
return (x-2)**2 - 2
x = np.linspace(0, 5, 100)
In [42]:
plt.figure()
plt.plot(x, f(x))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
Out[42]:
<matplotlib.lines.Line2D at 0x113c6c2b0>
In [43]:
def df(a):
return derivative(f, a)*(x - a) + f(a)
In [49]:
fig, ax = plt.subplots()
import matplotlib.patches as patches
plt.plot(x, f(x))
plt.plot(1.4, f(1.4), 'o', markersize = 10, alpha = 0.6)
plt.plot(x, df(a = 1.3))
plt.plot(x, df(a = 1.2))
plt.plot(x, df(a = 1.1))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.xlim(0,1.5)
plt.ylim(f(1.4)-0.2, 2)
plt.plot(x, df(a = 1.4))
Out[49]:
[<matplotlib.lines.Line2D at 0x114dd3828>]
In [50]:
def f(x):
return x**2
x = np.linspace(-2,2,100)
width = (2-(-2))/100
difs = np.diff(f(x))
plt.figure(figsize=(10, 7))
plt.subplot(2, 2, 1)
plt.plot(x, f(x), '-o', markersize = 1)
plt.title("Function")
plt.subplot(2, 2, 2)
plt.plot(x[1:], difs, '-o', markersize = 1)
plt.axhline(color = 'black')
plt.title("Differences")
plt.subplot(2, 2, 3)
plt.plot(x,f(x))
plt.title("Function")
plt.subplot(2,2,4)
plt.plot(x, derivative(f, x))
plt.axhline(color = 'black')
plt.title("Derivative")
Out[50]:
<matplotlib.text.Text at 0x1150f9cf8>
In [ ]:
plt.plot(f(x), animated = True)
In [51]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import animation, rc
from IPython.display import HTML
In [52]:
# First set up the figure, the axis, and the plot element we want to animate
fig, ax = plt.subplots()
ax.set_xlim(( 0, 2))
ax.set_ylim((-2, 2))
line, = ax.plot([], [], lw=2)
In [53]:
def init():
line.set_data([], [])
return (line,)
In [54]:
def animate(i):
x = np.linspace(0, 2, 1000)
y = np.sin(2 * np.pi * (x - 0.01 * i))
line.set_data(x, y)
return (line,)
In [55]:
anim = animation.FuncAnimation(fig, animate, init_func=init,
frames=100, interval=20, blit=True)
HTML(anim.to_html5_video())
Out[55]:
In [56]:
rc('animation', html='html5')
anim
Out[56]:
In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Parametric Descriptions¶
To this point, we have primarily seen relationships between quantities expressed in explicit terms of one another. In other words we usually have something like \(y=\) or \(x =\), with the other side of the expression having expressions in terms of another variable.
Alternatively, we can describe the relationships between quantities in terms of their value at some additional parameter. The example of displacement of a projectile body is a sensible example, where you would want to know the position of the particle at some time \(t\). For this, we have an expression for both the \(x\) and \(y\) coordinates of the body in terms of some time \(t\) for \(0\leq t \leq 25\).
We can plot this in Python by defining the values of \(t\) and functions \(x(t)\) and \(y(t)\) as usual.
This also means, for example, that at time \(t=5\), the body is located at \((1000, 1600)\).
In [3]:
t = np.linspace(0, 25, 25)
def x(t):
return 200*t
def y(t):
return 400*t - 16*t**2
In [4]:
plt.plot(x(t), y(t), '-o')
plt.ylim(0, 2500)
plt.xlim(0,5000)
plt.title("Parametric Representation")
Out[4]:
<matplotlib.text.Text at 0x10e73cba8>

Calculus on Parametrically Defined Functions¶
We can operate on the expressions as we had earlier. Now, we are describing the rate of change independently in the \(x\) and \(y\) direction in terms of a unit of time \(t\). Consider two points \(P_0 = (x(0), y(0))\) and \(P_1 = (x(1), y(1))\).
The change in \(x\) would be given by \(x_1 - x_0\). The change in \(y\) would be found with \(y_1 - y_0\), just as we have seen before. This is simple because we are using time intervals of 1, however just like before we can imagine that shrinking these to infinitely small intervals gives us a derivative in terms of \(t\). Recall our earlier description of the tangent as:
Sympy allows us to find all of these derivatives easily.
In [5]:
x, y, t = sy.symbols('x y t')
def x(t):
return 200*t
def y(t):
return 400*t - 16*t**2
In [6]:
dxt = sy.diff(x(t), t)
In [7]:
dyt = sy.diff(y(t), t)
In [8]:
dydx = dyt/dxt
dydx
Out[8]:
-4*t/25 + 2
Tangent Lines¶
In [9]:
t = np.linspace(-3,3, 100)
def x(t):
return t**2 + 1
def y(t):
return t**3 - 4*t
plt.plot(x(t), y(t))
plt.axhline(color = 'black')
Out[9]:
<matplotlib.lines.Line2D at 0x10e730c50>

In [10]:
x, y, t = sy.symbols('x y t')
def x(t):
return t**2 + 1
def y(t):
return t**3 - 4*t
In [11]:
dx = sy.diff(x(t), t)
dy = sy.diff(y(t), t)
eq = dy/dx
In [12]:
a, b = sy.solve(eq)
In [13]:
t = np.linspace(-3,3, 100)
def x(t):
return t**2 + 1
def y(t):
return t**3 - 4*t
plt.plot(x(t), y(t), '-o', markersize = 2)
plt.plot(x(a), y(a), 'o', markersize = 8)
plt.plot(x(b), y(b), 'o', markersize = 8)
plt.axhline(color = 'black')
plt.axhline(y(a), ls = '--', color = 'grey')
plt.axhline(y(b), ls = '--', color = 'grey')
Out[13]:
<matplotlib.lines.Line2D at 0x10ee5df60>

Additional Examples¶
Some additional examples of parametric curves are as follows:
Line: Through point \((a, b)\) with slope \(m\) \(\rightarrow c(t) = (a + t, b + mt)\)
Circle: Radius \(r\) centered at \((a, b)\) \(\rightarrow x(\theta) = a + r\cos(\theta) \quad y(\theta) = b + r\sin(\theta)\)
Ellipse: An ellipse of the form \((\frac{x}{a})^2+ (\frac{y}{b})^2 = 1\) is parameterized by \(\rightarrow c(t) = (a\cos(t), b\sin(t))\)
In [14]:
t = np.linspace(0, np.pi*2, 100)
x = 4*np.cos(t)
y = 2*np.sin(t)
plt.plot(x, y, '-o', c = 'green')
plt.xlim(min(x), max(x))
plt.ylim(min(y), max(y))
plt.title("An Ellipse")
plt.axhline(color = 'black')
plt.axvline(color = 'black')
Out[14]:
<matplotlib.lines.Line2D at 0x10ef9cbe0>

In [15]:
t = np.linspace(0, np.pi*4, 24)
x = np.cos(t)
y = np.cos(t)**2
plt.plot(x, y, '-o', c = 'red')
plt.xlim(min(x), max(x))
plt.ylim(min(y), max(y))
plt.title("Back and Forth Motion")
Out[15]:
<matplotlib.text.Text at 0x10eff39b0>

In [16]:
t = np.linspace(0, 13, 40)
x = t - np.sin(t)
y = 1 - np.cos(t)
plt.plot(x, y, '-o')
plt.xlim(min(x), max(x))
plt.ylim(min(y), max(y))
plt.title("The Cycloid")
Out[16]:
<matplotlib.text.Text at 0x10f115358>

Problems¶
Interpolation¶
Let’s suppose we have two points, defined parametrically, \(P_0 = (1, 3)\) and \(P_1 = (3, 7)\). Suppose also, that we have a body in motion between the points. This body at time \(t=0\) is at \(P_0\) and at time \(t=1\) is at \(P_1\). We can intuit the parametric equations for \(x(t)\) and \(y(t)\) as:
The plot below demonstrates the simple situation.
In [16]:
x = [1, 3]
y = [3, 7]
plt.plot(x, y, '-o')
plt.xlim(min(x)-1, max(x)+1)
plt.ylim(min(y)-1, max(y)+1)
Out[16]:
(2, 8)

We can extend the problem to three points as follows. Further, we can use numpy to interpolate the segments. To do so, we need to describe the interval of \(x\) values that we want to interpolate the curve. Here, we start at \(x = 1\) and end at \(x = 6\).
In [53]:
x = np.linspace(1, 6, 100)
xs = [1, 3, 6]
ys = [3, 7, 5]
plt.plot(xs, ys, '-o')
curve = np.interp(x, xs, ys)
plt.plot(x, curve)
plt.xlim(min(xs)-1, max(xs)+1)
plt.ylim(min(ys)-1, max(ys)+1)
Out[53]:
(2, 8)

If we consider the same problem for any two points \((x_0, y_0), (x_1, y_1)\) we would have:
and more generally in terms of the points \(P_0\) and \(P_1\) we have
We can hypothesize that this relationship would hold true for further points. For example, if we add a third point \(P_2\) it seems reasonable to expect that we would have an expression combining these points with some functions \(\alpha(t)\) as follows.
We can extend the problem above and again determine that equal time intervals will be traversed between each pair of points, and that at time \(t = 0, 1, 2\) we are at points \(P_0, P_1, P_2\) respectively to form a polynomial.
EXAMPLE: Suppose we now take points \(P_0 = (-1,9), P_1 = (2, 3), P_2 = (3, 5)\). Thus \(\alpha_0(0) = 1, \alpha_0(1) = 0, \alpha_0(2) = 0\). If we assume that the curve passing through these points is only zero at \(t = 1, 2\) we have \(\alpha_0(t) = a(t - 1)(t - 2)\) for some \(a\), but we also know that at \(\alpha_0(0) = 1\) so
Through similar procedures verify that \(\alpha_1(t) = -t(t-2)\) and \(\alpha_2(t) = \frac{t(t-1)}{2}\), or
which gives
In [86]:
def x(t):
return -t**2 + 4*t -1
def y(t):
return 4*t**2 - 10*t + 9
t = np.linspace(-2,2,100)
P = [-1,2,3]
Y = [9, 3, 5]
plt.plot(x(t), y(t), '--k')
plt.plot(P, Y, '-o')
plt.xlim(-3,4)
plt.ylim(0,10)
Out[86]:
(0, 10)

And with four points we have a similar example, where we now use the
scipy.interpolate
function.
In [62]:
x = [0, 1, 3, 5]
y = [4, 2, -1, 5]
curve = np.interp(x, x, y)
In [65]:
from scipy import interpolate
i1 = interpolate.splrep(x, y, s=0)
ynews = interpolate.splev(xs, i1, der = 0)
plt.plot(x, y, 'o')
plt.plot(x, curve, label = "Linear Interpolation")
plt.plot(xs, ynews, '--k', label = "Polynomial Interpolation")
plt.xlim(min(x)-1, max(x)+1)
plt.ylim(min(y)-1, max(y)+1)
plt.legend(loc = 'best', frameon = False)
Out[65]:
<matplotlib.legend.Legend at 0x119b6a208>

Bezier Curves¶
A variant of the problem of interpolating polynomials is the use of computer graphics programs to manipulate a curve. This is the problem that him and him encountered as engineers in for automotive manufacturers in the 1960’s. The idea, rather than finding the interpolating polynomial, is to form a polynomial contained withing the framework of the linear piecewise image.
For example, we can create a simple quadratic with two sections similar to our earlier shapes. Now, suppose we have a point affixed on each of these sections joined by a bar. Also, suppose these points traverse the distance between successive vertices in equal time. We can trace out a simple quadratic by following the point \(P\) as the bar moves.
In [87]:
x = [1, 3, 6]
y = [1, 5, 4]
plt.figure(figsize = (10,6))
plt.plot(x, y, '-o')
plt.plot((x[0]+x[1])/2, (y[0]+y[1])/2, '-o')
plt.plot((x[1]+x[2])/2, (y[1]+y[2])/2, 'o')
plt.xlim(min(x)-1, max(x)+1)
plt.ylim(min(y)-1, max(y)+1)
plt.annotate('$P_0$', xy = (x[0]-0.5, y[0]))
plt.annotate('$P_1$', xy = (x[1]-0.5, y[1]))
plt.annotate('$P_2$', xy = (x[2]-0.5, y[2]+0.5))
plt.annotate('$P_{01}$', xy = ((x[0]+x[1])/2 - 0.4, (y[0]+y[1])/2))
plt.annotate('$P_{12}$', xy = ((x[2]+x[1])/2 - 0.4, (y[2]+y[1])/2 + 0.3))
a = [(x[0]+x[1])/2, (x[1]+x[2])/2]
b = [(y[0]+y[1])/2, (y[1]+y[2])/2]
plt.plot(a, b, color = 'blue')
plt.plot((a[0]+a[1])/2, (b[0]+b[1])/2, 'o')
plt.annotate('$P$', xy = ((a[0]+a[1])/2, (b[0]+b[1])/2+.2))
plt.title("A Simple Bezier Spline Generated by Point $P$")
bez = [x[0], (a[0]+a[1])/2, x[2]]
bezy = [y[0], (b[0]+b[1])/2, y[2]]
i1 = interpolate.interp1d(bez, bezy, kind = 'quadratic')
xs = np.linspace(1, 6, 1000)
plt.plot(xs, i1(xs), '--k')
Out[87]:
[<matplotlib.lines.Line2D at 0x11938e630>]

The Bezier quadratic is given by the function \(P(t)\) as \(t\) goes from 0 to 1. It takes the form:
or parametrically as
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Composition and the Chain Rule¶
As you may have guessed, once we apply the idea of differentiation to simple curves, we’d like to use this method in more and more general situations. This notebook explores the use of differentiation and derivatives to explore functions formed by composition, and to use these results to differentiate implicitly defined functions.
GOALS:
- Identify situations where chain rule is of use
- Define and Use Chain Rule
- Use Chain Rule to differentiate implicitly defined functions
- Use Descartes algorithm to explore alternative approach to differentation
Composition of Functions¶
Functions can be formed by combining other functions through famililar operations. For example, we can consider the polynomial \(h(x) = x^3 + x^2\) as formed by two simpler polynomials \(f(x) = x^3\) and \(g(x) = x^2\) combined through addition. So far, we have not had to worry about this, as differentiation and integration are linear operators that work across addition and subtraction.
If we instead have a function \(h\) given by:
we may recognize the square root function and the polynomial inside of it. This was not formed by addition, subtraction, multiplication, or division of simpler functions however. Instead, we can understand the function \(h\) as formed by composing two functions \(f\) and \(g\) where:
The operation of composition means we apply the function \(f\) to the function \(g\). We would write this as
We can use SymPy to explore a few examples and determine a general rule for differentiating functions formed by compositions. We begin with trying to generalize the situation above, where we compose some function \(g\) into a function \(f\) of the form
It seems reasonable to expect that
You need to adjust this statement to make it true. Consider the following examples, use sympy to differentiate them and determine the remaining terms.
- \((x^2 - 3x)^2\)
- \((x^2 - 3x)^3\)
- \(\sqrt{x^2 - 3x}\)
In [2]:
x = sy.Symbol('x')
y1 = (x**2 - 3*x)**2
y2 = (x**2 - 3*x)**3
y3 = (x**2 - 3*x)**(1/2)
In [3]:
dy1 = sy.diff(y1, x)
sy.factor(dy1)
Out[3]:
2*x*(x - 3)*(2*x - 3)
In [4]:
dy2 = sy.diff(y2, x)
sy.factor(dy2)
Out[4]:
3*x**2*(x - 3)**2*(2*x - 3)
In [5]:
dy3 = sy.diff(y3, x)
dy3
Out[5]:
(1.0*x - 1.5)*(x**2 - 3*x)**(-0.5)
Now consider the examples for \(f(x) = \sin(g(x))\):
- \(\sin{2x}\)
- \(\sin{(\frac{1}{2}x + 3)}\)
- \(\sin{(x^2)}\)
Implicitly Defined Functions¶
Rather than being given a relationship that can be expressed in terms of a single variable, we can also apply the technique of differentiation to implicitly defined curves. For example, consider the equation for a circle centered at the origin with radius 5, i.e.:
We are unable to express \(y\) in terms of \(x\) with a single equation. We should still be able to understand things like tangent lines however, and apply the idea of differentiation to the expression. We can plot implicitly defined functions with Sympy as shown below. These should look familiar to our parametric plots, and we could introduce parameters into these equations to describe the curves parametrically.
In [6]:
x, y = sy.symbols('x y')
In [7]:
y1 = sy.Eq(x**2 + y**2, 25)
sy.plot_implicit(y1)

Out[7]:
<sympy.plotting.plot.Plot at 0x1144b0780>
In [8]:
y2 = sy.Eq(x**3 + y**3, (9/2)*x*y)
sy.plot_implicit(y2)

Out[8]:
<sympy.plotting.plot.Plot at 0x1168e74a8>
In [9]:
y3 = sy.Eq(sy.sin(x + y), y**2*sy.cos(x))
sy.plot_implicit(y3)

Out[9]:
<sympy.plotting.plot.Plot at 0x1169b3908>
In [10]:
y4 = sy.Eq(y*sy.sin(3*x), x*sy.cos(3*y))
sy.plot_implicit(y4)

Out[10]:
<sympy.plotting.plot.Plot at 0x116a4cef0>
Algorithmic Implicit Differentiation¶
Hopefully, you recognize the problem. There are many tangents at a given value of \(x\). However, we can determine a point on the graph and recall what we know about equations of lines. Take the second example, called the Folium of Descartes, defined as:
We know the point \((2, 1)\) is on the graph. Also, we know that if we have a tangent line at this point, it would be a linear equation with some slope \(m\) that passes through \((2, 1)\). This means
These equations agree at \((2,1)\) so we can substitute the second into the first for \(y\)
There is some algebra involved here that we will call on Sympy to help us with. Here’s the idea. From the picture, we see there will be two tangent lines at \(x = 2\). This means our equation has a double root there, or that \((x-2)^2\) is a factor of this. We can eliminate one of these roots by dividing the expression by \(x-2\), then substituting \(x=2\) into the remaining expression and solve for the slope \(m\).
In [24]:
x, y, m = sy.symbols('x y m')
expr = x**3 + y**3 -(9/2)*x*y
tan = m*(x-2) + 1
In [25]:
expr = expr.subs(y, tan)
sy.pprint(expr)
3 3
x - 4.5⋅x⋅(m⋅(x - 2) + 1) + (m⋅(x - 2) + 1)
In [26]:
expr
Out[26]:
x**3 - 4.5*x*(m*(x - 2) + 1) + (m*(x - 2) + 1)**3
In [31]:
now = sy.quo(expr, x-2)
now = now.subs(x, 2)
In [32]:
sy.solve(now, m)
Out[32]:
[1.25000000000000]
Textbook Approach¶
If we want to work directly on the expression, we can recall that \(y\) can be considered as a function of \(x\), \(y=f(x)\). When we do express things this way, the equation for the Folium above becomes
The left hand side of the equation has the term \(x^3\), whose derivative with respect to \(x\) is \(3x^2\) from our familiar rules, and then our second term, \((f(x))^3\) makes use of the chain rule and we would get:
We treat the other elements of the expression similarly, and solve for
the term \(f'\). We can use Sympy to simplify these computations and
verify the result above. The idiff
function from sympy takes the
implicit derivative with respect to the second variable. We can then
evaluate the generale derivative at our point \((2, 1)\).
In [85]:
x, y = sy.symbols('x y')
foli = x**3 + y**3 - (9/2)*x*y
dx_foli = sy.idiff(foli, y, x)
In [86]:
dx_foli.subs(x, 2).subs(y, 1)
Out[86]:
1.25000000000000
Problems¶
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
from scipy import interpolate, stats
Regression and Least Squares¶
Earlier, we encountered interpolation as a means to determine a line given only a few points. Interpolation determined a polynomial that would go through each point. This is not always sensible however.
Let’s consider the following example where the data represents cigarette consumption and death rates for countries given.
Country | Cigarette Consumption | Deaths per Million |
---|---|---|
Norway | 250 | 95 |
Sweden | 300 | 120 |
Denmark | 350 | 165 |
Australia | 470 | 170 |
If we use interpolation as we had earlier, we get the following picture.
In [2]:
cigs = [250, 300, 350, 470]
death = [95, 120, 165, 170]
xs = np.linspace(240, 480, 1000)
i1 = interpolate.splrep(cigs, death, s=0)
ynews = interpolate.splev(xs, i1, der = 0)
plt.plot(cigs, death, 'o')
plt.plot(xs, ynews, '--k')
plt.title("Interpolation Polynomial of Cigarette Data")
plt.xlabel("Cigarette Consumption")
plt.ylabel("Deaths per million")
Out[2]:
<matplotlib.text.Text at 0x10fdfd400>

Now suppose that we wanted to use our polynomial to make a prediction about a country with higher cigarette consumption than that of Australia. You should notice that our polynomial would provide an estimate that is lower.
Alternatively, we can fit a straight line to the data with a simple
numpy function polyfit
where we describe the data we are fitting and
the degree polynomial we are fitting. Here we want a straight line so we
choose degree 1.
In [3]:
a, b = np.polyfit(cigs, death, 1)
In [4]:
def l(x):
return a*x + b
In [5]:
xs = np.linspace(240, 480, 1000)
i1 = interpolate.splrep(cigs, death, s=0)
ynews = interpolate.splev(xs, i1, der = 0)
plt.plot(cigs, death, 'o')
plt.plot(xs, l(xs), label = 'Linear Regression')
plt.plot(xs, ynews, '--k', label = 'Polynomial Interpolation')
plt.title("Interpolation vs. Regression")
plt.xlabel("Cigarette Consumption")
plt.ylabel("Deaths per million")
plt.legend(loc = 'best', frameon = False)
Out[5]:
<matplotlib.legend.Legend at 0x110456d30>

Determining the Line of Best Fit¶
The idea behind our line of best fit, is that it minimizes the distance between itself and all the data points. These distances are called residuals and are shown in the plot below.
In [6]:
plt.plot(cigs, death, 'o')
plt.plot(xs, l(xs), label = 'Linear Regression')
plt.plot((cigs[0], cigs[0]), (death[0], l(cigs[0])), c = 'red')
plt.plot((cigs[1], cigs[1]), (death[1], l(cigs[1])), c = 'red')
plt.plot((cigs[2], cigs[2]), (death[2], l(cigs[2])), c = 'red')
plt.plot((cigs[3], cigs[3]), (death[3], l(cigs[3])), c = 'red')
plt.title("Residuals")
Out[6]:
<matplotlib.text.Text at 0x1104bc710>

The line of best fit minimizes these distances using familiar techniques of differentiation that we have studied. First, we investigate the criteria of least squares, that says the residuals are minimized by finding the smallest ROOT MEAN SQUARE ERROR or RMSE.
In general, we see that a residual is the distance between some actual data point \((x_i, y_i)\) and the resulting point on the line of best fit \(l(x)\) at the point \((x_i, l(x_i))\).
Suppose we were deciding between the lines
We want to compare the average difference between the actual and predicted values. We can find the residuals by creating a list of differences in terms of actual and predicted values.
In [7]:
def y1(x):
return 0.3*x + 34.75
def y2(x):
return 0.4*x + 0.5
In [8]:
plt.plot(cigs, death, 'o')
plt.plot(xs, y1(xs))
plt.plot(xs, y2(xs))
plt.title("Which is the Better Fit?")
Out[8]:
<matplotlib.text.Text at 0x110609e48>

In [9]:
resid_1 = [np.sqrt((death[i] - y1(cigs[i]))**2) for i in range(len(cigs))]
resid_2 = [np.sqrt((death[i] - y2(cigs[i]))**2) for i in range(len(cigs))]
In [10]:
resid_1
Out[10]:
[14.75, 4.75, 25.25, 5.75]
In [11]:
resid_2
Out[11]:
[5.5, 0.5, 24.5, 18.5]
In [12]:
resid_1_sq = [(a**2 )/4for a in resid_1]
resid_2_sq = [(a**2)/4 for a in resid_2]
In [13]:
np.sqrt(sum(resid_1_sq))
Out[13]:
15.089317413322579
In [14]:
np.sqrt(sum(resid_2_sq))
Out[14]:
15.596473960482221
Thus, the first line \(y1\) is considered a better fit for the data.
Deriving the Equation of the Line¶
In general, we have some line of best fit \(y\) given by:
If we have some set of points \((x_1, y_1), (x_2, y_2), (x_3, y_3)...(x_n, y_n)\). We need to minimize the sum of squares of residuals here, so we would have a number of values determined by:
which we can rewrite in summation notation as
We can consider this as a function in terms of the variable \(a\) that we are seeking to minimize.
From here, we can apply our familiar strategy of differentiating the function and locating the critical values. We are looking for the derivative of a sum, which turns out to be equivalent to the sum of the derivatives, hence we have
Setting this equal to zero and solving for \(a\) we get
The terms should be familiar as averages, and we can rewrite our equation as
We now use this to investigate a similar function in terms of \(b\) to complete our solution.
We end up with
Let’s return to the problem of cigarette consumption and test our work out by manually computing \(a\) and \(b\).
In [15]:
cigs = [250, 300, 350, 470]
death = [95, 120, 165, 170]
plt.scatter(cigs, death)
plt.title("Cigarette Consumption vs. Deaths")
Out[15]:
<matplotlib.text.Text at 0x110677710>

In [16]:
ybar = np.mean(death)
xbar = np.mean(cigs)
ydiff = (death - ybar)
xdiff = (cigs - xbar)
b = np.sum(ydiff*xdiff)/np.sum(xdiff**2)
a = ybar - b*xbar
a, b
Out[16]:
(21.621368322399249, 0.33833177132146203)
In [17]:
l = [a + b*i for i in cigs]
plt.scatter(cigs, death)
plt.plot(cigs, l, '--k', label = 'Line of Best Fit')
plt.title("Manual Fit of Data")
plt.legend(loc = 'best', frameon = False)
Out[17]:
<matplotlib.legend.Legend at 0x110898860>

We can check with numpy and see if our values for \(a\) and \(b\) agree with the computer model.
In [18]:
b2, a2 = np.polyfit(cigs, death, 1)
a2, b2
Out[18]:
(21.621368322399242, 0.33833177132146203)
Finally, we can write a simple function to compute the RMSE.
In [19]:
l
Out[19]:
[106.20431115276476, 123.12089971883786, 140.03748828491098, 180.6373008434864]
In [20]:
death
Out[20]:
[95, 120, 165, 170]
In [21]:
diff = [(l[i] - death[i])**2 for i in range(len(l))]
diff
Out[21]:
[125.53658840796878,
9.7400150550422442,
623.12699112595669,
113.15216923483649]
In [22]:
np.sqrt(np.sum(diff)/len(l))
Out[22]:
14.761061647318972
Other Situations¶
Our goal with regression is to identify situations where regression makes sense, fit models and discuss the reasonableness of the model for describing the data. Data does not always come in linear forms however.
We can easily generate sample data for familiar curves. First, we can
make some lists of polynomial form, then we will add some noise to
these, fit models with np.polyfit()
, and plot the results.
Non-Linear Functions¶
Plotting and fitting non-linear functions follows a similar pattern, however we need to take into consideration the nature of the function. First, if we see something following a polynomial pattern, we can just use whatever degree polynomial fit we believe is relevant. The derivation of these formulas follows the same structure as the linear case, except you are replacing the line \(a - bx_i\) with a polynomial \(a + bx_i + cx_i^2...\).
If we believe there to be an exponential fit, we can transform this into a linear situation using the logarithm. For example, suppose we have the following population data.
Decade \(t\) | Year | Population |
---|---|---|
0 | 1780 | 2.8 |
1 | 1790 | 3.9 |
2 | 1800 | 5.3 |
3 | 1810 | 7.2 |
If we examine the data, we see an exponential like trend. If we use NumPy to find the logarithm of the population values and plot the result, we note the transformed datas similarity to a linear function.
In [23]:
t = np.arange(0,13)
year = np.arange(1780,1910,10)
P = [2.8, 3.9, 5.3, 7.2, 9.6, 12.9, 17.1, 23.2, 31.4, 39.8, 50.2, 62.9, 76.0]
In [24]:
plt.figure(figsize = (8,5))
plt.subplot(1, 2, 1)
plt.scatter(year, P,color = 'red', alpha = 0.7)
plt.xlabel('Year')
plt.ylabel('Population')
plt.title("Original Data")
plt.subplot(1, 2, 2)
lnP = np.log(P)
plt.scatter(year, lnP, color = 'green', alpha = 0.4)
plt.xlabel('Year')
plt.ylabel('Logarithm of Population')
plt.title("Transformed Data")
Out[24]:
<matplotlib.text.Text at 0x110a026d8>

Symbolically, we would imagine the original function as an exponential of the form
The expression can be explored in a similar manner, where we use Sympy to find the effect of the logarithm.
In [25]:
y, a, b, x = sy.symbols('y a b x')
In [26]:
eq = sy.Eq(y, a*sy.exp(b*x))
In [27]:
sy.expand_log(sy.log(b**x))
Out[27]:
log(b**x)
In [28]:
sy.expand_log(sy.log(a*sy.exp(b*x)), force = True)
Out[28]:
b*x + log(a)
Hence, we have that
which should look like our familiar linear equations. Here, we can find \(a\) and \(b\), then convert the equation back to its original form by undoing the logarithm with the exponential.
For kicks, we introduce the SciPy linregress
function. Feel free to
examine the help documentation for the function. This gives a little
more information about the model than the polyfit
function. Further,
we add text to the plot to display information about the model.
In [29]:
line = np.polyfit(year, lnP, 1)
fit = np.polyval(line, year)
alpha, beta, r_value, p_value, std_err = stats.linregress(year, lnP) #
alpha, beta, r_value
Out[29]:
(0.027906119028040688, -48.55083021891685, 0.99815691149842678)
In [30]:
fig = plt.figure(figsize = (10,5))
plt.subplot(1, 2, 2)
plt.plot(year, np.exp(fit))
plt.plot(year, P, 'o', markersize = 7, alpha = 0.8)
ax = fig.add_subplot(121)
text_string = "\nCorrelation coefficient: %f" % (r_value)
ax.text(0.022, 0.972, text_string, transform=ax.transAxes, verticalalignment='top', fontsize=8)
plt.title("Results of Linear Regression on ln P", fontsize = 9)
plt.xlabel("Year", fontsize = 8)
plt.ylabel("ln Population")
plt.subplot(1, 2, 1)
plt.plot(year, fit)
plt.plot(year, lnP, 'o', markersize = 7, alpha = 0.8)
ax = fig.add_subplot(122)
text_string = "\nCorrelation coefficient: %f" % (r_value)
ax.text(0.022, 0.972, text_string, transform=ax.transAxes, verticalalignment='top', fontsize=8)
plt.title("Results after returning to exponential form", fontsize = 9)
plt.xlabel("Year", fontsize = 8)
plt.ylabel("Population")
Out[30]:
<matplotlib.text.Text at 0x110c1c748>

Logistic Example¶
As you see above, towards the end of our model the actual and predicted values seem to diverge. Considering the context, this makes sense. A population should reach some maximum levels due to physical resources. A more S shaped curve is the logistic function which is given by
As an example, consider the Inter Continental Ballistic Missle Data for 1960 - 1969.
Year | Number of ICBM’s |
---|---|
1960 | 18 |
1961 | 63 |
1962 | 294 |
1963 | 424 |
1964 | 834 |
1965 | 854 |
1966 | 904 |
1967 | 1054 |
1968 | 1054 |
1969 | 1054 |
In [31]:
year = [i for i in np.arange(1960, 1970, 1)]
icbm = [18, 63, 294, 424, 834, 854, 904, 1054, 1054, 1054]
In [32]:
plt.scatter(year, icbm)
Out[32]:
<matplotlib.collections.PathCollection at 0x1107ab748>

In [33]:
L, y, a, b, x = sy.symbols('L y a b x')
In [34]:
exp = sy.Eq(y, L/(1 + sy.exp(a + b*x)))
In [35]:
sy.solve(exp, (a + b*x), force = True)
Out[35]:
[log((L - y)/y)]
This means that the tranformation that linearizes our data is
The value \(L\) is defined as the carrying capacity of the model. Here, it seems something like \(L = 1060\) would be a reasonable value to try.
In [36]:
t_icbm = [np.log((1060 - i)/i) for i in icbm]
In [37]:
plt.scatter(year, t_icbm)
Out[37]:
<matplotlib.collections.PathCollection at 0x111183cf8>

In [38]:
b, a = np.polyfit(year, t_icbm, 1)
In [39]:
a, b
Out[39]:
(2091.7866057847828, -1.0653944179379069)
In [40]:
def l(x):
return b*x + a
l(1960), l(1969)
Out[40]:
(3.6135466264850038, -5.9750031349558412)
In [41]:
fit = [l(i) for i in year]
In [42]:
plt.scatter(year, t_icbm)
plt.plot(year, fit)
plt.title("Fitting ICBM Data")
plt.xlabel("Year")
plt.ylabel("Transformed ICMB Data")
Out[42]:
<matplotlib.text.Text at 0x1111ed320>

Much like the last example, we can return everything to its original form with the exponential. We arrive at the equation
In [43]:
def y(x):
return 1060/(1 + np.exp(2092 - 1.0654*x))
o_fit = [y(i) for i in year]
In [44]:
plt.scatter(year, icbm)
plt.plot(year, o_fit, '--k')
plt.title("ICBM Data and Logistic Fit")
plt.xlabel("Year")
plt.ylabel("ICMB's")
Out[44]:
<matplotlib.text.Text at 0x1111c0d30>

Machine Learning Example¶
In [46]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
# Load the diabetes dataset
diabetes = datasets.load_diabetes()
# Use only one feature
diabetes_X = diabetes.data[:, np.newaxis, 2]
In [61]:
diabetes_X[:5]
Out[61]:
array([[ 0.06169621],
[-0.05147406],
[ 0.04445121],
[-0.01159501],
[-0.03638469]])
In [64]:
datasets?
In [53]:
# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]
# Split the targets into training/testing sets
diabetes_y_train = diabetes.target[:-20]
diabetes_y_test = diabetes.target[-20:]
In [60]:
plt.figure(figsize = (10,5))
plt.scatter(diabetes_X_train, diabetes_y_train, color = 'blue', alpha = 0.4, label = "Training Data")
plt.scatter(diabetes_X_test, diabetes_y_test, color = 'red', alpha = 0.5, label = "Testing Data")
plt.legend(frameon = False)
Out[60]:
<matplotlib.legend.Legend at 0x1123e16a0>

In [65]:
# Create linear regression object
regr = linear_model.LinearRegression()
# Train the model using the training sets
regr.fit(diabetes_X_train, diabetes_y_train)
# Make predictions using the testing set
diabetes_y_pred = regr.predict(diabetes_X_test)
In [66]:
# The coefficients
print('Coefficients: \n', regr.coef_)
# The mean squared error
print("Mean squared error: %.2f"
% mean_squared_error(diabetes_y_test, diabetes_y_pred))
# Explained variance score: 1 is perfect prediction
print('Variance score: %.2f' % r2_score(diabetes_y_test, diabetes_y_pred))
Coefficients:
[ 938.23786125]
Mean squared error: 2548.07
Variance score: 0.47
In [75]:
# Plot outputs
plt.figure(figsize = (7,4))
plt.scatter(diabetes_X_test, diabetes_y_test, color='black', alpha = 0.5, s = 9)
plt.plot(diabetes_X_test, diabetes_y_pred, color='blue')
plt.title("The fit against the Testing Data")
Out[75]:
<matplotlib.text.Text at 0x112c6b748>

Problems¶
In [123]:
from sklearn.datasets import load_iris
iris = load_iris()
data = iris.data
column_names = iris.feature_names
In [134]:
print(iris.DESCR)
Iris Plants Database
Notes
-----
Data Set Characteristics:
:Number of Instances: 150 (50 in each of three classes)
:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- class:
- Iris-Setosa
- Iris-Versicolour
- Iris-Virginica
:Summary Statistics:
============== ==== ==== ======= ===== ====================
Min Max Mean SD Class Correlation
============== ==== ==== ======= ===== ====================
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
============== ==== ==== ======= ===== ====================
:Missing Attribute Values: None
:Class Distribution: 33.3% for each of 3 classes.
:Creator: R.A. Fisher
:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
:Date: July, 1988
This is a copy of UCI ML iris datasets.
http://archive.ics.uci.edu/ml/datasets/Iris
The famous Iris database, first used by Sir R.A Fisher
This is perhaps the best known database to be found in the
pattern recognition literature. Fisher's paper is a classic in the field and
is referenced frequently to this day. (See Duda & Hart, for example.) The
data set contains 3 classes of 50 instances each, where each class refers to a
type of iris plant. One class is linearly separable from the other 2; the
latter are NOT linearly separable from each other.
References
----------
- Fisher,R.A. "The use of multiple measurements in taxonomic problems"
Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
Mathematical Statistics" (John Wiley, NY, 1950).
- Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
(Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
- Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
Structure and Classification Rule for Recognition in Partially Exposed
Environments". IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. PAMI-2, No. 1, 67-71.
- Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions
on Information Theory, May 1972, 431-433.
- See also: 1988 MLC Proceedings, 54-64. Cheeseman et al"s AUTOCLASS II
conceptual clustering system finds 3 classes in the data.
- Many, many more ...
In [125]:
data
Out[125]:
array([[ 5.1, 3.5, 1.4, 0.2],
[ 4.9, 3. , 1.4, 0.2],
[ 4.7, 3.2, 1.3, 0.2],
[ 4.6, 3.1, 1.5, 0.2],
[ 5. , 3.6, 1.4, 0.2],
[ 5.4, 3.9, 1.7, 0.4],
[ 4.6, 3.4, 1.4, 0.3],
[ 5. , 3.4, 1.5, 0.2],
[ 4.4, 2.9, 1.4, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 5.4, 3.7, 1.5, 0.2],
[ 4.8, 3.4, 1.6, 0.2],
[ 4.8, 3. , 1.4, 0.1],
[ 4.3, 3. , 1.1, 0.1],
[ 5.8, 4. , 1.2, 0.2],
[ 5.7, 4.4, 1.5, 0.4],
[ 5.4, 3.9, 1.3, 0.4],
[ 5.1, 3.5, 1.4, 0.3],
[ 5.7, 3.8, 1.7, 0.3],
[ 5.1, 3.8, 1.5, 0.3],
[ 5.4, 3.4, 1.7, 0.2],
[ 5.1, 3.7, 1.5, 0.4],
[ 4.6, 3.6, 1. , 0.2],
[ 5.1, 3.3, 1.7, 0.5],
[ 4.8, 3.4, 1.9, 0.2],
[ 5. , 3. , 1.6, 0.2],
[ 5. , 3.4, 1.6, 0.4],
[ 5.2, 3.5, 1.5, 0.2],
[ 5.2, 3.4, 1.4, 0.2],
[ 4.7, 3.2, 1.6, 0.2],
[ 4.8, 3.1, 1.6, 0.2],
[ 5.4, 3.4, 1.5, 0.4],
[ 5.2, 4.1, 1.5, 0.1],
[ 5.5, 4.2, 1.4, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 5. , 3.2, 1.2, 0.2],
[ 5.5, 3.5, 1.3, 0.2],
[ 4.9, 3.1, 1.5, 0.1],
[ 4.4, 3. , 1.3, 0.2],
[ 5.1, 3.4, 1.5, 0.2],
[ 5. , 3.5, 1.3, 0.3],
[ 4.5, 2.3, 1.3, 0.3],
[ 4.4, 3.2, 1.3, 0.2],
[ 5. , 3.5, 1.6, 0.6],
[ 5.1, 3.8, 1.9, 0.4],
[ 4.8, 3. , 1.4, 0.3],
[ 5.1, 3.8, 1.6, 0.2],
[ 4.6, 3.2, 1.4, 0.2],
[ 5.3, 3.7, 1.5, 0.2],
[ 5. , 3.3, 1.4, 0.2],
[ 7. , 3.2, 4.7, 1.4],
[ 6.4, 3.2, 4.5, 1.5],
[ 6.9, 3.1, 4.9, 1.5],
[ 5.5, 2.3, 4. , 1.3],
[ 6.5, 2.8, 4.6, 1.5],
[ 5.7, 2.8, 4.5, 1.3],
[ 6.3, 3.3, 4.7, 1.6],
[ 4.9, 2.4, 3.3, 1. ],
[ 6.6, 2.9, 4.6, 1.3],
[ 5.2, 2.7, 3.9, 1.4],
[ 5. , 2. , 3.5, 1. ],
[ 5.9, 3. , 4.2, 1.5],
[ 6. , 2.2, 4. , 1. ],
[ 6.1, 2.9, 4.7, 1.4],
[ 5.6, 2.9, 3.6, 1.3],
[ 6.7, 3.1, 4.4, 1.4],
[ 5.6, 3. , 4.5, 1.5],
[ 5.8, 2.7, 4.1, 1. ],
[ 6.2, 2.2, 4.5, 1.5],
[ 5.6, 2.5, 3.9, 1.1],
[ 5.9, 3.2, 4.8, 1.8],
[ 6.1, 2.8, 4. , 1.3],
[ 6.3, 2.5, 4.9, 1.5],
[ 6.1, 2.8, 4.7, 1.2],
[ 6.4, 2.9, 4.3, 1.3],
[ 6.6, 3. , 4.4, 1.4],
[ 6.8, 2.8, 4.8, 1.4],
[ 6.7, 3. , 5. , 1.7],
[ 6. , 2.9, 4.5, 1.5],
[ 5.7, 2.6, 3.5, 1. ],
[ 5.5, 2.4, 3.8, 1.1],
[ 5.5, 2.4, 3.7, 1. ],
[ 5.8, 2.7, 3.9, 1.2],
[ 6. , 2.7, 5.1, 1.6],
[ 5.4, 3. , 4.5, 1.5],
[ 6. , 3.4, 4.5, 1.6],
[ 6.7, 3.1, 4.7, 1.5],
[ 6.3, 2.3, 4.4, 1.3],
[ 5.6, 3. , 4.1, 1.3],
[ 5.5, 2.5, 4. , 1.3],
[ 5.5, 2.6, 4.4, 1.2],
[ 6.1, 3. , 4.6, 1.4],
[ 5.8, 2.6, 4. , 1.2],
[ 5. , 2.3, 3.3, 1. ],
[ 5.6, 2.7, 4.2, 1.3],
[ 5.7, 3. , 4.2, 1.2],
[ 5.7, 2.9, 4.2, 1.3],
[ 6.2, 2.9, 4.3, 1.3],
[ 5.1, 2.5, 3. , 1.1],
[ 5.7, 2.8, 4.1, 1.3],
[ 6.3, 3.3, 6. , 2.5],
[ 5.8, 2.7, 5.1, 1.9],
[ 7.1, 3. , 5.9, 2.1],
[ 6.3, 2.9, 5.6, 1.8],
[ 6.5, 3. , 5.8, 2.2],
[ 7.6, 3. , 6.6, 2.1],
[ 4.9, 2.5, 4.5, 1.7],
[ 7.3, 2.9, 6.3, 1.8],
[ 6.7, 2.5, 5.8, 1.8],
[ 7.2, 3.6, 6.1, 2.5],
[ 6.5, 3.2, 5.1, 2. ],
[ 6.4, 2.7, 5.3, 1.9],
[ 6.8, 3. , 5.5, 2.1],
[ 5.7, 2.5, 5. , 2. ],
[ 5.8, 2.8, 5.1, 2.4],
[ 6.4, 3.2, 5.3, 2.3],
[ 6.5, 3. , 5.5, 1.8],
[ 7.7, 3.8, 6.7, 2.2],
[ 7.7, 2.6, 6.9, 2.3],
[ 6. , 2.2, 5. , 1.5],
[ 6.9, 3.2, 5.7, 2.3],
[ 5.6, 2.8, 4.9, 2. ],
[ 7.7, 2.8, 6.7, 2. ],
[ 6.3, 2.7, 4.9, 1.8],
[ 6.7, 3.3, 5.7, 2.1],
[ 7.2, 3.2, 6. , 1.8],
[ 6.2, 2.8, 4.8, 1.8],
[ 6.1, 3. , 4.9, 1.8],
[ 6.4, 2.8, 5.6, 2.1],
[ 7.2, 3. , 5.8, 1.6],
[ 7.4, 2.8, 6.1, 1.9],
[ 7.9, 3.8, 6.4, 2. ],
[ 6.4, 2.8, 5.6, 2.2],
[ 6.3, 2.8, 5.1, 1.5],
[ 6.1, 2.6, 5.6, 1.4],
[ 7.7, 3. , 6.1, 2.3],
[ 6.3, 3.4, 5.6, 2.4],
[ 6.4, 3.1, 5.5, 1.8],
[ 6. , 3. , 4.8, 1.8],
[ 6.9, 3.1, 5.4, 2.1],
[ 6.7, 3.1, 5.6, 2.4],
[ 6.9, 3.1, 5.1, 2.3],
[ 5.8, 2.7, 5.1, 1.9],
[ 6.8, 3.2, 5.9, 2.3],
[ 6.7, 3.3, 5.7, 2.5],
[ 6.7, 3. , 5.2, 2.3],
[ 6.3, 2.5, 5. , 1.9],
[ 6.5, 3. , 5.2, 2. ],
[ 6.2, 3.4, 5.4, 2.3],
[ 5.9, 3. , 5.1, 1.8]])
In [126]:
df = pd.DataFrame(data)
In [127]:
df.columns = column_names
In [128]:
df.head()
Out[128]:
sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | |
---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 |
1 | 4.9 | 3.0 | 1.4 | 0.2 |
2 | 4.7 | 3.2 | 1.3 | 0.2 |
3 | 4.6 | 3.1 | 1.5 | 0.2 |
4 | 5.0 | 3.6 | 1.4 | 0.2 |
In [129]:
df.iloc[:, [1]].head()
Out[129]:
sepal width (cm) | |
---|---|
0 | 3.5 |
1 | 3.0 |
2 | 3.2 |
3 | 3.1 |
4 | 3.6 |
In [130]:
plt.figure()
plt.scatter(df.iloc[:,1], df.iloc[:,2])
plt.xlabel("Sepal Width")
plt.ylabel("Sepal Length")
plt.title("Relationship between Sepal Length and Width \nFrom the sklearn Iris Data", loc = 'right', fontsize = 12)
Out[130]:
<matplotlib.text.Text at 0x11b63c080>

In [131]:
df.describe()
Out[131]:
sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | |
---|---|---|---|---|
count | 150.000000 | 150.000000 | 150.000000 | 150.000000 |
mean | 5.843333 | 3.054000 | 3.758667 | 1.198667 |
std | 0.828066 | 0.433594 | 1.764420 | 0.763161 |
min | 4.300000 | 2.000000 | 1.000000 | 0.100000 |
25% | 5.100000 | 2.800000 | 1.600000 | 0.300000 |
50% | 5.800000 | 3.000000 | 4.350000 | 1.300000 |
75% | 6.400000 | 3.300000 | 5.100000 | 1.800000 |
max | 7.900000 | 4.400000 | 6.900000 | 2.500000 |
In [132]:
from sklearn.datasets import load_boston
boston = load_boston()
bdata = boston.data
bcolumn_names = boston.feature_names
In [133]:
print(boston.DESCR)
Boston House Prices dataset
Notes
------
Data Set Characteristics:
:Number of Instances: 506
:Number of Attributes: 13 numeric/categorical predictive
:Median Value (attribute 14) is usually the target
:Attribute Information (in order):
- CRIM per capita crime rate by town
- ZN proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS proportion of non-retail business acres per town
- CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
- NOX nitric oxides concentration (parts per 10 million)
- RM average number of rooms per dwelling
- AGE proportion of owner-occupied units built prior to 1940
- DIS weighted distances to five Boston employment centres
- RAD index of accessibility to radial highways
- TAX full-value property-tax rate per $10,000
- PTRATIO pupil-teacher ratio by town
- B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- LSTAT % lower status of the population
- MEDV Median value of owner-occupied homes in $1000's
:Missing Attribute Values: None
:Creator: Harrison, D. and Rubinfeld, D.L.
This is a copy of UCI ML housing dataset.
http://archive.ics.uci.edu/ml/datasets/Housing
This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980. N.B. Various transformations are used in the table on
pages 244-261 of the latter.
The Boston house-price data has been used in many machine learning papers that address regression
problems.
**References**
- Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
- Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.
- many more! (see http://archive.ics.uci.edu/ml/datasets/Housing)
In [117]:
df2 = pd.DataFrame(bdata)
In [118]:
df2.columns = bcolumn_names
In [119]:
df2.head()
Out[119]:
CRIM | ZN | INDUS | CHAS | NOX | RM | AGE | DIS | RAD | TAX | PTRATIO | B | LSTAT | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.00632 | 18.0 | 2.31 | 0.0 | 0.538 | 6.575 | 65.2 | 4.0900 | 1.0 | 296.0 | 15.3 | 396.90 | 4.98 |
1 | 0.02731 | 0.0 | 7.07 | 0.0 | 0.469 | 6.421 | 78.9 | 4.9671 | 2.0 | 242.0 | 17.8 | 396.90 | 9.14 |
2 | 0.02729 | 0.0 | 7.07 | 0.0 | 0.469 | 7.185 | 61.1 | 4.9671 | 2.0 | 242.0 | 17.8 | 392.83 | 4.03 |
3 | 0.03237 | 0.0 | 2.18 | 0.0 | 0.458 | 6.998 | 45.8 | 6.0622 | 3.0 | 222.0 | 18.7 | 394.63 | 2.94 |
4 | 0.06905 | 0.0 | 2.18 | 0.0 | 0.458 | 7.147 | 54.2 | 6.0622 | 3.0 | 222.0 | 18.7 | 396.90 | 5.33 |
In [18]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
from math import atan2
Visualizing Differential Equations¶
Returning to the example from our last notebook, we had the differential equation
Interpreting this geometrically we are looking at a relationship where the slope of the tangent line at some point \((x,y)\) is given by \(2x+3\). For example, at any point where \(x = 1\), we have the slope of the tangent line equal to \(2(1) + 3 = 5\). We can draw a small short line at each of these points to represent this behavior. Similarly, we could repeat this for a small number of \(x\) and \(y\) values. Most important, is that we understand we are not finding a single line solution, but rather representing tangent lines of a family of functions just like our initial general solutions to the ODE.
In [2]:
def dy_dx(x):
return 2*x + 3
In [3]:
x = np.arange(-3,4, 1)
y = np.arange(-3,4, 1)
X,Y = np.meshgrid(x, y)
In [4]:
X
Out[4]:
array([[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3],
[-3, -2, -1, 0, 1, 2, 3]])
In [5]:
Y
Out[5]:
array([[-3, -3, -3, -3, -3, -3, -3],
[-2, -2, -2, -2, -2, -2, -2],
[-1, -1, -1, -1, -1, -1, -1],
[ 0, 0, 0, 0, 0, 0, 0],
[ 1, 1, 1, 1, 1, 1, 1],
[ 2, 2, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 3, 3]])
In [122]:
fig = plt.figure()
plt.plot(X,Y, 'o', color = 'black', markersize = 6)
plt.axhline(color = 'black')
plt.axvline(color = 'black')
ax = fig.add_subplot(111)
ax.quiver(0,0,2,2)
ax.quiver(1,1,1.1,2*(1.1) + 3)
ax.quiver(-1,1,-1.1, 2*(-1.1) + 3)
Out[122]:
<matplotlib.quiver.Quiver at 0x11fcd9160>

In [127]:
atan2(dy_dx(2), 1)
Out[127]:
1.4288992721907328
In [134]:
theta = [atan2(dy_dx(i), 1) for i in x]
In [140]:
theta
Out[140]:
[-1.2490457723982544,
-0.7853981633974483,
0.7853981633974483,
1.2490457723982544,
1.373400766945016,
1.4288992721907328,
1.460139105621001]
In [147]:
plt.quiver(X,Y, np.cos(theta), np.sin(theta))
plt.axhline(color = 'black')
plt.axvline(color = 'black')
plt.plot(X, Y, 'o', color = 'black')
x2 = np.linspace(-3,3,1000)
plt.plot(x2, x2**2 + 3*x2 + 1, '-k', color = 'red', label = 'Particular Solution through $(0,1)$')
plt.xlim(-3,3)
plt.ylim(-3,3)
plt.legend(loc = 'best')
Out[147]:
<matplotlib.legend.Legend at 0x120e02438>

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
Approximating Values¶
In this notebook, we approximate solutions to differential equations. This is an important idea, as most differential equations are not easily solved with classical techniques. Our solutions with be numerical values that approximate the exact solutions to the differential equation. Given the initial value problem:
we want to determine other values of the function near \(x = 0\). To start, we want to consider how to approximate the following values without solving the differential equation, as we already know this is \(y = x^2 + 3x + 1\).
- \(y(0.1)\)
- \(y(0.2)\)
- \(y(0.3)\)
- \(y(1.1)\)
For \(y(0.1)\), consider using the tangent line to approximate this value. We have a line with slope \(2(0.1) + 3 = 3.2\) through the point \((0,1)\). We can write an equation for this line as
and investigate this line and the solution around \(x = 0.1\)
In [2]:
plt.figure(figsize = (11,5))
plt.subplot(121)
x = np.linspace(0,0.2,10)
plt.plot(x, x**2 + 3*x + 1, label = 'solution')
plt.plot(x, 3*x + 1, label = 'tangent line')
plt.plot(0.1,1.3,'o', label = '$(0.1, 1.3)$')
plt.legend(loc = 'best')
plt.title('Zooming In around Approximation')
plt.subplot(122)
x = np.linspace(-3,3,100)
plt.plot(x, x**2 + 3*x + 1, label = 'solution')
plt.plot(x, 3*x + 1, label = 'tangent line')
plt.plot(0.1,1.3,'o')
plt.legend(loc = 'best')
plt.title('Zoomed Out view of Approximation')
Out[2]:
<matplotlib.text.Text at 0x11312a320>

As we notice in the graph on the right, while the tangent line serves as a nice approximation in the near vicinity of our initial value, as we move further away it diverges from the solution.
Euler’s Method¶
Also, recognize that the differential equation suggests an alternative equation for the tangent line at different \(x\) values. At \(x = 1.1\) the tangent line has a different slope than when \(x=1\). We can adjust every \(0.1\) and write a new equation to traverse for a short period of time. Here, we started at \(x=0\), moved along the tangent line for \(0.1\) units and now write the equation for the tangent line at \(x = 0.1\).
When \(x=0.1\), we have \(3(0.1) + 1 = 1.3\), so
We repeat this process at \((0.1,1.3)\) in order to find \(y_2\).
to find \(y_3\) or an approximation for \(y(0.3)\), we simply repeat
In [3]:
(2*(0.1) + 3)*(0.2 - 0.1)+ 1.3
Out[3]:
1.62
In [4]:
(2*(0.2) + 3)*(0.3 - 0.2)+ 1.62
Out[4]:
1.96
Perhaps you recognize a more general pattern. We are evaluating the derivative at each \(h\) step, multiplying this by \(h\) and adding this to the previous approximation to get our new approximation. Let’s generate more terms and visualize our solutions. Formally, we can define a relationship between successive \(y\) terms as follows, where \(x_i = h*i\) and \(f = \frac{dy}{dx}\):
In [5]:
def df(x):
return 2*x+3
h = 0.1
y = [1]
for i in range(10):
next = df(i*h)*h + y[i]
y.append(next)
y
Out[5]:
[1,
1.3,
1.62,
1.9600000000000002,
2.3200000000000003,
2.7,
3.1,
3.52,
3.96,
4.42,
4.9]
In [6]:
x = np.linspace(0,1.0,11)
plt.plot(x,y,'--o', label = 'Approximations')
plt.plot(x, x**2 + 3*x + 1, '--o', color = 'red', label = 'Actual Solutions')
plt.legend(loc = 'best')
Out[6]:
<matplotlib.legend.Legend at 0x113916dd8>

In [7]:
solutions = [(i**2 + 3*i + 1) for i in x]
error = [(solutions[i] - y[i]) for i in range(len(x))]
In [8]:
plt.plot(x, error, '--o')
Out[8]:
[<matplotlib.lines.Line2D at 0x113b1e3c8>]

In [9]:
h = 0.01
In [10]:
y2 = [1]
for i in range(100):
next = df(i*h)*h + y2[i]
y2.append(next)
x2 = [i for i in np.linspace(0,1.0,101)]
plt.plot(x2, y2, '-k', color = 'red', label = 'Approximation with h = 0.01')
plt.plot(x, y, '--k', color = 'green', label = 'Approximation with h = 0.1')
plt.legend(loc = 'best')
Out[10]:
<matplotlib.legend.Legend at 0x113c40b38>

You may not feel this is a big difference, however if we examine the error in these approximations you should notice the divergence in the values depending on the size of \(h\).
In [11]:
solutions2 = [(i**2 + 3*i + 1) for i in np.linspace(0,1,101)]
error2 = [(solutions2[i] - y2[i]) for i in range(100)]
x2 = [i for i in np.linspace(0,1,100)]
In [12]:
plt.plot(x2, error2, '--', color = 'blue', label = 'Error for h = 0.01')
plt.plot(x, error, '--k', color = 'red', label = 'Error for h = 0.1')
plt.legend()
plt.xlim(0,0.3)
plt.ylim(0,0.05)
Out[12]:
(0, 0.05)

Population Models¶
Another familiar differential equation is the exponential growth/decay model. Here, we have a rate of change of a population \(A\) at some time \(t\) for growth rate \(r\) as
Also, suppose we have the initial condition that at \(t=0\) the population is \(P(0) = 100\), with \(r=2\). Use Euler’s method to approximate solutions at time \(t = .1, .2, .4, 1.1\) with \(h = 0.1, 0.01,\) and \(0.001\). Compare the errors in these approximations to the exact solution to the differential equation.
Populations II¶
Now suppose we have the differential equation below that models a biological population in time given in hours. Determine an appropriate step size to provide reasonable approximations to the differential equation.
Comparing Exact and Approximates¶
As mentioned earlier, we cannot always find a function who has a derivative expressed by the differential equation and instead need to use an approximation to find values of a solution. Below, two of the three involve precise solutions. Determine approximations for each, and those that you can find an exact solution using any method please do so.
- \(\frac{dy}{dt} = \cos(t) - y\tan(t) \quad y(0) = 0\)
- \(\frac{dy}{dt} = t\cos(yt) \quad y(0) = 0\)
- \(\frac{dy}{dt} = t^2 \quad y(0) = 0\)
In [1]:
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
from scipy.integrate import odeint
size = (12, 9)
from ipywidgets import interact, widgets, fixed
from math import atan2
Population Models¶
A simple example of a model involving a differential equation could be the basic additive population growth model. Here, suppose we have a constant rate of change \(k\). As a differential equation we would have:
We are familiar with the solution
In this notebook, we want to add complexity to this model and explore the results. Obviously, assuming a constant rate of change in a population model is a fairly naive assumption. Regardless, let’s quickly revisit our methods for visualizing and solving a differential equation expressed in this form.
In [2]:
x, k, t, P = sy.symbols('x k t P')
In [3]:
def dp_dt(x,k):
return k
In [4]:
sy.integrate(dp_dt(x,k), t)
Out[4]:
k*t
In [5]:
x = np.linspace(0,10,11)
theta = [atan2(dp_dt(i, 1.1), 1) for i in x]
In [6]:
x = np.linspace(0,10,11)
y = np.linspace(0,10,11)
X, Y = np.meshgrid(x, y)
plt.figure()
plt.plot(X, Y, 'o', color = 'black', alpha = 0.6)
plt.quiver(X,Y, np.cos(theta), np.sin(theta))
plt.plot(x, 1.1*x + 4, '-k', color = 'red', linewidth = 4, label = 'Solution Curve for $k = 1.1$ and $c=4$')
plt.ylim(0,10)
plt.legend(loc = 'best')
Out[6]:
<matplotlib.legend.Legend at 0x119402ef0>
Exponential Growth¶
Last notebook, we saw the basic population model expressed as a differential equation. Here, we should recall our earlier work with discrete sequences and the passage to the limit that determines the change from \(\Delta P\) to \(dP\). We will rely heavily on this idea when modeling with differential equations. For the simple exponential population model, as a differential equation we have
whereas in the discrete case we have
Returning to a basic example, suppose we know a population has size \(P = 100\) at time \(t=0\). The population grows at a 0.24% growth rate. If we consider this in terms of our abover relationship as a way to approximate solutions every \(\Delta t = 0.1\), we get the following populations \(p_i\).
Problem: A certain city had a population of 25,000 and a growth rate of \(r = 1.8%\). Assume that its population will continue to grow exponentially at a constant rate, what population can its city planners expect in the year 2000?
In [7]:
r = 0.018
p0 = 25000
dt = 1
P = [p0]
for i in range(10):
next = P[i]*(1 + r)*dt
P.append(next)
In [8]:
P
Out[8]:
[25000,
25450.0,
25908.100000000002,
26374.4458,
26849.185824400003,
27332.471169239205,
27824.455650285512,
28325.295851990653,
28835.151177326487,
29354.183898518364,
29882.559208691695]
In [9]:
plt.figure()
plt.plot(np.linspace(1,10,11), P, '-o')
Out[9]:
[<matplotlib.lines.Line2D at 0x11946fd68>]
We are interested in the effect of varying some of the parameters of the differential equation. Here, we really only have two, the inital population and the growth rate. Let’s just vary the growth rate and describe the change in the resulting solutions.
In [10]:
def exp_grow(p0, r):
P = [p0]
for i in range(10):
next = p0*r**i
P.append(next)
plt.figure()
plt.plot(P, '-o')
interact(exp_grow, r=(1.00,2.0,0.01), p0 = (100,500,25));
The Logistic Differential Equation¶
The logistic equation was introduced by Pierre-Francois Verhulst in a discussion on the United States population growth. In the middle of the 19th century, the United States was still expanding, and the population had followed exponential growth. This could not continue however, as “the difficulty of finding good land has begun to make itself felt.”

“the population tends to grow in geometric progression while the production of food follows a more or less arithmetic progression”–Verhulst 1845
This translates mathematically to the rate of births against those of deaths as a linear rate of change. We can examine this through an example.
Suppose we have a fish population that will stop growing at a population of 50,000. We also know that when the population is 10,000, the population doubles. Accordingly, if we plot population against growth rate we have two points \((10,000, 2)\) and \((50,000, 1)\). We assume this is linear, and have the equation:
In [11]:
plt.figure(figsize=(8,6))
p = np.linspace(0,74,2000)
r = 2.25 - .025*p
plt.plot(p,r)
plt.plot(50,1,'ro')
plt.plot(10,2,'ro')
plt.xlim(0,72)
plt.ylabel('Growth Rate r')
plt.xlabel('Population in Thousands')
plt.title('A linear change in growth rate')
Out[11]:
<matplotlib.text.Text at 0x119502b38>
Now we want to find some values. We can approximate these values using the earlier general population model. In this situation we end up with:
We will define a function to generate successive values of the sequence.
In [12]:
def logistic_example(N=100, x0 = 1000):
x = np.zeros(N)
x[0] = x0
for n in range(N-1):
x[n+1] = (2.25 - 0.000025*x[n])*x[n]
return x
x = logistic_example(N=100, x0 = 1000)
In [13]:
n = np.linspace(0,10,10)
plt.figure()
plt.plot(n,x[:10],marker='o')
plt.xlabel('Year')
plt.ylabel('Population')
plt.title('Logistic Fish Population')
Out[13]:
<matplotlib.text.Text at 0x119431828>
In [14]:
def plot_logistic(i,x):
plt.figure(figsize=(9,6))
plt.plot(x[:i],'-ok', linewidth=2)
plt.ylim(0,60000); plt.xlim(0,len(x))
plt.xlabel('r'); plt.ylabel('x')
plt.xlabel('Year')
plt.ylabel('Population')
plt.title('Logistic Fish Population')
In [15]:
x = logistic_example(N=20)
plot_logistic(len(x), x)
Experimenting with different rates¶
In general, we are dealing with the recurrence relationship
We’ve seen what happened in the fish population with a certain rate. Now we would like to see some other behavior by playing with values of \(r\).
In [16]:
def logistic_example2(r=1.0, N=100, x0 = 0.6):
x = np.zeros(N)
x[0] = x0
for n in range(N-1):
x[n+1] = r*x[n]*(1. - x[n])
return x
In [17]:
c = logistic_example2(r=1.0, N=30)
print(c[:10])
[ 0.6 0.24 0.1824 0.14913024 0.12689041 0.11078923
0.09851498 0.08880978 0.0809226 0.07437413]
In [18]:
def plot_logistic2(i,x):
plt.figure(figsize=(9,5))
plt.plot(c[:i], '-ok', linewidth=2)
In [19]:
interact(plot_logistic2, i = widgets.IntSlider(min=0, max=len(c), step=1, value=0), x=fixed(x));
/Users/NYCMath/anaconda/lib/python3.5/site-packages/traitlets/traitlets.py:567: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
silent = bool(old_value == new_value)
Changing r¶
We can see some different kinds of behavior depending on altering the parameters \(r\) and \(x_0\). Three things may happen. Recall our equation:
- The values of \(x_n\) get closer and closer to some limit value \(x_\infty\) as \(n \to \infty\).
- The values of \(x_n\) oscillate periodically among two or more values, repeating those values forever.
- The values \(x_n\) get larger and larger without bound.
In [20]:
x = logistic_example2(r=1.5,N=30,x0=1./10)
def plot_logistic2(i,x):
plt.figure(figsize=(9,5))
plt.plot(x[:i], '-ok', linewidth=2)
plot_logistic2(len(x),x)
Exploration¶
Playing with the sliders below, answer the following:
- For what values of \(r\) does \(x_n \to 0\)?
- What happens for slightly bigger values of \(r\)?
- Can you find values of \(r\) that generate periodic orbits (i.e., situation 2 from above list)?
- How many different points are visited in the periodic case?
- What happens for even bigger values of \(r\)?
- Make sure to try the value \(r = 3.83\). What do you observe?
In [21]:
def logistic(r=1.0, N = 100, x0=0.2):
x = np.zeros(N)
x[0] = x0
for n in range(N-1):
x[n+1] = r * x[n] * (1. - x[n])
plt.figure(figsize=size)
ax1 = plt.subplot2grid((1,8), (0,0), colspan=7)
ax2 = plt.subplot2grid((1,8), (0,7), colspan=1)
ax1.plot(x, '-ok', linewidth=2)
ax1.set_ylim(0,1)
n = int(round(N/5))
ax2.plot([0]*n,x[-n:],'or',markersize=10,alpha=0.1)
ax2.set_ylim(0,1)
ax2.axis('off')
ax1.set_xlabel('r'); ax1.set_ylabel('x')
interact(logistic,r=(0,4,0.01),x0=(0.01,1,0.1));
In [22]:
def bifurcation_diagram(r=(0.8,4),N=2000,k=2000,m=200,x0=0.2):
"""
r: Pair of numbers (rmin,rmax) indicating parameter range
k: Number of samples in r
N: Number of iterations per sequence
m: keep just the last m iterates
"""
x = np.zeros((k,N))
vals = np.zeros((k,m))
rs = np.linspace(r[0],r[1],k)
x[:,0] = x0
for n in range(N-1):
x[:,n+1] = rs * x[:,n] * (1. - x[:,n])
return rs, x[:,-m:]
plotargs = {'markersize':0.5, 'alpha':0.4}
rs, vals = bifurcation_diagram()
plt.figure(figsize=size)
plt.plot(rs,vals,'ok',**plotargs);
plt.xlim(rs.min(),rs.max());
plt.xlabel('r'); plt.ylabel('x');
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import sympy as sy
from ipywidgets import interact, widgets, fixed
Population Models¶
The simplest population model we examined was that of exponential growth. Here, we claimed a population that grows in direct proportion to itself can be represented both in terms of discrete time steps and a continuous differential equation model, given below.
Additionally, we looked at a population model that considered the limiting behavior of decreased resources through the logistic equation. Again, we could describe this either based on a discrete model or a continuous model as show below.
This model was interesting because of some unexpected behavior that we saw in the form of chaos. Today, we extend our discussion to population dynamics where we investigate predator prey relationships. Given some population of prey \(x(t)\) and some population of prey \(y(t)\) we aim to investigate the changes in these populations over time depending on the values of certain population parameters much like we did with the logistic model. In general, we will be exploring variations on a theme. The general system of equations given by:
Can be understood as describing the change in populations based on the difference between the increase in the populations (\(f_1, f_3\)) and the decrease in the populations (\(f_2, f_4\)).
Lotka Voltera Model¶
An early example of a predator prey model is the Lotka Volterra model. We can follow Lotka’s argument for the construction of the model. First, he assumed:
- In the absence of predators, prey should grow exponentially without bound
- In the absence of prey, predators population should decrease similarly
Thus, we have the following:
where \(a,d\) are some constants. Next, Lotka called on the Law of Mass Action from chemistry. When a reaction occurs by mixing chemicals, the law of mass action states that the rate of the reaction is proportional to the product of the quantities of the reactants. Lotka argued that prey should decrease and that predators should increase at rates proportional to the product of numbers present. Hence, we would have:
For some other constants \(b\) and \(c\).
In [2]:
def predator_prey(a=2,b=1,c=1,d=5):
# Initial populations
rabbits = [10]
foxes = [20]
a = a/1000.
b = b/10000.
c = c/10000.
d = d/1000.
dt = 0.1
N = 20000
for i in range(N):
R = rabbits[-1]
F = foxes[-1]
change_rabbits = dt * (a*R - b*R*F)
change_foxes = dt * (c*R*F - d*F)
rabbits.append(rabbits[-1] + change_rabbits)
foxes.append(foxes[-1] + change_foxes)
plt.figure(figsize=(12,6))
plt.plot(np.arange(N+1),rabbits)
plt.hold(True)
plt.plot(np.arange(N+1),foxes)
plt.ylim(0,200); plt.xlabel('Time'); plt.ylabel('Population')
plt.legend(('Rabbits','Foxes'),loc='best');
Questions¶
Examine the following Lotka Volterra model:
with \(x(0) = 4\) and \(y(0) = 2\).
- Experiment with changing the rate constants to answer the following questions:
- Describe the effect of increasing the prey birth rate. Is this what you expected to happen? Explain.
- Describe the effect of increasing the prey death rate. Is this what you expected to happen? Explain.
- The predator birth rate is currently much less than the prey birth rate. What effect does having the same birth rate for both species have on the solution?
- Experiment with the initial populations for the predators and prey. Experiment with incorporating harvesting terms into both of the equations. What happens when you change the harvesting rate? Can you harvest and have the populations increase rather than decrease? Is this what you expected? Explain.
In [3]:
interact(predator_prey,a=widgets.FloatSlider(min=0.01,max=30,step=1,value=0.7),
b=widgets.FloatSlider(min=0.01,max=10,step=1,value=0.3),
c=widgets.FloatSlider(min=0.01,max=10,step=1,value=0.08),
d=widgets.FloatSlider(min=0.01,max=30,step=1,value=0.44));

Now let’s look at a particular example. We will set
\(a,b,c,d=1, 0.1, 1.5,\) and \(0.75\) respectively. We can use
the scipy integrate
function to solve the differential equations
with the initial conditions as populations of 10 rabbits and 5 foxes.
In [4]:
a = 1.
b = 0.1
c = 1.5
d = 0.75
def dX_dt(X, t=0):
""" Return the growth rate of fox and rabbit populations. """
return array([ a*X[0] - b*X[0]*X[1] ,
-c*X[1] + d*b*X[0]*X[1] ])
In [5]:
from scipy import integrate
from numpy import *
t = linspace(0, 15, 1000) # time
X0 = array([10, 5]) # initials conditions: 10 rabbits and 5 foxes
X = integrate.odeint(dX_dt, X0, t)
In [6]:
rabbits, foxes = X.T
f1 = plt.figure(figsize=(12,6))
plt.plot(t, rabbits, 'r-', label='Rabbits')
plt.plot(t, foxes , 'b-', label='Foxes')
plt.grid()
plt.legend(loc='best')
plt.xlabel('time')
plt.ylabel('population')
plt.title('Evolution of fox and rabbit populations')
f1.savefig('rabbits_and_foxes_1.png')

In [7]:
X_f0 = array([ 0. , 0.])
X_f1 = array([ c/(d*b), a/b])
all(dX_dt(X_f0) == zeros(2) ) and all(dX_dt(X_f1) == zeros(2))
values = linspace(0.3, 0.9, 5) # position of X0 between X_f0 and X_f1
vcolors = plt.cm.autumn_r(linspace(0.3, 1., len(values))) # colors for each trajectory
f2 = plt.figure(figsize=(12,6))
#-------------------------------------------------------
# plot trajectories
for v, col in zip(values, vcolors):
X0 = v * X_f1 # starting point
X = integrate.odeint( dX_dt, X0, t) # we don't need infodict here
plt.plot( X[:,0], X[:,1], lw=3.5*v, color=col, label='X0=(%.f, %.f)' % ( X0[0], X0[1]) )
#-------------------------------------------------------
# define a grid and compute direction at each point
ymax = plt.ylim(ymin=0)[1] # get axis limits
xmax = plt.xlim(xmin=0)[1]
nb_points = 20
x = linspace(0, xmax, nb_points)
y = linspace(0, ymax, nb_points)
X1 , Y1 = meshgrid(x, y) # create a grid
DX1, DY1 = dX_dt([X1, Y1]) # compute growth rate on the gridt
M = (hypot(DX1, DY1)) # Norm of the growth rate
M[ M == 0] = 1. # Avoid zero division errors
DX1 /= M # Normalize each arrows
DY1 /= M
#-------------------------------------------------------
# Drow direction fields, using matplotlib 's quiver function
# I choose to plot normalized arrows and to use colors to give information on
# the growth speed
plt.title('Trajectories and direction fields')
Q = plt.quiver(X1, Y1, DX1, DY1, M, pivot='mid', cmap=plt.cm.jet)
plt.xlabel('Number of rabbits')
plt.ylabel('Number of foxes')
plt.legend()
plt.grid()
plt.xlim(0, xmax)
plt.ylim(0, ymax)
f2.savefig('rabbits_and_foxes_2.png')

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sympy as sy
Discrete Dynamical Systems¶
Below, we will see a few examples of discrete dynamical systems. We will start with the example of medication dosage. Suppose we have some amount of medication in our system upon taking a dose one day later. We would have a fraction plus the dose in our system at this time. Below, the example is first examined numerically by iterating a number of terms of the equation. Symbolically, this means suppose we start with \(\alpha = 0.8\), and \(n = 1\):
In python, we can index how many iterations we want to carry out with
the range()
function, and embed the iteration in a for
loop as
follows.
In [2]:
n = 1
for i in range(1,10):
n = 0.8*n + 1
print(n)
1.8
2.4400000000000004
2.9520000000000004
3.3616000000000006
3.6892800000000006
3.9514240000000007
4.161139200000001
4.328911360000001
4.4631290880000005
We can look at some additional terms and see that these are heading towards 5.
In [3]:
n = 1
for i in range(1,50):
n = 0.8*n + 1
print(n)
1.8
2.4400000000000004
2.9520000000000004
3.3616000000000006
3.6892800000000006
3.9514240000000007
4.161139200000001
4.328911360000001
4.4631290880000005
4.570503270400001
4.656402616320001
4.725122093056001
4.780097674444801
4.824078139555841
4.859262511644673
4.887410009315738
4.90992800745259
4.927942405962073
4.942353924769659
4.953883139815727
4.9631065118525814
4.9704852094820655
4.976388167585652
4.981110534068522
4.984888427254818
4.987910741803855
4.990328593443085
4.992262874754468
4.993810299803575
4.99504823984286
4.996038591874289
4.996830873499431
4.997464698799545
4.997971759039636
4.998377407231709
4.998701925785367
4.998961540628294
4.999169232502635
4.999335386002109
4.999468308801687
4.999574647041349
4.9996597176330795
4.999727774106464
4.999782219285171
4.999825775428137
4.99986062034251
4.999888496274008
4.999910797019206
4.9999286376153655
In [4]:
def f(n):
return 0.8*n+1
In [5]:
n = 1; y = [n]
x = range(0,20)
for i in x[1:]:
n = f(n)
y.append(n)
plt.figure(figsize=(10,8))
plt.title('Iterations Approaching Fixed Point')
plt.scatter(x,y)
plt.plot(x,y)
plt.xlim(0,16)
plt.ylim(0,6)
Out[5]:
(0, 6)

Fixed Point¶
Here, we would say 5 is our fixed point. This is born out by solving the equation:
Similarly, let’s examine the behavior of the system created by iterating:
With a starting value of \(n = 1.4\).
In [6]:
n = 1.4
for i in range(1,10):
n = n - n*n + 2
print(n)
1.4400000000000002
1.3663999999999998
1.4993510400000003
1.2512974988509178
1.685552068220355
0.8444662935384386
2.13134297261589
-0.41127989430324874
1.4195689542386598
Hmm... Something interesting is happening here. Let’s look a little further down the line...
In [7]:
n = 1.4
for i in range(1,50):
n = n - n*n + 2
print(n)
1.4400000000000002
1.3663999999999998
1.4993510400000003
1.2512974988509178
1.685552068220355
0.8444662935384386
2.13134297261589
-0.41127989430324874
1.4195689542386598
1.4043929384004177
1.4320734129714583
1.3812391528317374
1.4734175555164017
1.3024582626124728
1.6060607367649717
1.026629646586928
1.9726612153357272
0.08126894484589875
2.074664303449533
-0.22956766855820243
1.717731016994549
0.7671311702494212
2.1786409378811746
-0.5678353983305899
1.1097275620721503
1.8782323000495522
0.3504757271001211
2.2276424918137625
-0.734748579520466
0.7253959453721914
2.199196667809776
-0.6372693158958462
0.956618503121794
2.0414995426068123
-0.12622083985701193
1.8578474597287786
0.40625027610810305
2.241210989270193
-0.7818157091552842
0.606948487762736
2.238562020965264
-0.7725979007428236
0.6304945830249584
2.2329711638011425
-0.7531890545662865
0.6795171935152571
2.2177735772324056
-0.7007460626378159
0.8082088930597822
In [8]:
def f(n):
return n - n*n + 2
n=1.4
y=[n]
x = range(0,40)
for i in x[1:]:
n = f(n)
y.append(n)
plt.figure(figsize=(12,10))
plt.scatter(x,y)
plt.plot(x,y)
plt.title('$n_i = n_i + n_i^2 + 2$', fontsize=18)
Out[8]:
<matplotlib.text.Text at 0x10abc56d8>

The terms seem to bouncing around in an organized manner. If we solve for the fixed points, we see that \(n = \pm \sqrt{2}\). Let’s look at these examples from a different perspective with a cobweb diagram.
Cobweb Plot¶
A different way to visualize the behavior of the iterations is to look at what is called a cobweb diagram. The idea is to plot the line \(y = x\) on the same axes as a continuous representation of the discrete system. Here, this results in graphing:
We then start at \(x=0\) on \(y1\) draw a vertical line to \(y2\), a horizontal line to \(y1\), a vertical line to \(y2\), repeat...
The results demonstrate not only where the fixed point is based on the intersection, but also give use insight into whether or not these points are repelling or attracting. For example, our fixed point of 5 is an attracting one, as it draws our pencil towards it no matter where we start our iterations.
In [9]:
def f(n):
return 0.8*n + 1
# return f^n(x)
def func_n(x,n):
for i in range(0,n):
x = f(x)
return x
def plot_graphical(x0,n):
xv = np.linspace(0.0,10.0,2*n) # create array for points xvalue
yv = np.linspace(0.0,10.0,2*n) # create array for points yvalue
x =x0
for i in range(0,n): #iterate
xv[2*i] = x # first point is (x,f(x))
x = f(x)
yv[2*i] = x
xv[2*i+1] = x #second point is (f(x),f(x))
yv[2*i+1] = x
plt.plot(xv,yv,'b') # connect up all these points blue
plt.figure(figsize=(12,10))
plt.xlabel('x')
plt.ylabel('f(x)')
plt.plot(5,5, marker = 'o', markersize=10, c='grey',label = 'Fixed Point')
xcon = np.arange(0,7, 0.1) # to plot function
plt.plot(xcon,xcon, 'g', label = 'y = x')#y=x plotted gree
alpha=0.81
ycon = f(xcon) # function computed
plt.plot(xcon,ycon, 'r', label='y = 0.8n + 1') # function plotted red
plot_graphical(0.1,500)
plt.legend(loc='best')
plt.xlim(0,6)
plt.ylim(0,6)# cobweb plot, 0.3 is initial condition
plt.title('Cobweb Diagram')
Out[9]:
<matplotlib.text.Text at 0x10a751278>

Different Behavior¶
We find more interesting examples from the earlier work with the sequences that demonstrated period 2 and period 3 behavior. Let’s examine the cobweb diagram for
In [10]:
def f(n):
return n-n*n + 2
# return f^n(x)
def func_n(x,n):
for i in range(0,n):
x = f(x)
return x
def plot_graphical(x0,n):
xv = np.linspace(0.0,10.0,2*n) # create array for points xvalue
yv = np.linspace(0.0,10.0,2*n) # create array for points yvalue
x =x0
for i in range(0,n): #iterate
xv[2*i] = x # first point is (x,f(x))
x = f(x)
yv[2*i] = x
xv[2*i+1] = x #second point is (f(x),f(x))
yv[2*i+1] = x
plt.plot(xv,yv) # connect up all these points blue
plt.figure(figsize=(12,10))
plt.xlabel('x')
plt.ylabel('f(x)')
xcon = np.arange(0,7, 0.1) # to plot function
plt.plot(xcon,xcon, 'g')#y=x plotted gree
alpha=0.81
ycon = f(xcon) # function computed
plt.plot(xcon,ycon, 'r') # function plotted red
plot_graphical(0.1,5000)
plt.xlim(-1,2.5)
plt.ylim(-1,2.5)# cobweb plot, 0.3 is initial condition
Out[10]:
(-1, 2.5)
