Course Notes¶

Jupyter Lab Overview¶

Start Jupyter Lab

client server architecture
browser connects to server
kernels execute commands
different kernels available
terminals, text editor, notebook

You can arrange panels as you like them.

pwd

'/home/jet08013/GitHub/Carpentries/2020-01-13/notes'

Jupyter Image

Notebook blends text (markdown) and code (python, in our case)

click in a box to activate it
esc gives you access to the notebook commands (menu, etc)
m makes a book markdown, y makes it code -- or use the menu at the top

Simple Markdown¶

for lists
lists
list items with hypen

emphasis bold

links

Latex¶

For those who know TeX, the notebook can render math:

$$\sum_{i=1}^{\infty}\frac{1}{n^2} = \frac{\pi^2}{6}$$

Python¶

Python is an interpreted language and the blocks of the notebook can be used as a calculator (a fancy one!)

Variables¶

name = "Jeremy"
age = 60
probability = .35

print(name, age, probability)

Jeremy 60 0.35

print(name, 'is', age,'years old with probability',probability)

Jeremy is 60 years old with probability 0.35

Variables must be created before being used, by having a value assigned to them.

temperature

50

temperature=50

temperature*2

100

print("the temperature is:", temperature)

the temperature is: 50

full_name = "Jeremy Teitelbaum"

full_name[3]

'e'

In python, indexing starts from zero!!!

full_name[0]

'J'

len(full_name)

17

Types¶

strings
integers
floats
lists

a=123
b='123'

a[1]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-8bc71255a22e> in <module>
----> 1 a[1]

TypeError: 'int' object is not subscriptable

b[1]

'2'

L = ['jeremy', 'kendra', 'parkisheet']

L[0]

'jeremy'

Slicing¶

full_name[1:3]

'er'

full_name[5:]

'y Teitelbaum'

full_name[:5]

'Jerem'

full_name[3:5:3]

'e'

full_name[:-1]

'Jeremy Teitelbau'

full_name[-1]

'm'

full_name[-1::-1]

'muabletieT ymereJ'

L = ['a','b','c','d']

L[1:3]

['b', 'c']

L[-1]

'd'

L[::2]

['a', 'c']

magic_word = "thanos_must_die"

What command will determine how many letters are in the magic_word?
How would you print the fifth through eighth letters of the magic word (inclusive)?

print(magic_word[4:8])
print(magic_word[7:])
print(magic_word[:3])
print(magic_word[:])
print(magic_word[2:-3])
print(magic_word[-3:2:-1])
print(magic_word[0:100])

os_m
must_die
tha
thanos_must_die
anos_must_
d_tsum_son
thanos_must_die

Types¶

In python, every variable has a type, but the language figures out the appropriate type by itself.

The most important types are:

strings
integers
floating point

x="this is a string"
a=137.50
b=12e-1
c=133

print(a)

137.5

print(b)

1.2

print(c)

133

type(a)

float

type(b)

float

type(c)

int

type(x)

str

Operations on numbers and strings¶

They're what you'd expect, with some caveats.

x=3.5
y=x/5
z=x*22.4
w=x*1e-1
print("x=",x,"y=",y,"z=",z,"w=",w)

x= 3.5 y= 0.7 z= 78.39999999999999 w= 0.35000000000000003

String addition is concatenation.

x="Jeremy"+"Teitelbaum"

print(x)

JeremyTeitelbaum

Mixing floats and integers is ok (you get a float) but mixing strings and numbers is a problem.

1+3.5

4.5

"Jeremy"+3

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-51-8b1195841f62> in <module>
----> 1 "Jeremy"+3

TypeError: can only concatenate str (not "int") to str

You can convert floats or ints to strings and then combine them:

"Jeremy Teitelbaum is "+60+" years old"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-55-1d48ca8ed8ea> in <module>
----> 1 "Jeremy Teitelbaum is "+60+" years old"

TypeError: can only concatenate str (not "int") to str

"Jeremy Teitelbaum is "+str(60)+" years old"

'Jeremy Teitelbaum is 60 years old'

3*"a"

'aaa'

type("hello")

str

type(60)

int

type(3.6)

float

len("Jeremy")

6

len(3.5)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-63-5c30567bf151> in <module>
----> 1 len(3.5)

TypeError: object of type 'float' has no len()

Operations on lists¶

L = ['jeremy','kendra', 'pariksheet','dyanna']

S = ['mouse', 'cat']

L+S

['jeremy', 'kendra', 'pariksheet', 'dyanna', 'mouse', 'cat']

sorted(L)

['dyanna', 'jeremy', 'kendra', 'pariksheet']

['a']*3

['a', 'a', 'a']

L = L + ['x']
print(L)

['jeremy', 'kendra', 'pariksheet', 'dyanna', 'x', 'x', 'x']

strings to lists and back¶

list('abcdefg')

['a', 'b', 'c', 'd', 'e', 'f', 'g']

''.join(['a','b','c'])

'abc'

Dictionaries¶

f = {}
f['jeremy']=15
f['kendra']=33
f['marshmallow']=100
f['french_fries']='hello'

f['jeremy']

15

f['french_fries']

'hello'

f[0]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-286-d8711d29ed7d> in <module>
----> 1 f[0]

KeyError: 0

f.keys()

dict_keys(['jeremy', 'kendra', 'marshmallow', 'french_fries'])

f.values()

dict_values([15, 33, 100, 'hello'])

a = dict(name=['jeremy','kendra','pariksheet'],gender=['m','f','m'])

a

{'name': ['jeremy', 'kendra', 'pariksheet'], 'gender': ['m', 'f', 'm']}

Comments¶

# this goes in a code cell, but doesn't get executed

Built-in Functions¶

f(x,y,z) returns a value¶

len
int, str, float
print
max
min
round
help - function or shift-Tab

round(3.12131,3)

3.121

max('hello')

'o'

min("hello")

'e'

round("hello")

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-73-06fcaad98a98> in <module>
----> 1 round("hello")

TypeError: type str doesn't define __round__ method

1/2

0.5

result = 60

report = "the result is "+str(result)

print(report)

the result is 60

#help(type)

help(round)

Help on built-in function round in module builtins:

round(number, ndigits=None)
    Round a number to a given precision in decimal digits.
    
    The return value is an integer if ndigits is omitted or None.  Otherwise
    the return value has the same type as the number.  ndigits may be negative.

round(123,-2)

100

round(1351,-1)

1350

#help()

report = "Now is the time to flee'

  File "<ipython-input-467-51897dfb7c04>", line 1
    report = "Now is the time to flee'
                                      ^
SyntaxError: EOL while scanning string literal

1/0

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-468-9e1622b385b6> in <module>
----> 1 1/0

ZeroDivisionError: division by zero

y = 15-23
x = y+8
z= 3/x

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-469-23b9d1a37b92> in <module>
      1 y = 15-23
      2 x = y+8
----> 3 z= 3/x

ZeroDivisionError: division by zero

max()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-470-f870ea12a3fc> in <module>
----> 1 max()

TypeError: max expected 1 arguments, got 0

min()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-471-9c9b7bdfea4e> in <module>
----> 1 min()

TypeError: min expected 1 arguments, got 0

Libraries¶

Python is a very simple language¶

There are only 33 "keywords" in the python language

False, class, finally, is, return, None, continue, for, lambda, try, True, def, from, nonlocal, while, and, del, global, not, with, as, elif, if, or, yield, assert, else, import, pass, break, except, in, raise

All of the interesting capabilities of the language come from extensions to the basic language; these extensions are called "libraries" or "modules".

The language also includes about 100 built-in functions. The functions

int, len, print, str, type, max, min

are built-in library. The built-in functions are very primitive.

To get access to more interesting functions, you need to

import

them by importing the library that defines them. Many interesting capabilities are added by the standard library

Two important libraries¶

numpy
pandas

import numpy
import pandas

print('pi is ',numpy.pi)

pi is  3.141592653589793

print('e is ',numpy.exp(1))

e is  2.718281828459045

Notice that we refer to the function exp, for example, by first naming the library it came from.

exp(1)

2.718281828459045

math.exp(1)

2.718281828459045

You can get help on a library/module using the help command.

The import command also allows you to adopt abbreviations.

import numpy as np

np.exp(1)

2.718281828459045

from numpy import exp

exp(1)

2.718281828459045

import numpy.random

numpy.random.choice([1,2,3,4,5])

3

np.random.choice(list('ACTTGCTTGAC'))

'T'

import math 
import random
bases = "ACTTGCTTGAC" 
n_bases = len(bases)
idx = np.random.randint(n_bases)
print("random base", bases[idx], "base index", idx)

random base C base index 10

Data frames are "like" spreadsheets
- Columns
- Index
manipulation is through "methods"

import numpy.random as rnd

rnd.randint(20)

11

Numpy arrays¶

a=np.arange(0,10,.1)

numpy.cos(a)

array([ 1.        ,  0.99500417,  0.98006658,  0.95533649,  0.92106099,
        0.87758256,  0.82533561,  0.76484219,  0.69670671,  0.62160997,
        0.54030231,  0.45359612,  0.36235775,  0.26749883,  0.16996714,
        0.0707372 , -0.02919952, -0.12884449, -0.22720209, -0.32328957,
       -0.41614684, -0.5048461 , -0.58850112, -0.66627602, -0.73739372,
       -0.80114362, -0.85688875, -0.90407214, -0.94222234, -0.97095817,
       -0.9899925 , -0.99913515, -0.99829478, -0.98747977, -0.96679819,
       -0.93645669, -0.89675842, -0.84810003, -0.79096771, -0.7259323 ,
       -0.65364362, -0.57482395, -0.49026082, -0.40079917, -0.30733287,
       -0.2107958 , -0.11215253, -0.01238866,  0.08749898,  0.18651237,
        0.28366219,  0.37797774,  0.46851667,  0.55437434,  0.63469288,
        0.70866977,  0.77556588,  0.83471278,  0.88551952,  0.92747843,
        0.96017029,  0.98326844,  0.9965421 ,  0.99985864,  0.99318492,
        0.97658763,  0.95023259,  0.91438315,  0.86939749,  0.8157251 ,
        0.75390225,  0.68454667,  0.60835131,  0.52607752,  0.43854733,
        0.34663532,  0.25125984,  0.15337386,  0.05395542, -0.04600213,
       -0.14550003, -0.24354415, -0.33915486, -0.43137684, -0.51928865,
       -0.6020119 , -0.67872005, -0.74864665, -0.81109301, -0.86543521,
       -0.91113026, -0.9477216 , -0.97484362, -0.99222533, -0.99969304,
       -0.99717216, -0.98468786, -0.96236488, -0.93042627, -0.88919115])

3*a

array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8,  2.1,  2.4,  2.7,  3. ,
        3.3,  3.6,  3.9,  4.2,  4.5,  4.8,  5.1,  5.4,  5.7,  6. ,  6.3,
        6.6,  6.9,  7.2,  7.5,  7.8,  8.1,  8.4,  8.7,  9. ,  9.3,  9.6,
        9.9, 10.2, 10.5, 10.8, 11.1, 11.4, 11.7, 12. , 12.3, 12.6, 12.9,
       13.2, 13.5, 13.8, 14.1, 14.4, 14.7, 15. , 15.3, 15.6, 15.9, 16.2,
       16.5, 16.8, 17.1, 17.4, 17.7, 18. , 18.3, 18.6, 18.9, 19.2, 19.5,
       19.8, 20.1, 20.4, 20.7, 21. , 21.3, 21.6, 21.9, 22.2, 22.5, 22.8,
       23.1, 23.4, 23.7, 24. , 24.3, 24.6, 24.9, 25.2, 25.5, 25.8, 26.1,
       26.4, 26.7, 27. , 27.3, 27.6, 27.9, 28.2, 28.5, 28.8, 29.1, 29.4,
       29.7])

a*a

array([0.000e+00, 1.000e-02, 4.000e-02, 9.000e-02, 1.600e-01, 2.500e-01,
       3.600e-01, 4.900e-01, 6.400e-01, 8.100e-01, 1.000e+00, 1.210e+00,
       1.440e+00, 1.690e+00, 1.960e+00, 2.250e+00, 2.560e+00, 2.890e+00,
       3.240e+00, 3.610e+00, 4.000e+00, 4.410e+00, 4.840e+00, 5.290e+00,
       5.760e+00, 6.250e+00, 6.760e+00, 7.290e+00, 7.840e+00, 8.410e+00,
       9.000e+00, 9.610e+00, 1.024e+01, 1.089e+01, 1.156e+01, 1.225e+01,
       1.296e+01, 1.369e+01, 1.444e+01, 1.521e+01, 1.600e+01, 1.681e+01,
       1.764e+01, 1.849e+01, 1.936e+01, 2.025e+01, 2.116e+01, 2.209e+01,
       2.304e+01, 2.401e+01, 2.500e+01, 2.601e+01, 2.704e+01, 2.809e+01,
       2.916e+01, 3.025e+01, 3.136e+01, 3.249e+01, 3.364e+01, 3.481e+01,
       3.600e+01, 3.721e+01, 3.844e+01, 3.969e+01, 4.096e+01, 4.225e+01,
       4.356e+01, 4.489e+01, 4.624e+01, 4.761e+01, 4.900e+01, 5.041e+01,
       5.184e+01, 5.329e+01, 5.476e+01, 5.625e+01, 5.776e+01, 5.929e+01,
       6.084e+01, 6.241e+01, 6.400e+01, 6.561e+01, 6.724e+01, 6.889e+01,
       7.056e+01, 7.225e+01, 7.396e+01, 7.569e+01, 7.744e+01, 7.921e+01,
       8.100e+01, 8.281e+01, 8.464e+01, 8.649e+01, 8.836e+01, 9.025e+01,
       9.216e+01, 9.409e+01, 9.604e+01, 9.801e+01])

a+numpy.exp(a)

array([1.00000000e+00, 1.20517092e+00, 1.42140276e+00, 1.64985881e+00,
       1.89182470e+00, 2.14872127e+00, 2.42211880e+00, 2.71375271e+00,
       3.02554093e+00, 3.35960311e+00, 3.71828183e+00, 4.10416602e+00,
       4.52011692e+00, 4.96929667e+00, 5.45519997e+00, 5.98168907e+00,
       6.55303242e+00, 7.17394739e+00, 7.84964746e+00, 8.58589444e+00,
       9.38905610e+00, 1.02661699e+01, 1.12250135e+01, 1.22741825e+01,
       1.34231764e+01, 1.46824940e+01, 1.60637380e+01, 1.75797317e+01,
       1.92446468e+01, 2.10741454e+01, 2.30855369e+01, 2.52979513e+01,
       2.77325302e+01, 3.04126389e+01, 3.33641000e+01, 3.66154520e+01,
       4.01982344e+01, 4.41473044e+01, 4.85011845e+01, 5.33024491e+01,
       5.85981500e+01, 6.44402876e+01, 7.08863310e+01, 7.79997937e+01,
       8.58508687e+01, 9.45171313e+01, 1.04084316e+02, 1.14647172e+02,
       1.26310418e+02, 1.39189780e+02, 1.53413159e+02, 1.69121907e+02,
       1.86472242e+02, 2.05636810e+02, 2.26806416e+02, 2.50191932e+02,
       2.76026407e+02, 3.04567401e+02, 3.36099560e+02, 3.70937468e+02,
       4.09428793e+02, 4.51957770e+02, 4.98949041e+02, 5.50871910e+02,
       6.08245038e+02, 6.71641633e+02, 7.41695189e+02, 8.19105825e+02,
       9.04647292e+02, 9.99174716e+02, 1.10363316e+03, 1.21906707e+03,
       1.34663076e+03, 1.48759993e+03, 1.64338443e+03, 1.81554241e+03,
       2.00579590e+03, 2.21604799e+03, 2.44840198e+03, 2.70518233e+03,
       2.98895799e+03, 3.30256808e+03, 3.64915031e+03, 4.03217239e+03,
       4.45546675e+03, 4.92326884e+03, 5.44025959e+03, 6.01161222e+03,
       6.64304401e+03, 7.34087354e+03, 8.11208393e+03, 8.96439270e+03,
       9.90632906e+03, 1.09473192e+04, 1.20977807e+04, 1.33692268e+04,
       1.47743816e+04, 1.63273072e+04, 1.80435449e+04, 1.99402704e+04])

DataFrames and Pandas¶

import pandas as pd

simple dataframes and dictionaries¶

df = pd.DataFrame.from_dict(a)

df

data = pd.read_csv('../gapminder_data.csv')

data.columns

Index(['country', 'year', 'pop', 'continent', 'lifeExp', 'gdpPercap'], dtype='object')

data.head()

data = pd.read_csv('../gapminder_data.csv', index_col='country')

data.columns

Index(['year', 'pop', 'continent', 'lifeExp', 'gdpPercap'], dtype='object')

data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1704 entries, Afghanistan to Zimbabwe
Data columns (total 5 columns):
year         1704 non-null int64
pop          1704 non-null float64
continent    1704 non-null object
lifeExp      1704 non-null float64
gdpPercap    1704 non-null float64
dtypes: float64(3), int64(1), object(1)
memory usage: 79.9+ KB

selecting elements from dataframes¶

columns¶

data['pop']

country
Afghanistan     8425333.0
Afghanistan     9240934.0
Afghanistan    10267083.0
Afghanistan    11537966.0
Afghanistan    13079460.0
                  ...    
Zimbabwe        9216418.0
Zimbabwe       10704340.0
Zimbabwe       11404948.0
Zimbabwe       11926563.0
Zimbabwe       12311143.0
Name: pop, Length: 1704, dtype: float64

data[['year','pop']]

rows¶

from the index¶

data.loc['Afghanistan']

boolean selection on a column¶

data['continent']=='Asia'

country
Afghanistan     True
Afghanistan     True
Afghanistan     True
Afghanistan     True
Afghanistan     True
               ...  
Zimbabwe       False
Zimbabwe       False
Zimbabwe       False
Zimbabwe       False
Zimbabwe       False
Name: continent, Length: 1704, dtype: bool

data[data['continent']=='Asia']

data[data['pop']>1e8]

data[data['lifeExp']<60]

numerical indexing¶

data.iloc[33]

year               1997
pop          2.9072e+07
continent        Africa
lifeExp          69.152
gdpPercap        4797.3
Name: Algeria, dtype: object

data.iloc[33,3]

69.152

rows and columns¶

data.loc['Afghanistan','pop']

country
Afghanistan     8425333.0
Afghanistan     9240934.0
Afghanistan    10267083.0
Afghanistan    11537966.0
Afghanistan    13079460.0
Afghanistan    14880372.0
Afghanistan    12881816.0
Afghanistan    13867957.0
Afghanistan    16317921.0
Afghanistan    22227415.0
Afghanistan    25268405.0
Afghanistan    31889923.0
Name: pop, dtype: float64

statistics on numerical columns¶

data.loc['Afghanistan','pop'].describe()

count    1.200000e+01
mean     1.582372e+07
std      7.114583e+06
min      8.425333e+06
25%      1.122025e+07
50%      1.347371e+07
75%      1.779529e+07
max      3.188992e+07
Name: pop, dtype: float64

Grouping: optional¶

s=data.groupby(['continent']).mean()
s.head()

averages_by_continent=data.groupby(['continent','year']).mean()
averages_by_continent.loc['Asia'].round()

Pivot table¶

lifeExp_over_time = pd.pivot_table(data,index='country',columns='year',values='lifeExp')

lifeExp_over_time.loc['Afghanistan',:]

year
1952    28.801
1957    30.332
1962    31.997
1967    34.020
1972    36.088
1977    38.438
1982    39.854
1987    40.822
1992    41.674
1997    41.763
2002    42.129
2007    43.828
Name: Afghanistan, dtype: float64

stats_over_time = pd.pivot_table(data,index=['continent','country'],columns='year',values=['lifeExp','gdpPercap','pop'])
stats_over_time_by_continent=stats_over_time.groupby('continent').mean()

stats_over_time_by_continent['pop'].round(-3)

Plotting¶

plotting libraries:¶

import matplotlib.pyplot as plt
plt.style.use('ggplot')

plt.plot([1,2,3],[1,4,9],c='blue',linewidth=3,linestyle='solid')
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.suptitle('A demo plot')

Text(0.5, 0.98, 'A demo plot')

plt.plot([1,2,3],[1,4,9],c='orange',linewidth=1,linestyle='dashed')
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.suptitle('A demo plot')

Text(0.5, 0.98, 'A demo plot')

plt.scatter([1,2,3],[1,4,9],c=[0,1,2],s=100)
plt.plot([1,2,3],[1,4,9])
plt.xticks([1,2,3])
plt.yticks([1,4,9])
plt.xlim(0,5)
plt.ylim(0,10)
plt.grid(False)

Plotting from pandas¶

the index is the x axis and the values in the column are plotted against that
or you can specify x and y

data.head()

lifeExp_over_time.head()

lifeExp_over_time.T['Afghanistan'].plot()

<matplotlib.axes._subplots.AxesSubplot at 0x1a24217a50>

transpose = lifeExp_over_time.T
transpose[['Afghanistan','Germany']].plot(kind='bar')

<matplotlib.axes._subplots.AxesSubplot at 0x1a215b9b50>

_=stats_over_time.groupby('continent').mean()['gdpPercap'].T.plot()

stats_over_time.groupby('continent').mean()['lifeExp'].T.plot()
_=plt.suptitle('Mean Life Expectancy over Time')

_=stats_over_time['gdpPercap'].loc[['Africa']].mean().T.plot(kind='bar',legend=None)

data.plot(kind='scatter',y='gdpPercap',x='lifeExp',c='year',logy=True,figsize=(10,10))
plt.savefig('scatter.png')

pwd

'/Users/swc/2020-01-13/Course Notes'

_=data[data['year']==2002].groupby('continent').sum()['pop'].plot(kind='pie',figsize=(10,10),label='Population')

_=data[(data['year']==2002) & (data['continent']=='Asia')].groupby('country').sum()['pop'].sort_values().plot(kind='bar',figsize=(10,10),)

	year	pop	continent	lifeExp	gdpPercap
country
Bangladesh	1987	103764241.0	Asia	52.819	751.979403
Bangladesh	1992	113704579.0	Asia	56.018	837.810164
Bangladesh	1997	123315288.0	Asia	59.412	972.770035
Bangladesh	2002	135656790.0	Asia	62.013	1136.390430
Bangladesh	2007	150448339.0	Asia	64.062	1391.253792
...	...	...	...	...	...
United States	1987	242803533.0	Americas	75.020	29884.350410
United States	1992	256894189.0	Americas	76.090	32003.932240
United States	1997	272911760.0	Americas	76.810	35767.433030
United States	2002	287675526.0	Americas	77.310	39097.099550
United States	2007	301139947.0	Americas	78.242	42951.653090

	year	pop	lifeExp	gdpPercap
continent
Africa	1979.5	9.916003e+06	48.865330	2193.754578
Americas	1979.5	2.450479e+07	64.658737	7136.110356
Asia	1979.5	7.703872e+07	60.064903	7902.150428
Europe	1979.5	1.716976e+07	71.903686	14469.475533
Oceania	1979.5	8.874672e+06	74.326208	18621.609223

	pop	lifeExp	gdpPercap
year
1952	42283556.0	46.0	5195.0
1957	47356988.0	49.0	5788.0
1962	51404763.0	52.0	5729.0
1967	57747361.0	55.0	5971.0
1972	65180977.0	57.0	8187.0
1977	72257987.0	60.0	7791.0
1982	79095018.0	63.0	7434.0
1987	87006690.0	65.0	7608.0
1992	94948248.0	67.0	8640.0
1997	102523803.0	68.0	9834.0
2002	109145521.0	69.0	10174.0
2007	115513752.0	71.0	12473.0

year	1952	1957	1962	1967	1972	1977	1982	1987	1992	1997	2002	2007
continent
Africa	4570000.0	5093000.0	5702000.0	6448000.0	7305000.0	8328000.0	9603000.0	11055000.0	12675000.0	14304000.0	16033000.0	17876000.0
Americas	13806000.0	15478000.0	17331000.0	19230000.0	21175000.0	23123000.0	25212000.0	27310000.0	29571000.0	31876000.0	33991000.0	35955000.0
Asia	42284000.0	47357000.0	51405000.0	57747000.0	65181000.0	72258000.0	79095000.0	87007000.0	94948000.0	102524000.0	109146000.0	115514000.0
Europe	13937000.0	14596000.0	15345000.0	16039000.0	16688000.0	17239000.0	17709000.0	18103000.0	18605000.0	18965000.0	19274000.0	19537000.0
Oceania	5343000.0	5971000.0	6642000.0	7300000.0	8053000.0	8620000.0	9197000.0	9787000.0	10460000.0	11121000.0	11727000.0	12275000.0

year	1952	1957	1962	1967	1972	1977	1982	1987	1992	1997	2002	2007
country
Afghanistan	28.801	30.332	31.997	34.020	36.088	38.438	39.854	40.822	41.674	41.763	42.129	43.828
Albania	55.230	59.280	64.820	66.220	67.690	68.930	70.420	72.000	71.581	72.950	75.651	76.423
Algeria	43.077	45.685	48.303	51.407	54.518	58.014	61.368	65.799	67.744	69.152	70.994	72.301
Angola	30.015	31.999	34.000	35.985	37.928	39.483	39.942	39.906	40.647	40.963	41.003	42.731
Argentina	62.485	64.399	65.142	65.634	67.065	68.481	69.942	70.774	71.868	73.275	74.340	75.320

	country	year	pop	continent	lifeExp	gdpPercap
0	Afghanistan	1952	8425333.0	Asia	28.801	779.445314
1	Afghanistan	1957	9240934.0	Asia	30.332	820.853030
2	Afghanistan	1962	10267083.0	Asia	31.997	853.100710
3	Afghanistan	1967	11537966.0	Asia	34.020	836.197138
4	Afghanistan	1972	13079460.0	Asia	36.088	739.981106

	name	gender
0	jeremy	m
1	kendra	f
2	pariksheet	m