DataCamp - Intro to Python

Datacamp course notes on basic calculation, variables and types, functions, and Numpy.

Basic Calculation

1
2
3
4
5
6
7
8
9
# Summation
2 + 3
'ab'+'cd' #this simply paste 'cd' and 'ab' together

# Exponentiation
4 ** 2

# Modulo
18 % 7

Variable & Types

Variable:

  • Specific, case-sensitive name
  • Call up value through variable name
  • Can help to make your code reproducible

Type:

1
2
# To know the type of a variable
type(variable)
  1. Float, a value with both fractional and integer part
  2. Int
  3. Str, text. Can use both double and single quotes.
  4. Bool, 0 & 1/False & True True/False is case sensitive here

Note: Different types have different behavior

To paste strings and floats together, we can simply put:

1
2
3
print("I started with $" + str(savings) + " and now have $" + str(result) + ". Awesome!")

print("I said " + ("Hey " * 2) + "Hey!")

Other conversion functions: int(),float(),bool()

  1. List
    Characteristics
  • Name a collection of values. d = [a,b,c]
  • Can contain any type and different types
  • List of lists: d2 = [[a,1],[b,2],[c,3]]

Subsetting lists
zero-based indexing: start from 0

  • x[0]: returns the first element
  • x[-1]: returns the last element
  • x[-2]: returns the last but one element
  • x[3:5]: returns the fourth and fifth element. The element represented by index 5 is not selected, so this slicing is like [start:end) mathemetically
  • x[:4]: returns all the element from the start to the fourth
  • x[5:]: returns all the element from the sixth element to the last one

List Manipulation

  • Replaced the indexed part of the list with desired values

    1
    x[0:2] = [a,b]
  • Adding and removing elements

    1
    2
    3
    4
    5
    # adding
    x = x + [a,b]

    # removing
    del(x[2])

Important Note: when you create a new list, what actually happens is that you store a list into the computer memory, and store the “address” of the list to the variable. This means that the variable actually does not contain all the list elements, but rather contains a reference to the list elements. This difference is especially important when you try to copy the list:

1
2
3
x = ['a', 'b', 'c']
y = x
y[1] = 'z'

Now if you print y, you will see the following output: ['a', 'z', 'c'], while interestingly, the element in x is also changed into ['a', 'z', 'c']. That is because when you copy x to y with an equal sign, you copied the reference to y, not the list elements themselves. Therefore, when you are updating an element in the list, which was stored in the computer memory, both x and y, whose reference point to this list, will return changed outcome.

If you want to create a list y with a new list of elements but same values as x, you should use y = list(x) or y = x[:] to select all the elements explicitly. Now when you update the elements in y, x will not change accordingly.

Functions

help(function) can help you understand the arguments. If the argument is in square brackets, then it is an optional one.

Objects and Methods

All kinds of values or data structures, like strings, floats, lists, etc., are all python objects, and methods are functions that belongs to objects. Methods may behave differently for different objects.
To use a method, an example with list are as follows: x.index(a), x.count(a)

Useful tips and functions:

  1. To place two commands in the same line: command1; command2
  2. max(), min()
  3. round(x, digits) #default to 2
  4. len(x): length of a variable
  5. sorted(x, reverse = TRUE/FALSE, key = column): sort the variable in ascending or descending order. For a list, reverse() should be used instead
  6. capitalize() or upper(): capitalize all the letters in a string
  7. replace('a', 'b'): replace ‘a’ with ‘b’
  8. append(): append a value to the end of the list
  9. count('a'): count the number of appearance of letter ‘a’ in the string

Packages

Conveniently install packages with pip in terminal after downloading the get-pip.py

1
2
python3 get-pip.py
pip3 install numpy

Import the package

1
2
3
4
5
6
7
import numpy as np
np.array([1, 2, 3])

#if you only need one function from the package
from numpy import array
array([1, 2, 3])
#this way, the reader of your code may not know explicitly that you are calling the fucntion from numpy. Therefore the previous method is usually more preferable.

Numpy

  • Great for vector arithmetic
  • Support element-wise calculation on lists
  • Fast and easy
    np.array(list) can easily convert a list into a numpy array.
  • Remarks:
  1. numpy arrays only contain one type. If the array contains different types, some of the types will be coerced to produce a homogeneous list.
  2. adding arrays together will perform element-wise calculation, while adding lists together simply paste the lists together without calculation.
    np.array([True, 1, 2]) will become array([1, 1, 2])

Subsetting

1
2
x[1]
x[x>1]

2D Numpy Arrays

  • ndarray = N-dimensional array
  • needs homogeneous type
    1
    2
    3
    4
    5
    6
    np_2d = np.array([[1.73, 1.68, 1.71],[65.4, 59.2, 63.6]])
    np_2d.shape #returns the dimension
    np_2d[0] #selects the first row
    np_2d[0][2] #selects the third element in the first row
    np_2d[0, 2] #selets the element at row 1 column 3
    np_2d[:, 1:3] #selects column 2 and 3 (remember the 3 index is not included here)

Basic Statistics

  • Much faster
  1. np.mean(), np.median(), np.corrcoef(a, b), np.std()
  2. sum(), sort()

To generate data: np.random.normal(mean, sd, number)

To paste columns together: np.column_stack((a,b))