CMU 15-110: Principles of Computing
Data Representation and Algorithms

Part 1: Data Representation

What is Data Representation?

How to store and transmit data
Digital: using only 0's and 1's (bits) (real!)
Abstraction: Everything else must be represented with 0's and 1's
Type: the class or kind of a data value, such as integer (42), string ('hello'), image (...), video (...), etc.
Data Abstraction Hierarchy: Represent complex types using simpler types
For example:

Ways to Represent Some Types

integers

unsigned (not negative) integers

binary (base 2)
- See here
- Example: Why 22 in (unsigned) base 2 is 10110:
Issues
- word size (how many bits should we use?)
- overflow (what happens when a number is too big for that many bits?)
Examples:
- +13 in an unsigned 8-bit word is 0000 1101
- +13 in an unsigned 4-bit word is 1101
- +13 in an unsigned 2-bit word is impossible due to overflow
- -13 in any-sized unsigned word is impossible (unsigned means "not negative")

signed (possibly-negative) integers

sign-magnitude
- This is not really used in practice, but it's similar to what is used, and it's very clear, so we will use this to digitize signed integers.
- Use the leftmost (high order) bit for the sign: 0 is positive, 1 is negative
- Examples:
  - +13 in a (sign-magnitude) signed 8-bit word is 0000 1101
  - -13 in a (sign-magnitude) signed 8-bit word is 1000 1101
    - Note: 1000 1101 in an unsigned 8-bit word is 141 (128+8+4+1)
  - -13 in a (sign-magnitude) signed 4-bit word is impossible due to overlow
- Issue: What is the (sign-magnitude) signed 4-bit value 1000? (negative 0!)
two's complement
- The real way in practice, similar to 10's complement in algorithms section below. Optionally read about it here.

characters

ASCII
- 8-bit unsigned values
  - Technically, 7-bits but we add a leading (leftmost) 0 to get 8 bits.
- See here
- Examples:
  - 'A' is 65 which in unsigned 8 bits is 0100 0001
  - 'a' is 97 which in unsigned 8 bits is 0110 0001
  - '{' is 123 which in unsigned 8 bits is 0111 1011
Unicode
- Can use more than 8 bits
- Can represent all languages, not just English
- For example, UTF-8 uses up to 32 bits and can represent over 1 million characters!
- For simplicity, in 110 we'll use 8-bit ASCII to digitize characters

strings (sequences of characters)

0-terminated

Example: Encode 'Dog' using 0-terminated ASCII and 8-bit words:

        'D' is 68, 'o' is 111, 'g' is 103
        -->    68          111         103          0
        --> 0100 0100   0110 1111   0110 0111   0000 0000
        Final answer: 01000100011011110110011100000000

length-prefixed

Example: Encode 'Dog' using length-prefixed ASCII and 8-bit words:

        'D' is 68, 'o' is 111, 'g' is 103
        -->     3          68          111         103
        --> 0000 0011   0100 0100   0110 1111   0110 0111
        Final answer: 00000011010001000110111101100111

everything else!

floats (like 3.14), lists, 2d lists, images, audio, video, maps, music, art, math, even ideas, ...

The Big Idea:
- To store, transmit, or compute over any data, you first must represent it! And then, only using 1's and 0's!

Part 2: Algorithms

What is an Algorithm?

A precise way to solve a problem

Input (bits) → Computation (precise steps) → Output (bits)
a "step" is defined by the language and desired level of abstraction

Pseudocode: an informal English-like code-like language
Abstraction: all algorithms must be reduced to these basic steps (or primitives)
Hierarchy: Build complex algorithms out of simpler algorithms (using Top-Down Design)
Analysis: how correct, how fast, how clear, how general,...?

Arithmetic

counting on fingers!
- in base 1
- in base 2
subtraction
- standard subtraction
- 10's complement (subtraction by addition!)
  - Example: Compute 238 - 147 using 10's Complement
    - The 9's Complement of 147 is 852
    - So the 10's Complement of 147 is 852+1, which is 853
    - 238 + 853 is 1091
    - Ignoring the last carry, that's 91.
    - So 238 - 147 is 91

multiplication

standard multiplication
lattice multiplication
- Example: Show that 87 * 24 is 2088 using lattice multiplication:

Egyptian multiplication

Example: Show that 87 * 24 is 2088 using Egyptian multiplication (in just 5 additions!):

                              87 is  1 * 87
                 87 +  87 =  174 is  2 * 87   (Addition #1)
                174 + 174 =  348 is  4 * 87   (Addition #2)
                348 + 348 =  696 is  8 * 87   (Addition #3)
                696 + 696 = 1392 is 16 * 87   (Addition #4)
                ----------------------------
                Note: 24 = 16 + 8
                So:   87 * 24 = 87 * (16 + 8) = 87*16 + 87*8
                ----------------------------
                So the answer is:
                1392 + 696 = 2088             (Addition #5)

Repeated addition

Example: Show that 87 * 24 is 2088 using repeated additions:

                              87 is  1 * 87
                  87 + 87 =  174 is  2 * 87   (Addition #1)
                 174 + 87 =  261 is  3 * 87   (Addition #2)
                 261 + 87 =  348 is  4 * 87   (Addition #3)
                 348 + 87 =  435 is  5 * 87   (Addition #4)
                                 ...
                2001 + 87 = 2088 is 24 * 87   (Addition #23) (yuck)

Card Tricks

The Parity Card Trick
- See here.
- Can provide "110-level" error detection and correction
  - Example: Transmit 10111 using the Parity Card Trick:
    1. We have 5 bits, so cannot use 2x2, so use 3x3 and pad with 4 0's, so we have 101110000
    2. Place these in a 3x3 grid, row-by-row, like so:
```
                1 0 1
                1 1 0
                0 0 0
```
    3. Add bottom row and right col to get even 1's in each, like so:
```
                1 0 1 0
                1 1 0 0
                0 0 0 0
                0 1 1 0
```
    4. Read these back into a single bit stream, like so:
```
                1 0 1 0 1 1 0 0 0 0 0 0 0 1 1 0
```
    5. Send those bits!
    6. Now, optionally, the transmission medium might flip one bit (and hopefully not more than one!), like so:
```
                1 0 1 0 1 1 0 0 0 0 1 0 0 1 1 0
```
    7. Now, the receiver receives these (erroneous) bits!
    8. The receiver recreates the square grid, like so:
```
                1 0 1 0
                1 1 0 0
                0 0 1 0
                0 1 1 0
```
    9. The receiver uses the Parity Card Trick to detect and correct the error, like so:
```
                1 0 1 0
                1 1 0 0
                0 0 0 0
                0 1 1 0
```
    10. The receiver ignores the last row and column to produce the final (correct!) result, like so:
```
                1 0 1 1 1 0 0 0 0
```

Coin Flips

N-Person Coin Flips
- Note: Try this random coin flipper
- Example: use a coin to choose fairly among 5 people:
  1. One flip has 2 outcomes, 2 flips have 4 (2²) outcomes, but we need at least 5 outcomes, so we need 3 flips, which have 8 (2³) outcomes.
  2. Number the 5 people from 0 to 4.
  3. Now, flip the coin 3 times. Say we get H-H-T.
  4. Convert the H/T outcomes into binary, H is 1, T is 0. So H-H-T is 110.
  5. Convert the binary into decimal. 110 is 6.
  6. The matching person wins. If there is no match, keep doing it again until there is.
  7. In fact, with an outcome of 6, there is no match, so we do it again, and flip a coin 3 times. Say we get T-H-T.
  8. Convert to binary, 010, then to decimal, 2.
  9. So Person 2 just won! (Remember that we numbered the people 0, 1, 2, 3, and 4)

Game Playing
- Nim
  - Goal: pick up the last straw!
  - Rule: you must pick up 1, 2, or 3 straws on your turn.
  - Strategy: go for multiples of 4.
  - The Big Idea: we can use an algorithm for optimal play of a game. That's great!

CMU 15-110: Principles of Computing Data Representation and Algorithms

CMU 15-110: Principles of Computing
Data Representation and Algorithms