list comprehension and generators

list comprehension and generators

list

list comprehensions and generators

Nested list comprehensions

  • [[output expression] for iterator variable in iterable]
  • Collapse for loops for building lists into a single line
    • Components
      • Iterable
      • Iterator variable (represent members of iterable)
      • Output expression
In [1]:
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
In [7]:
pair_2=[(num1, num2) for num1 in range(0, 2) for num2 in range(6, 8)]
pair_2
Out[7]:
[(0, 6), (0, 7), (1, 6), (1, 7)]

Using conditionals in comprehensions

  • [ output expression for iterator variable in iterable if predicate expression ].
In [2]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)
['samwise', 'aragorn', 'legolas', 'boromir']
In [3]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)
['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']

Dict comprehensions

  • Recall that the main difference between a list comprehension and a dict comprehension is the use of curly braces {} instead of []. Additionally, members of the dictionary are created using a colon :, as in key:value
    • Create dictionaries
    • Use curly braces {} instead of brackets []
In [4]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create dict comprehension: new_fellowship
new_fellowship = {member:len(member) for member in fellowship}

# Print the new list
print(new_fellowship)
{'aragorn': 7, 'frodo': 5, 'samwise': 7, 'merry': 5, 'gimli': 5, 'boromir': 7, 'legolas': 7}

Generator expressions

  • Recall list comprehension
    • Use ( ) instead of [ ]
In [9]:
g = (2 * num for num in range(10))
g
Out[9]:
<generator object <genexpr> at 0x0000000004335A20>

List comprehensions vs. generators

  • List comprehension - returns a list
  • Generators - returns a generator object
  • Both can be iterated over
In [13]:
(num for num in range(10*1000000) if num % 2 == 0)
Out[13]:
<generator object <genexpr> at 0x0000000004335E10>

Generator functions

Generator functions are functions that, like generator expressions, yield a series of values, instead of returning a single value. A generator function is defined as you do a regular function, but whenever it generates a value, it uses the keyword yield instead of return.

  • Produces generator objects when called
  • Defined like a regular function - def
  • Yields a sequence of values instead of returning a single value
  • Generates a value with yield keyword
In [15]:
def num_sequence(n):
    
    """Generate values from 0 to n."""
    i = 0
    while i < n:
        yield i
        i += 1
In [17]:
test=num_sequence(7)
print type(test)
<type 'generator'>
In [21]:
next(test)
Out[21]:
3
In [22]:
test.next()
Out[22]:
4

List comprehensions for time-stamped data

the pandas Series

  • single-dimension arrays
  • Extract the column 'created_at' from df and assign the result to tweet_time. Fun fact: the extracted column in tweet_time here is a Series data structure!
  • reate a list comprehension that extracts the time from each row in tweet_time. Each row is a string that represents a timestamp, and you will access the 11th to 18th characters in the string to extract the time. Use entry as the iterator variable and assign the result to tweet_clock_time.
In [27]:
import pandas as pd

df = pd.read_csv('tweets.csv')
    
# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time]

# Print the extracted times
print(tweet_clock_time[:100])
['05:24:51', '05:24:57', '05:25:38', '05:25:42', '05:25:48', '05:25:53', '05:25:58', '05:26:12', '05:26:27', '05:26:30', '05:26:35', '05:26:48', '05:27:56', '05:28:28', '05:28:28', '05:28:40', '05:28:55', '05:30:06', '05:30:18', '05:30:20', '05:30:53', '05:30:55', '05:31:41', '05:32:20', '05:32:23', '05:32:32', '05:34:11', '05:34:17', '05:36:07', '05:38:17', '05:38:26', '05:39:39', '05:39:48', '05:40:07', '05:40:19', '05:40:58', '05:41:06', '05:41:21', '05:41:34', '05:41:51', '05:42:13', '05:42:51', '05:43:20', '05:43:24', '05:43:34', '05:44:36', '05:45:16', '05:45:40', '05:46:38', '05:46:40', '05:46:56', '05:47:07', '05:47:36', '05:47:44', '05:47:50', '05:48:01', '05:48:19', '05:49:10', '05:49:31', '05:49:36', '05:49:39', '05:49:39', '05:49:48', '05:49:52', '05:49:54', '05:50:04', '05:50:07', '05:50:16', '05:50:21', '05:50:35', '05:50:46', '05:50:49', '05:50:49', '05:50:56', '05:51:15', '05:51:26', '05:51:28', '05:51:43', '05:52:27', '05:52:32', '05:52:35', '05:52:45', '05:53:00', '05:53:33', '05:53:37', '05:53:55', '05:53:59', '05:54:14', '05:54:26', '05:54:55', '05:54:59', '05:55:25', '05:55:31', '05:55:39', '05:55:53', '05:55:57', '05:56:02', '05:56:14', '05:56:17', '05:56:29']

Conditional list comprehesions for time-stamped data

  • add a conditional expression to the list comprehension so that you only select the times in which entry[17:19] is equal to '19'
In [28]:
# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19']

# Print the extracted times
print(tweet_clock_time)
['05:40:19', '05:48:19', '06:02:19', '06:03:19', '04:56:19', '05:40:19', '05:48:19', '06:02:19', '06:03:19', '03:31:19', '03:54:19', '04:23:19']
In [ ]:
 

Leave a Reply

Your email address will not be published.