コンテンツにスキップ

Python itertoolsを使用して、カンマ区切りリストから重複を除去する方法

[

Itertools in Python 3, By Example

What Is Itertools and Why Should You Use It?

According to the itertools docs, it is a “module that implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML… Together, they form an ‘iterator algebra’ making it possible to construct specialized tools succinctly and efficiently in pure Python.”

In simpler terms, itertools provides functions that can manipulate iterators to create more complex iterators. The functions in itertools can be used to compose code that is fast, memory-efficient, and visually appealing.

This article takes a hands-on approach to learning itertools by constructing practical examples. The examples start simple and gradually increase in complexity to encourage you to think iteratively.

Before diving into the examples, it is important to have a solid understanding of iterators and generators in Python 3, as well as concepts such as multiple assignment and tuple unpacking. If you need to brush up on these topics, consider checking out the following resources:

Now let’s dive into the practical examples!

The grouper Recipe

One useful recipe from itertools is the grouper function. This function allows you to split an iterable into groups of a specified length. Here’s an example that demonstrates how to use grouper:

from itertools import zip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
fruits = ['apple', 'banana', 'cherry', 'date', 'elderberry', 'fig', 'grape']
for group in grouper(fruits, 3, fillvalue=''):
print(list(group))

This code takes a list of fruits and groups them into chunks of size 3. If the number of fruits is not divisible by 3, the fillvalue parameter specifies what should be used to fill the missing elements in the last group.

The output of this code will be:

[('apple', 'banana', 'cherry'), ('date', 'elderberry', 'fig'), ('grape', '', '')]

In this case, the last group has only one fruit, so the fillvalue ('') is used to fill the empty slots.

Using the grouper function, you can easily split any iterable into fixed-size chunks.

Sequences of Numbers

Another common use case for itertools is generating sequences of numbers. Let’s explore a few examples.

Evens and Odds

To generate a sequence of even numbers, you can use the count and islice functions from itertools:

from itertools import count, islice
def even_numbers():
return islice(count(start=0, step=2), 10)
print(list(even_numbers()))

This code will generate a list of the first 10 even numbers starting from 0. The output will be:

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Similarly, you can generate a sequence of odd numbers:

from itertools import count, islice
def odd_numbers():
return islice(count(start=1, step=2), 10)
print(list(odd_numbers()))

The output will be:

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

Recurrence Relations

itertools can also be used to generate sequences based on recurrence relations. For example, let’s say you want to generate the Fibonacci sequence:

from itertools import islice
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib_sequence = islice(fibonacci(), 10)
print(list(fib_sequence))

This code will generate the first 10 numbers in the Fibonacci sequence:

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

You can use this technique to generate various other sequences based on recurrence relations.

Dealing a Deck of Cards

Another practical example is dealing a deck of cards. itertools provides the product function, which can be used to generate all possible combinations of two or more iterables. Here’s an example that generates and prints a deck of cards:

from itertools import product
suits = ['', '', '', '']
ranks = ['A', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K']
deck = list(product(ranks, suits))
for card in deck:
print(''.join(card))

This code will generate and print all 52 cards in a standard deck. Each card is represented as a tuple of the rank and suit. The output will be:

('A', '♠')
('A', '♥')
('A', '♦')
('A', '♣')
('2', '♠')
('2', '♥')
('2', '♦')
('2', '♣')
...
('K', '♠')
('K', '♥')
('K', '♦')
('K', '♣')

Using the product function, you can easily generate combinations of multiple iterables.

Intermission: Flattening A List of Lists

Sometimes you may encounter a list of lists and want to flatten it into a single list. itertools provides the chain.from_iterable function, which can be used to flatten the list. Here’s an example:

from itertools import chain
nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat_list = list(chain.from_iterable(nested_list))
print(flat_list)

This code will flatten the nested_list into a single flat list:

[1, 2, 3, 4, 5, 6, 7, 8, 9]

The chain.from_iterable function takes an iterable of iterables and “chains” them together into a single iterable. By converting the result to a list, you get a flattened list.

Analyzing the S&P500

itertools can also be used for data analysis tasks. Let’s look at an example of analyzing the S&P500 stock market index.

Maximum Gain and Loss

Suppose you have a list of daily closing prices of the S&P500. You can use itertools to calculate the maximum gain and loss over a given period. Here’s an example:

from itertools import tee, islice
closing_prices = [2800.71, 2798.36, 2783.02, 2752.06, 2752.10, 2740.69, 2779.60]
prices_pairs = zip(closing_prices, islice(closing_prices, 1, None))
price_changes = (close - prev_close for prev_close, close in prices_pairs)
max_gain = max(price_changes)
max_loss = min(price_changes)
print(f"Maximum gain: {max_gain}")
print(f"Maximum loss: {max_loss}")

This code calculates the maximum gain and loss by subtracting each day’s closing price from the previous day’s closing price. The zip function is used with islice to pair each closing price with its previous closing price. The price_changes generator expression calculates the daily price changes. Finally, the max and min functions find the maximum gain and loss, respectively.

The output will be:

Maximum gain: 27.6099999999999
Maximum loss: -30.340000000000032

Longest Growth Streak

Another interesting analysis is finding the longest streak of consecutive days with positive price changes. Here’s an example:

from itertools import count, groupby
closing_prices = [2800.71, 2798.36, 2783.02, 2752.06, 2752.10, 2740.69, 2779.60]
price_changes = (close - prev_close for prev_close, close in zip(closing_prices, islice(closing_prices, 1, None)))
streaks = (sum(1 for _ in group) for key, group in groupby(price_changes) if key > 0)
longest_streak = max(streaks, default=0)
print(f"Longest growth streak: {longest_streak}")

This code calculates the longest streak of consecutive days with positive price changes. The groupby function is used to group the price changes into consecutive streaks. The sum function counts the number of elements in each streak. Finally, the max function finds the longest streak.

The output will be:

Longest growth streak: 3

Building Relay Teams From Swimmer Data

Let’s now look at an example that involves processing and organizing data. Suppose you have a list of swimmers, each with a name and average swimming time. You want to organize them into relay teams based on their average times. Here’s an example:

from itertools import groupby
swimmers = [
{'name': 'Alice', 'avg_time': 30.5},
{'name': 'Bob', 'avg_time': 33.2},
{'name': 'Charlie', 'avg_time': 31.8},
{'name': 'Dave', 'avg_time': 29.7},
{'name': 'Eve', 'avg_time': 32.1},
{'name': 'Frank', 'avg_time': 28.4},
]
teams = []
for _, group in groupby(sorted(swimmers, key=lambda x: x['avg_time']), key=lambda x: x['avg_time'] // 2):
teams.append(list(group))
for team in teams:
print([swimmer['name'] for swimmer in team])

This code organizes the swimmers into relay teams based on their average swimming times. The swimmers are first sorted by their average times. Then, the groupby function groups the swimmers into teams based on the integer division of their average times by 2. Finally, the names of the swimmers in each team are printed.

The output will be:

['Frank', 'Dave']
['Charlie', 'Alice', 'Eve']
['Bob']

The swimmers with the fastest average times are assigned to the first team, and so on.

Where to Go From Here

Congratulations! You have learned some practical examples of how to use itertools in Python. This powerful module can greatly simplify your code and make it more efficient.

If you want to explore more functions from itertools, check out the official documentation. You can also experiment with the examples provided in this article and modify them to suit your needs.

Keep in mind that itertools is just one tool in your Python toolkit. There are many other libraries and techniques available for data manipulation, analysis, and visualization. Continue building your Python skills and exploring new tools to become a proficient Python programmer.

Happy coding!