# Python Concatenate Lists

## Concatenate Two Lists in Python

Problem: You have two lists and you’d like to join them into a new list. Solution:

``````Python 3.8.2
>>> one = ["one","two", "three"]
>>> two = ["four","five"]
>>> one + two
['one', 'two', 'three', 'four', 'five']``````

### 📢 TLDR: Use `+`

In almost all simple situations, using `list1 + list2` is the way you want to concatenate lists.

The edge cases below are better in some situations, but `+` is generally the best choice. All options covered work in Python 2.3, Python 2.7, and all versions of Python 31.

## Combine Lists In Place In Python

Problem: You have a huge list, and you want to add a smaller list on the end while minimizing memory usage.

In this case, it may be best to append to the existing list, reusing it instead of recreating a new list.

``````>>>longlist = ["one","two", "three"] * 1000
['one', 'two', 'three', 'one', 'two', 'three', ... ]
>>>shortlist = ["four","five"]
["four","five"]
>>> x.extend(y)
>>> x
['one', 'two', 'three', 'one',  ..., "four","five"]``````

As with any optimization, you should verify that this reduces memory thrash in your specific case and stick to the simple idiomatic `x + y` otherwise.

Let’s use the `timeit` module to check some performance numbers.

``````# Performance Check
>>> setup = """\
x = ["one","two","three"] * 1000
y = ["four","five","six"]
"""
# x + y with large x
>>> timeit.timeit('x + y', setup=setup, number=1000000)
3.6260274310000113
# x.extend(y) with large x
>>> timeit.timeit('x.extend(y)', setup=setup, number=1000000)
0.06857255800002804``````

In this example, where x is 3000 elements, extend is around 50x faster.

### ❗ Concatenating Lists With Huge Elements is Fine

If the elements in your list are huge (million character strings), but the list size is less than a thousand elements, the previous solution `x + y` will work just fine. This is because Python stores references to the values in the list, not the values themselves. Thus, the element size makes no difference to the runtime complexity.

``````>>> x = ["one" * 1000, "two" * 1000, "three" * 1000]
>>> y = ["four" * 1000, "five" * 1000]
>>> #This is fine
>>> z = x + y
>>> #Performance Testing (extend is slower for large elements)
>>>  setup = """\
x = ["one" * 1000, "two" * 1000, "three" * 1000]
y = ["four" * 1000, "five" * 1000]
"""
>>> timeit.timeit('x + y', setup=setup, number=1000000)
0.05397573999994165
>>> timeit.timeit('x.extend(y)', setup=setup, number=1000000)
0.06511967799997365``````

In this case, `extend` does not have an advantage.

### Avoid Chain From `itertools` For Two Lists

It is possible to use `chain` from `itertools` to create an iterable of two lists.

``````>>>longlist = ["one","two", "three"] * 1000
['one', 'two', 'three', 'one', 'two', 'three',, .......... ]
>>>shortlist = ["four","five"]
["four","five"]
>>> from itertools import chain
>>> z = list(chain(longlist, shortlist)
['one', 'two', 'three', 'one', , .........., "four","five"]``````

We can check the performance of using chain:

``````>>> setup = """\
from itertools import chain
x = ["one","two","three"] * 1000
y = ["four","five","six"]
"""
# x + y with large x
# x.extend(y) with large x
>>> timeit.timeit('x.extend(y)', setup=setup, number=1000000)
0.06857255800002804
>>> timeit.timeit('list(chain(x, y))', setup=setup, number=1000000)
16.810488051999982``````

Using `chain` with two lists is slower in all cases tested, and `x + y` is easier to understand.

## Combining N Lists in Python

If you need to add three or even ten lists together and the lists are statically known, then `+` for concatenate works great.

``````>>> one = ["one","two", "three"]
>>> two = ["four","five"]
>>> three = []
>>> z = one + two + three``````

## Flatten a List of Lists in Python

However, if the number of lists is dynamic and unknown until runtime, `chain` from `itertools` becomes a great option. Chain takes a list of lists and flattens it into a single list.

``````>>> l = [["one","two", "three"],["four","five"],[]] * 99
[['one', 'two', 'three'], ['four', 'five'], [], ...
>>> list(chain.from_iterable(l))
['one', 'two', 'three', 'four', 'five', 'one', 'two', ... ]``````

`chain` can take anything iterable, making it an excellent choice for combining lists, dictionaries, and other iterable structures.

``````>>> from itertools import chain
>>> one = [1,2,3]
>>> two = {1,2,3}
>>> list(chain(one, two, one))
[1, 2, 3, 1, 2, 3, 1, 2, 3]``````

### Performance of Flattening a List of Lists

Performance doesn’t always matter, but readability always does, and the chain method is a straightforward way to combine lists of lists. That said, let’s put readability aside for a moment and try to find the fastest way to flatten lists.

One option is iterating ourselves:

``````result = []
for nestedlist in l:
result.extend(nestedlist)``````

Let’s check its performance vs chain:

``````>>> setup = """\
from itertools import chain
l = [["one","two", "three"],["four","five"],[]] * 99
"""
>>> # Add Nested Lists using chain.from_iterable
>>> timeit.timeit('list(chain.from_iterable(l))', setup=setup, number=100000)
1.0384087909997106
>>> ### Add using our own iteration
>>> run = """\
result = []
for nestedlist in l:
result.extend(nestedlist)
"""
>>> timeit.timeit(run, setup=setup, number=100000)
1.8619721710001613``````

This shows that `chain.from_iterable` is faster than extend.

## Flattening and Merging Lists With One Big List

What about adding a list of lists to an existing and large list? We saw that using extend can be faster with two lists when one is significantly longer than the other so let’s test the performance of `extend` with N lists.

First, we use our standard `chain.from_iterable`.

``````>>> # Method 1 - chain.from_iterable
>>> longlist = ["one","two", "three"] * 1000
>>> nestedlist = [longlist, ["one","two", "three"],["four","five"],[]]
>>> list(chain.from_iterable(nestedlist))``````

We then test its performance:

``````>>> setup = """\
from itertools import chain
longlist = ["one","two", "three"] * 1000;
combinedlist = [longlist, ["one","two", "three"],["four","five"],[]]
"""
>>> timeit.timeit('list(chain.from_iterable(combinedlist))', setup=setup, number=100000)
1.8676087710009597``````

Next, let’s try concatenating by adding everything onto the long list:

``````>>> # Method 2 - extend
>>>  longlist = ["one","two", "three"] * 1000
>>>  nestedlist = [["one","two", "three"],["four","five"],[]]
>>>  for item in nestedlist:
>>>     longlist.extent(item)``````

Performance Test:

``````>>> setup = """\
from itertools import chain
longlist = ["one","two", "three"] * 1000;
nestedlist = [["one","two", "three"],["four","five"],[]]
"""
>>> run = """\
for item in nestedlist:
longlist.extend(item)
"""
>>> timeit.timeit(run, setup=setup, number=100000)
0.02403609199973289``````

There we go, `extend` is much faster when flattening lists or concatenating many lists with one long list. If you encounter this, using `extend` to add the smaller lists to the long list can decrease the work that has to be done and increase performance.

## Summary

These are the main variants of combining lists in python. Use this table to guide you in the future.

Also, if you are looking for a nice way to standardize the processes around your python projects – running tests, installing dependencies, and linting code – take a look at Earthly for Repeatable Builds.

Condition Solution Performance Optimization2
2 lists `x + y` No
1 large list, 1 small list `x.extend(y)` Yes
Known number of N lists `x + y + z` No
Unknown number of N lists `list(chain.from_iterable(l))` No
List of Lists `list(chain.from_iterable(l))` No
1 large list, many small lists `for l1 in l: x.extend(...)` Yes

1. I did all the performance testing using Python 3.9.5 on MacOS BigSur.↩︎

2. If you don’t have a performance bottleneck, clarity trumps performance, and you should ignore the performance suggestions.

↩︎

Spreading the word about Earthly. Host of CoRecursive podcast. Physical Embodiment of Cunningham’s Law.