---
jupyter:
  jupytext:
    text_representation:
      extension: .Rmd
      format_name: rmarkdown
      format_version: '1.1'
      jupytext_version: 1.1.1
  kernelspec:
    display_name: Python 3
    language: python
    name: python3
---

```{python nbsphinx=hidden}
import pandas as pd
pd.set_option("display.max_rows", 20)
```

## Nest

```{python}
from siuba import _, nest, unnest, group_by
from gapminder import gapminder
```

### Specifying column to exclude

```{python}
gap_country = nest(gapminder, -_.country)
gap_country
```

```{python}
# unnest is its inverse (except for some sorting!)
unnest(gap_country, "data")
```

### Specifying column to include

```{python}
# specifying columns to nest directly
df = pd.DataFrame({
    'group': ['a', 'a', 'b', 'b'],
    'value': [1,2,3,4]
})

df >> nest(_.value)
```

### Group by and nesting

```{python}
# equivalent to
# gapminder >> nest(-_.country, -_.continent)

(gapminder
  >> group_by(_.country, _.continent)
  >> nest()
  )
```

### Unnesting lists


For context, see [this Stack Overflow post](https://stackoverflow.com/questions/30885005/pandas-series-of-lists-to-one-series).

```{python}
from siuba import _, unnest, mutate

sent = pd.DataFrame({
    'id': ['1', '2'],
    'sentence': ['a b c d e', 'x y z']
})

sent
```

```{python}
split_sent = sent >> mutate(data = _.sentence.str.split(" "))

split_sent
```

```{python}
split_sent >> unnest()
```