正文

3.1 pandas 基本介绍

如果把 numpy 比作列表，pandas 可以比作字典

1 2	`import pandas as pd import numpy as np`

创建 pandas 序列

1	`pd.Series([1, 3, 6, np.nan, 44, 1]) # np.nan 相当于 None`

0     1.0
1     3.0
2     6.0
3     NaN
4    44.0
5     1.0
dtype: float64

创建一个日期序列

1 2	`dates = pd.date_range('20160101', periods=6) dates`

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06'],
              dtype='datetime64[ns]', freq='D')

生成数据表格, pd.DataFrame(数据, 索引, 标题列表)

1	`pd.DataFrame(np.random.randn(6, 4), index=dates, columns=['a', 'b', 'c', 'd'])`

	a	b	c	d
2016-01-01	0.014513	0.490584	-1.985363	-1.734158
2016-01-02	1.694216	0.383375	-1.260541	-0.126581
2016-01-03	0.475547	1.050239	0.897093	1.155942
2016-01-04	0.126726	-1.169920	-1.876652	-1.245558
2016-01-05	0.606119	0.648469	-1.367697	-0.822617
2016-01-06	-0.615075	0.557680	-2.104794	0.114070

默认生成的数据表格

1	`pd.DataFrame(np.arange(12).reshape((3, 4)))`

	0	1	2	3
0	0	1	2	3
1	4	5	6	7
2	8	9	10	11

用字典代替要输入的值

df = pd.DataFrame({'A': 1,
                   'B': pd.Timestamp('20130102'),
                   'C': pd.Series(1, index=list(range(4)), dtype='float32'),
                   'D': np.array([3] * 4, dtype='int32'),
                   'E': pd.Categorical(["test", "train", "test", "train"]),
                   'F': 'foo'})
df

	A	B	C	D	E	F
0	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
2	1	2013-01-02	1.0	3	test	foo
3	1	2013-01-02	1.0	3	train	foo

输出表格的类型

1	`df.dtypes`

A             int64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

输出表格的索引

1	`df.index`

Int64Index([0, 1, 2, 3], dtype='int64')

输出表格的行名

1	`df.columns`

Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')

输出表格每行的数据

1	`df.values`

array([[1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo']],
      dtype=object)

输出表格每行的数据特征

1	`df.describe()`

	A	C	D
count	4.0	4.0	4.0
mean	1.0	1.0	3.0
std	0.0	0.0	0.0
min	1.0	1.0	3.0
25%	1.0	1.0	3.0
50%	1.0	1.0	3.0
75%	1.0	1.0	3.0
max	1.0	1.0	3.0

转置表格

df.T

	0	1	2	3
A	1	1	1	1
B	2013-01-02 00:00:00	2013-01-02 00:00:00	2013-01-02 00:00:00	2013-01-02 00:00:00
C	1.0	1.0	1.0	1.0
D	3	3	3	3
E	test	train	test	train
F	foo	foo	foo	foo

排序表格

1	`df.sort_index(axis=1, ascending=False) # 以行排序，倒序`

	F	E	D	C	B	A
0	foo	test	3	1.0	2013-01-02	1
1	foo	train	3	1.0	2013-01-02	1
2	foo	test	3	1.0	2013-01-02	1
3	foo	train	3	1.0	2013-01-02	1

1	`df.sort_index(axis=0, ascending=False)`

	A	B	C	D	E	F
3	1	2013-01-02	1.0	3	train	foo
2	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
0	1	2013-01-02	1.0	3	test	foo

1	`df.sort_values(by='E')`

	A	B	C	D	E	F
0	1	2013-01-02	1.0	3	test	foo
2	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
3	1	2013-01-02	1.0	3	train	foo

3.2 pandas 选择数据

import pandas as pd
import numpy as np

dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])
df

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

选择表格中 A 列的所有数据

df.A

2013-01-01     0
2013-01-02     4
2013-01-03     8
2013-01-04    12
2013-01-05    16
2013-01-06    20
Freq: D, Name: A, dtype: int32

df['A']

2013-01-01     0
2013-01-02     4
2013-01-03     8
2013-01-04    12
2013-01-05    16
2013-01-06    20
Freq: D, Name: A, dtype: int32

选择表格中某些行中所有的数据

df[0:3]

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11

1	`df['20130102':'20130104']`

	A	B	C	D
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15

根据标签选择数据

1	`df.loc['20130102']`

A    4
B    5
C    6
D    7
Name: 2013-01-02 00:00:00, dtype: int32

1	`df.loc[:, ['A', 'B']]`

	A	B
2013-01-01	0	1
2013-01-02	4	5
2013-01-03	8	9
2013-01-04	12	13
2013-01-05	16	17
2013-01-06	20	21

1	`df.loc['20130102', ['A', 'B']]`

A    4
B    5
Name: 2013-01-02 00:00:00, dtype: int32

根据位置选择数据

1	`df.iloc[3]`

A    12
B    13
C    14
D    15
Name: 2013-01-04 00:00:00, dtype: int32

1	`df.iloc[3, 1]`

1	`df.iloc[3:5, 1:3]`

	B	C
2013-01-04	13	14
2013-01-05	17	18

1	`df.iloc[[1, 3, 5], 1:3]`

	B	C
2013-01-02	5	6
2013-01-04	13	14
2013-01-06	21	22

混合选择表格中的数据

1	`df.ix[:3, ['A', 'C']] # 高版本中已不推荐使用`

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

Input In [27], in <cell line: 1>()
----> 1 df.ix[:3, ['A', 'C']]


File ~\anaconda3\lib\site-packages\pandas\core\generic.py:5575, in NDFrame.__getattr__(self, name)
   5568 if (
   5569     name not in self._internal_names_set
   5570     and name not in self._metadata
   5571     and name not in self._accessors
   5572     and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5573 ):
   5574     return self[name]
-> 5575 return object.__getattribute__(self, name)


AttributeError: 'DataFrame' object has no attribute 'ix'

是或否的筛选

1	`df[df.A > 8] # 只对比 A，但是 ABCD 都会显示`

	A	B	C	D
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

3.3pandas 设置值

import pandas as pd
import numpy as np

dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])

dates

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

df

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

修改表格中的值

df.iloc[2, 2] = 1111
df.loc['20130101', 'B'] = 2222
df[df.A > 4] = 0
df.B[df.A > 4] = 0
df

	A	B	C	D
2013-01-01	0	2222	2	3
2013-01-02	4	5	6	7
2013-01-03	0	0	1111	0
2013-01-04	0	0	0	0
2013-01-05	0	0	0	0
2013-01-06	0	0	0	0

给表格中添加空的行

1 2	`df['F'] = np.nan df`

	A	B	C	D	F
2013-01-01	0	2222	2	3	NaN
2013-01-02	4	5	6	7	NaN
2013-01-03	0	0	1111	0	NaN
2013-01-04	0	0	0	0	NaN
2013-01-05	0	0	0	0	NaN
2013-01-06	0	0	0	0	NaN

1 2	`df['E'] = pd.Series([1, 2, 3, 4, 5, 6], index=pd.date_range('20130101', periods=6)) df`

	A	B	C	D	F	E
2013-01-01	0	2222	2	3	NaN	1
2013-01-02	4	5	6	7	NaN	2
2013-01-03	0	0	1111	0	NaN	3
2013-01-04	0	0	0	0	NaN	4
2013-01-05	0	0	0	0	NaN	5
2013-01-06	0	0	0	0	NaN	6

3.4 pandas 处理丢失数据

import pandas as pd
import numpy as np

dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])
df.iloc[0, 1] = np.nan
df.iloc[1, 2] = np.nan
df

	A	B	C	D
2013-01-01	0	NaN	2.0	3
2013-01-02	4	5.0	NaN	7
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

丢弃 nan 数据

1	`df.dropna(axis=0, how='any') # how = {'any', 'all'} 出现 nan/所有数据都是 nan 是才处理`

	A	B	C	D
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

将 nan 数据更改为其他值

1	`df.fillna(value=0)`

	A	B	C	D
2013-01-01	0	0.0	2.0	3
2013-01-02	4	5.0	0.0	7
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

判断表格中是否有缺失数据

1	`df.isna()`

	A	B	C	D
2013-01-01	False	True	False	False
2013-01-02	False	False	True	False
2013-01-03	False	False	False	False
2013-01-04	False	False	False	False
2013-01-05	False	False	False	False
2013-01-06	False	False	False	False

1	`df.isnull()`

	A	B	C	D
2013-01-01	False	True	False	False
2013-01-02	False	False	True	False
2013-01-03	False	False	False	False
2013-01-04	False	False	False	False
2013-01-05	False	False	False	False
2013-01-06	False	False	False	False

1	`np.any(df.isnull() is True) # 若返回 True，则说明表格中的确缺失数据`

False

3.5 3D 数据

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()  # 新建一个 figure 窗口
ax = Axes3D(fig, auto_add_to_figure=False)  # 添加三维坐标轴
fig.add_axes(ax)

# 输入数据
X = np.arange(-4, 4, 0.25)
Y = np.arange(-4, 4, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X ** 2 + Y ** 2)
Z = np.sin(R)

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=plt.get_cmap('rainbow'))
ax.contourf(X, Y, Z, zdir='z', offset=-2, cmap='rainbow')
ax.set_zlim(-2, 2)

plt.show()

png

3.6 pandas 合并 concat

1 2	`import pandas as pd import numpy as np`

concatenating

1
2
3

df1 = pd.DataFrame(np.ones((3, 4)) * 0, columns=['a', 'b', 'c', 'd'])
df2 = pd.DataFrame(np.ones((3, 4)) * 1, columns=['a', 'b', 'c', 'd'])
df3 = pd.DataFrame(np.ones((3, 4)) * 2, columns=['a', 'b', 'c', 'd'])

df1

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0

df2

	a	b	c	d
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0

df3

	a	b	c	d
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

上下合并

1	`pd.concat([df1, df2, df3], axis=0) # axis=0 竖向划分操作行，1 横向划分操作列`

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

1	`pd.concat([df1, df2, df3], axis=0, ignore_index=True) # 重新排序索引`

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

join, [‘inner’, ‘outer’]

1
2
3

df1 = pd.DataFrame(np.ones((3, 4)) * 0, columns=['a', 'b', 'c', 'd'])
df2 = pd.DataFrame(np.ones((3, 4)) * 1, columns=['b', 'c', 'd', 'e'])
df1

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0

df2

	b	c	d	e
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0

直接合并出现的结果，会用 nan 填充

1	`pd.concat([df1, df2], join='outer')`

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

1	`pd.concat([df1, df2], join='inner') # 只会合并相同的部分`

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

1	`pd.concat([df1, df2], join='inner', ignore_index=True)`

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

按照索引进行合并

1	`pd.concat([df1, df2], axis=1, join_axes=[df1.index])`

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

Input In [23], in <cell line: 1>()
----> 1 pd.concat([df1, df2], axis=1, join_axes=[df1.index])


File ~\anaconda3\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    305 if len(args) > num_allow_args:
    306     warnings.warn(
    307msg.format(arguments=arguments),
    308         FutureWarning,
    309stacklevel=stacklevel,
    310     )
--> 311 return func(*args, **kwargs)


TypeError: concat() got an unexpected keyword argument 'join_axes'

append 将被移除，不推荐使用

1	`df1.append([df2, df3])`

C:\Users\gzjzx\AppData\Local\Temp\ipykernel_10380\266511466.py:1: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  df1.append([df2, df3])

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

3.7pandas 合并 merge

import pandas as pd

left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'A':   ['A0', 'A1', 'A2', 'A3'],
                     'B':   ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                      'C':   ['C0', 'C1', 'C2', 'C3'],
                      'D':   ['D0', 'D1', 'D2', 'D3']})

left

	key	A	B
0	K0	A0	B0
1	K1	A1	B1
2	K2	A2	B2
3	K3	A3	B3

right

	key	C	D
0	K0	C0	D0
1	K1	C1	D1
2	K2	C2	D2
3	K3	C3	D3

1	`pd.merge(left, right, on='key')`

	key	A	B	C	D
0	K0	A0	B0	C0	D0
1	K1	A1	B1	C1	D1
2	K2	A2	B2	C2	D2
3	K3	A3	B3	C3	D3

在有两个 key 的情况下

left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                     'key2': ['K0', 'K1', 'K0', 'K1'],
                     'A':   ['A0', 'A1', 'A2', 'A3'],
                     'B':   ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                      'key2': ['K0', 'K0', 'K0', 'K0'],
                      'C':   ['C0', 'C1', 'C2', 'C3'],
                      'D':   ['D0', 'D1', 'D2', 'D3']})
left

	key1	key2	A	B
0	K0	K0	A0	B0
1	K0	K1	A1	B1
2	K1	K0	A2	B2
3	K2	K1	A3	B3

right

	key1	key2	C	D
0	K0	K0	C0	D0
1	K1	K0	C1	D1
2	K1	K0	C2	D2
3	K2	K0	C3	D3

1	`pd.merge(left, right, on=['key1', 'key2'], how='inner')`

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K1	K0	A2	B2	C1	D1
2	K1	K0	A2	B2	C2	D2

1	`pd.merge(left, right, on=['key1', 'key2'], how='outer')`

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

1	`pd.merge(left, right, on=['key1', 'key2'], how='left')`

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

1	`pd.merge(left, right, on=['key1', 'key2'], how='right', indicator=True) # indicator 显示合并的方式`

	key1	key2	A	B	C	D	_merge
0	K0	K0	A0	B0	C0	D0	both
1	K1	K0	A2	B2	C1	D1	both
2	K1	K0	A2	B2	C2	D2	both
3	K2	K0	NaN	NaN	C3	D3	right_only

1	`pd.merge(left, right, on=['key1', 'key2'], how='right', indicator="indicator_column") # indicator 显示合并的方式`

	key1	key2	A	B	C	D	indicator_column
0	K0	K0	A0	B0	C0	D0	both
1	K1	K0	A2	B2	C1	D1	both
2	K1	K0	A2	B2	C2	D2	both
3	K2	K0	NaN	NaN	C3	D3	right_only

1	`pd.merge(left, right, left_index=True, right_index=True, how='outer')`

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

处理合并造成的重复问题

1
2
3

boys = pd.DataFrame({'k': ['K0', 'K1', 'K2'], 'age': [1, 2, 3]})
girls = pd.DataFrame({'k': ['K0', 'K1', 'K2'], 'age': [4, 5, 6]})
boys

	k	age
0	K0	1
1	K1	2
2	K2	3

girls

	k	age
0	K0	4
1	K1	5
2	K2	6

1	`pd.merge(boys, girls, on='k', suffixes=['_boy', '_girl'], how='inner')`

	k	age_boy	age_girl
0	K0	1	4
1	K1	2	5
2	K2	3	6

3.8 pandas plot 画图

1
2
3

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

0       0.682404
1       1.833947
2       2.639884
3       1.843226
4       3.675730
         ...    
995    72.472419
996    70.507220
997    70.897704
998    71.912970
999    71.542112
Length: 1000, dtype: float64

Series 折线图

1 2	`data = pd.Series(np.random.randn(1000), index=np.arange(1000)).cumsum() data`

0      -1.287330
1      -1.293898
2      -0.166798
3      -1.177414
4      -1.890751
         ...    
995    10.204522
996    10.308410
997     7.887580
998     8.327458
999     9.105820
Length: 1000, dtype: float64

1 2	`data.plot() plt.show()`

png

DataFrame 折线图

1 2	`data = pd.DataFrame(np.random.randn(1000, 4), index=np.arange(1000), columns=list('ABCD')).cumsum() data`

	A	B	C	D
0	0.020454	-1.172689	1.843716	1.544707
1	-0.097744	0.989506	1.068073	2.404901
2	-0.273196	0.758510	-1.905761	3.836764
3	0.401259	1.408281	-2.043875	3.762166
4	1.821346	0.941836	-1.549852	4.347704
...	...	...	...	...
995	-4.775217	-18.948179	-47.410079	-53.719370
996	-3.830409	-19.661865	-46.040664	-54.201339
997	-3.061518	-20.368645	-46.068266	-54.682348
998	-3.447858	-21.113475	-46.808006	-54.889304
999	-2.270221	-21.544267	-46.512128	-54.953465

1000 rows × 4 columns

1 2	`data.plot() plt.show()`

png

plot 的方法：bar 条形图，hist，box，kde，area，scatter，hexbin，pie……

1
2
3

ax = data.plot.scatter(x='A', y='B', color='DarkBlue', label='Class1')
data.plot.scatter(x='A', y='C', color='DarkGreen', label='Class2', ax=ax)
plt.show()

png

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3