正文

3.1 pandas 基本介绍

如果把 numpy 比作列表，pandas 可以比作字典

python

import pandas as pd
import numpy as np

创建 pandas 序列

python

pd.Series([1, 3, 6, np.nan, 44, 1]) # np.nan 相当于 None

0     1.0
1     3.0
2     6.0
3     NaN
4    44.0
5     1.0
dtype: float64

创建一个日期序列

python

dates = pd.date_range('20160101', periods=6)
dates

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06'],
              dtype='datetime64[ns]', freq='D')

生成数据表格, pd.DataFrame(数据, 索引, 标题列表)

python

pd.DataFrame(np.random.randn(6, 4), index=dates, columns=['a', 'b', 'c', 'd'])

	a	b	c	d
2016-01-01	0.014513	0.490584	-1.985363	-1.734158
2016-01-02	1.694216	0.383375	-1.260541	-0.126581
2016-01-03	0.475547	1.050239	0.897093	1.155942
2016-01-04	0.126726	-1.169920	-1.876652	-1.245558
2016-01-05	0.606119	0.648469	-1.367697	-0.822617
2016-01-06	-0.615075	0.557680	-2.104794	0.114070

默认生成的数据表格

python

pd.DataFrame(np.arange(12).reshape((3, 4)))

	0	1	2	3
0	0	1	2	3
1	4	5	6	7
2	8	9	10	11

用字典代替要输入的值

python

df = pd.DataFrame({'A': 1,
                   'B': pd.Timestamp('20130102'),
                   'C': pd.Series(1, index=list(range(4)), dtype='float32'),
                   'D': np.array([3] * 4, dtype='int32'),
                   'E': pd.Categorical(["test", "train", "test", "train"]),
                   'F': 'foo'})
df

	A	B	C	D	E	F
0	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
2	1	2013-01-02	1.0	3	test	foo
3	1	2013-01-02	1.0	3	train	foo

输出表格的类型

python

df.dtypes

A             int64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

输出表格的索引

python

df.index

Int64Index([0, 1, 2, 3], dtype='int64')

输出表格的行名

python

df.columns

Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')

输出表格每行的数据

python

df.values

array([[1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'test', 'foo'],
       [1, Timestamp('2013-01-02 00:00:00'), 1.0, 3, 'train', 'foo']],
      dtype=object)

输出表格每行的数据特征

python

df.describe()

	A	C	D
count	4.0	4.0	4.0
mean	1.0	1.0	3.0
std	0.0	0.0	0.0
min	1.0	1.0	3.0
25%	1.0	1.0	3.0
50%	1.0	1.0	3.0
75%	1.0	1.0	3.0
max	1.0	1.0	3.0

转置表格

python

df.T

	0	1	2	3
A	1	1	1	1
B	2013-01-02 00:00:00	2013-01-02 00:00:00	2013-01-02 00:00:00	2013-01-02 00:00:00
C	1.0	1.0	1.0	1.0
D	3	3	3	3
E	test	train	test	train
F	foo	foo	foo	foo

排序表格

python

df.sort_index(axis=1, ascending=False) # 以行排序，倒序

	F	E	D	C	B	A
0	foo	test	3	1.0	2013-01-02	1
1	foo	train	3	1.0	2013-01-02	1
2	foo	test	3	1.0	2013-01-02	1
3	foo	train	3	1.0	2013-01-02	1

python

df.sort_index(axis=0, ascending=False)

	A	B	C	D	E	F
3	1	2013-01-02	1.0	3	train	foo
2	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
0	1	2013-01-02	1.0	3	test	foo

python

df.sort_values(by='E')

	A	B	C	D	E	F
0	1	2013-01-02	1.0	3	test	foo
2	1	2013-01-02	1.0	3	test	foo
1	1	2013-01-02	1.0	3	train	foo
3	1	2013-01-02	1.0	3	train	foo

3.2 pandas 选择数据

python

import pandas as pd
import numpy as np
 
dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])
df

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

选择表格中 A 列的所有数据

python

df.A

2013-01-01     0
2013-01-02     4
2013-01-03     8
2013-01-04    12
2013-01-05    16
2013-01-06    20
Freq: D, Name: A, dtype: int32

python

df['A']

2013-01-01     0
2013-01-02     4
2013-01-03     8
2013-01-04    12
2013-01-05    16
2013-01-06    20
Freq: D, Name: A, dtype: int32

选择表格中某些行中所有的数据

python

df[0:3]

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11

python

df['20130102':'20130104']

	A	B	C	D
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15

根据标签选择数据

python

df.loc['20130102']

A    4
B    5
C    6
D    7
Name: 2013-01-02 00:00:00, dtype: int32

python

df.loc[:, ['A', 'B']]

	A	B
2013-01-01	0	1
2013-01-02	4	5
2013-01-03	8	9
2013-01-04	12	13
2013-01-05	16	17
2013-01-06	20	21

python

df.loc['20130102', ['A', 'B']]

A    4
B    5
Name: 2013-01-02 00:00:00, dtype: int32

根据位置选择数据

python

df.iloc[3]

A    12
B    13
C    14
D    15
Name: 2013-01-04 00:00:00, dtype: int32

python

df.iloc[3, 1]

python

df.iloc[3:5, 1:3]

	B	C
2013-01-04	13	14
2013-01-05	17	18

python

df.iloc[[1, 3, 5], 1:3]

	B	C
2013-01-02	5	6
2013-01-04	13	14
2013-01-06	21	22

混合选择表格中的数据

python

df.ix[:3, ['A', 'C']] # 高版本中已不推荐使用

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

Input In [27], in <cell line: 1>()
----> 1 df.ix[:3, ['A', 'C']]


File ~\anaconda3\lib\site-packages\pandas\core\generic.py:5575, in NDFrame.__getattr__(self, name)
   5568 if (
   5569     name not in self._internal_names_set
   5570     and name not in self._metadata
   5571     and name not in self._accessors
   5572     and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5573 ):
   5574     return self[name]
-> 5575 return object.__getattribute__(self, name)


AttributeError: 'DataFrame' object has no attribute 'ix'

是或否的筛选

python

df[df.A > 8] # 只对比 A，但是 ABCD 都会显示

	A	B	C	D
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

3.3pandas 设置值

python

import pandas as pd
import numpy as np
 
dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])

python

dates

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

python

df

	A	B	C	D
2013-01-01	0	1	2	3
2013-01-02	4	5	6	7
2013-01-03	8	9	10	11
2013-01-04	12	13	14	15
2013-01-05	16	17	18	19
2013-01-06	20	21	22	23

修改表格中的值

python

df.iloc[2, 2] = 1111
df.loc['20130101', 'B'] = 2222
df[df.A > 4] = 0
df.B[df.A > 4] = 0
df

	A	B	C	D
2013-01-01	0	2222	2	3
2013-01-02	4	5	6	7
2013-01-03	0	0	1111	0
2013-01-04	0	0	0	0
2013-01-05	0	0	0	0
2013-01-06	0	0	0	0

给表格中添加空的行

python

df['F'] = np.nan
df

	A	B	C	D	F
2013-01-01	0	2222	2	3	NaN
2013-01-02	4	5	6	7	NaN
2013-01-03	0	0	1111	0	NaN
2013-01-04	0	0	0	0	NaN
2013-01-05	0	0	0	0	NaN
2013-01-06	0	0	0	0	NaN

python

df['E'] = pd.Series([1, 2, 3, 4, 5, 6], index=pd.date_range('20130101', periods=6))
df

	A	B	C	D	F	E
2013-01-01	0	2222	2	3	NaN	1
2013-01-02	4	5	6	7	NaN	2
2013-01-03	0	0	1111	0	NaN	3
2013-01-04	0	0	0	0	NaN	4
2013-01-05	0	0	0	0	NaN	5
2013-01-06	0	0	0	0	NaN	6

3.4 pandas 处理丢失数据

python

import pandas as pd
import numpy as np
 
dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6, 4)), index=dates, columns=['A', 'B', 'C', 'D'])
df.iloc[0, 1] = np.nan
df.iloc[1, 2] = np.nan
df

	A	B	C	D
2013-01-01	0	NaN	2.0	3
2013-01-02	4	5.0	NaN	7
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

丢弃 nan 数据

python

df.dropna(axis=0, how='any') # how = {'any', 'all'} 出现 nan/所有数据都是 nan 是才处理

	A	B	C	D
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

将 nan 数据更改为其他值

python

df.fillna(value=0)

	A	B	C	D
2013-01-01	0	0.0	2.0	3
2013-01-02	4	5.0	0.0	7
2013-01-03	8	9.0	10.0	11
2013-01-04	12	13.0	14.0	15
2013-01-05	16	17.0	18.0	19
2013-01-06	20	21.0	22.0	23

判断表格中是否有缺失数据

python

df.isna()

	A	B	C	D
2013-01-01	False	True	False	False
2013-01-02	False	False	True	False
2013-01-03	False	False	False	False
2013-01-04	False	False	False	False
2013-01-05	False	False	False	False
2013-01-06	False	False	False	False

python

df.isnull()

	A	B	C	D
2013-01-01	False	True	False	False
2013-01-02	False	False	True	False
2013-01-03	False	False	False	False
2013-01-04	False	False	False	False
2013-01-05	False	False	False	False
2013-01-06	False	False	False	False

python

np.any(df.isnull() is True) # 若返回 True，则说明表格中的确缺失数据

False

3.5 3D 数据

python

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
 
fig = plt.figure()  # 新建一个 figure 窗口
ax = Axes3D(fig, auto_add_to_figure=False)  # 添加三维坐标轴
fig.add_axes(ax)
 
# 输入数据
X = np.arange(-4, 4, 0.25)
Y = np.arange(-4, 4, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X ** 2 + Y ** 2)
Z = np.sin(R)
 
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=plt.get_cmap('rainbow'))
ax.contourf(X, Y, Z, zdir='z', offset=-2, cmap='rainbow')
ax.set_zlim(-2, 2)
 
plt.show()

3.6 pandas 合并 concat

python

import pandas as pd 
import numpy as np

concatenating

python

df1 = pd.DataFrame(np.ones((3, 4)) * 0, columns=['a', 'b', 'c', 'd'])
df2 = pd.DataFrame(np.ones((3, 4)) * 1, columns=['a', 'b', 'c', 'd'])
df3 = pd.DataFrame(np.ones((3, 4)) * 2, columns=['a', 'b', 'c', 'd'])

python

df1

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0

python

df2

	a	b	c	d
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0

python

df3

	a	b	c	d
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

上下合并

python

pd.concat([df1, df2, df3], axis=0) # axis=0 竖向划分操作行，1 横向划分操作列

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

python

pd.concat([df1, df2, df3], axis=0, ignore_index=True) # 重新排序索引

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

join, ['inner', 'outer']

python

df1 = pd.DataFrame(np.ones((3, 4)) * 0, columns=['a', 'b', 'c', 'd'])
df2 = pd.DataFrame(np.ones((3, 4)) * 1, columns=['b', 'c', 'd', 'e'])
df1

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0

python

df2

	b	c	d	e
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0

直接合并出现的结果，会用 nan 填充

python

pd.concat([df1, df2], join='outer')

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

python

pd.concat([df1, df2], join='inner') # 只会合并相同的部分

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

python

pd.concat([df1, df2], join='inner', ignore_index=True)

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

按照索引进行合并

python

pd.concat([df1, df2], axis=1, join_axes=[df1.index])

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

Input In [23], in <cell line: 1>()
----> 1 pd.concat([df1, df2], axis=1, join_axes=[df1.index])


File ~\anaconda3\lib\site-packages\pandas\util\_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    305 if len(args) > num_allow_args:
    306     warnings.warn(
    307msg.format(arguments=arguments),
    308         FutureWarning,
    309stacklevel=stacklevel,
    310     )
--> 311 return func(*args, **kwargs)


TypeError: concat() got an unexpected keyword argument 'join_axes'

append 将被移除，不推荐使用

python

df1.append([df2, df3])

C:\Users\gzjzx\AppData\Local\Temp\ipykernel_10380\266511466.py:1: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  df1.append([df2, df3])

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

3.7pandas 合并 merge

python

import pandas as pd
 
left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'A':   ['A0', 'A1', 'A2', 'A3'],
                     'B':   ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                      'C':   ['C0', 'C1', 'C2', 'C3'],
                      'D':   ['D0', 'D1', 'D2', 'D3']})

python

left

	key	A	B
0	K0	A0	B0
1	K1	A1	B1
2	K2	A2	B2
3	K3	A3	B3

python

right

	key	C	D
0	K0	C0	D0
1	K1	C1	D1
2	K2	C2	D2
3	K3	C3	D3

python

pd.merge(left, right, on='key')

	key	A	B	C	D
0	K0	A0	B0	C0	D0
1	K1	A1	B1	C1	D1
2	K2	A2	B2	C2	D2
3	K3	A3	B3	C3	D3

在有两个 key 的情况下

python

left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                     'key2': ['K0', 'K1', 'K0', 'K1'],
                     'A':   ['A0', 'A1', 'A2', 'A3'],
                     'B':   ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                      'key2': ['K0', 'K0', 'K0', 'K0'],
                      'C':   ['C0', 'C1', 'C2', 'C3'],
                      'D':   ['D0', 'D1', 'D2', 'D3']})
left

	key1	key2	A	B
0	K0	K0	A0	B0
1	K0	K1	A1	B1
2	K1	K0	A2	B2
3	K2	K1	A3	B3

python

right

	key1	key2	C	D
0	K0	K0	C0	D0
1	K1	K0	C1	D1
2	K1	K0	C2	D2
3	K2	K0	C3	D3

python

pd.merge(left, right, on=['key1', 'key2'], how='inner')

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K1	K0	A2	B2	C1	D1
2	K1	K0	A2	B2	C2	D2

python

pd.merge(left, right, on=['key1', 'key2'], how='outer')

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

python

pd.merge(left, right, on=['key1', 'key2'], how='left')

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

python

pd.merge(left, right, on=['key1', 'key2'], how='right', indicator=True) # indicator 显示合并的方式

	key1	key2	A	B	C	D	_merge
0	K0	K0	A0	B0	C0	D0	both
1	K1	K0	A2	B2	C1	D1	both
2	K1	K0	A2	B2	C2	D2	both
3	K2	K0	NaN	NaN	C3	D3	right_only

python

pd.merge(left, right, on=['key1', 'key2'], how='right', indicator="indicator_column") # indicator 显示合并的方式

	key1	key2	A	B	C	D	indicator_column
0	K0	K0	A0	B0	C0	D0	both
1	K1	K0	A2	B2	C1	D1	both
2	K1	K0	A2	B2	C2	D2	both
3	K2	K0	NaN	NaN	C3	D3	right_only

python

pd.merge(left, right, left_index=True, right_index=True, how='outer')

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

处理合并造成的重复问题

python

boys = pd.DataFrame({'k': ['K0', 'K1', 'K2'], 'age': [1, 2, 3]})
girls = pd.DataFrame({'k': ['K0', 'K1', 'K2'], 'age': [4, 5, 6]})
boys

	k	age
0	K0	1
1	K1	2
2	K2	3

python

girls

	k	age
0	K0	4
1	K1	5
2	K2	6

python

pd.merge(boys, girls, on='k', suffixes=['_boy', '_girl'], how='inner')

	k	age_boy	age_girl
0	K0	1	4
1	K1	2	5
2	K2	3	6

3.8 pandas plot 画图

python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

0       0.682404
1       1.833947
2       2.639884
3       1.843226
4       3.675730
         ...    
995    72.472419
996    70.507220
997    70.897704
998    71.912970
999    71.542112
Length: 1000, dtype: float64

Series 折线图

python

data = pd.Series(np.random.randn(1000), index=np.arange(1000)).cumsum()
data

0      -1.287330
1      -1.293898
2      -0.166798
3      -1.177414
4      -1.890751
         ...    
995    10.204522
996    10.308410
997     7.887580
998     8.327458
999     9.105820
Length: 1000, dtype: float64

python

data.plot()
plt.show()

DataFrame 折线图

python

data = pd.DataFrame(np.random.randn(1000, 4), index=np.arange(1000), columns=list('ABCD')).cumsum()
data

	A	B	C	D
0	0.020454	-1.172689	1.843716	1.544707
1	-0.097744	0.989506	1.068073	2.404901
2	-0.273196	0.758510	-1.905761	3.836764
3	0.401259	1.408281	-2.043875	3.762166
4	1.821346	0.941836	-1.549852	4.347704
...	...	...	...	...
995	-4.775217	-18.948179	-47.410079	-53.719370
996	-3.830409	-19.661865	-46.040664	-54.201339
997	-3.061518	-20.368645	-46.068266	-54.682348
998	-3.447858	-21.113475	-46.808006	-54.889304
999	-2.270221	-21.544267	-46.512128	-54.953465

1000 rows × 4 columns

python

data.plot()
plt.show()

plot 的方法：bar 条形图，hist，box，kde，area，scatter，hexbin，pie……

python

ax = data.plot.scatter(x='A', y='B', color='DarkBlue', label='Class1')
data.plot.scatter(x='A', y='C', color='DarkGreen', label='Class2', ax=ax)
plt.show()

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
0	1.0	1.0	1.0	1.0
1	1.0	1.0	1.0	1.0
2	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0
1	2.0	2.0	2.0	2.0
2	2.0	2.0	2.0	2.0

	a	b	c	d
0	0.0	0.0	0.0	0.0
1	0.0	0.0	0.0	0.0
2	0.0	0.0	0.0	0.0
3	1.0	1.0	1.0	1.0
4	1.0	1.0	1.0	1.0
5	1.0	1.0	1.0	1.0
6	2.0	2.0	2.0	2.0
7	2.0	2.0	2.0	2.0
8	2.0	2.0	2.0	2.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
0	1.0	1.0	1.0
1	1.0	1.0	1.0
2	1.0	1.0	1.0

	b	c	d
0	0.0	0.0	0.0
1	0.0	0.0	0.0
2	0.0	0.0	0.0
3	1.0	1.0	1.0
4	1.0	1.0	1.0
5	1.0	1.0	1.0

	a	b	c	d	e
0	0.0	0.0	0.0	0.0	NaN
1	0.0	0.0	0.0	0.0	NaN
2	0.0	0.0	0.0	0.0	NaN
0	NaN	1.0	1.0	1.0	1.0
1	NaN	1.0	1.0	1.0	1.0
2	NaN	1.0	1.0	1.0	1.0
0	2.0	2.0	2.0	2.0	NaN
1	2.0	2.0	2.0	2.0	NaN
2	2.0	2.0	2.0	2.0	NaN

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN
5	K2	K0	NaN	NaN	C3	D3

	key1	key2	A	B	C	D
0	K0	K0	A0	B0	C0	D0
1	K0	K1	A1	B1	NaN	NaN
2	K1	K0	A2	B2	C1	D1
3	K1	K0	A2	B2	C2	D2
4	K2	K1	A3	B3	NaN	NaN

	key1_x	key2_x	A	B	key1_y	key2_y	C	D
0	K0	K0	A0	B0	K0	K0	C0	D0
1	K0	K1	A1	B1	K1	K0	C1	D1
2	K1	K0	A2	B2	K1	K0	C2	D2
3	K2	K1	A3	B3	K2	K0	C3	D3