歩行量とWi-Fiデータの比較

歩行量(survery_data)とWi-Fiアドレス数(wifi_data)との比較を、階層線形モデルで行う。

手順:

  • 2日間のデータを訓練データとして、lmerにより、地点ごとの回帰係数(切片、傾き)を求める
  • その係数を用いて、もう1日のWi-Fiデータから、地点・時間ごとに予測値を求める
  • 両者の散布図をつくる

マニュアルページ:http://www.statsmodels.org/stable/mixed_linear.html

In [1]:
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
# テストデータ
df = {}
df[1]= pd.read_csv('/home/tamada/survey_wifi20181130.csv', names = ('point','date','time','wifi_data','survey_data'))
df[2] = pd.read_csv('/home/tamada/survey_wifi20181201.csv', names = ('point','date','time','wifi_data','survey_data'))
df[3] = pd.read_csv('/home/tamada/survey_wifi20181202.csv', names = ('point','date','time','wifi_data','survey_data')) 
plt.scatter(df[1]['wifi_data'], df[1]['survey_data'], label = "20181130")
plt.scatter(df[2]['wifi_data'], df[2]['survey_data'], label = "20181201")
plt.scatter(df[3]['wifi_data'], df[3]['survey_data'], label = "20181202")
plt.xlabel("wifi_data")
plt.ylabel("survey_data")
plt.legend()
plt.show()
In [2]:
# テストデータ日を除いた日を訓練データ(train_df)として係数算出 (OLS)
train_df = {}
train_df[1] = pd.concat([df[2],df[3]])
train_df[2] = pd.concat([df[3],df[1]])
train_df[3] = pd.concat([df[1],df[2]])
import statsmodels.api as sm
result = {}
for i in train_df.keys():
    Y = train_df[i]['survey_data'].values
    df_X = train_df[i][["wifi_data"]]
    X = df_X.values
    X1 = sm.add_constant(X)
    model = sm.OLS(Y,X1)
    result[i] = model.fit()
    print(result[i].summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.237
Model:                            OLS   Adj. R-squared:                  0.235
Method:                 Least Squares   F-statistic:                     123.6
Date:                Mon, 10 Jun 2019   Prob (F-statistic):           3.43e-25
Time:                        19:12:59   Log-Likelihood:                -2506.8
No. Observations:                 400   AIC:                             5018.
Df Residuals:                     398   BIC:                             5026.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         88.7806     10.024      8.857      0.000      69.073     108.488
x1             0.2431      0.022     11.118      0.000       0.200       0.286
==============================================================================
Omnibus:                      112.728   Durbin-Watson:                   0.318
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              240.417
Skew:                           1.477   Prob(JB):                     6.22e-53
Kurtosis:                       5.388   Cond. No.                         719.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.310
Model:                            OLS   Adj. R-squared:                  0.309
Method:                 Least Squares   F-statistic:                     179.1
Date:                Mon, 10 Jun 2019   Prob (F-statistic):           5.48e-34
Time:                        19:12:59   Log-Likelihood:                -2518.4
No. Observations:                 400   AIC:                             5041.
Df Residuals:                     398   BIC:                             5049.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         80.2520      9.912      8.096      0.000      60.766      99.738
x1             0.2841      0.021     13.383      0.000       0.242       0.326
==============================================================================
Omnibus:                      113.122   Durbin-Watson:                   0.334
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              280.737
Skew:                           1.391   Prob(JB):                     1.09e-61
Kurtosis:                       6.018   Cond. No.                         704.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.244
Model:                            OLS   Adj. R-squared:                  0.242
Method:                 Least Squares   F-statistic:                     128.3
Date:                Mon, 10 Jun 2019   Prob (F-statistic):           5.76e-26
Time:                        19:12:59   Log-Likelihood:                -2545.7
No. Observations:                 400   AIC:                             5095.
Df Residuals:                     398   BIC:                             5103.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         99.2634     11.264      8.812      0.000      77.118     121.409
x1             0.2382      0.021     11.326      0.000       0.197       0.280
==============================================================================
Omnibus:                      126.608   Durbin-Watson:                   0.381
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              327.007
Skew:                           1.547   Prob(JB):                     9.80e-72
Kurtosis:                       6.170   Cond. No.                         856.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In [3]:
# 混合線形モデル(mixedlm)での係数算出
import statsmodels.formula.api as smf
for i in train_df.keys():
    md = smf.mixedlm("survey_data ~ wifi_data", train_df[i], groups=train_df[i]["point"], re_formula="~wifi_data")
    mdf = md.fit()
    # print(mdf.fe_params)
    
In [4]:
# mixedlmでの地点ごとの係数を(目的日、地点)ごとに算出
# coef[テストデータ用ID][地点ID]  訓練データは地点データを除いた2日間
import statsmodels.formula.api as smf
coef = {}
for i in df.keys():
    md = smf.mixedlm("survey_data ~ wifi_data", train_df[i], groups=train_df[i]["point"], re_formula="~wifi_data")
    mdf = md.fit()
    Intercept_common = mdf.fe_params.Intercept # 共通切片
    coef_common = mdf.fe_params.wifi_data      # 共通傾き
    random_coef = mdf.random_effects           # ランダム項
    coef[i] = {}
    for j in random_coef.keys(): # 共通とランダムの和を計算
        coef[i][j] = [random_coef[j].Group+ Intercept_common, random_coef[j].wifi_data+ coef_common]
# import pprint
# pprint.pprint(coef) # 係数を書き出してみる
In [5]:
# 訓練データから得られた回帰係数を用いてテストデータ日のWi-Fiデータより歩行者数の予測値を計算
dfa = {} # 結果を入れるdataframe
for i in df.keys():
    predict_dict = {}
    for j in df[i].index:
        item = df[i].loc[j]
        this_coef = coef[i][item.point] # 地点ごとの係数
        predict_dict[j] = {"predict_val": item.wifi_data*this_coef[1]+ this_coef[0]}
    #print(predict_dict)  
    predict_fr = pd.DataFrame(predict_dict).T
    dfa[i] = pd.concat([df[i],predict_fr], axis=1)

# 結果表示と描画
print(dfa[1][['predict_val','survey_data']].corr())
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(17, 5))
dfa[1]['color'] = dfa[1]['point'].astype(int) # 色は整数値で指定
dfa[1].plot(kind="scatter", ax=axes[0], x="predict_val", y="survey_data", 
            title="訓練データ12/1,12/2 テストデータ 11/30",
            c="color", colormap='Accent', colorbar=False)

print(dfa[2][['predict_val','survey_data']].corr())
dfa[2]['color'] = dfa[2]['point'].astype(int) # 色は整数値で指定
dfa[2].plot(kind="scatter", ax=axes[1], x="predict_val", y="survey_data",
            title="訓練データ11/30,12/1 テストデータ 12/1",
            c="color", colormap='Accent', colorbar=False)

print(dfa[3][['predict_val','survey_data']].corr())
dfa[3]['color'] = dfa[3]['point'].astype(int) # 色は整数値で指定
dfa[3].plot(kind="scatter", ax=axes[2], x="predict_val", y="survey_data",
            title="訓練データ11/30, 2/1 テストデータ 12/2",
            c="color", colormap='Accent', colorbar=False)
plt.savefig("comp_kofu_2018.png")
             predict_val  survey_data
predict_val       1.0000       0.9315
survey_data       0.9315       1.0000
             predict_val  survey_data
predict_val     1.000000     0.901211
survey_data     0.901211     1.000000
             predict_val  survey_data
predict_val     1.000000     0.930131
survey_data     0.930131     1.000000
In [6]:
# 予測値と歩行量調査値との差(率)が大きい順に並べてみる

# 地点名を各日のデータフレームに追加
name_data= pd.read_csv('/home/tamada/kofupointname.csv', encoding='cp932',names = ('point','name'))
point_name = {}
for i,n in name_data.iterrows():
    point_name[n['point']] = n['name']
name_ser = [point_name[val['point']] for i,val in dfa[1].iterrows()]

for i in dfa:
    # 地点名を追加
    dfa[i]['point_name'] = name_ser
    # 予測と歩行量調査との差を追加
    dfa[i]['diff'] = dfa[i]['predict_val'] - dfa[i]['survey_data']
    # 差の率を追加
    dfa[i]['diff_ratio'] = abs(dfa[i]['predict_val'] - dfa[i]['survey_data'])/ dfa[i]['predict_val']
    # 並べ替え
    dfa[i].sort_values('diff_ratio',ascending=False)

# dfa[1] # 表示
In [7]:
# 地点ごとの誤差の合計
# groupbyを使う 参考: http://sinhrks.hatenablog.com/entry/2014/10/13/005327
daily_sum = {}
for i in dfa:
    daily_sum[i] =dfa[i].groupby('point_name')['survey_data','predict_val','diff'].sum().sort_values('diff', ascending=False)

# ラベルの変更
daily_sum[1].columns = ["survey1130", "predict1130","diff1130"]
daily_sum[2].columns = ["survey1201", "predict1201","diff1201"]
daily_sum[3].columns = ["survey1202", "predict1202","diff1202"]

# 3つのデータフレームのマージ
three_day_sum = pd.merge(daily_sum[1], daily_sum[2], on=['point_name'])
three_day_sum = pd.merge(three_day_sum, daily_sum[3], on=['point_name'])

# 各日の誤差率


# 3日間の合計
three_day_sum['歩行合計'] = three_day_sum['survey1130'] + three_day_sum['survey1201'] + three_day_sum['survey1202']
three_day_sum['推計合計'] = three_day_sum['predict1130'] + three_day_sum['predict1201'] + three_day_sum['predict1202']
three_day_sum['差合計'] = three_day_sum['diff1130'] + three_day_sum['diff1201'] + three_day_sum['diff1202']
three_day_sum['誤差率'] = three_day_sum['差合計'] / three_day_sum['歩行合計']
three_day_sum
Out[7]:
survey1130 predict1130 diff1130 survey1201 predict1201 diff1201 survey1202 predict1202 diff1202 歩行合計 推計合計 差合計 誤差率
point_name
松木呉服店前 1464 1978.942241 514.942241 2019 1445.842848 -573.157152 1452 1481.848963 29.848963 4935 4906.634052 -28.365948 -0.005748
セブンイレブン前 1342 1619.749097 277.749097 1498 1017.183412 -480.816588 905 1231.387727 326.387727 3745 3868.320235 123.320235 0.032929
風月堂前 816 967.443190 151.443190 864 818.266584 -45.733416 610 496.290588 -113.709412 2290 2282.000363 -7.999637 -0.003493
桜通り北交差点西 2506 2641.576872 135.576872 3049 3202.095384 153.095384 2751 2629.806101 -121.193899 8306 8473.478356 167.478356 0.020164
河野スポーツ前 1101 1146.713240 45.713240 1302 1644.886304 342.886304 1122 1038.135624 -83.864376 3525 3829.735168 304.735168 0.086450
永田楽器 442 433.854975 -8.145025 395 383.547349 -11.452651 294 311.778808 17.778808 1131 1129.181132 -1.818868 -0.001608
防災新館南 2125 2101.621324 -23.378676 2415 2615.389612 200.389612 1907 1888.542919 -18.457081 6447 6605.553855 158.553855 0.024593
三枝豆店前 701 664.522336 -36.477664 818 689.334943 -128.665057 569 780.247297 211.247297 2088 2134.104576 46.104576 0.022081
きぬや前 1911 1853.242973 -57.757027 2227 2082.413474 -144.586526 1856 2040.544644 184.544644 5994 5976.201090 -17.798910 -0.002969
内藤セイビドー眼鏡店 1021 933.311735 -87.688265 1103 900.006671 -202.993329 682 998.536166 316.536166 2806 2831.854571 25.854571 0.009214
玉屋前 701 567.783000 -133.217000 690 673.976865 -16.023135 459 653.969506 194.969506 1850 1895.729372 45.729372 0.024719
オスカー前 1189 1006.132225 -182.867775 1141 1231.692098 90.692098 641 719.537873 78.537873 2971 2957.362196 -13.637804 -0.004590
ブラザー前 1448 1190.609349 -257.390651 1350 1473.547767 123.547767 1122 1334.084175 212.084175 3920 3998.241292 78.241292 0.019960
小林動物病院前 1828 1405.107591 -422.892409 1374 1389.251069 15.251069 991 1263.433776 272.433776 4193 4057.792435 -135.207565 -0.032246
KoKoriオリオン通り入り口南 4392 3936.482679 -455.517321 4254 4352.796448 98.796448 4435 4778.849975 343.849975 13081 13068.129102 -12.870898 -0.000984
ファミリーマート前 2533 2052.623926 -480.376074 2006 2414.372516 408.372516 1479 1199.292732 -279.707268 6018 5666.289175 -351.710825 -0.058443
奥藤本店前 7015 6449.701619 -565.298381 6537 6005.086730 -531.913270 5558 6543.671242 985.671242 19110 18998.459591 -111.540409 -0.005837
KoKori紅梅南入り口西 2318 1672.666962 -645.333038 1824 2585.178982 761.178982 1469 1989.541570 520.541570 5611 6247.387514 636.387514 0.113418
古名屋ホテル前 2455 1807.587651 -647.412349 1585 1649.882399 64.882399 1139 1391.008958 252.008958 5179 4848.479008 -330.520992 -0.063819
ライフテクトナカゴミ前 3423 2293.173574 -1129.826426 2348 3379.488366 1031.488366 1617 2108.911517 491.911517 7388 7781.573457 393.573457 0.053272
In [8]:
three_day_sum[['歩行合計', '推計合計', '差合計', '誤差率']]
Out[8]:
歩行合計 推計合計 差合計 誤差率
point_name
松木呉服店前 4935 4906.634052 -28.365948 -0.005748
セブンイレブン前 3745 3868.320235 123.320235 0.032929
風月堂前 2290 2282.000363 -7.999637 -0.003493
桜通り北交差点西 8306 8473.478356 167.478356 0.020164
河野スポーツ前 3525 3829.735168 304.735168 0.086450
永田楽器 1131 1129.181132 -1.818868 -0.001608
防災新館南 6447 6605.553855 158.553855 0.024593
三枝豆店前 2088 2134.104576 46.104576 0.022081
きぬや前 5994 5976.201090 -17.798910 -0.002969
内藤セイビドー眼鏡店 2806 2831.854571 25.854571 0.009214
玉屋前 1850 1895.729372 45.729372 0.024719
オスカー前 2971 2957.362196 -13.637804 -0.004590
ブラザー前 3920 3998.241292 78.241292 0.019960
小林動物病院前 4193 4057.792435 -135.207565 -0.032246
KoKoriオリオン通り入り口南 13081 13068.129102 -12.870898 -0.000984
ファミリーマート前 6018 5666.289175 -351.710825 -0.058443
奥藤本店前 19110 18998.459591 -111.540409 -0.005837
KoKori紅梅南入り口西 5611 6247.387514 636.387514 0.113418
古名屋ホテル前 5179 4848.479008 -330.520992 -0.063819
ライフテクトナカゴミ前 7388 7781.573457 393.573457 0.053272

循環的に訓練データとテストデータを入れ替えて評価しているので、プラスマイナス逆になったものを使うことになって、誤差が相殺されてしまっているかもしれない。

以下は元のまま

In [102]:
mdf.fittedvalues[:5]
Out[102]:
0    63.844070
1    54.110198
2    61.062964
3    59.672411
4    59.116189
dtype: float64
In [103]:
df_all['fitted_lmer'] = mdf.fittedvalues
In [104]:
df_all['handcount'] = d2['survey_data']
In [105]:
df_all.head()
Out[105]:
point date time wifi_data survey_data fitted_olm fitted_lmer handcount
0 2 20181130 10 59 51 97.011055 63.844070 60
1 2 20181130 11 24 49 87.069267 54.110198 87
2 2 20181130 12 49 77 94.170544 61.062964 82
3 2 20181130 13 44 67 92.750289 59.672411 78
4 2 20181130 14 42 58 92.182187 59.116189 110
In [141]:
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))
df_all['color'] = df_all['point'].astype(int) # 色は整数値で指定
df_all.plot(kind="scatter", ax=axes[0], x="fitted_olm", y="handcount", c="color", colormap='Accent', colorbar=False)
df_all.plot(kind="scatter", ax=axes[1], x="fitted_lmer", y="handcount", c="color",colormap='tab20', colorbar=False)
print(df_all[['fitted_olm','survey_data']].corr())
print(df_all[['fitted_lmer','handcount']].corr())
             fitted_olm  survey_data
fitted_olm     1.000000     0.557101
survey_data    0.557101     1.000000
             fitted_lmer  handcount
fitted_lmer     1.000000   0.900756
handcount       0.900756   1.000000
In [151]:
name_data= pd.read_csv('/home/tamada/kofupointname.csv', encoding='cp932',names = ('point','name'))
name_data
Out[151]:
point name
0 2 三枝豆店前
1 3 風月堂前
2 4 永田楽器
3 5 桜通り北交差点西
4 7 ライフテクトナカゴミ前
5 8 オスカー前
6 9 防災新館南
7 10 河野スポーツ前
8 11 内藤セイビドー眼鏡店
9 12 ファミリーマート前
10 13 ブラザー前
11 16 KoKoriオリオン通り入り口南
12 17 古名屋ホテル前
13 18 KoKori紅梅南入り口西
14 19 きぬや前
15 20 セブンイレブン前
16 21 玉屋前
17 22 奥藤本店前
18 23 小林動物病院前
19 24 松木呉服店前
In [152]:
pd.merge(df_all,name_data, on='point')
Out[152]:
point date time wifi_data survey_data fitted_olm fitted_lmer handcount color name
0 2 20181130 10 59 51 97.011055 63.844070 60 2 三枝豆店前
1 2 20181130 11 24 49 87.069267 54.110198 87 2 三枝豆店前
2 2 20181130 12 49 77 94.170544 61.062964 82 2 三枝豆店前
3 2 20181130 13 44 67 92.750289 59.672411 78 2 三枝豆店前
4 2 20181130 14 42 58 92.182187 59.116189 110 2 三枝豆店前
5 2 20181130 15 40 53 91.614084 58.559968 57 2 三枝豆店前
6 2 20181130 16 56 46 96.158901 63.009738 59 2 三枝豆店前
7 2 20181130 17 57 99 96.442953 63.287849 66 2 三枝豆店前
8 2 20181130 18 54 91 95.590799 62.453517 102 2 三枝豆店前
9 2 20181130 19 79 110 102.692076 69.406282 117 2 三枝豆店前
10 2 20181202 10 20 41 85.933063 52.997756 60 2 三枝豆店前
11 2 20181202 11 52 49 95.022697 61.897296 87 2 三枝豆店前
12 2 20181202 12 57 58 96.442953 63.287849 82 2 三枝豆店前
13 2 20181202 13 65 50 98.715361 65.512734 78 2 三枝豆店前
14 2 20181202 14 96 67 107.520944 74.134163 110 2 三枝豆店前
15 2 20181202 15 63 65 98.147259 64.956512 57 2 三枝豆店前
16 2 20181202 16 50 65 94.454595 61.341074 59 2 三枝豆店前
17 2 20181202 17 143 71 120.871344 87.205362 66 2 三枝豆店前
18 2 20181202 18 75 44 101.555872 68.293840 102 2 三枝豆店前
19 2 20181202 19 68 59 99.567514 66.347066 117 2 三枝豆店前
20 3 20181130 10 658 40 267.157644 66.419328 18 3 風月堂前
21 3 20181130 11 696 55 277.951585 70.446003 44 3 風月堂前
22 3 20181130 12 663 68 268.577900 66.949154 63 3 風月堂前
23 3 20181130 13 660 46 267.725747 66.631258 63 3 風月堂前
24 3 20181130 14 595 59 249.262427 59.743525 58 3 風月堂前
25 3 20181130 15 659 54 267.441696 66.525293 35 3 風月堂前
26 3 20181130 16 776 77 300.675671 78.923214 78 3 風月堂前
27 3 20181130 17 1009 100 366.859569 103.613090 120 3 風月堂前
28 3 20181130 18 1313 157 453.211094 135.826491 201 3 風月堂前
29 3 20181130 19 1628 160 542.687181 169.205508 184 3 風月堂前
... ... ... ... ... ... ... ... ... ... ...
370 23 20181202 10 273 86 157.797983 75.375742 100 23 小林動物病院前
371 23 20181202 11 323 108 172.000537 87.578001 100 23 小林動物病院前
372 23 20181202 12 657 106 266.873593 169.089091 136 23 小林動物病院前
373 23 20181202 13 463 79 211.767686 121.744326 131 23 小林動物病院前
374 23 20181202 14 319 92 170.864332 86.601820 139 23 小林動物病院前
375 23 20181202 15 413 101 197.565133 109.542067 92 23 小林動物病院前
376 23 20181202 16 499 81 221.993525 130.529953 125 23 小林動物病院前
377 23 20181202 17 524 117 229.094801 136.631082 151 23 小林動物病院前
378 23 20181202 18 555 125 237.900384 144.196483 218 23 小林動物病院前
379 23 20181202 19 568 96 241.593048 147.369070 182 23 小林動物病院前
380 24 20181130 10 286 93 161.490647 126.321404 124 24 松木呉服店前
381 24 20181130 11 327 85 173.136741 137.581225 161 24 松木呉服店前
382 24 20181130 12 393 188 191.884111 155.706791 223 24 松木呉服店前
383 24 20181130 13 337 110 175.977252 140.327523 202 24 松木呉服店前
384 24 20181130 14 348 104 179.101813 143.348451 169 24 松木呉服店前
385 24 20181130 15 384 101 189.327652 153.235123 169 24 松木呉服店前
386 24 20181130 16 310 110 168.307873 132.912519 163 24 松木呉服店前
387 24 20181130 17 468 162 213.187942 176.304025 225 24 松木呉服店前
388 24 20181130 18 569 238 241.877099 204.041632 330 24 松木呉服店前
389 24 20181130 19 726 273 286.473117 247.158508 253 24 松木呉服店前
390 24 20181202 10 206 97 138.766562 104.351022 124 24 松木呉服店前
391 24 20181202 11 265 134 155.525575 120.554179 161 24 松木呉服店前
392 24 20181202 12 325 130 172.568639 137.031966 223 24 松木呉服店前
393 24 20181202 13 297 152 164.615209 129.342332 202 24 松木呉服店前
394 24 20181202 14 195 198 135.642000 101.330094 169 24 松木呉服店前
395 24 20181202 15 353 159 180.522069 144.721600 169 24 松木呉服店前
396 24 20181202 16 312 125 168.875975 133.461779 163 24 松木呉服店前
397 24 20181202 17 305 180 166.887617 131.539370 225 24 松木呉服店前
398 24 20181202 18 384 116 189.327652 153.235123 330 24 松木呉服店前
399 24 20181202 19 382 161 188.759550 152.685863 253 24 松木呉服店前

400 rows × 10 columns

In [156]:
point_num = df_all["point"].unique()
ig, axes = plt.subplots(nrows=25, ncols=1, figsize=(5, 100))
for point in point_num:
    pd.merge(df_all,name_data, on='point')
    df_kofu = df_all[df_all["point"] == point].reset_index(drop = True)
    df_kofu.plot(kind="scatter", ax=axes[point], x="fitted_lmer", y="handcount", c='color',colormap='tab20', colorbar=False,label=point)
In [107]:
df_all.plot(kind = "scatter" , y = "fitted_lmer", x ="wifi_data")
Out[107]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f54d26c97b8>