Вывод: не следует использовать линейную модель, так как она уступает в объясняющей силе полулогарифмической
Выполняем тест Бокса-Кокса для сравне6ния логарифмической и линейной моделей
. . means priceperm
Variable | Type | Obs | Mean | [95% Conf. | Interval] |
priceperm | Arithmetic | 5469.564 | 5407.998 | 5531.13 | |
Geometric | 5402.319 | 5342.295 | 5463.018 | ||
Harmonic | 5336.337 | 5277.501 | 5396.5 | ||
. . g priceperm_star=priceperm/5402.319
. . g lgpriceperm_star=log(priceperm_star)
Строим модели с новыми зависимыми переменными
. . reg priceperm_star totsp kitsp dist metrdist walk brick tel floor new floors nfloor sw
Source | SS | df MS | Number of obs | = 763 | |
F( 12, 750) | = 59.91 | ||||
Model | 9.59000974 | 12 .799167479 | Prob > F | = 0.0000 | |
Residual | 10.004223 | 750 .013338964 | R-squared | = 0.4894 | |
Adj R-squared | = 0.4813 | ||||
Total | 19.5942328 | 762 .025714216 | Root MSE | = .11549 | |
priceperm_~r | Coef. | Std. Err. t | P>t | [95% Conf. | Interval] |
totsp | -.0040999 | .0009803 -4.18 | 0.000 | -.0060243 | -.0021754 |
kitsp | .0216728 | .0040109 5.40 | 0.000 | .0137988 | .0295467 |
dist | -.0209141 | .0012389 -16.88 | 0.000 | -.0233462 | -.0184821 |
metrdist | -.0058656 | .0010097 -5.81 | 0.000 | -.0078478 | -.0038835 |
walk | .0948849 | .0092563 10.25 | 0.000 | .0767136 | .1130561 |
brick | .0451822 | .0103083 4.38 | 0.000 | .0249456 | .0654187 |
tel | .0242942 | .0119652 2.03 | 0.043 | .0008051 | .0477834 |
floor | .0724951 | .0101346 7.15 | 0.000 | .0525995 | .0923907 |
new | -.0747356 | .0288712 -2.59 | 0.010 | -.1314136 | -.0180575 |
floors | .007405 | .001156 6.41 | 0.000 | .0051356 | .0096743 |
nfloor | .0047846 | .0010727 4.46 | 0.000 | .0026789 | .0068904 |
sw | .0433601 | .0088736 4.89 | 0.000 | .02594 | .0607802 |
_cons | 1.063269 | .0377433 28.17 | 0.000 | .9891742 | 1.137364 |
. reg lgpriceperm_star lntotsp lnkitsp lndist lnmetrdist lnfloors lnnfloor walk brick tel floor new sw
Source | SS | df MS | Number of obs | = 763 | |
F( 12, 750) | = 58.59 | ||||
Model | 9.112745 | 12 .759395416 | Prob > F | = 0.0000 | |
Residual | 9.72142331 | 750 .012961898 | R-squared | = 0.4838 | |
Adj R-squared | = 0.4756 | ||||
Total | 18.8341683 | 762 .024716756 | Root MSE | = .11385 | |
lgpriceper~r | Coef. | Std. Err. t | P>t | [95% Conf. | Interval] |
lntotsp | -.1994132 | .0491942 -4.05 | 0.000 | -.2959879 | -.1028385 |
lnkitsp | .1488611 | .0338481 4.40 | 0.000 | .0824127 | .2153094 |
lndist | -.2114004 | .0131866 -16.03 | 0.000 | -.2372874 | -.1855133 |
lnmetrdist | -.0388459 | .008135 -4.78 | 0.000 | -.0548159 | -.022876 |
lnfloors | .0979816 | .0127182 7.70 | 0.000 | .0730141 | .122949 |
lnnfloor | .0251988 | .005752 4.38 | 0.000 | .0139068 | .0364908 |
walk | .0992604 | .0090582 10.96 | 0.000 | .0814781 | .1170428 |
brick | .055766 | .0103708 5.38 | 0.000 | .0354067 | .0761253 |
tel | .0268293 | .0117936 2.27 | 0.023 | .0036769 | .0499816 |
floor | .0569536 | .0105399 5.40 | 0.000 | .0362624 | .0776448 |
new | -.0656276 | .0283623 -2.31 | 0.021 | -.1213064 | -.0099487 |
sw | .0445313 | .0087298 5.10 | 0.000 | .0273935 | .0616691 |
_cons | .6331904 | .1493441 4.24 | 0.000 | .3400082 | .9263725 |
Сравниваем RSS в обоих моделях:
Χ2 = n/2 * ln(RSSmax/RSSmin) ~ Χ21
Χ2 = 763/2 * ln(10/9,7) ~ Χ21
11,62 ~ Χ21
На 1% уровне значимости больше подходит логарифмическая модель.
Сравнивая R2adj в логарифмической и полулогарифмической моделях, а также Root MSE в обоих моделях, приходим к выводу о том, что полулогарифмическая модель обладает самой высокой описательной способностью.
. reg lnpriceperm totsp kitsp dist metrdist walk brick tel floor new floors nfloor sw
Source | SS | df MS | Number of obs | = 763 | |
F( 12, 750) | = 61.48 | ||||
Model | 9.33934623 | 12 .778278852 | Prob > F | = 0.0000 | |
Residual | 9.49482307 | 750 .012659764 | R-squared | = 0.4959 | |
Adj R-squared | = 0.4878 | ||||
Total | 18.8341693 | 762 .024716758 | Root MSE | = .11252 | |
lnpriceperm | Coef. | Std. Err. t | P>t | [95% Conf. | Interval] |
totsp | -.0039887 | .000955 -4.18 | 0.000 | -.0058635 | -.002114 |
kitsp | .0208337 | .0039075 5.33 | 0.000 | .0131629 | .0285046 |
dist | -.0202985 | .0012069 -16.82 | 0.000 | -.0226679 | -.0179292 |
metrdist | -.0064124 | .0009837 -6.52 | 0.000 | -.0083435 | -.0044814 |
walk | .0943859 | .0090175 10.47 | 0.000 | .0766833 | .1120885 |
brick | .0437063 | .0100424 4.35 | 0.000 | .0239917 | .0634209 |
tel | .0226295 | .0116565 1.94 | 0.053 | -.0002538 | .0455129 |
floor | .0747079 | .0098732 7.57 | 0.000 | .0553255 | .0940904 |
new | -.0714038 | .0281266 -2.54 | 0.011 | -.12662 | -.0161876 |
floors | .007333 | .0011262 6.51 | 0.000 | .0051222 | .0095438 |
nfloor | .0043739 | .001045 4.19 | 0.000 | .0023225 | .0064254 |
sw | .0427449 | .0086448 4.94 | 0.000 | .0257741 | .0597157 |
_cons | 8.648569 | .0367698 235.21 | 0.000 | 8.576385 | 8.720753 |
Проводим тест на мультиколлинеарность
. vif
Variable | VIF | 1/VIF |
kitsp | 3.16 | 0.316009 |
totsp | 2.72 | 0.367271 |
floors | 2.28 | 0.438399 |
brick | 1.40 | 0.714848 |
nfloor | 1.35 | 0.739590 |
dist | 1.22 | 0.818237 |
walk | 1.14 | 0.878354 |
metrdist | 1.13 | 0.884390 |
floor | 1.08 | 0.926682 |
sw | 1.06 | 0.947458 |
new | 1.04 | 0.962782 |
tel | 1.01 | 0.989702 |
Mean VIF | 1.55 |
Мультиколлинеарность в модели отсутствует
Проводим тест Уайта на гетероскедастичность
. . estat imtest, white
White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity
chi2(83) = 151.49
Prob > chi2 = 0.0000
Cameron & Trivedi's decomposition of IM-test
Source | chi2 | df | p |
Heteroskedasticity | 151.49 | 0.0000 | |
Skewness | 30.57 | 0.0023 | |
Kurtosis | 0.05 | 0.8173 | |
Total | 182.11 | 0.0000 | |
В модели присутствует гетероскедастичность
Вводим робастные поправки
. . reg lnpriceperm totsp kitsp dist metrdist walk brick tel floor new floors nfloor sw,robust
Linear regression Number of obs = 763
F( 12, 750) = 64.30
Prob > F = 0.0000
R-squared = 0.4959
Root MSE = .11252
Robust | ||||||
lnpriceperm | Coef. | Std. Err. | t | P>t | [95% Conf. | Interval] |
totsp | -.0039887 | .001076 | -3.71 | 0.000 | -.0061011 | -.0018764 |
kitsp | .0208337 | .0042951 | 4.85 | 0.000 | .0124019 | .0292656 |
dist | -.0202985 | .001275 | -15.92 | 0.000 | -.0228016 | -.0177955 |
metrdist | -.0064124 | .0011026 | -5.82 | 0.000 | -.008577 | -.0042478 |
walk | .0943859 | .0091225 | 10.35 | 0.000 | .0764773 | .1122946 |
brick | .0437063 | .0100798 | 4.34 | 0.000 | .0239183 | .0634942 |
tel | .0226295 | .0117326 | 1.93 | 0.054 | -.0004032 | .0456622 |
floor | .0747079 | .0098475 | 7.59 | 0.000 | .055376 | .0940398 |
new | -.0714038 | .0312315 | -2.29 | 0.023 | -.1327153 | -.0100923 |
floors | .007333 | .0011424 | 6.42 | 0.000 | .0050903 | .0095756 |
nfloor | .0043739 | .0010661 | 4.10 | 0.000 | .002281 | .0064669 |
sw | .0427449 | .0083814 | 5.10 | 0.000 | .0262911 | .0591987 |
_cons | 8.648569 | .0417 | 207.40 | 0.000 | 8.566707 | 8.730432 |
Проводим тест Рамсея на наличие пропущенных переменных
. ovtest
Ramsey RESET test using powers of the fitted values of lnpriceperm
Ho: model has no omitted variables
F(3, 747) = 5.03
Prob > F = 0.0019
На 0, 002% уровне значимости в модели нет пропущенных переменных. Спецификация правильная.
Введём квадрат переменной kitsp
. . gen sq_kitsp=kitsp*kitsp
. . reg lnpriceperm totsp kitsp dist metrdist walk brick tel floor new floors nfloor sw sq_kitsp,robust
Linear regression Number of obs = 763
F( 13, 749) = 59.53
Prob > F = 0.0000
R-squared = 0.5051
Root MSE = .11156
Robust | ||||||
lnpriceperm | Coef. | Std. Err. | t | P>t | [95% Conf. | Interval] |
totsp | -.0041778 | .0010803 | -3.87 | 0.000 | -.0062987 | -.002057 |
kitsp | .089703 | .0215536 | 4.16 | 0.000 | .0473903 | .1320157 |
dist | -.0199529 | .0012799 | -15.59 | 0.000 | -.0224656 | -.0174402 |
metrdist | -.0064519 | .0010951 | -5.89 | 0.000 | -.0086017 | -.0043022 |
walk | .0941075 | .0090892 | 10.35 | 0.000 | .0762642 | .1119509 |
brick | .0452121 | .0100253 | 4.51 | 0.000 | .025531 | .0648932 |
tel | .0207995 | .0118707 | 1.75 | 0.080 | -.0025044 | .0441034 |
floor | .0765689 | .0097534 | 7.85 | 0.000 | .0574217 | .0957161 |
new | -.065622 | .030214 | -2.17 | 0.030 | -.1249361 | -.0063079 |
floors | .0059485 | .0011412 | 5.21 | 0.000 | .0037083 | .0081888 |
nfloor | .0044057 | .0010547 | 4.18 | 0.000 | .0023351 | .0064762 |
sw | .0415754 | .0082676 | 5.03 | 0.000 | .0253451 | .0578058 |
sq_kitsp | -.0038924 | .0012314 | -3.16 | 0.002 | -.0063098 | -.001475 |
_cons | 8.381565 | .0896462 | 93.50 | 0.000 | 8.205578 | 8.557553 |
Существует насыщение по площади кухни. До определённого момента каждый метр кухни значительно увеличивает цену квартиры. После – динамика остаётся положительной, но влияние не такое значительное.
Протестируем гипотезу об одинаковом влиянии этажа, на котором расположена квартира и количества этажей в доме на цену квадратного метра квартиры
. . test(floors-nfloor=0)
( 1) floors - nfloor = 0
F( 1, 749) = 0.79
Prob > F = 0.3749
Гипотеза подтверждается.
Включим в модель фиктивные переменные h13, h7, h3, f13, f7, f3, соответствующие номерам домов 13, 7 и 3 соответственно, а также номерам этажей 13, 7 и 3 соответственно для того, чтобы протестировать гипотезу о влиянии несчастливого числа 13 (номера дома и этажа) и счастливых чисел 7 и 3 (номеров домов и этажей) на цену квартиры
. . reg lnpriceperm totsp kitsp dist metrdist walk brick tel floor new floors nfloor sw sq_kitsp h13 f13 h7 f7 h3 f3, robust
Linear regression Number of obs = 763
F( 19, 743) = 41.63
Prob > F = 0.0000
R-squared = 0.5105
Root MSE = .11139
Robust | ||||||
lnpriceperm | Coef. | Std. Err. | t | P>t | [95% Conf. | Interval] |
totsp | -.0042591 | .0010852 | -3.92 | 0.000 | -.0063895 | -.0021288 |
kitsp | .0924931 | .0216969 | 4.26 | 0.000 | .0498986 | .1350877 |
dist | -.0196424 | .0013037 | -15.07 | 0.000 | -.0222018 | -.017083 |
metrdist | -.0065022 | .0010943 | -5.94 | 0.000 | -.0086505 | -.0043539 |
walk | .0940259 | .0091193 | 10.31 | 0.000 | .0761233 | .1119285 |
brick | .0467104 | .0102049 | 4.58 | 0.000 | .0266766 | .0667443 |
tel | .0208754 | .0118558 | 1.76 | 0.079 | -.0023994 | .0441503 |
floor | .0770408 | .0100223 | 7.69 | 0.000 | .0573653 | .0967163 |
new | -.0718142 | .0306718 | -2.34 | 0.019 | -.1320279 | -.0116004 |
floors | .0058446 | .0011731 | 4.98 | 0.000 | .0035416 | .0081476 |
nfloor | .0049364 | .0011397 | 4.33 | 0.000 | .0026989 | .0071738 |
sw | .0405921 | .0090265 | 4.50 | 0.000 | .0228716 | .0583125 |
sq_kitsp | -.004027 | .0012346 | -3.26 | 0.001 | -.0064508 | -.0016032 |
h13 | -.0263359 | .0151353 | -1.74 | 0.082 | -.0560489 | .0033771 |
f13 | -.0498502 | .0180887 | -2.76 | 0.006 | -.0853613 | -.0143392 |
h7 | -.0216254 | .0172138 | -1.26 | 0.209 | -.0554189 | .0121681 |
f7 | .0099288 | .0208067 | 0.48 | 0.633 | -.0309181 | .0507756 |
h3 | -.0092845 | .0164746 | -0.56 | 0.573 | -.0416267 | .0230577 |
f3 | .0015009 | .0180583 | 0.08 | 0.934 | -.0339504 | .0369522 |
_cons | 8.370999 | .0898383 | 93.18 | 0.000 | 8.194632 | 8.547366 |