Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept) -25297.77 8329.05 -3.037 0.00784 ** X1 -1354.60 770.79 -1.757 0.09796 . X2 17835.37 2165.53 8.236 3.80e-07 *** X3 -50.31 40.57 -1.240 0.23287 X4 19.74 3.49 5.658 3.56e-05 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 9053 on 16 degrees of freedom Multiple R-squared: 0.9128, Adjusted R-squared: 0.8911 F-statistic: 41.89 on 4 and 16 DF, p-value: 2.765e-08
In the summary we should pay more attention to the Coefficient of Determination. The multiple R-squared is 0.9128 and the adjusted R-squared is 0.8911, which is a good indicator. But some improvements are still necessary.
3.3 Improved Regression Model
In section 3.1.5 we have already know that X3 (Diameter at Breast Height) is not significant even the significance level is 0.05. So deleting X3 from the basic model may improve the regression result. In the same time we also have searched lots of information about the form of leaf mass and finally we find that leaf masses were also calculated using the allometric equations developed by Nowak (1996) for application
to a wide variety of species. That really gives me a hint that we may use the logarithmic form.
3.3.1 Solution and Results Analysis
After deleting X3 and using the logarithmic form we get a new regression model:
Call:
lm(formula = log(Y) ~ X1 + X2 + X4)
Coefficients:
(Intercept) X1 X2 X4 6.388219 -0.026668 0.814148 0.001004
The new regression equation is:
lnY?6.388219-0.026668?X1 ?0.814148?X2?0.001004?X4
The VIFs of the independent variables are below
X1 X2 X4 2.184419 1.044081 2.166960
which has no problem. ANOVA Table below:
Analysis of Variance Table
Response: log(Y)
Df Sum Sq Mean Sq F value Pr(>F) X1 1 3.6649 3.6649 39.287 8.498e-06 *** X2 1 19.8924 19.8924 213.243 4.736e-11 *** X4 1 6.8531 6.8531 73.465 1.410e-07 *** Residuals 17 1.5858 0.0933
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
According to the p-value of each independent variable we can conclude that all of them are very significant!
Finally we give a summary of this model:
Call:
lm(formula = log(Y) ~ X1 + X2 + X4)
Residuals:
Min 1Q Median 3Q Max -0.46763 -0.19975 -0.00825 0.17168 0.54094
Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept) 6.3882189 0.1915638 33.348 < 2e-16 *** X1 -0.0266680 0.0259150 -1.029 0.318 X2 0.8141478 0.0634590 12.830 3.60e-10 *** X4 0.0010043 0.0001172 8.571 1.41e-07 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3054 on 17 degrees of freedom Multiple R-squared: 0.9504, Adjusted R-squared: 0.9417 F-statistic: 108.7 on 3 and 17 DF, p-value: 2.72e-11
Notice the multiple R-squared is 0.9504 which is greater than the previous one and the adjusted R-squared is 0.9417 which is also greater than the previous one. That means this is an improved regression model.
According to this final model, we know that the logarithmic form of leaf mass is highly correlated with Crown height, Crown radius and Leaf mass per area of crown projection. The regression equation
lnY?6.388219-0.026668?X1 ?0.814148?X2?0.001004?X4 can be applied to estimate the leaf-mass of a whole-tree.
IV. Hierarchical Clustering in Leaves
4.1 Data Description
Source: ü. NIINEMETS, A. PORTSMUTH and M. TOBIAS

