CM Box User GuideMain Page | About | Special pages | Log in

Printable version | Disclaimer | Privacy policy | Current revision

Revision as of 14:21, 22 September 2006; view current revision
←Older revision | Newer revision→

7.2. Calibrate crop yields against water balance outputs and other variables.


The yield function is a statistically derived function relating the water balance parameters (which constitute the outputs of the water balance model) and the other factors (farm inputs, trend) or NDVI with station yield. Once this function has been established, it can be used for early crop yield forecasting.

Although many different equations are possible, the most widely used one is the outcome of a multiple linear regression procedure:

Y = a + b1X1 + b2X2 + b3X3

where b1 to b3 are the corresponding X coefficients.

Using an example for Malawi, first it will be established which water balance parameters are good predictors for yield. Then the multiple linear regression will be performed.

The input data file for Malawi contains multiple lines for stations and years. In every line, besides yield, all possibly relevant water balance output parameters are stored. The file can be downloaded here


Step1. Finding the relevant parameters with a principle component analysis


Activate the Tools - Principle Component Analysis function. A settings window appears.
All parameters are selected.graph94.jpg
The output log file gives the most valuable results. The first, second and third component explains over 50% of the variance. The equation will be based on those 3 components.
Taking all parameters with correlation above 0.70 or below -0.70 the following list of regression parameters is obtained:
  • WRSIfin (final index)
  • DEFtot (total deficit)
  • DEFflow (total deficit during flowering)
  • DEFrip (total deficit during ripening)
  • ETAveg (ETA during vegetative stage)
  • ETArip (ETA during ripening)
  • WEXTot (Total Water Excess)
  • WEXveg (Water excess during vegetative stage)



Step 2. Performing the Linear Regression Analysis


The multiple regression is done in Excel. First the columns that are no longer needed are deleted. The file is saved under another name to prevent data loss (sample1.xls).
A second worksheet is created to hold the regression parameters. An array formula is entered :
=LINEST(sample!B3:B1367,sample!C3:J1367, TRUE, TRUE)

The regression parameters are calculated.

The regression formula is now:
yield = WRSIfin*10.95832936 + 
        DEFtot*-19.25393401 + 
        DEFflow*20.00715655 + 
        DEFrip*15.96960614 + 
        ETAveg*-1.232280314 + 
        ETArip*1.210208146 +
        WEXtot*0.14383726 + 
        WEXveg*-0.629481819 + 
        73.68884436

The predected yield for the first data row (third line) is:

=Regression!A$1*J3+Regression!B$1*I3+Regression!C$1*H3+Regression!D$1*G3+Regression!E$1*F3+Regression!F$1*E3+Regression!G$1*D3+Regression!H$1*C3+Regression!I$1

A graph is created in which measured and predicted yield are combined. This gives an idea of the accurracy of the regression formula calculated. The sample workbook can de downloaded here







0 Page generated in 0.395718 seconds.