Examples of Moment-Based Estimation and Hypothesis Testing

Examples of Moment-Based Estimation and Hypothesis Testing

In addition to the MLE-based estimation, stochastic frontier models may also be estimated using other methods including the method of moments (MoM). We provide sfmodel_MoMTest() which uses the MoM to test and estimate the normal-half normal and the normal-exponential models. This is based on Chen and Wang (2012).

There are advantages of the MoM estimator over the MLE in general and in regard with sfmodel_MoMTest() in particular.

MoM estimators usually run much faster and are less prone to numerical issues. In fact, sfmodel_MoMTest() uses closed-form solutions for all of the model parameters; there is no need for numerical optimization or root-finding procedures.
The sfmodel_MoMTest() provides formal hypothesis testings on the joint distribution assumptions on $v$ and $u$ of the model's composed error (e.g., $\epsilon = v-u$). The test may be used for formally justifying the distribution assumptions of the model, or for data exploration. The result is valid regardless whether you decide to proceed with the MLE or the MoM for subsequent parameter estimation.

Because of its simplicity, the MoM estimation and test are done using a single command sfmodel_MoMTest(). In contrast, the MLE approach requires multiple estimation commands to accomplish (e.g., sfmodel_spec(), sfmodel_init(), sfmodel_opt(), sfmodel_fit()).

Currently, sfmodel_MoMTest() does not have panel data feature and it assumes all the observations are iid.

Normal Half-Normal Model

A general setup of the model is:

\[\begin{aligned} y_i & = x_i \beta + \epsilon_i,\\ \epsilon_i & = v_i - u_i,\\ v_i \sim N(0, \sigma_v^2), & \quad u_i \sim N^+(0, \sigma_u^2), \end{aligned} \]

where $\sigma_v^2$ and $\sigma_u^2$ are both constant. Here $N^+(0, \sigma_u^2)$ is a half-normal distribution obtained by truncating the normal distribution $N(0, \sigma_u^2)$ from below at 0. There is no variables of inefficiency determinants ($z_i$) in the model.

Normal Exponential Model

This model assumes $u_i$ follows an exponential distribution.

\[\begin{aligned} u_i \sim \mathrm{Exp}(\sigma_u^2), \end{aligned} \]

where $\sigma_u^2$ is the scale parameter such that $E(u_i) = \sigma_u$ and $Var(u_i) = \sigma_u^2$. The $\sigma_u^2$ may be parameterized by a vector of variables, as we show in the following example. There is no variables of inefficiency determinants ($z_i$) in the model.

Test the Distribution Assumptions

Suppose we want to conduct a hypothesis test on the assumption that $v_i$ follows a normal distribution and $u_i$ follows an exponential distribution in the data.

julia> using SFrontiers        # main packages
julia> using DataFrames, CSV   # handling data

julia> df = DataFrame(CSV.File("sampledata.csv")) 
julia> df[!, :_cons] .= 1.0;         # append column _cons as a column of 1 

julia> sfmodel_MoMTest(sftype(prod), sfdist(expo),
               @depvar(yvar), @frontier( Lland, PIland, Llabor, Lbull, Lcost, _cons),
               data=df, 
               ω=(0.5, 1, 2),
               testonly = true
               )

sftype(prod) indicates a production-frontier type of model. The alternative is cost for cost frontier where the composed error is $v_i + u_i$.
sfdist(expo) specifies the exponential distribution assumption on $u_i$. An alternative is half for half-normal.
@depvar(.) specifies the dependent variable.
@frontier(.) specifies the list of variables used in the frontier equation. The variables are assumed to be linear in the equation.
data=df specifies the dataset, which has to be in the DataFrame format.
ω=(0.5, 1, 2) or omega=(0.5, 1, 2) specifies the test parameter, which may be a single scalar (e.g., ω=1) or a list of scalars (e.g., ω=(0.5, 1, 2)). Chen and Wang (2012) suggests that ω=1 usually works well.
testonly=true specifies to show only the test results. Default is false, which print both the test and the estimation results on the screen.
Other options include level= to set the significance level of the confidence intervals.

Here is the result.

****************************************
** Moment Based Tests and Estimations **
****************************************

* Null Hypothesis: v is normal AND u is exponential.

  Test Statistics (χ² distribution)
┌─────┬─────────┬─────────┐
│   ω │    sine │  cosine │
├─────┼─────────┼─────────┤
│ 0.5 │ 3.78229 │ 4.95490 │
│   1 │ 3.94102 │ 4.99217 │
│   2 │ 4.41290 │ 3.59071 │
└─────┴─────────┴─────────┘
  Note: Chen and Wang (2012 EReviews) indicates that cosine test 
with ω=1 has good overall performance.


  Critical Values (χ²(1))
┌─────────┬─────────┬─────────┐
│      1% │      5% │     10% │
├─────────┼─────────┼─────────┤
│ 6.63490 │ 3.84146 │ 2.70554 │
└─────────┴─────────┴─────────┘

If we take the cosine test at $\omega=1$, the test statistic is $4.992$ which is larger than the 5% significance level (though smaller than the 1% level). We may conclude that the null hypothesis ($v$ is normal and $u$ is exponential) is rejected at the 5% level.

Test the Distribution Assumptions, Estimate Model Parameters, Obtain Inefficiency Index

What if $u_i$ is assumed to follow a half-normal distribution? It can be tested in a similar way. In the following example, we do not set testonly=true so that we will see results of both of the test and the parameter estimation. We also use res in the beginning of the command to receive returns of the command for later analysis. The returned results include the Jondrow et al.(1982) inefficiency index (JLMS), the Battese and Coelli (1988) efficiency index, and others.

julia> res = sfmodel_MoMTest(sftype(prod), sfdist(half),
                     @depvar(yvar), @frontier( Lland, PIland, Llabor, Lbull, Lcost, _cons),
                     data = df,
                     ω = (0.5, 1, 2)
                     )


****************************************
** Moment Based Tests and Estimations **
****************************************

* Null Hypothesis: v is normal AND u is half-normal.

  Test Statistics (χ² distribution)
┌─────┬─────────┬─────────┐
│   ω │    sine │  cosine │
├─────┼─────────┼─────────┤
│ 0.5 │ 0.03301 │ 1.25844 │
│   1 │ 0.05283 │ 1.37552 │
│   2 │ 0.16846 │ 1.81301 │
└─────┴─────────┴─────────┘
  Note: Chen and Wang (2012 EReviews) indicates that cosine test with ω=1 has good overall performance.


  Critical Values (χ²(1))
┌─────────┬─────────┬─────────┐
│      1% │      5% │     10% │
├─────────┼─────────┼─────────┤
│ 6.63490 │ 3.84146 │ 2.70554 │
└─────────┴─────────┴─────────┘


* Method of Moments Estimates of the Model (Chen and Wang 2012 EReviews)
  ** Model type: normal and half-normal
  ** The constant variable (for intercept) in the model: _cons.
  ** Number of observations: 271
  ** Log-likelihood value: -114.87749

┌────────┬──────────┬───────────┬──────────┬─────────┬──────────┬──────────┐
│        │    Coef. │ Std. Err. │        z │   P>|z| │  95%CI_l │  95%CI_u │
├────────┼──────────┼───────────┼──────────┼─────────┼──────────┼──────────┤
│  Lland │  0.34560 │   0.07801 │  4.43006 │ 0.00001 │  0.19270 │  0.49851 │
│ PIland │  0.37290 │   0.20402 │  1.82772 │ 0.06872 │ -0.02698 │  0.77278 │
│ Llabor │  1.10329 │   0.08261 │ 13.35565 │ 0.00000 │  0.94138 │  1.26520 │
│  Lbull │ -0.43908 │   0.06459 │ -6.79833 │ 0.00000 │ -0.56567 │ -0.31249 │
│  Lcost │  0.01876 │   0.01514 │  1.23940 │ 0.21630 │ -0.01091 │  0.04842 │
│  _cons │  2.18283 │   0.35326 │  6.17901 │ 0.00000 │  1.49044 │  2.87521 │
│    σᵥ² │  0.00621 │   0.01441 │     n.a. │    n.a. │  0.00528 │  0.00741 │
│    σᵤ² │  0.37867 │   0.06534 │     n.a. │    n.a. │  0.32211 │  0.45164 │
└────────┴──────────┴───────────┴──────────┴─────────┴──────────┴──────────┘
  Note: CI of σᵥ² and σᵤ² is calculated based on the χ² distribution.


***** Additional Information *********
* OLS (frontier-only) log-likelihood: -121.79668
* Skewness of OLS residuals: -0.90075
* The sample mean of the JLMS inefficiency index: 0.49424
* The sample mean of the BC efficiency index: 0.64701

* Use `name.list` to see saved results (keys and values) where `name` is the return specified in `name = sfmodel_MoMTest(..)`. Values may be retrieved using the keys. For instance:
   ** `name.MoM_loglikelihood`: the log-likelihood value of the model;
   ** `name.jlms`: Jondrow et al. (1982) inefficiency index;
   ** `name.bc`: Battese and Coelli (1988) efficiency index;
* Use `keys(name)` to see available keys.
**************************************