Next Article in Journal
Novel Uncertainty Principles Related to Quaternion Linear Canonical S-Transform
Previous Article in Journal
Weakly Coupled Systems of Semi-Linear Fractional σ–Evolution Equations with Different Power Nonlinearities
Previous Article in Special Issue
A New Modification of the Weibull Distribution: Model, Theory, and Analyzing Engineering Data Sets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimating the Confidence Interval for the Common Coefficient of Variation for Multiple Inverse Gaussian Distributions

Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(7), 886; https://doi.org/10.3390/sym16070886
Submission received: 30 May 2024 / Revised: 9 July 2024 / Accepted: 9 July 2024 / Published: 11 July 2024

Abstract

:
The inverse Gaussian distribution is a two-parameter continuous probability distribution with positive support, which is used to account for the asymmetry of the positively skewed data that are often seen when modeling environmental phenomena, such as P M 2.5 levels. The coefficient of variation is often used to assess variability within datasets, and the common coefficient of variation of several independent samples can be used to draw inferences between them. Herein, we provide estimation methods for the confidence interval for the common coefficient of variation of multiple inverse Gaussian distributions by using the generalized confidence interval (GCI), the fiducial confidence interval (FCI), the adjusted method of variance estimates recovery (MOVER), and the Bayesian credible interval (BCI) and highest posterior density (HPD) methods using the Jeffreys prior rule. The estimation methods were evaluated based on their coverage probabilities and average lengths, using a Monte Carlo simulation study. The findings indicate the superiority of the GCI over the other methods for nearly all of the scenarios considered. This was confirmed for a real-world scenario involving P M 2.5 data from three provinces in northeastern Thailand that followed inverse Gaussian distributions.

1. Introduction

Air pollution poses a significant environmental threat with dire consequences for public health, a concern that is likely to escalate in the coming years. Despite growing awareness, our understanding of the intricate relationship between air pollution and human health remains incomplete. In the present study, we endeavor to bridge this gap by examining and quantifying the impact of particulate matter with a diameter less than or equal to 2.5 µm ( P M 2.5 ) in conjunction with meteorological variables and incidences of circulatory system diseases. Empirical evidence concerning the extremely high P M 2.5 concentrations in Thailand’s northeastern and northern regions during certain parts of the year highlights the seriousness of the air pollution situation. Indeed, the P M 2.5 levels in these regions were four times higher in April 2022 than the World Health Organization’s annual air quality standard, which indicated hazardous atmospheric conditions. Forecasting P M 2.5 levels presents a formidable challenge, due to their inherent variability, with daily P M 2.5 data within a given region typically conforming to an inverse Gaussian distribution. Moreover, statistical metrics, such as the coefficient of variation, can be used to test for significance in this context.
The inverse distribution involves two positive parameters, the mean ( μ ) and the scale ( λ ), and it maintains a strong connection with the normal distribution. It has been applied to financial modeling, survival analysis, and reliability theory, among others. Its versatility makes it particularly well-suited for analyzing data that deviate from normality. The inverse Gaussian distribution has been utilized across multiple disciplines, including biology (Hsu et al. [1]), pharmacokinetics (Weiss [2]), survival analysis (Khan et al. [3]), demography (Ewbanks [4]), and finance (Balakrishna and Rahul [5], Punzo [6]). Characterized by its ability to model processes with delay and right-skewed data, it provides a suitable framework for analyzing P M 2.5 concentrations. The distribution of the data tends to be right-skewed, due to occasional high-pollution events, resulting in a longer tail of extreme values on the higher end. This skewness indicates that P M 2.5 data are not symmetric. Utilizing the inverse Gaussian distribution to model P M 2.5 concentrations allows for more accurate statistical analyses and predictions. It effectively captures the asymmetry and heavy tail of P M 2.5 data, providing insights into the likelihood of extreme pollution events and aiding in assessing its impact on public health. While data transformation using techniques such as logarithms can sometimes be applied to achieve symmetry in the data, the inherent characteristics of P M 2.5 data make the inverse Gaussian distribution a more effective choice for modeling and analyzing these environmental data.
Statistical inference can be obtained by testing hypotheses or parameter estimation. The confidence interval comprising an estimator’s minimum and maximum values is the most widely utilized interval estimation method for a parameter. Hsieh [7] calculated the confidence interval for the inverse Gaussian distribution’s coefficient of variation and used it to examine actual data pertaining to runoff volumes at Jug Bridge, Maryland. Non-informative priors for the confidence interval of the common coefficient of variation across two inverse Gaussian distributions were developed by Kang et al. [8]. Chankham et al. [9] expanded upon this concept by offering estimators for the coefficients of variation across various inverse Gaussian distributions.
The coefficient of variation, a unit-free metric, can be used to assess the dispersion within data. The confidence intervals for the coefficient of variation and its many functional derivatives have been estimated by a multitude of authors using various approaches (Pang et al. [10], Hayter [11], Nam and Kwon [12]). Nonetheless, the current study aimed to estimate the confidence interval for the common coefficient of variation of many inverse Gaussian distributions, which had not been done previously. This was accomplished through rigorous analysis and empirical investigation. In previous studies, researchers have devised confidence intervals for the common coefficient of variation of both normal and non-normal distributions. Gupta et al. [13] calculated the asymptotic variance of the common coefficient of variation of normal distributions and then formulated confidence intervals for it. A method of estimating the common coefficient of variation for multiple zero-inflated lognormal distributions was presented by Tian [14], which used the idea of the generalized confidence interval (GCI). Behboodian and Jafari [15] utilized generalized p-values and the GCI in a similar endeavor. Ng [16] employed the generalized variables methodology to estimate confidence intervals for the common coefficient of variation across multiple lognormal distributions. Liu and Xu [17] introduced a technique for constructing confidence interval estimates for the common coefficient of variation across multiple normal populations, using a confidence distribution interval method. In order to estimate the confidence interval for the weighted coefficient of variation in two-parameter exponential distributions, Thangjai and Niwitpong [18] suggested employing the adjusted method of variance estimates recovery (MOVER) methodology. They contrasted the effectiveness of this strategy with that of the high-sample-size method and the generalized confidence interval (GCI) approach. According to their findings, positive coefficient of variation values are a good fit for the adjusted MOVER approach. They also found that the weighted coefficient of variation in two-parameter exponential distributions has a best-fit confidence interval, which can be estimated using the GCI approach. The adjusted GCI was used by Thangjai et al. [19] in their recent work, to estimate the confidence interval for the common coefficient of variation of normal distributions using computational techniques. In a comparative analysis with the GCI and adjusted MOVER methods, the adjusted GCI proved effective with small sample sizes, while the computational approach showed efficacy with larger sample sizes. Enhancements in computational methodology and the MOVER framework have been made, to build upon the work of Thangjai et al. [20]. The fiducial GCI methodology is the most effective method for estimating the confidence interval for the common coefficient of variation of several lognormal distributions. The study by Thangjai et al. [20] was restricted in scope, though, as it only looked at positively skewed lognormal distributions.
As mentioned earlier, many researchers have developed confidence intervals for the common coefficient of variation of several normal and non-normal distributions. However, there has not yet been an investigation of statistical inference using the common coefficient of variation of several inverse Gaussian distributions. In the present study, our primary aim was to estimate the confidence interval for the common coefficient of variation of several inverse Gaussian distributions. We achieved this by employing various methodologies, such as the GCI, adjusted MOVER, the Bayesian credible interval (BCI), the highest posterior density (HPD).BCI, the fiducial confidence interval (FCI), and the HPD.FCI. We evaluated the effectiveness of these methods through rigorous analysis of their coverage probabilities and average lengths for various scenarios in a simulation study. We also applied them in a real-world scenario to analyze P M 2.5 data from various areas in northeastern Thailand.

2. Methods

Let X i j ; i = 1 , 2 , , k , j = 1 , 2 , , n i be a random sample from k inverse Gaussian distributions denoted as X i j I G ( μ i , λ i ) . The distribution function for X i j is defined as
f ( x i j , μ i , λ i ) = λ i 2 π x i j 3 1 2 exp λ i x i j μ i 2 2 μ i 2 x i j , x i j > 0 , μ i > 0 , λ i > 0 ,
where μ i and λ i are the mean and scale parameters, respectively. Following the method of Ye et al. [21], the respective mean and variance of X i j are
E ( X i j ) = μ i ,
and
Var ( X i j ) = μ i 3 λ i .
Hence, the coefficient of variation of X i j can be represented as
CV X i j = τ i = V a r ( X i j ) E ( X i j ) = μ i λ i .
For a random sample X i j = X i 1 , X i 2 , , X i n i , from I G ( μ i , λ i ) we obtain μ ^ i = X ¯ i and U i = ( n i 1 ) 1 j = 1 n i 1 X i j 1 X ¯ j . Therefore, X ¯ I G ( μ i , n i λ i ) and ( n i 1 ) λ i U i χ n i 1 2 . First, we consider the square of the coefficients of variation, denoted as
ϕ i = τ i 2 = μ i λ i .
Since random variables X ¯ i and U i are independent, we can obtain an unbiased estimator for ϕ i as follows:
ϕ ^ i = μ ^ i 1 λ ^ i = X ¯ i U i .
Using the distributional properties of X ¯ i and U i , we obtain
ϕ ^ i = Z i Y i v i ,
where Z i I G ( ϕ i , n i ) , Y i χ v i 2 , and v i = n i 1 . We can use ϕ i instead and reparametrize the probability density function of ϕ i as follows:
f ( X i j , μ i , ϕ i ) = μ i 2 π ϕ i x i j 3 1 2 exp 1 2 ϕ i x i j μ i 2 μ i x i j ; x i j > 0 , μ i > 0 , ϕ i > 0 .
Using the moment of the inverse Gaussian from Chhikara and Folks [22], we obtain
E ( ϕ ^ i ) = ϕ ,
E ( ϕ ^ i 2 ) = ϕ 2 1 + ϕ i n i 1 + 2 v i .
Therefore, the first two moments for later use are given by
E ( ϕ ^ i ϕ i ) 2 = Var ( ϕ i ) = ϕ i 2 2 v i + 1 + 2 v i ϕ n i .
The approximately unbiased variance estimate of ϕ i is
Var ( ϕ ^ i ) = ϕ ^ i 2 2 v i + 1 + 2 v i ϕ ^ n i .
For the estimator for the common variance ( ϕ i ), ϕ ^ i , its weighted average based on k individual samples can be defined as
ϕ ˜ = i = 1 k w i ϕ ^ i i = 1 k w i ,
where w i = 1 / Var ( ϕ i ^ ) . Accordingly, the common coefficient of variation can be defined as
τ ˜ i = ϕ ˜ = i = 1 k w i ϕ ^ i i = 1 k w i .
Now, we derive the methods to estimate the confidence interval for the common coefficient of variation of multiple inverse Gaussian distributions.

2.1. The Generalized Confidence Interval (GCI)

Weeranhandi [23] pioneered the development of the generalized pivotal quantity (GPQ) concept and exploited it to provide the framework for the GCI. This approach provides flexibility as it does not require the assumption of normality, making it well-suited for the inherently skewed and asymmetric nature of the inverse Gaussian distribution. This flexibility ensures accurate modeling and analysis of such data. Additionally, the GCI method accounts for the uncertainty of multiple parameters simultaneously, leading to more accurate confidence intervals estimation.
Let X i j ; i = 1 , 2 , , k , j = 1 , 2 , , n i be a random sample with density function f ( X i j , x i j , μ i , λ i ) , where μ i and λ i are the parameters of interest and δ i is a nuisance parameter. Let x i j be the observed values of X i j . The GPQ is needed to satisfy the following two properties:
  • The probability distribution of function R ( X i j , x i j , μ i , λ i , δ i ) does not depend on the nuisance parameter.
  • The observed values for R ( X i j , x i j , μ i , λ i , δ i ) , X i j = x i j are independent of the nuisance parameter.
Given that R γ is the 100 γ t h percentile of R ( X i j , x i j , μ i , λ i , δ i ) , X i j = x i j then ( R γ / 2 , R 1 γ / 2 ) becomes the 100 ( 1 γ ) % two-sided GCI for μ i and λ i . Therefore, it is essential to use the GPQs for μ i and λ i to estimate the confidence interval for the common coefficient of variation τ ˜ for several inverse Gaussian distributions.
For k individual random samples from inverse Gaussian distributions, the GPQs for λ i and μ i can, respectively, be defined as by Ye et al. [21]:
R λ i = n i λ i V i n i υ i χ n i 1 2 n i υ i ,
and
R μ i = x ¯ i | 1 + n i λ i ( x ¯ i μ i ) μ i x ¯ i x ¯ i n i R λ i | d x ¯ i 1 + Z i x ¯ i n i R λ i .
where v i denotes the observed values of V i . Accordingly, the GPQ for ϕ ^ i becomes
R ϕ ^ i = R μ i R λ i .
By using Equations (13) and (14), the GPQ for τ i ˜ can be formulated as
R τ ˜ i = i = 1 k R w i R ϕ ^ i i = 1 k R w i
where R w i = 1 / R Var ( ϕ ^ ) and R Var ( ϕ ^ ) is given by
R V a r ( ϕ ^ ) = R ϕ ^ i 2 2 R v i + 1 + 2 R v i R ϕ ^ n i 1
where R v i = n i 1 1 . Therefore, the 100 ( 1 γ ) % two-sided confidence interval for τ ˜ based on the GCI is
C I τ ˜ ( G C I ) = [ L τ ˜ ( G C I ) , U τ ˜ ( G C I ) ] = [ R τ ˜ ( γ / 2 ) , R τ ˜ ( 1 γ / 2 ) ] ,
where R τ ˜ ( γ / 2 ) and R τ ˜ ( 1 γ / 2 ) denote the ( γ / 2 ) -th and ( 1 γ / 2 ) -th percentiles of the distribution of R τ ˜ , respectively. Algorithm 1 delineates the step-by-step computational process for constructing the GCI:
Algorithm 1: The GCI method
  • Generate x i j , i = 1 , 2 , , k , j = 1 , 2 , , n i from an inverse Gaussian distribution.
  • Calculate μ i ^ and λ i ^ .
  • Generate χ n i 1 2 and Z i from Chi-square and standard normal distributions, respectively.
  • Calculate R λ i , R μ i , R ϕ ^ i , and R τ ˜ i , using Equations (15), (16), (17), and (18), respectively.
  • Compute R τ ˜ i , using Equation (18)
  • Repeat Steps 2–5 5000 times.
  • Complete the 100 ( 1 γ / 2 ) confidence interval for τ ˜ i of the GCI.

2.2. The Bayesian Methods

Bayesian inference involves revising initial beliefs by considering fresh evidence, leading to the deviation of a posterior probability. For random samples X i j = ( X i 1 , X i 2 , , X i n i ) from I G ( μ i , λ i ) the joint likelihood function can be expressed as
L ( μ i , λ i | X i j ) λ i 2 π n i 2 i = 1 k X i j 3 2 e x p λ i i = 1 k X i j μ i 2 2 μ i 2 X i j .
By utilizing Bayes’ theorem to forecast the posterior distribution, we derive
π ( μ i , λ i | X i j ) L ( μ i , λ i | X i j ) × π ( μ i ) × π ( λ i ) ,
where π ( μ i ) and π ( λ i ) constitute the prior distributions for μ i and λ i , respectively. To formulate the Fisher information matrix for the unknown parameters, we employ the second-order partial derivative of the log-likelihood function. This process involves utilizing mathematical techniques to extract crucial information about the parameters’ uncertainty. The Fisher information matrix derivation hinges on analyzing the log-likelihood function’s behavior with respect to the unknown parameters as follows:
I μ i , λ i = d i a g λ 1 n 1 μ 1 3 1 2 λ 1 2 λ k n k μ k 2 1 2 λ k 2 .
In the ensuing sections, we cover the application of Jeffreys’ prior rule to constructing both BCI and HPD intervals. Within the Bayesian framework, the methodology pertaining to the inverse Gaussian distribution hinges significantly on parameter selection [24]. Instead of directly using the mean, a more beneficial method involves using the reciprocal of the mean and considering ( δ , λ ) , where δ = μ 1 is employed for parameterization. This approach aids in deriving manageable expressions for both the joint and marginal posterior distributions. The Jeffreys prior rule can be used to generate the posterior distribution when both parameters are unknown, thereby eliminating the need to assume a prior. Although choosing a natural conjugate prior is a viable alternative, it presents challenges in selecting hyperparameter values, which can introduce bias in the inference. By using the Jeffreys prior rule, the marginal posterior distributions for both λ i and δ i can, respectively, be derived as
f ( λ i | x i j ) G a m m a ( n i j 2 , β i ) ,
and
f ( δ i | x i j ) = 1 Φ ( n i 1 2 λ i 1 2 x ¯ i 1 2 ) ( 2 n i j λ i x ¯ i π ) 1 2 e x p n i λ i x ¯ i 1 2 + x ¯ i 1 ,
where β i = 1 2 μ ^ i 2 k = 1 n i ( x i j μ ^ i ) x i j ; Φ is the cumulative distribution function for the standard normal distribution; and x ¯ i and λ ^ i are the maximum likelihood estimators for μ i and λ i , respectively, given that all of the observations are considered in μ ^ i = x ¯ i and λ ^ i = n i / j = 1 n i 1 x i j 1 x ¯ i , respectively. In the present work, we assume that both μ i and λ i are unknown. Utilizing the Markov chain Monte Carlo (MCMC) technique, Gibbs sampling was employed to determine the posterior and fiducial distributions of the parameter (Gelfand and Smith, [25]). One popular strategy in Bayesian methodology is sampling from the posterior distribution by iteratively going over each variable one after the other and sampling from its conditional distribution while keeping the other variables fixed. The Gibbs sampler verifies the correctness of the sampled data by guaranteeing convergence, using both numerical and graphical summaries. Through iteration, the sampler progressively refines the samples, ultimately yielding a representative approximation of the posterior distribution. This methodological approach offers a robust means of inference, particularly in complex Bayesian models where direct sampling may be infeasible. In summary, the Gibbs sampler is a reliable tool for exploring posterior distributions, offering both theoretical grounding and practical applicability in statistical analyses. After generating BCIs and HPD intervals for the common coefficient of variation of multiple inverse Gaussian distributions by replacing the posterior densities of τ i in Equations (4), (5), (12), and (14), the 100 ( 1 γ ) % two-sided confidence interval for the common coefficient of variation based on the BCI method becomes
C I τ ˜ ( B C I ) = [ L τ ˜ ( B C I ) , U τ ˜ ( B C I ) ] ,
where L τ ˜ and U τ ˜ are the lower and upper bounds of the intervals for the 100 ( 1 γ ) % equal-tailed confidence interval and HPD interval of τ ˜ , respectively. The highest posterior density (HPD) interval represents the shortest interval within the HPD region. Within this region, all included values exhibit higher probability densities compared to any values outside the region. Consequently, the HPD interval is a critical measure in Bayesian statistics that offers a concise summary of the most probable values based on the given data and model. This enables the precise calculation of the interval, ensuring that the values inside it have the highest possible densities relative to those outside.

2.3. The Fiducial Confidence Interval (FCI)

Due to the pioneering work by Fisher [26], fiducial inference has emerged as a pivotal concept that is a significant departure from conventional statistical methods. Parameters in fiducial inference are regarded as random variables. Thus, their distributions (called fiducial distributions) are based only on the observed data and do not depend on any previous distributions. Fiducial intervals can be interpreted similarly to Bayesian credible intervals, thereby providing a direct probabilistic interpretation of parameter estimates. This can be more intuitive for practitioners who prefer understanding uncertainty in terms of probability. The FCI offers a blend of frequentist and Bayesian features, thereby providing an interpretable, flexible, and computationally efficient method for parameter estimation and uncertainty quantification. They are particularly beneficial in settings where traditional methods are challenged or where prior information is unavailable or undesirable. In addition, random samples are produced by utilizing point and interval estimates of obscure parameters along with maximum likelihood estimation. Despite its complexity, the application of the fiducial method to the parameters of the inverse Gaussian distribution, especially when coupled with MCMC, can be accomplished in the following manner:
μ i ( F C I ) I G ( μ i ^ , n i λ i ^ ) ,
and
λ i ( F C I ) λ i ^ n i χ n i 1 2 ,
where μ i ^ and λ i ^ are the maximum likelihood estimators for μ i and λ i , respectively.
In the present study, we employed the Gibbs sampler technique to draw samples from the fiducial distribution. In addition, we implemented a simultaneous procedure for estimating the fiducial values, wherein we substituted the Bayesian posterior distribution with the fiducial distribution. Thus, by inserting the posterior densities of τ ˜ i into Equations (4), (5), (12), and (14) we were able to apply the FCI to estimate the confidence interval for the common coefficient of variation of multiple inverse Gaussian distributions.
Therefore, the 100 ( 1 γ ) % two-sided confidence interval for the common coefficient of variation based on the FCI method becomes
C I τ ˜ ( F C I ) = [ L τ ˜ ( F C I ) , U τ ˜ ( F C I ) ] ,
where L τ ˜ ( F C I ) and U τ ˜ ( F C I ) are the lower and upper bounds of the 100 ( 1 γ ) % equitailed FCI and HPD intervals of τ ˜ , respectively.
The value of τ ˜ i for the BCI, HPD.BCI, FCI, and HPD.FCI can be estimated by applying Algorithm 2:
Algorithm 2: the BCI, HPD.BCI, FCI, and HPD.FCI methods:
  • Generate x i j , i = 1 , 2 , , k , j = 1 , 2 , , n i from an inverse Gaussian distribution.
  • Compute MLEs of μ ^ M L E and λ ^ M L E and set μ ^ M L E = μ i 0 and λ ^ M L E = λ i 0 .
  • Generate μ i 1 and λ i 1 from their respective posterior distributions, as detailed in Equations (24) and (25), utilizing the updated sample observations.
  • Repeat with Steps 2 and 3, starting from the current values of μ i 1 and λ i 1 for t ( t = 20 , 000 ), which denotes the number of MCMC replications, and ending with the μ i t and λ i t for each.
  • Calculate the desired parameters after discarding the first 1000 samples as burn-in.
  • Compute the 100 ( 1 γ / 2 ) confidence interval for τ ˜ i of the BCI and FCI.
  • Compute HPD.BCI and HPD.FCI, using HPDinterval in the R software package version 4.2.2.

2.4. Adjusted MOVER

We use the large sample technique to compute the adjusted MOVER, building on the foundation for MOVER first proposed by Donner and Zou [27]. Again, Equation (13) can be used to define the aggregated estimator for the common mean. For the two parameters of interest, ϕ 1 and ϕ 2 , we can employ their estimators, ϕ ^ 1 and ϕ ^ 2 , which are independent, to defined the lower limit L and the upper limit U τ ^ 1 + τ ^ 2 as
L , U = τ ^ 1 + τ ^ 2 ± Z γ / 2 Var τ ^ 1 + Var ( τ ^ 2 ) ,
where Z γ / 2 is the 100 ( γ / 2 ) -th percentile of the standard normal distribution. Through the application of the central limit theorem, the variance estimates for ϕ ^ i at ϕ i = l i , i = 1 , 2 can be, respectively, defined as
Var ( ϕ ^ l 1 ) = ( ϕ ^ 1 l 1 ) 2 z γ / 2 2
and
Var ( ϕ ^ l 2 ) = ( ϕ ^ 2 l 2 ) 2 z γ / 2 2 ,
where l 1 and l 2 are the lower limits of ϕ 1 and ϕ 2 , respectively. Furthermore, the variance estimates for ϕ ^ 1 at ϕ 1 = u 1 , i = 1 , 2 can be, respectively, defined as
Var ( ϕ ^ u 1 ) = ( u 1 ϕ ^ 1 ) 2 z γ / 2 2
and
Var ( ϕ ^ u 2 ) = ( u 2 ϕ ^ 2 ) 2 z γ / 2 2 ,
where u 1 and u 2 are the upper limits of ϕ 1 and ϕ 2 , respectively. Based on Equation (30), the 100 ( 1 γ ) % confidence limit for ϕ ^ 1 + ϕ ^ 2 can be expressed as
L = ϕ ^ 1 + ϕ ^ 2 ( ϕ ^ 1 l 1 ) 2 + ( ϕ ^ 2 l 2 ) 2
and
U = ϕ ^ 1 + ϕ ^ 2 + ( u 1 ϕ ^ 1 ) 2 + ( u 2 ϕ ^ 2 ) 2 .
For k independent samples to which the adjusted MOVER is applied, L and U for the sum of τ ^ i can be written as
L , U = ( ϕ ^ 1 + . . . + ϕ ^ k ) ± Z γ / 2 Var ϕ ^ 1 + . . . + Var ( ϕ ^ k ) .
The variance estimates of ϕ ^ i at ϕ i = l i and ϕ i = u i , where i = 1 , 2 , , k , are provided by
Var ( ϕ ^ l i ) = ( ϕ ^ i l i ) 2 z γ / 2 2
and
Var ( ϕ ^ u i ) = ( u i ϕ ^ i ) 2 z γ / 2 2 .
In this study, the lower and upper bounds of ϕ ^ i were established using the Wald confidence interval as follows:
l i , u i = n i 1 ( ϕ ^ i ) χ 1 γ / 2 , n i 1 2 , n i 1 ( ϕ ^ i ) χ γ / 2 , n i 1 2 .
When utilizing the large sample concept for the interval estimation of ϕ , the variance estimate for ϕ ^ i can be defined as
Var w ( ϕ ^ i ) = 1 2 ( ϕ ^ i l i ) 2 z γ / 2 2 + ( u i ϕ ^ i ) 2 z γ / 2 2 . f o r i = 1 , 2 , . . . , k .
Therefore, the 100 ( 1 γ ) % two-sided confidence interval for τ ˜ , using the adjusted MOVER with the Wald confidence interval, becomes
C I τ ˜ ( A . M O V E R ) = L τ ˜ ( A . M O V E R ) , U τ ˜ ( A . M O V E R ) ,
where
L τ ˜ ( A . M O V E R ) = ϕ ^ z γ / 2 1 i = 1 k 1 / Var ^ ( ϕ ^ i l ) = ϕ ^ 1 i = 1 k 1 / ( ϕ ^ i l i ) 2
and
U τ ˜ ( A . M O V E R ) = ϕ ^ + z γ / 2 1 i = 1 k 1 / Var ^ ( ϕ ^ i l ) = ϕ ˜ + 1 i = 1 k 1 / ( u i ϕ ^ i ) 2 ,
where ϕ ^ i is defined as in Equation (6). The confidence interval, derived from the adjusted MOVER method, can be readily constructed, using Algorithm 3.
Algorithm 3: The adjusted MOVER method
  • Generate x i j , i = 1 , 2 , , k , j = 1 , 2 , , n i from an inverse Gaussian distribution.
  • Compute μ i ^ and λ i ^ .
  • Compute ϕ ^ i and Var w ( ϕ ^ i ) .
  • Compute l i and u i .
  • Compute the 100 ( 1 γ / 2 ) confidence interval for τ ˜ i .

3. The Simulation Study and Results

We employed the R statistical software in conjunction with Monte Carlo simulation techniques to calculate coverage probabilities and average lengths for the different confidence interval estimation methods: GCI, BCI, HPD.BCI, FCI, HPD.FCI, and the adjusted MOVER. The most effective method for the given scenario achieved a coverage probability that met or exceeded the nominal confidence level of 0.95 and had the shortest average length. In each simulation, 10,000 random samples from an inverse Gaussian distribution along with 5000 pivotal quantities for the GCI, BCI, and FCI methods were generated. The number of populations was k = 3 ; 5 . The sample sizes were n = 30 , 50 , or 100, μ = 5 or 7, and λ = 10 , 20 , 30 , or 40 .
Plots of the coverage probabilities and average lengths for the six confidence interval estimation methods across various sample sizes are provided in Figure 1, Figure 2, Figure 3 and Figure 4. The simulation results for k = 3 are reported in Table 1. It can be seen that the coverage probabilities for the GCI were greater than or close to the nominal confidence level of 0.95 for most scenarios. One noteworthy result from the study was that the FCI performed well in situations where the sample sizes of each group were unequal. However, the coverage probabilities for the BCI, HPD.BCI, HPD.FCI, and the adjusted MOVER were below the 0.95 threshold in all the scenarios. As the sample sizes were increased, the coverage probabilities for the BCI, HPD.BCI, HPD.FCI, and the adjusted MOVER were better but still under the nominal confidence level of 0.95. When examining the average lengths, the adjusted MOVER typically provided the shortest, with the BCI and HPD.BCI following closely. From the findings for k = 5 in Table 2, it can be observed that the GCI method provided coverage probabilities of at least 0.95 only when the sample size was 100. The other methods yielded similar results to those for k = 3 . When considering the average lengths for both k = 3 and 5, it was found that an increase in sample size and scale resulted in shorter average lengths in all cases. Overall, the GCI outperformed the others in the various simulation study scenarios by meeting the criteria for both efficiency and accuracy.

4. An Empirical Example with Real PM 2 . 5 Data

For this part of the study, we used daily P M 2.5 data from October to December 2023 from the Nakhon Ratchasima, Nong Khai, and Ubon Ratchathani provinces in northeastern Thailand (Table 3 [28]). The Q–Q plots in Figure 5, Figure 6 and Figure 7 illustrate that the positive data conformed to inverse Gaussian distributions, as also evidenced by the lowest Akaike information criterion (AIC) and Bayesian information criterion (BIC) values in Table 4 and Table 5, respectively. We first utilized the Kolmogorov–Smirnov (KS) test to determine whether the P M 2.5 data from the three provinces followed inverse Gaussian distributions. This test evaluates whether the data adheres to an inverse Gaussian distribution by comparing the p-values for a particular significance level, commonly set as 0.05. If the p-value is below this threshold, the null hypothesis is rejected, indicating that the data do not follow the specified distribution. For our analysis, the KS test produced p-values of 0.2172 for Nakhon Ratchasima, 0.2146 for Nong Khai, and 0.2812 for Ubon Ratchathani. Since all these p-values were above the 0.05 significance level, we concluded that the P M 2.5 data from these provinces fitted the inverse Gaussian distribution model.
Table 6 provides the summary statistics derived from the three P M 2.5 datasets. The common coefficient of variation for these three datasets was calculated as 0.4732. We subsequently employed the various methods detailed herein to estimate the 95 % confidence interval for the common coefficient of variation of these three inverse Gaussian distributions, as reported in Table 7. Similar to our simulation findings for k = 3 , we found that the GCI provided a coverage probability close to the nominal confidence level of 0.95 and the shortest average length.

5. Discussion

Chankham et al. [29] proposed the simultaneous confidence interval for the ratios of the coefficients of variation of multiple inverse Gaussian distributions and its application to P M 2.5 data. We extended this idea to construct estimators for the confidence interval for the common coefficient of variation of several inverse Gaussian distributions. In our case, the findings from the simulation study imply that the GCI method is superior to the FCI, HPD.FCI, BCI, HPD.FCI, and adjusted MOVER methods for almost all cases. However, the FCI method also performed well, particularly when the sample sizes were not equal, by providing coverage probabilities that met or exceeded the nominal confidence level. The GCI method, while not always providing the shortest interval, balanced precision and reliability well, offering a reasonably short interval with high coverage probability. The Bayesian method was not the most effective approach, likely due to the hyperparameter configuration in the Jeffrey’s prior distribution. Moreover, other algorithms, such as Lindley’s approximation, could be used to address the issue of a non-closed-form posterior distribution. Inverse Gaussian distributions are inherently asymmetrical, and understanding this asymmetry impacts the performance of any applied statistical method. Transforming data from an inverse Gaussian distribution to a normal distribution can potentially make the data more symmetric, depending on the transformation method and the nature of the data [30]. Common transformations used to normalize data include the Box–Cox transformation and the Yeo–Johnson transformation. While inverse Gaussian distributions are asymmetrical, understanding this asymmetry is crucial for choosing effective statistical methods.
The simulation results were supported by the results for the real-world example of analyzing P M 2.5 levels in northeastern Thailand; data from Nakhon Ratchasima, Nong Khai, and Ubon Ratchathani provinces were examined using the inverse Gaussian distribution to assess the variability in P M 2.5 levels in these provinces. Once again, the GCI method proved to be the most effective by providing the most accurate confidence interval with the shortest average length, thus confirming its better precision and reliability in the context of P M 2.5 data analysis. These findings have practical implications for public health policies and pollution control strategies. Accurate statistical modeling of P M 2.5 levels could enhance the capability of forecasting pollution episodes, allowing for timely public health warnings and better air quality management. Understanding and managing P M 2.5 is crucial, due to its significant health and environmental impacts, and statistical methods provide valuable tools for analyzing and interpreting P M 2.5 data, leading to informed decisions and policies aimed at reducing pollution and protecting public health.

6. Conclusions

In this paper, we employed the GCI, BCI, HPD.BCI, FCI, HPD.FCI, and adjusted MOVER methods to estimate the confidence interval for the common coefficient of variation of multiple inverse Gaussian distributions. We evaluated their performances across various simulation scenarios, using coverage probability and average length metrics. The simulation study findings revealed that the coverage probabilities of the GCI, FCI, and HPD.FCI methods met or exceeded the nominal confidence level of 0.95. Notably, the GCI method emerged as the most suitable approach for both k = 3 and 5. As an empirical example, P M 2.5 datasets from the Nakhon Ratchasima, Nong Khai, and Ubon Ratchathani provinces in northeastern Thailand were utilized to assess the efficacy of the various methods. Once again, the GCI method outperformed the others by yielding the confidence interval with the shortest length, which was consistent with the simulation results for k = 3 . Hence, the GCI method is recommended for estimating the confidence interval for the common coefficient of multiple inverse Gaussian distributions, with the FCI and HPD.FCI methods also being suitable under certain circumstances.
The GCI method was chosen due to its flexibility and suitability for skewed distributions, making it an appropriate choice for inverse Gaussian distributions. Its inclusion was further supported by extensive documentation and widespread recognition in statistical inference. The adjusted MOVER method is known for providing robust interval estimates even with small sample sizes, enhancing its practical applicability and reinforcing its acceptance in the statistical literature. The BCI method leverages Bayesian inference to incorporate prior information, producing credible intervals that reflect posterior distributions. The distinct methodological foundation of the Bayesian framework justifies the BCI method’s inclusion. The FCI method combines elements of both frequentist and Bayesian inference, providing a unique approach to interval estimation, and its theoretical appeal makes it a compelling choice for comparative analysis. However, while these four indicators are robust and well-recognized, to enhance the contributions a more exhaustive comparative study is necessary. The inclusion of additional methods, such as bootstrap methods, which provide insights into the robustness and performance of interval estimation in various scenarios, and percentile intervals, which offer a non-parametric alternative to the selected methods, could enrich the analysis. Exploring different Bayesian methods with varying priors will also add depth to the study.
We used various approaches to estimate the common confidence interval for several inverse Gaussian distributions. The findings from an analysis of P M 2.5 concentrations from three pollution monitoring stations in northeastern Thailand aligned well with the outcomes of a simulation study, with the GCI method performing the best in most scenarios, thereby confirming the validity of our approach. Our research will be extended in the future to determine the simultaneous confidence intervals for the difference between the percentiles of multiple inverse Gaussian distributions.

Author Contributions

Conceptualization, S.-A.N.; methodology, W.C., S.-A.N. and S.N.; software, W.C.; formal analysis, W.C. and S.N.; investigation, S.-A.N. and S.N.; project administration, S.-A.N.; resources, S.-A.N.; data curation, W.C.; writing—original draft, W.C.; writing—review and editing, S.-A.N. and S.N.; supervision, S.-A.N. and S.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by King Mongkut’s University of Technology North Bangkok, contract number: KMUTNB-PHD-63-01.

Data Availability Statement

The Pollution Control Department provided the P M 2.5 concentration data [28].

Acknowledgments

The authors wish to extend their thanks to King Mongkut’s University of Technology North Bangkok for supporting their research and offering a space for programming.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AICAkaike information criterion
ALaverage length
A.MOVERadjusted method of variance estimates recovery
BCIBayesian confidence interval
BICBayesian information criterion
CIconfidence interval
CPcoverage probability
CVcoefficient of variation
FCIfiducial confidence interval
GCIgeneralized confidence interval
GPQgeneralized pivotal quantity
HPD.BCIhighest posterior density based on Bayesian method
HPD.FCIhighest posterior density based on fiducial method
MCMCMonte Carlo Markov chain
MOVERmethod of variance estimates recovery

References

  1. Hsu, A.; Ferrage, F.; Palmer, A.G. Analysis of NMR Spin Relaxation Data Using an Inverse Gaussian Distribution Function. Biophys. J. 2018, 115, 2301–2309. [Google Scholar] [CrossRef]
  2. Weiss, M. A note on the role of generalized inverse Gaussian distributions of circulatory transit times in pharmacokinetics. J. Math. Biol. 1984, 20, 95–102. [Google Scholar] [CrossRef] [PubMed]
  3. Khan, N.; Akhtar, T.; Ali Khan, A. A Bayesian Approach to Survival Analysis of Inverse Gaussian Model with Laplace Approximation. Int. J. Stat. Appl. 2016, 6, 391–398. [Google Scholar]
  4. Ewbank, D.C. Mortality differences by APOE genotype estimated from demographic synthesis. Genet. Epidemiol. 2002, 22, 146–155. [Google Scholar] [CrossRef]
  5. Balakrishna, N.; Rahul, T. Inverse Gaussian Distribution for Modeling Conditional Durations in Finance. Commun. Stat.-Simul. Comput. 2002, 43, 476–486. [Google Scholar] [CrossRef]
  6. Punzo, A. A new look at the inverse Gaussian distribution with applications to insurance and economic data. J. Appl. Stat. 2018, 46, 1260–1287. [Google Scholar] [CrossRef]
  7. Hsieh, H.K. Inferences on the coefficient of variation of an inverse gaussian distribution. Commun. Stat.—Theory Methods 1990, 19, 1589–1605. [Google Scholar] [CrossRef]
  8. Kang, S.G.; Kim, D.H.; Lee, W.D. Noninformative Priors for the Coeffiicient of variation in Two Inverse Gaussian Distributions. Commun. Korean Stat. Soc. 2008, 15, 429–440. [Google Scholar]
  9. Chankham, W.; Niwitpong, S.-A.; Niwitpong, S. Confidence Intervals for the Difference Between the Coefficients of Variation of Inverse Gaussian Distributions. In Proceedings of the Integrated Uncertainty in Knowledge Modelling and Decision Making, Ishikawa, Japan, 2 November 2022; Springer: Cham, Switzerland; pp. 372–383. [Google Scholar]
  10. Pang, C.K.; Leung, P.K.; Huang, W.K.; Liu, W. On interval estimation of the coefficient of variation for the three-parameter Weibull, lognormal and gamma distribution: A simulation-based approach. Eur. J. Oper. Res. 2005, 164, 367–377. [Google Scholar] [CrossRef]
  11. Hayter, A.J. Confidence bounds on the coefficient of variation of a normal distribution with applications to win-probabilities. J. Stat. Comput. Simul. 2015, 85, 3778–3791. [Google Scholar] [CrossRef]
  12. Nam, J.; Kwon, D. Inference on the ratio of two coefficients of variation of two lognormal distributions. Commun. Stat. Theory Methods 2016, 46, 8575–8587. [Google Scholar] [CrossRef]
  13. Gupta, R.C.; Ramakrishnan, S.; Zhou, X. Point and interval estimation of P(X>Y): The normal case with common coefficient of variation. Ann. Inst. Stat. Math. 1999, 51, 571–584. [Google Scholar] [CrossRef]
  14. Tian, L. Inference on the mean of zero-inflated lognormal data: The generalized variable approach. Stat. Med. 2005, 24, 3223–3232. [Google Scholar] [CrossRef] [PubMed]
  15. Behboodian, J.; Jafari, A. Generalized confidence interval for the common coefficient of variation. J. Stat. Adv. Theory Appl. 2008, 7, 349–363. [Google Scholar]
  16. Ng, C.K. Inference on the common coefficient of variation when populations are lognormal: A simulation-based approach. J. Stat. Adv. Theory Appl. 2014, 11, 117–134. [Google Scholar]
  17. Liu, X.; Xu, X.A. A note on combined inference on the common coefficient of variation using confidence distributions. Electron. J. Stat. 2018, 9, 219–233. [Google Scholar] [CrossRef]
  18. Thangjai, W.; Niwitpong, S.-A. Confidence intervals for the weighted coefficients of variation of two-parameter exponential distributions. Cogent Math. 2017, 4, 131588. [Google Scholar] [CrossRef]
  19. Thangjai, W.; Niwitpong, S.-A.; Niwitpong, S. Adjusted generalized confidence intervals for the common coefficient of variation of several normal populations. Commun. Stat.-Simul. Comput. 2020, 49, 194–206. [Google Scholar] [CrossRef]
  20. Thangjai, W.; Niwitpong, S.-A.; Niwitpong, S. Confidence intervals for the common coefficient of variation of rainfall in Thailand. PeerJ 2020, 8, e10004. [Google Scholar] [CrossRef]
  21. Ye, R.-D.; Ma, T.-F.; Wang, S.-G. Inferences on the common mean of several inverse Gaussian populations. Comput. Stat. Data Anal. 2010, 54, 906–915. [Google Scholar] [CrossRef]
  22. Chhikara, R.S.; Folks, J.L. The Inverse Gaussian Distribution; Marcel Dekker: New York, NY, USA, 1989. [Google Scholar]
  23. Weerahandi, S. Generalized confidence intervals. J. Am. Stat. Assoc. 1993, 88, 899–905. [Google Scholar] [CrossRef]
  24. Amry, Z. Bayes Estimator for inverse Gaussian Distribution with Jeffrey’s Prior. SCIREA J. Math. 2021, 6, 44–50. [Google Scholar]
  25. Gelfand, A.E.; Smith, A.F.M. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 1990, 85, 398–409. [Google Scholar] [CrossRef]
  26. Fisher, R.A. Statistical Methods and Scientific Inference; Hafner Publishing Co.: New York, NY, USA, 1973. [Google Scholar]
  27. Zou, G.; Donner, A. Construction of confidence limits about effect measures: A general approach. Stat. Med. 2008, 27, 1693–1702. [Google Scholar] [CrossRef] [PubMed]
  28. Report on Regional Air Quality and Situation. Available online: http://air4thai.pcd.go.th/webV3/ (accessed on 3 January 2024).
  29. Chankham, W.; Niwitpong, S.-A.; Niwitpong, S. The Simultaneous Confidence Interval for the Ratios of the Coefficients of Variation of Multiple Inverse Gaussian Distributions and Its Application to PM2.5 Data. Symmetry 2024, 26, 331. [Google Scholar] [CrossRef]
  30. Whitmore, G.A.; Yalovsky, M. A normalizing logarithmic transformation for inverse Gaussian random variables. Technometrics 1978, 20, 207–208. [Google Scholar] [CrossRef]
Figure 1. Coverage probabilities for the 95% CI obtained using different methods for various sample sizes when k = 3 .
Figure 1. Coverage probabilities for the 95% CI obtained using different methods for various sample sizes when k = 3 .
Symmetry 16 00886 g001
Figure 2. Average lengths for the 95% CI obtained using different methods for various sample sizes when k = 3 .
Figure 2. Average lengths for the 95% CI obtained using different methods for various sample sizes when k = 3 .
Symmetry 16 00886 g002
Figure 3. Coverage probabilities for the 95 % CI obtained using different methods for various sample sizes when k = 5 .
Figure 3. Coverage probabilities for the 95 % CI obtained using different methods for various sample sizes when k = 5 .
Symmetry 16 00886 g003
Figure 4. Average lengths for the 95% CI obtained using different methods for various sample sizes when k = 5 .
Figure 4. Average lengths for the 95% CI obtained using different methods for various sample sizes when k = 5 .
Symmetry 16 00886 g004
Figure 5. Q–Q plots of P M 2.5 data from Nakhon Ratchasima.
Figure 5. Q–Q plots of P M 2.5 data from Nakhon Ratchasima.
Symmetry 16 00886 g005
Figure 6. Q–Q plots of P M 2.5 data from Nong Khai.
Figure 6. Q–Q plots of P M 2.5 data from Nong Khai.
Symmetry 16 00886 g006
Figure 7. Q–Q plots of P M 2.5 data from Ubon Ratchathani.
Figure 7. Q–Q plots of P M 2.5 data from Ubon Ratchathani.
Symmetry 16 00886 g007
Table 1. The results of the 95% confidence interval, two-sided for coverage probability and average length, for the common coefficient of variation of inverse Gaussian for k = 3 .
Table 1. The results of the 95% confidence interval, two-sided for coverage probability and average length, for the common coefficient of variation of inverse Gaussian for k = 3 .
n i μ i λ i Coverage Probability (Average Length)
CI GCI CI BCI CI HBD . BCI CI FCI CI HBD . FCI CI A . MOVER
30 3 5 3 10 3 0.95020.83640.82470.89460.88860.8896
(0.2808)(0.2286)(0.2273)(0.2577)(0.2564)(0.2137)
20 3 0.95210.87650.86320.92630.90150.9166
(0.1837)(0.1607)(0.1560)(0.1743)(0.1734)(0.1517)
30 3 0.91250.82330.80140.89420.86250.8666
(0.1436)(0.1291)(0.1284)(0.1381)(0.1373)(0.1226)
40 3 0.93510.86210.85410.91250.89540.9047
(0.1229)(0.1137)(0.1131)(0.1206)(0.1198)(0.1074)
7 3 10 3 0.95200.75120.73680.86230.83120.7962
(0.3545)(0.2752)(0.2736)(0.3171)(0.3156)(0.2499)
20 3 0.98420.87410.86210.92510.90020.9306
(0.2241)(0.1914)(0.1903)(0.2111)(0.2101)(0.1803)
30 3 0.95240.90080.87410.92350.91450.9162
(0.1754)(0.1555)(0.1546)(0.1679)(0.1670)(0.1464)
40 3 0.93620.89450.86420.92120.89520.8941
(0.1492)(0.1340)(0.1331)(0.1433)(0.1425)(0.1269)
50 3 5 3 10 3 0.96040.89410.87120.92510.92650.8994
(0.2145)(0.1764)(0.1757)(0.1984)(0.1977)(0.1641)
20 3 0.96510.86020.86410.92540.91740.9152
(0.1382)(0.1229)(0.1224)(0.1323)(0.1318)(0.1156)
30 3 0.95220.86210.86420.93560.89450.8912
(0.1082)(0.0990)(0.0986)(0.1045)(0.1041)(0.0933)
40 3 0.94250.89210.88410.91230.91420.9213
(0.0933)(0.0869)(0.0864)(0.0910)(0.0906)(0.0825)
7 3 10 3 0.97810.86210.85440.88240.87120.8863
(0.2692)(0.2083)(0.2074)(0.2422)(0.2414)(0.1932)
20 3 0.96210.83620.83410.93250.90040.9152
(0.1689)(0.1460)(0.1453)(0.1598)(0.1592)(0.1367)
30 3 0.96420.91520.91420.92540.92630.8952
(0.1331)(0.1186)(0.1180)(0.1270)(0.1264)(0.1120)
40 3 0.95420.90120.89620.94210.92540.8942
(0.1113)(0.1017)(0.1012)(0.1075)(0.1070)(0.0965)
100 3 5 3 10 3 0.97520.86410.87540.92410.92630.8742
(0.1475)(0.1206)(0.1203)(0.1354)(0.1350)(0.1146)
20 3 0.94520.87410.88240.93780.91520.8952
(0.0955)(0.0845)(0.0842)(0.0904)(0.0921)(0.0812)
30 3 0.94520.89410.88510.91620.92660.8942
(0.0741)(0.0683)(0.0680)(0.0716)(0.0714)(0.0658)
40 3 0.94520.90040.91280.93620.93510.9152
(0.0634)(0.0594)(0.0592)(0.0617)(0.0615)(0.0571)
7 3 10 3 0.94520.79520.75410.86140.83210.8024
(0.1845)(0.1427)(0.1422)(0.1653)(0.1649)(0.1341)
20 3 0.96520.92750.91320.92410.92650.9004
(0.1171)(0.1001)(0.0998)(0.1094)(0.1090)(0.0963)
30 3 0.96520.86410.86520.91480.90580.8932
(0.0903)(0.0806)(0.0804)(0.0859)(0.0856)(0.0781)
40 3 0.97420.91240.86510.94850.95210.9452
(0.0765)(0.0699)(0.0696)(0.0736)(0.0733)(0.0675)
30 5 3 10 3 0.95410.88410.88620.89410.89520.8541
50 (0.1967)(0.1613)(0.1607)(0.1812)(0.1805)(0.1483)
100 20 3 0.95210.89620.88510.92630.92470.9238
(0.1295)(0.1149)(0.1144)(0.1229)(0.1225)(0.1054)
30 3 0.96520.92410.91840.94520.93780.9352
(0.0996)(0.0909)(0.0905)(0.0957)(0.0953)(0.0855)
40 3 0.95840.92480.91280.95630.93840.9385
(0.0848)(0.0792)(0.0789)(0.0826)(0.0823)(0.0744)
7 3 10 3 0.97520.82570.86140.89620.88120.8004
(0.2474)(0.1923)(0.1916)(0.2218)(0.2211)(0.1732)
20 3 0.96520.87590.86210.90150.89420.8541
(0.1558)(0.1342)(0.1337)(0.1467)(0.1462)(0.1240)
30 3 0.96520.89410.88740.95410.94520.9251
(0.1212)(0.1085)(0.1081)(0.1158)(0.1154)(0.1017)
40 3 0.95210.91580.91470.94350.92610.9111
(0.1033)(0.0645)(0.0642)(0.0996)(0.0993)(0.0884)
The coverage probabilities exceed the nominal confidence level of 0.95, and the shortest average lengths are highlighted in bold.
Table 2. The results of the 95% confidence interval, two-sided for coverage probability and average length, for the common coefficient of variation of inverse Gaussian for k = 5 .
Table 2. The results of the 95% confidence interval, two-sided for coverage probability and average length, for the common coefficient of variation of inverse Gaussian for k = 5 .
n i μ i λ i Coverage Probability (Average Length)
CI GCI CI BCI CI HBD . BCI CI FCI CI HBD . FCI CI A . MOVER
30 5 5 5 10 5 0.92150.82410.83540.90120.90050.8465
(0.2249)(0.1853)(0.1847)(0.2079)(0.2073)(0.1643)
20 5 0.93620.86410.85610.90240.91520.8962
(0.1476)(0.1303)(0.1298)(0.1408)(0.1404)(0.1175)
30 5 0.90580.85170.89410.93140.92510.9041
(0.1164)(0.1048)(0.1045)(0.1120)(0.1116)(0.0961)
40 5 0.91520.82140.83250.92150.93210.8962
(0.0978)(0.0903)(0.0900)(0.0955)(0.0952)(0.0817)
7 5 10 5 0.93540.82140.84210.89900.89750.8562
(0.2829)(0.2251)(0.2245)(0.2590)(0.2583)(0.1956)
20 5 0.94520.84120.82410.90410.90250.8962
(0.1792)(0.1553)(0.1548)(0.1705)(0.1700)(0.1381)
30 5 0.92010.85740.89230.90520.91520.8962
(0.1407)(0.1245)(0.1241)(0.1342)(0.1338)(0.1128)
40 5 0.92330.84620.86720.90410.91540.8962
(0.1170)(0.1057)(0.1053)(0.1130)(0.1126)(0.0976)
50 5 5 5 10 5 0.93250.86740.89420.90740.92860.9042
(0.1702)(0.1403)(0.1399)(0.1572)(0.1568)(0.1201)
20 5 0.91520.84250.86350.91240.91520.8752
(0.1091)(0.0971)(0.0968)(0.1042)(0.1039)(0.0899)
30 5 0.93840.85420.86410.92140.92570.8965
(0.0862)(0.0792)(0.0789)(0.0836)(0.0833)(0.0728)
40 5 0.92360.84250.85620.91320.92630.9025
(0.0733)(0.0683)(0.0681)(0.0715)(0.0713)(0.0629)
7 5 10 5 0.93520.85740.84620.90210.91850.8862
(0.2145)(0.1681)(0.1676)(0.1944)(0.1939)(0.1492)
20 5 0.94620.88510.87510.89620.90410.9063
(0.1349)(0.1164)(0.1160)(0.1271)(0.1268)(0.1056)
30 5 0.92310.89630.88540.90240.91520.9063
(0.1050)(0.0942)(0.0939)(0.1008)(0.1005)(0.0865)
40 5 0.93210.88510.85420.90350.89620.8966
(0.0892)(0.0814)(0.0811)(0.0861)(0.0859)(0.0750)
100 5 5 5 10 5 0.95630.90620.91630.92570.96530.9258
(0.1152)(0.0945)(0.0942)(0.1060)(0.1056)(0.1152)
20 5 0.94520.91520.90620.93210.92510.9064
(0.0742)(0.0663)(0.0661)(0.0409)(0.0707)(0.0627)
30 5 0.93250.90410.90630.92540.93650.9152
(0.0582)(0.0535)(0.0534)(0.0561)(0.0560)(0.0510)
40 5 0.92630.91240.93540.94520.94620.9263
(0.0495)(0.0464)(0.0463)(0.0483)(0.0481)(0.0442)
7 5 10 5 0.96530.86520.89620.92150.93620.9035
(0.1459)(0.1136)(0.1133)(0.1318)(0.1314)(0.1046)
20 5 0.95630.92410.91520.90650.92630.9284
(0.0914)(0.0789)(0.0787)(0.0860)(0.0858)(0.0742)
30 5 0.93620.86520.87520.93490.94650.9052
(0.0708)(0.0636)(0.0634)(0.0676)(0.0674)(0.0603)
40 5 0.95620.90450.91540.93650.94520.9055
(0.0600)(0.0551)(0.0549)(0.0579)(0.0577)(0.0522)
30 2 5 5 10 5 0.93620.84210.83250.90520.94210.8962
50 (0.1542)(0.1273)(0.1270)(0.1453)(0.1448)(0.1120)
100 2 20 5 0.92630.85420.85410.90630.91520.9251
(0.0993)(0.0887)(0.0880)(0.0957)(0.0954)(0.0794)
30 5 0.92450.88960.89420.90620.91630.9005
(0.0785)(0.0722)(0.0720)(0.0764)(0.0762)(0.0650)
40 5 0.93650.88520.89640.91520.90470.8990
(0.0662)(0.0622)(0.0620)(0.0651)(0.0649)(0.0562)
7 5 10 5 0.94520.85420.86350.90420.90620.8992
(0.1949)(0.1528)(0.1529)(0.1814)(0.1809)(0.1329)
20 5 0.93210.88410.87620.89520.84750.9025
(0.1214)(0.1056)(0.1053)(0.1165)(0.1162)(0.0937)
30 5 0.92510.89520.88420.89420.90250.9066
(0.0950)(0.0855)(0.0853)(0.0920)(0.0917)(0.0770)
40 5 0.93210.86240.86510.90050.90410.9065
(0.0799)(0.0734)(0.0732)(0.0779)(0.0776)(0.0667)
The coverage probabilities exceed the nominal confidence level of 0.95, and the shortest average lengths are highlighted in bold.
Table 3. The daily P M 2.5 data for each province in northern Thailand from May to June 2022.
Table 3. The daily P M 2.5 data for each province in northern Thailand from May to June 2022.
AreaDaily PM 2.5
Nakhon Ratchasima13.515.815.210.111.79.910.311.514.315
12.312.914.81419.127.631.736.634.735.8
20.11923.117.413.511.713.514.515.115.3
13.820.728.131.730.823.717.215.122.927
21.818.115.817.610.512.813.517.125.728.1
3334.13829.217.32825.326.436.439.2
31.719.213.815.822.522.229.717.933.241.7
32.63241.327.830.938.336.519.622.326.9
15.318.321.630.128.739.739.633.730.928.8
25.624.6
Nong Khai12.410.512.419.315.410.16.466.518.1
18.520.714.32315.422.531.240.544.933.5
24.327.622.817.517.816.115.118.225.87.4
11.920.424.329.275.252.624.518.115.534
35.131.9259.910.613.610.118.53236.2
55.639.426.223.713.419.115.412.122.631.4
27.69.111.818.446.346.846.238.858.980.3
86.583.957.559.559.348.255.114.924.839.7
16.514.933.233.238.852.280.472.163.747.6
48.844.3
Ubon Ratchathani15.913.99.28.811.18.4568.913.2
7.47.66.99.917.328.93638.643.139.1
11.98.98.61010.511.925.119.616.19.4
1118.320.715.715.820.216.79.411.515.1
13.615.79.96.46.16.47.917.524.326.2
26.126.228.716.914.617.37.61934.335.5
20.87.577.813.213.612.217.934.338.4
60.341.730.519.724.82423.95.76.79.1
4.79.818.621.423.431.833.320.212.510.5
12.910.5
Table 4. The AIC values for assessing the distribution of daily P M 2.5 data.
Table 4. The AIC values for assessing the distribution of daily P M 2.5 data.
AreaDistribution
Normal Exponential Cauchy Inverse Gaussian
Nakhon Ratchasima666.6932765.1523719.4743659.1754
Nong Khai809.3776814.5335824.5560770.5842
Ubon Ratchathani699.9272712.7021713.192657.7329
Table 5. The BICvalues for assessing the distribution of daily P M 2.5 data.
Table 5. The BICvalues for assessing the distribution of daily P M 2.5 data.
AreaDistribution
Normal Exponential Cauchy Inverse Gaussian
Nakhon Ratchasima671.7368767.6740724.5179664.2190
Nong Khai814.4212817.0553829.5996775.6277
Ubon Ratchathani704.9708715.2239718.2356662.7765
Table 6. Estimated parameters for the five P M 2.5 datasets.
Table 6. Estimated parameters for the five P M 2.5 datasets.
Area n i μ ^ i λ ^ i τ ^ i
Nakhon Ratchasima9223.2794138.38650.4101
Nong Khai9230.445765.34130.6826
Ubon Ratchathani9217.505445.44190.6207
Table 7. The 95% confidence intervals for the common coefficient of the P M 2.5 data from the provinces of Nakhon Ratchasima, Nong Khai, and Ubon Ratchathani in Thailand.
Table 7. The 95% confidence intervals for the common coefficient of the P M 2.5 data from the provinces of Nakhon Ratchasima, Nong Khai, and Ubon Ratchathani in Thailand.
Methods 95 % Confidence Interval
Lower Upper Lengths
GCI0.39530.56190.1666
BCI0.41100.54370.1327
HPD.BCI0.40880.54080.1320
FCI0.40970.54740.1377
HPD.FCI0.40940.54670.1373
A. MOVER0.43440.52510.0907
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chankham, W.; Niwitpong, S.-A.; Niwitpong, S. Estimating the Confidence Interval for the Common Coefficient of Variation for Multiple Inverse Gaussian Distributions. Symmetry 2024, 16, 886. https://doi.org/10.3390/sym16070886

AMA Style

Chankham W, Niwitpong S-A, Niwitpong S. Estimating the Confidence Interval for the Common Coefficient of Variation for Multiple Inverse Gaussian Distributions. Symmetry. 2024; 16(7):886. https://doi.org/10.3390/sym16070886

Chicago/Turabian Style

Chankham, Wasana, Sa-Aat Niwitpong, and Suparat Niwitpong. 2024. "Estimating the Confidence Interval for the Common Coefficient of Variation for Multiple Inverse Gaussian Distributions" Symmetry 16, no. 7: 886. https://doi.org/10.3390/sym16070886

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop