EXPLORING THE MODIFIED GAMMA FRAILTY DISTRIBUTION: AN OPTIMAL DESIGN APPROACH USING PYTHON

A suction/injection controlled mixed convection flow of an incompressible and viscous fluid in a vertical SurvivalAnalysis is pivotal in understanding the effects of covariates on potentially censored failure times and in the joint modelling of clustered data. It is used in the context of incomplete repeated measures and failure times in longitudinal studies. Survival data are often subject to right censoring and to a subsequent loss of information about the effect of explanatory variables. Frailty models are one common approach to handle such data.Three frailty models are used to analyze bivariate time-to-event data. All approaches accommodate right censored lifetime data and account for heterogeneity in the study population. A Modified Gamma Frailty Model is compared with two existing Frailty Models. The survival-analysis was performed using the Python.The newly derived MGF was analyzed using Python which is more robust when sample size is more than forty.The MGF model performs better than the existing models in the presence of clustering. However the CGF is preferable in the absence of clusters in a given data set.


INTRODUCTION
The design of optimal experiments is crucial in the analysis of survival data.When studying time-to-event data or survival data, frailty models, including the Gamma frailty model are essential for capturing unobserved heterogeneity among subjects.Abdulazeez, (2020).Hazard models have become widespread in their use for the analysis of durationtimedata in many scientific disciplines, including biology and medicine (Cox,1972;Kalbfleisch & Prentice, 1980), sociology (Petersen, 1998, Vermunt, 1996), marketing research (Vilcassim & Jain, 1991;Wedel et al., 1995), (Getachew & Bekele 2016) andeconomics (Kiefer, 1988;Lancaster, 1990).These models overcome theproblems of accounting for censored observations of duration and timevaryingexplanatory variables, which arise in applying standard regression type models toduration data.The basic concept in hazard models is the probability of the occurrence of an event during a certain time interval, says t to t+ t  , given that it has not occurredbefore t, specified as: The Cox proportional hazards model (Cox, 1972) is commonly used in the analysis of survival time data.An often unstated assumption of the proportional hazards model and of traditional frailty models (with the exception of those that use the compound Poisson distribution (Rakhmawati et al (2021)) is that all individuals will experience the event of interest.However, in some situation a fraction of individuals is not expected toexperience the event of interest; that is, these individuals are not at risk.(Anthony et al (2019).The terminologyto describe the never-at-risk group varies from field to field, but includes 'long-term survivors' or 'cured' in epidemiology, 'non-susceptibles' in toxicology, 'stayers' in finiteMarkov transition models of occupational mobility,the 'non-fecundable' in fertilitymodels, and 'non-recidivists' among convicted criminals.In epidemiology and medicine,researchers may be interested in analyzing the occurrence of a disease.Many individualsmay never experience that disease; therefore, there exists a fraction in the population thatis protected.Cure models are survival models which allow for a cured fraction in thestudy population.These models extend the understanding of time-to-event data by allowing for the formulation of more accurate and informative conclusions than previously made.Theseconclusions would otherwise be unobtainable from an analysis that fails to account for acured fraction in the population.If a cured component is not present, the analysis reducesto standard approaches of survival analysis.In cure models, the population is divided into two subpopulations so that an individualis either cured with probability 1 −  , or has a proper survival function S(t), withprobability  .Here, proper survival function means  →∞ () = 0. Individuals regardedas cured will never experience the event of interest and their survival time willbe defined as infinity.Therefore, the hazard and survival functions of cured individuals are set to zero and one, respectively, for all finite values of t.Longini and Halloran (1996) have proposed frailty cure models that extend standardfrailty models.The frailty random variable in the former has point mass at zero withprobability 1 −  while heterogeneity among those experiencing the event of interest ismodelled via a continuous distribution with probability φ.Price and Manatunga (2001)gave an excellent introduction to this area and applied leukaemia remission data to different cure, frailty and frailty cure models.They found that frailty models are useful inmodelling data with a cured fraction and that the gamma frailty cure model provides abetter fit to their remission data compared to the standard cure model.
In the next section we describe the existing models and a proposed model, then provide an application of the models to an existing data on occupational exposure tagged -IRANIAN data.

MATERIALS AND METHODS Cox PH models
The notation used for Cox PH models (Cox, 1972), Lee & Song (2001) with one more subscript to capture multiple events is generalized.Let   be the total time of the  ℎ event for the  ℎ subject,   be the censoring time of the  ℎ event for the  ℎ subject.Let   be the observation time, that is,   = (  ;   ), (2)    = (  ≤   ) (3) is an indicator of observed  ℎ failure time for subject i.   = ( 1 ,  2 , . . . .  )is the covariate vector for the  ℎ subject with respect to the  ℎ event, and   = ( 1 ' ,  2 ' , . . . .  ' ) denotes the covariate vector for the th i subject, where K is the maximum number of events within a subject. = ( 1 ,  2 , . . . .  ) is a  × 1 vector of unknown parameters.Denote ℎ  (|  ()) as the hazard function for the  ℎ event of the  ℎ subject at time t.This is in the context of competing risk.
In general, the hazard function at time t for a subject is defined as the instantaneous probability of failure at time t given the survivorship prior to time t and the covariates: Note that Cox PH model for the  ℎ event time

Correlated Gamma Frailty (CGF) Model
This model was introduced by Yashin & Iachine (1995a,b, 1997, 1999a,b) and applied to related lifetimes in many different settings.Examples are found in Rakhmawati et al (2021) ,Pickles et al. (1994), Yashin et al. (1996), Iachine et al. (1998), Iachine (2002), Petersen (1998), Rueten-Budde et al ( 2019), Wienke et al. (2000Wienke et al. ( , 2001Wienke et al. ( , 2002Wienke et al. ( , 2003aWienke et al. ( ,b, 2005)), Zdravkovic et al. (2002Zdravkovic et al. ( , 2004)).Zhu &Kosorok(2012) Let  0 ,  1 ,  2 be some real positive values.Set  1 =  0 +  1  2 =  0 +  2 Let  0 ,  1 ,  2 be independently gamma distributed random variables with The following relation holds This leads to the correlation Consequently, because of relation and To derive the unconditional model, the Laplace transform of gamma distributed random variables is applied.Hence, The Proposed Model -Modified Gamma Frailty (MGF) Model In order to include heterogeneity in the model, we assume a correlated gamma frailty model.Let Zj (j = 1; 2) be the frailties, andXj (j = 1; 2) vectors of observable covariates of the two individuals of a twin pair.Assume that their individual hazards are represented by the proportional hazards model () =    0 () {     }( = 1,2) (18) with a baseline hazard function  0 () describing the risk of respiratory infection as a function of age and  denotes the vector of regression parameters.Let the lifetimes of the two twin partners be conditionally independent given their frailties Z1 and Z2.Because frailties Zj(j = 1; 2) are usually unobservable, their correlation coefficient used cannot be estimated directly from the empirical data.So a bivariate lifetime model which allows indirect calculation of the parameters is needed.The unconditional bivariate survival function of the correlated gamma frailty model with observed covariates is given by: Where ( |) denotes the marginal univariate survival function, assumed to be equal for both partners in a twin pair.Using a parametric approach we fit a model to the data, such that (20) Where a, b, 1 2 ,  2 2 , β and ρ are parameters to be estimated.The lifetimes are assumed to be independently censored from the right by independent and identically distributed pairs of non-negative random variables, which are independent of the lifetimes.Thus, observe ( 1 ,  2 , Δ 1 , Δ 2 ,  1 ,  2 ) with Δ 1 ( = 1,2, . . .;  = 1,2) as a binary variable with values 1 (event) and 0 (no event).Let the lifetimes follow a distribution (dependent on covariates X1,X2) given by the bivariate survival function Starting from this model, we are able to derive the likelihood function given by Partial derivatives of the marginal survival functions are given by and The model is called the Modified Gamma Frailty (MGF) Model.

Numerical Illustration
An application of the models to an existing data on occupational exposure tagged -IRANIAN data is demonstrated here.Relationships between occupational exposures and morbidity, morbidity and job category were analyzed using proportional hazard analysis, allowing for exposure status (never exposed, ever smoked and ever exposed) until the time of carrying out the study.The survival-analysis was performed using the Python programming.# The output will provide parameter estimates, p-values The discrete algorithm was used, since the time-scale (personyears) was discrete.All exposures were first analyzed separately, allowing for age and smoking habits.Two-sided p-values < 0.05 were considered as statistically significant.
The relationship between occupational exposures and morbidity was also analyzed simultaneously.Using the stepwise option of Python programming, and allowing for age and smoking habits, specific exposures were included and excluded until the following conditions were met: the significance of the residual Chi-squared was less than 0.25, and the significance of the relative risks was less than 0.10.Using the standard error of the regression coefficient, the 95% confidence intervals were estimated.
The Python programming was also applied in analyzing the Correlated Gamma Frailty Model and the Modified Gamma Frailty Model.Hazard function and survival functions for the exposure data for large and small samples were estimated.

RESULTS AND DISCUSSION
Table 1 shows the results of analysis of the Iranian data and the goodness of fit table.The exponentiated coefficients in the third column of each table of the output shown are interpretable as multiplicative effects on the hazard.In tables 1, for example, holding the other covariates constant, one additional year of age increases the yearly hazard of exposure of worker by a factor of   = 1.053376 on averagethat is, by 5.3 percent.Similarly, each Forced Ventilatory Function (FVC) factor increases the hazard by a factor of 1.059079 or 5.9 percent.The fifth column is the result of the test of significance of  using the Wald Statistic which is the ratio of the coefficients to the standard error of .The obtained value is compared with the Z value and a decision is made.
In table 2, holding the other covariates constant, an additional year of age increases the yearly hazard of exposure of worker by a factor of   = 1.034585 on averagethat is, by 3.5 percent.Similarly, each FVC factor increases the hazard by a factor of 1.001301 or 0.1 percent.
In table 3, holding the other covariates constant, an additional year of age increases the yearly hazard of exposure of worker by a factor of   = 1.053481 on averagethat is, by 5.3 percent.Similarly, each FVC factor increases the hazard by a factor of 1.062155 or 6.2 percent.The exposure status (never exposed, exposed and ever smoked), Job category and pack years smoked is considered to be insignificant for the Iranian data using the Cox Model.
The CGF captures the exposure status and Job category to be insignificant for the Iranian data while the proposed MGF considers all the variables to be significant for the Iranian data.

Abdulazeez FJS
of the correlated compound Poisson frailty model, where the baseline hazard functions are not specified.A part of future research is envisaged in this direction.Another aspect that will be of interest for further research is the problem of identifiability.The identifiability problem is growing with increased censoring, but is reduced by the parametric modelling of the baseline hazard.This study furnishes a structured approach for optimal experiment design using the modified gamma frailty distribution, supported by a demonstrative Python-based simulation.

Figure 1 :
Figure 1: Survival Function at mean of Figure 2: Hazard Function at mean covariates -Iranian Study covariates -Iranian Study