Statistics Seminar at Georgia State University

Fall 2009, Fridays 3:00-4:00pm, Paul Erdos Conference room (796) COE

Organizer: Yichuan Zhao

If you would like to give a talk in Statistics Seminar, please send an email to Yichuan Zhao at matyiz@langate.gsu.edu



December 4, 3:00-4:00pm, 796 COE, Professor Jiawei Liu, Gerorgia State University

Abstract:

November 20, 2:00-3:00pm, 796 COE (Colloquium), Professor Lijian Yang, Department of Statistics and Probability, Michigan State University
Simultaneous Confidence Band for Sparse Longitudinal Regression Curve

Abstract: Recently functional data analysis has received considerable attention in statistics research and a number of successful applications have been reported, but there has been no results on the inference of the global shape of the mean regression curve. In this paper, asymptotically simultaneous confidence band is obtained for the mean trajectory curve based on sparse longitudinal data, using piecewise constant spline estimation. Simulation experiments corroborate the asymptotic theory.

November 13, 2:00-3:00pm, 796 COE (Colloquium), Professor Junhui Wang, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago
On Margin Based Semisupervised Learning

Abstract: In classification, semi-supervised learning occurs when a large amount of unlabeled data is available with only a small number of labeled data. This imposes a great challenge in that it is difficult to achieve good classiffication performance through labeled data alone. To leverage unlabeled data for enhancing classification, we introduces a margin based semisupervised learning method within the framework of regularization, based on an efficient margin loss for unlabeled data, which seeks efficient extraction of the information from unlabeled data for estimating the Bayes rule for classiffication. In particular, I will discuss three aspects: (1) the idea and methodology development; (2) computational tools; (3) a statistical learning theory. Numerical examples will be provided to demonstrate the advantage of our proposed methodology against other existing competitors. An application to gene function prediction will be discussed.

November 6, 3:00-4:00pm, 796 COE, Professor Yuanhui Xiao, Georgia State University
On Intraclass Correlation Coefficients

Abstract: The intraclass correlation coefficient (ICC) rho is widely used to measure the degree of family resemblance with respect to characteristics such as blood pressure, weight and height, etc. In this talk the author will discuss several statistical problems regarding ICCs. Especially, the author will present several resampling methods for computing the confidence intervals for the common ICC and testing the homogeneity of ICCs for several populations. The author will also propose a few research topics regarding ICCs.

October 30, 2:00-3:00pm, 796 COE (Colloquium), Professor Hongtu Zhu, University of North Carolina at Chapel Hill
Intrinsic Regression Models for Medial Representation and Diffusion Tensor Data

Abstract: In medical imaging analysis and computer vision, there is a growing interest in analyzing various manifold-valued data including 3D rotations, planar shapes, oriented or directed directions, the Grassmann manifold, deformation field, symmetric positive definite (SPD) matrices and medial shape representations (m-rep) of subcortical structures. Particularly, the scientific interests of most population studies focus on establishing the associations between a set of covariates (e.g., diagnostic status, age, and gender) and manifold-valued data for characterizing brain structure and shape differences, thus requiring a regression modeling framework for manifold-valued data. The aim of this talk is to develop an intrinsic regression model for the analysis of manifold-valued data as responses in a Riemannian manifold and their associations with a set of covariates, such as age and gender, in Euclidean space. Because manifold-valued data do not form a vector space, directly applying classical multivariate regression may be inadequate in establishing the relationship between manifold-valued data and covariates of interest, such as age and gender, in real applications. Our intrinsic regression model, which is a semiparametric model, uses a link function to map from the Euclidean space of covariates to the Riemannian manifold of manifold data. We develop an estimation procedure to calculate an intrinsic least square estimator and establish its limiting distribution. We develop score statistics to test linear hypotheses on unknown parameters. We apply our methods to the detection of the difference in the morphological changes of the left and right hippocampi between schizophrenia patients and healthy controls using medial shape description.

October 16, 3:00-4:00pm, 796 COE, Zhouping Li, School of Mathematics, Georgia Institute of Technology
Empirical Likelihood Method For Conditional Value-at-Risk

Abstract: Value-at-Risk is a simple, but useful measure in risk management. When some volatility model is employed, conditional Value-at-Risk is of importance. As ARCH/GARCH models are widely used in modeling volatilities, in this talk, we first propose empirical likelihood methods to construct confidence intervals for the conditional Value-at-Risk with the volatility model being an ARCH/GARCH model. We further consider an empirical likelihood-based estimation of the conditional Value-at-Risk in the nonparametric regression model.

October 9, 3:00-4:00pm, 796 COE, Dr. Zhipeng Cai, Mississippi State University
Association Study on Pedigree SNP Data

Abstract: Most association study methods become either ineffective or inefficient when dealing with increasing numbers of SNPs. Suggested by the block-like structure of the human genome, a popular strategy is to use haplotypes to try to capture the correlation structure of SNPs in regions of little recombination. This haplotype based association study would have significantly reduced degrees of freedom and be able to capture the combined effects of tightly linked causal variants. An efficient rule-based algorithm is presented for haplotype inference from pedigree genotype data, with the assumption of no recombination. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m3n3) time in a novel incremental fashion, here m denotes the total number of SNP loci on the chromosome. Extensive simulation experiments using eight pedigree structures that were used previously for association studies showed that the haplotype allele sharing status among the members can be deterministically, efficiently, and accurately determined, even for very small pedigrees.

October 2, 3:00-4:00pm, 796 COE, Professor Yixin Fang, Georgia State University
Some discussion on variable selection in mixed-effects models

Abstract: For model selection in mixed effects models, Vaida and Blanchard (2005) demonstrated that the marginal Akaike information criterion is appropriate as to the questions regarding the population and the conditional Akaike information criterion is appropriate as to the questions regarding the particular clusters in the data. This paper shows that the marginal Akaike information criterion is asymptotically equivalent to the leave-one-cluster-out cross-validation and the conditional Akaike information criterion is asymptotically equivalent to the leave-one-observation-out cross-validation.

September 25, COE 2:00-3:00pm, 796 (Colloquium), James L. Kepner, PhD , Vice-President, Statistics and Evaluation American Cancer Society And Adjunct Professor, Department of Biostatistics Rollins School of Public Health Emory University
Survey of Exact Methods in Sample Size Determination

Abstract: Discussed are exact one-stage and group-sequential sample size determination methods for one- and two-sample binomial proportions testing problems, methods for the corresponding finite population tests, and simultaneous tests for correlated binomial proportions. Design properties are discussed and new/unpublished results are described. The exact group sequential methods allow early stops only for efficacy or only for futility or for either efficacy or futility. Sample sizes, levels of significance and power at fixed points in the research hypothesis parameter space are compared among competing designs including those derived using asymptotic normal theory methods. Documents provided will include a description of how sample points are placed in the rejection region, simple proofs for each of the 3 one-sample theorems, tables demonstrating the efficiency of the two-sample designs, a table showing how close the one-sample designs can get to the one-stage uniformly most powerful test in terms of significance and power, a table demonstrating the remarkable sample size savings if two or more binomial endpoints are tested simultaneously.

September 23, 2:30-3:30pm, 796 COE (Colloquium), Professor Yufeng Liu, Department of Statistics & Operations Research, Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill
Estimation of Multiple Noncrossing Quantile Regression Functions

Abstract: Quantile regression is a very useful statistical tool to learn the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain betterestimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility to incorporate noncrossing constraints of quantile regression functions. In this talk, I will present a new multiple noncrossing quantile regression estimation technique. Both asymptotic properties and finite sample performance will be presented to illustrate usefulness of the proposed method.

August 28, 2:00-3:00pm, 654 COE (Colloquium), Professor Dabao Zhang, Department of Statistics, Purdue University
Penalized orthogonal-components regression for large p small n data

Abstract: We propose a penalized orthogonal-components regression (POCRE) for large p small n data. Orthogonal components are sequentially constructed to maximize, upon standardization, their correlation to the response residuals. A new penalization framework, implemented via empirical Bayes thresholding, is presented to effectively identify sparse predictors of each component. POCRE is computationally efficient owing to its sequential construction of leading sparse principal components. In addition, such construction offers other properties such as grouping highly correlated predictors and allowing for collinear or nearly collinear predictors. With multivariate responses, POCRE can construct common components and thus build up latent-variable models for large p small n data. This is joint work with Yanzhu Lin and Min Zhang.