subscribe to arXiv mailings

Mechanistic Interpretation through Contextual Decomposition in Transformers

Authors: Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Anobel Y. Odisho, Peter R. Carroll, Bin Yu

Abstract: Transformers exhibit impressive capabilities but are often regarded as black boxes due to challenges in understanding the complex nonlinear relationships between features. Interpreting machine learning models is of paramount importance to mitigate risks, and mechanistic interpretability is in particular of current interest as it opens up a window for guiding manual modifications and reverse-engine… ▽ More Transformers exhibit impressive capabilities but are often regarded as black boxes due to challenges in understanding the complex nonlinear relationships between features. Interpreting machine learning models is of paramount importance to mitigate risks, and mechanistic interpretability is in particular of current interest as it opens up a window for guiding manual modifications and reverse-engineering solutions. In this work, we introduce contextual decomposition for transformers (CD-T), extending a prior work on CD for RNNs and CNNs, to address mechanistic interpretation computationally efficiently. CD-T is a flexible interpretation method for transformers. It can capture contributions of combinations of input features or source internal components (e.g. attention heads, feed-forward networks) to (1) final predictions or (2) the output of any target internal component. Using CD-T, we propose a novel algorithm for circuit discovery. On a real-world pathology report classification task: we show CD-T distills a more faithful circuit of attention heads with improved computational efficiency (speed up 2x) than a prior benchmark, path patching. As a versatile interpretation method, CD-T also exhibits exceptional capabilities for local interpretations. CD-T is shown to reliably find words and phrases of contrasting sentiment/topic on SST-2 and AGNews datasets. Through human experiments, we demonstrate CD-T enables users to identify the more accurate of two models and to better trust a model's outputs compared to alternative interpretation methods such as SHAP and LIME. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2206.11907 [pdf, other]

doi 10.1093/mnras/stac2329

HI Properties of Satellite Galaxies around Local Volume Hosts

Authors: Ananthan Karunakaran, Kristine Spekkens, Rhys Carroll, David J. Sand, Paul Bennet, Denija Crnojević, Michael G. Jones, Burçin Mutlu-Pakdil

Abstract: We present neutral atomic hydrogen (HI) observations using the Robert C. Byrd Green Bank Telescope (GBT) along the lines of sight to 49 dwarf satellite galaxy candidates around eight Local Volume systems (M104, M51, NGC1023, NGC1156, NGC2903, NGC4258, NGC4565, NGC4631). We detect the HI reservoirs of two candidates (dw0934+2204 and dw1238$-$1122) and confirm them as background sources relative to… ▽ More We present neutral atomic hydrogen (HI) observations using the Robert C. Byrd Green Bank Telescope (GBT) along the lines of sight to 49 dwarf satellite galaxy candidates around eight Local Volume systems (M104, M51, NGC1023, NGC1156, NGC2903, NGC4258, NGC4565, NGC4631). We detect the HI reservoirs of two candidates (dw0934+2204 and dw1238$-$1122) and confirm them as background sources relative to their nearest foreground host systems. The remaining 47 satellite candidates are not detected in HI, and we place stringent $5σ$ upper limits on their HI mass. We note that some (15/47) of our non-detections stem from satellites being occluded by their putative host's HI emission. In addition to these new observations, we compile literature estimates on the HI mass for an additional 17 satellites. We compare the HI properties of these satellites to those within the Local Group, finding broad agreement between them. Crucially, these observations probe a ``transition'' region between $-10\gtrsim M_V \gtrsim -14$ where we see a mixture of gas-rich and gas-poor satellites and where quenching processes shift from longer timescales (i.e. via starvation) to shorter ones (i.e. via stripping). While there are many gas-poor satellites within this region, some are gas rich and this suggests that the transition towards predominantly gas-rich satellites occurs at $L_{V}\sim10^{7}L_{\odot}$, in line with simulations. The observations presented here are a key step toward characterizing the properties of dwarf satellite galaxies around Local Volume systems and future wide-field radio surveys with higher angular resolution (e.g.~WALLABY) will vastly improve upon the study of such systems. △ Less

Submitted 23 June, 2022; originally announced June 2022.

Comments: 12 pages, 5 Figures, 3 Tables; Submitted to MNRAS; Comments welcome!

arXiv:2203.13400 [pdf]

doi 10.1088/1538-3873/ac5b87

Wayne State Universitys Dan Zowada Memorial Observatory: Characterization and Pipeline of a 0.5 Meter Robotic Telescope

Authors: Robert Carr, David Cinabro, Edward Cackett, David Moutard, Russell Carroll

Abstract: Wayne State University's Dan Zowada Memorial Observatory is a fully robotic 0.5m telescope and imaging system located under the dark skies of New Mexico. The observatory is particularly suited to time domain astronomy: the observation of variable objects, such as tidal disruption events, supernovae, and active galactic nuclei. We have developed a software suite for image reduction, alignment and s… ▽ More Wayne State University's Dan Zowada Memorial Observatory is a fully robotic 0.5m telescope and imaging system located under the dark skies of New Mexico. The observatory is particularly suited to time domain astronomy: the observation of variable objects, such as tidal disruption events, supernovae, and active galactic nuclei. We have developed a software suite for image reduction, alignment and stacking, and calculation of absolute photometry in the Sloan filters used at the telescope. Our pipeline also performs image subtraction to enable photometry of objects embedded in bright backgrounds such as galaxies. The 5 sigma detection limit of the Zowada Observatory for integration of 16 x 90 second exposures is 19.0 magnitude in g-band, 18.1 magnitude in r-band, 17.9 magnitude in i-band, and 16.6 magnitude in z-band. For a 3 sigma detection limit, measurements may be performed with greater uncertainties as deep as 19.9, 19.1. 18.9 and 17.5 magnitude in griz bands, respectively. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: Accepted by Publications of the Astronomical Society of the Pacific. 11 pages, 7 figures

arXiv:2201.10208 [pdf, other]

Semi-Supervised Quantile Estimation: Robust and Efficient Inference in High Dimensional Settings

Authors: Abhishek Chakrabortty, Guorong Dai, Raymond J. Carroll

Abstract: We consider quantile estimation in a semi-supervised setting, characterized by two available data sets: (i) a small or moderate sized labeled data set containing observations for a response and a set of possibly high dimensional covariates, and (ii) a much larger unlabeled data set where only the covariates are observed. We propose a family of semi-supervised estimators for the response quantile(s… ▽ More We consider quantile estimation in a semi-supervised setting, characterized by two available data sets: (i) a small or moderate sized labeled data set containing observations for a response and a set of possibly high dimensional covariates, and (ii) a much larger unlabeled data set where only the covariates are observed. We propose a family of semi-supervised estimators for the response quantile(s) based on the two data sets, to improve the estimation accuracy compared to the supervised estimator, i.e., the sample quantile from the labeled data. These estimators use a flexible imputation strategy applied to the estimating equation along with a debiasing step that allows for full robustness against misspecification of the imputation model. Further, a one-step update strategy is adopted to enable easy implementation of our method and handle the complexity from the non-linear nature of the quantile estimating equation. Under mild assumptions, our estimators are fully robust to the choice of the nuisance imputation model, in the sense of always maintaining root-n consistency and asymptotic normality, while having improved efficiency relative to the supervised estimator. They also attain semi-parametric optimality if the relation between the response and the covariates is correctly specified via the imputation model. As an illustration of estimating the nuisance imputation function, we consider kernel smoothing type estimators on lower dimensional and possibly estimated transformations of the high dimensional covariates, and we establish novel results on their uniform convergence rates in high dimensions, involving responses indexed by a function class and usage of dimension reduction techniques. These results may be of independent interest. Numerical results on both simulated and real data confirm our semi-supervised approach's improved performance, in terms of both estimation and inference. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: 31 pages, 6 tables. arXiv admin note: text overlap with arXiv:2201.00468

arXiv:2107.07257 [pdf, other]

Nonparametric, tuning-free estimation of S-shaped functions

Authors: Oliver Y. Feng, Yining Chen, Qiyang Han, Raymond J. Carroll, Richard J. Samworth

Abstract: We consider the nonparametric estimation of an S-shaped regression function. The least squares estimator provides a very natural, tuning-free approach, but results in a non-convex optimisation problem, since the inflection point is unknown. We show that the estimator may nevertheless be regarded as a projection onto a finite union of convex cones, which allows us to propose a mixed primal-dual bas… ▽ More We consider the nonparametric estimation of an S-shaped regression function. The least squares estimator provides a very natural, tuning-free approach, but results in a non-convex optimisation problem, since the inflection point is unknown. We show that the estimator may nevertheless be regarded as a projection onto a finite union of convex cones, which allows us to propose a mixed primal-dual bases algorithm for its efficient, sequential computation. After developing a projection framework that demonstrates the consistency and robustness to misspecification of the estimator, our main theoretical results provide sharp oracle inequalities that yield worst-case and adaptive risk bounds for the estimation of the regression function, as well as a rate of convergence for the estimation of the inflection point. These results reveal not only that the estimator achieves the minimax optimal rate of convergence for both the estimation of the regression function and its inflection point (up to a logarithmic factor in the latter case), but also that it is able to achieve an almost-parametric rate when the true regression function is piecewise affine with not too many affine pieces. Simulations and a real data application to air pollution modelling also confirm the desirable finite-sample properties of the estimator, and our algorithm is implemented in the R package Sshaped. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: 79 pages, 10 figures

arXiv:2103.12846 [pdf, ps, other]

On the global identifiability of logistic regression models with misclassified outcomes

Authors: Rui Duan, Yang Ning, Jiasheng Shi, Raymond J Carroll, Tianxi Cai, Yong Chen

Abstract: In the last decade, the secondary use of large data from health systems, such as electronic health records, has demonstrated great promise in advancing biomedical discoveries and improving clinical decision making. However, there is an increasing concern about biases in association studies caused by misclassification in the binary outcomes derived from electronic health records. We revisit the cla… ▽ More In the last decade, the secondary use of large data from health systems, such as electronic health records, has demonstrated great promise in advancing biomedical discoveries and improving clinical decision making. However, there is an increasing concern about biases in association studies caused by misclassification in the binary outcomes derived from electronic health records. We revisit the classical logistic regression model with misclassified outcomes. Despite that local identification conditions in some related settings have been previously established, the global identification of such models remains largely unknown and is an important question yet to be answered. We derive necessary and sufficient conditions for global identifiability of logistic regression models with misclassified outcomes, using a novel approach termed as the submodel analysis, and a technique adapted from the Picard-Lindelöf existence theorem in ordinary differential equations. In particular, our results are applicable to logistic models with discrete covariates, which is a common situation in biomedical studies, The conditions are easy to verify in practice. In addition to model identifiability, we propose a hypothesis testing procedure for regression coefficients in the misclassified logistic regression model when the model is not identifiable under the null. △ Less

Submitted 23 March, 2021; originally announced March 2021.

arXiv:2008.10563 [pdf]

The vimentin cytoskeleton: When polymer physics meets cell biology

Authors: Alison E. Patteson, Robert J. Carroll, Daniel V. Iwamoto, Paul A. Janmey

Abstract: The proper functions of tissues depend on the ability of cells to withstand stress and maintain shape. Central to this process is the cytoskeleton, comprised of three polymeric networks: F-actin, microtubules, and intermediate filaments. Intermediate filament proteins are among the most abundant cytoskeletal proteins in cells; yet they remain one of the least understood. Their structure and functi… ▽ More The proper functions of tissues depend on the ability of cells to withstand stress and maintain shape. Central to this process is the cytoskeleton, comprised of three polymeric networks: F-actin, microtubules, and intermediate filaments. Intermediate filament proteins are among the most abundant cytoskeletal proteins in cells; yet they remain one of the least understood. Their structure and function deviate from those of their cytoskeletal partners, F-actin and microtubules. Intermediate filament networks show a unique combination of extensibility, flexibility and toughness that confers mechanical resilience to the cell. Vimentin is an intermediate filament protein expressed in mesenchymal cells. This review highlights exciting new results on the physical biology of vimentin intermediate filaments and their role in allowing whole cells and tissues to cope with stress. △ Less

Submitted 24 August, 2020; originally announced August 2020.

arXiv:2006.16357 [pdf, ps, other]

Data integration in high dimension with multiple quantiles

Authors: Guorong Dai, Ursula U. Müller, Raymond J. Carroll

Abstract: This article deals with the analysis of high dimensional data that come from multiple sources (experiments) and thus have different possibly correlated responses, but share the same set of predictors. The measurements of the predictors may be different across experiments. We introduce a new regression approach with multiple quantiles to select those predictors that affect any of the responses at a… ▽ More This article deals with the analysis of high dimensional data that come from multiple sources (experiments) and thus have different possibly correlated responses, but share the same set of predictors. The measurements of the predictors may be different across experiments. We introduce a new regression approach with multiple quantiles to select those predictors that affect any of the responses at any quantile level and estimate the nonzero parameters. Our estimator is a minimizer of a penalized objective function, which aggregates the data from the different experiments. We establish model selection consistency and asymptotic normality of the estimator. In addition we present an information criterion, which can also be used for consistent model selection. Simulations and two data applications illustrate the advantages of our method, which takes the group structure induced by the predictors across experiments and quantile levels into account. △ Less

Submitted 29 June, 2020; originally announced June 2020.

arXiv:2006.15384 [pdf, ps, other]

Optimal Asset Allocation For Outperforming A Stochastic Benchmark Target

Authors: Chendi Ni, Yuying Li, Peter Forsyth, Ray Carroll

Abstract: We propose a data-driven Neural Network (NN) optimization framework to determine the optimal multi-period dynamic asset allocation strategy for outperforming a general stochastic target. We formulate the problem as an optimal stochastic control with an asymmetric, distribution shaping, objective function. The proposed framework is illustrated with the asset allocation problem in the accumulation p… ▽ More We propose a data-driven Neural Network (NN) optimization framework to determine the optimal multi-period dynamic asset allocation strategy for outperforming a general stochastic target. We formulate the problem as an optimal stochastic control with an asymmetric, distribution shaping, objective function. The proposed framework is illustrated with the asset allocation problem in the accumulation phase of a defined contribution pension plan, with the goal of achieving a higher terminal wealth than a stochastic benchmark. We demonstrate that the data-driven approach is capable of learning an adaptive asset allocation strategy directly from historical market returns, without assuming any parametric model of the financial market dynamics. Following the optimal adaptive strategy, investors can make allocation decisions simply depending on the current state of the portfolio. The optimal adaptive strategy outperforms the benchmark constant proportion strategy, achieving a higher terminal wealth with a 90% probability, a 46% higher median terminal wealth, and a significantly more right-skewed terminal wealth distribution. We further demonstrate the robustness of the optimal adaptive strategy by testing the performance of the strategy on bootstrap resampled market data, which has different distributions compared to the training data. △ Less

Submitted 27 June, 2020; originally announced June 2020.

Comments: 33 pages

arXiv:2005.03685 [pdf, other]

doi 10.3847/1538-4357/ab91b5

Supermassive black holes with high accretion rates in active galactic nuclei. XI. Accretion disk reverberation mapping of Mrk 142

Authors: Edward M. Cackett, Jonathan Gelbord, Yan-Rong Li, Keith Horne, Jian-Min Wang, Aaron J. Barth, Jin-Ming Bai, Wei-Hao Bian, Russell W. Carroll, Pu Du, Rick Edelson, Michael R. Goad, Luis C. Ho, Chen Hu, Viraja C. Khatu, Bin Luo, Jake Miller, Ye-Fei Yuan

Abstract: We performed an intensive accretion disk reverberation mapping campaign on the high accretion rate active galactic nucleus Mrk 142 in early 2019. Mrk 142 was monitored with the Neil Gehrels Swift Observatory for 4 months in X-rays and 6 UV/optical filters. Ground-based photometric monitoring was obtained from the Las Cumbres Observatory, Liverpool Telescope and Dan Zowada Memorial Observatory in u… ▽ More We performed an intensive accretion disk reverberation mapping campaign on the high accretion rate active galactic nucleus Mrk 142 in early 2019. Mrk 142 was monitored with the Neil Gehrels Swift Observatory for 4 months in X-rays and 6 UV/optical filters. Ground-based photometric monitoring was obtained from the Las Cumbres Observatory, Liverpool Telescope and Dan Zowada Memorial Observatory in ugriz filters and the Yunnan Astronomical Observatory in V. Mrk 142 was highly variable throughout, displaying correlated variability across all wavelengths. We measure significant time lags between the different wavelength light curves, finding that through the UV and optical the wavelength-dependent lags, $τ(λ)$, generally follow the relation $τ(λ) \propto λ^{4/3}$, as expected for the $T\propto R^{-3/4}$ profile of a steady-state optically-thick, geometrically-thin accretion disk, though can also be fit by $τ(λ) \propto λ^{2}$, as expected for a slim disk. The exceptions are the u and U band, where an excess lag is observed, as has been observed in other AGN and attributed to continuum emission arising in the broad-line region. Furthermore, we perform a flux-flux analysis to separate the constant and variable components of the spectral energy distribution, finding that the flux-dependence of the variable component is consistent with the $f_ν\proptoν^{1/3}$ spectrum expected for a geometrically-thin accretion disk. Moreover, the X-ray to UV lag is significantly offset from an extrapolation of the UV/optical trend, with the X-rays showing a poorer correlation with the UV than the UV does with the optical. The magnitude of the UV/optical lags is consistent with a highly super-Eddington accretion rate. △ Less

Submitted 7 May, 2020; originally announced May 2020.

Comments: 15 pages, 5 figures, 4 tables, accepted for publication in ApJ

arXiv:2002.07255 [pdf, other]

Nonparametric Bayesian Deconvolution of a Symmetric Unimodal Density

Authors: Ya Su, Anirban Bhattacharya, Yan Zhang, Nilanjan Chatterjee, Raymond J. Carroll

Abstract: We consider nonparametric measurement error density deconvolution subject to heteroscedastic measurement errors as well as symmetry about zero and shape constraints, in particular unimodality. The problem is motivated by applications where the observed data are estimated effect sizes from regressions on multiple factors, where the target is the distribution of the true effect sizes. We exploit the… ▽ More We consider nonparametric measurement error density deconvolution subject to heteroscedastic measurement errors as well as symmetry about zero and shape constraints, in particular unimodality. The problem is motivated by applications where the observed data are estimated effect sizes from regressions on multiple factors, where the target is the distribution of the true effect sizes. We exploit the fact that any symmetric and unimodal density can be expressed as a mixture of symmetric uniform densities, and model the mixing density in a new way using a Dirichlet process location-mixture of Gamma distributions. We do the computations within a Bayesian context, describe a simple scalable implementation that is linear in the sample size, and show that the estimate of the unknown target density is consistent. Within our application context of regression effect sizes, the target density is likely to have a large probability near zero (the near null effects) coupled with a heavy-tailed distribution (the actual effects). Simulations show that unlike standard deconvolution methods, our Constrained Bayesian Deconvolution method does a much better job of reconstruction of the target density. Applications to a genome-wise association study (GWAS) and microarray data reveal similar results. △ Less

Submitted 17 February, 2020; originally announced February 2020.

arXiv:1912.05084 [pdf, other]

Bayesian Copula Density Deconvolution for Zero-Inflated Data in Nutritional Epidemiology

Authors: Abhra Sarkar, Debdeep Pati, Bani K. Mallick, Raymond J. Carroll

Abstract: Estimating the marginal and joint densities of the long-term average intakes of different dietary components is an important problem in nutritional epidemiology. Since these variables cannot be directly measured, data are usually collected in the form of 24-hour recalls of the intakes, which show marked patterns of conditional heteroscedasticity. Significantly compounding the challenges, the recal… ▽ More Estimating the marginal and joint densities of the long-term average intakes of different dietary components is an important problem in nutritional epidemiology. Since these variables cannot be directly measured, data are usually collected in the form of 24-hour recalls of the intakes, which show marked patterns of conditional heteroscedasticity. Significantly compounding the challenges, the recalls for episodically consumed dietary components also include exact zeros. The problem of estimating the density of the latent long-time intakes from their observed measurement error contaminated proxies is then a problem of deconvolution of densities with zero-inflated data. We propose a Bayesian semiparametric solution to the problem, building on a novel hierarchical latent variable framework that translates the problem to one involving continuous surrogates only. Crucial to accommodating important aspects of the problem, we then design a copula-based approach to model the involved joint distributions, adopting different modeling strategies for the marginals of the different dietary components. We design efficient Markov chain Monte Carlo algorithms for posterior inference and illustrate the efficacy of the proposed method through simulation experiments. Applied to our motivating nutritional epidemiology problems, compared to other approaches, our method provides more realistic estimates of the consumption patterns of episodically consumed dietary components. △ Less

Submitted 10 December, 2019; originally announced December 2019.

arXiv:1910.06235 [pdf, other]

Gaussian Processes with Errors in Variables: Theory and Computation

Authors: Shuang Zhou, Debdeep Pati, Tianying Wang, Yun Yang, Raymond J. Carroll

Abstract: Covariate measurement error in nonparametric regression is a common problem in nutritional epidemiology and geostatistics, and other fields. Over the last two decades, this problem has received substantial attention in the frequentist literature. Bayesian approaches for handling measurement error have only been explored recently and are surprisingly successful, although the lack of a proper theore… ▽ More Covariate measurement error in nonparametric regression is a common problem in nutritional epidemiology and geostatistics, and other fields. Over the last two decades, this problem has received substantial attention in the frequentist literature. Bayesian approaches for handling measurement error have only been explored recently and are surprisingly successful, although the lack of a proper theoretical justification regarding the asymptotic performance of the estimators. By specifying a Gaussian process prior on the regression function and a Dirichlet process Gaussian mixture prior on the unknown distribution of the unobserved covariates, we show that the posterior distribution of the regression function and the unknown covariates density attain optimal rates of contraction adaptively over a range of Hölder classes, up to logarithmic terms. This improves upon the existing classical frequentist results which require knowledge of the smoothness of the underlying function to deliver optimal risk bounds. We also develop a novel surrogate prior for approximating the Gaussian process prior that leads to efficient computation and preserves the covariance structure, thereby facilitating easy prior elicitation. We demonstrate the empirical performance of our approach and compare it with competitors in a wide range of simulation experiments and a real data example. △ Less

Submitted 26 January, 2023; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1908.03968 [pdf, ps, other]

Finite Sample Hypothesis Tests for Stacked Estimating Equations

Authors: Eli S. Kravitz, Raymond J. Carroll, David Ruppert

Abstract: Suppose there are two unknown parameters, each parameter is the solution to an estimating equation, and the estimating equation of one parameter depends on the other parameter. The parameters can be jointly estimated by "stacking" their estimating equations and solving for both parameters simultaneously. Asymptotic confidence intervals are readily available for stacked estimating equations. We int… ▽ More Suppose there are two unknown parameters, each parameter is the solution to an estimating equation, and the estimating equation of one parameter depends on the other parameter. The parameters can be jointly estimated by "stacking" their estimating equations and solving for both parameters simultaneously. Asymptotic confidence intervals are readily available for stacked estimating equations. We introduce a bootstrap-based hypothesis test for stacked estimating equations which does not rely on asymptotic approximations. Test statistics are constructed by splitting the sample in two, estimating the first parameter on a portion of the sample then plugging the result into the second estimating equation to solve for the next parameter using the remaining sample. To reduce simulation variability from a single split, we repeatedly split the sample and take the sample mean of all the estimates. For parametric models, we derive the limiting distribution of sample splitting estimator and show they are equivalent to stacked estimating equations. △ Less

Submitted 11 August, 2019; originally announced August 2019.

Comments: preprint. arXiv admin note: text overlap with arXiv:1908.03967

arXiv:1908.03967 [pdf, ps, other]

Sample Splitting as an M-Estimator with Application to Physical Activity Scoring

Authors: Eli S. Kravitz, Raymond J. Carroll, David Ruppert

Abstract: Sample splitting is widely used in statistical applications, including classically in classification and more recently for inference post model selection. Motivating by problems in the study of diet, physical activity, and health, we consider a new application of sample splitting. Physical activity researchers wanted to create a scoring system to quickly assess physical activity levels. A score is… ▽ More Sample splitting is widely used in statistical applications, including classically in classification and more recently for inference post model selection. Motivating by problems in the study of diet, physical activity, and health, we consider a new application of sample splitting. Physical activity researchers wanted to create a scoring system to quickly assess physical activity levels. A score is created using a large cohort study. Then, using the same data, this score serves as a covariate in a model for the risk of disease or mortality. Since the data are used twice in this way, standard errors and confidence intervals from fitting the second model are not valid. To allow for proper inference, sample splitting can be used. One builds the score with a random half of the data and then uses the score when fitting a model to the other half of the data. We derive the limiting distribution of the estimators. An obvious question is what happens if multiple sample splits are performed. We show that as the number of sample splits increases, the combination of multiple sample splits is effectively equivalent to solving a set of estimating equations. △ Less

Submitted 11 August, 2019; originally announced August 2019.

Comments: preprint. arXiv admin note: text overlap with arXiv:1908.03968

arXiv:1812.11040 [pdf]

doi 10.1127/ejm/2018/0030-2701

Sulfur and REE zoning in apatite: The example of the Colli Albani magmatic system

Authors: Alessandro Fabbrizio, Mario Gaeta, Michael R. Carroll, Maurizio Petrelli

Abstract: We investigate the distribution of major and trace elements in apatite crystals hosted in granular alkaline rocks composed mainly of leucite and clinopyroxene, representative of the hypabyssal crystallization of a magma body in the Quaternary ultra-potassic Colli Albani Volcanic District (CAVD), which was emplaced into thick limestone units along the Tyrrhenian margin of Italy. Results show that t… ▽ More We investigate the distribution of major and trace elements in apatite crystals hosted in granular alkaline rocks composed mainly of leucite and clinopyroxene, representative of the hypabyssal crystallization of a magma body in the Quaternary ultra-potassic Colli Albani Volcanic District (CAVD), which was emplaced into thick limestone units along the Tyrrhenian margin of Italy. Results show that the analyzed crystals are the SrO-richest (up to 4.6 wt%) fluorapatite (F =2.6-3.7 wt%) of the Italian alkaline rocks. The strontium enrichment is caused by the lack of other Sr-compatible mineral phases, such as plagioclase, alkali feldspar and melilite, in these leucite- and clinopyroxene-bearing rocks. The studied samples show core-rim zoning with rims enriched in Si, S, and REE whereas the cores are enriched in Ca and P. The LREE-oxides contents of apatite, reaching 4.2 wt%, represent more than 95% of the total REE budget; SiO2 contents range from 1.3 to 3.6 wt%, and SO3 concentrations between 0.6 and 1.4 wt%. △ Less

Submitted 28 December, 2018; originally announced December 2018.

arXiv:1807.05274 [pdf, ps, other]

Sparse semiparametric canonical correlation analysis for data of mixed types

Authors: Grace Yoon, Raymond J. Carroll, Irina Gaynanova

Abstract: Canonical correlation analysis investigates linear relationships between two sets of variables, but often works poorly on modern data sets due to high-dimensionality and mixed data types such as continuous, binary and zero-inflated. To overcome these challenges, we propose a semiparametric approach for sparse canonical correlation analysis based on Gaussian copula. Our main contribution is a trunc… ▽ More Canonical correlation analysis investigates linear relationships between two sets of variables, but often works poorly on modern data sets due to high-dimensionality and mixed data types such as continuous, binary and zero-inflated. To overcome these challenges, we propose a semiparametric approach for sparse canonical correlation analysis based on Gaussian copula. Our main contribution is a truncated latent Gaussian copula model for data with excess zeros, which allows us to derive a rank-based estimator of the latent correlation matrix for mixed variable types without the estimation of marginal transformation functions. The resulting canonical correlation analysis method works well in high-dimensional settings as demonstrated via numerical studies, as well as in application to the analysis of association between gene expression and micro RNA data of breast cancer patients. △ Less

Submitted 10 October, 2019; v1 submitted 13 July, 2018; originally announced July 2018.

Comments: Accepted to Biometrika. Main text: 19 pages and 3 figures. Supplementary material: 28 pages and 9 figures

Journal ref: Biometrika 2020, Vol. 107, No. 3, 609-625

arXiv:1806.04171 [pdf, other]

doi 10.1145/3197517.3201329

Synthetic Depth-of-Field with a Single-Camera Mobile Phone

Authors: Neal Wadhwa, Rahul Garg, David E. Jacobs, Bryan E. Feldman, Nori Kanazawa, Robert Carroll, Yair Movshovitz-Attias, Jonathan T. Barron, Yael Pritch, Marc Levoy

Abstract: Shallow depth-of-field is commonly used by photographers to isolate a subject from a distracting background. However, standard cell phone cameras cannot produce such images optically, as their short focal lengths and small apertures capture nearly all-in-focus images. We present a system to computationally synthesize shallow depth-of-field images with a single mobile camera and a single button pre… ▽ More Shallow depth-of-field is commonly used by photographers to isolate a subject from a distracting background. However, standard cell phone cameras cannot produce such images optically, as their short focal lengths and small apertures capture nearly all-in-focus images. We present a system to computationally synthesize shallow depth-of-field images with a single mobile camera and a single button press. If the image is of a person, we use a person segmentation network to separate the person and their accessories from the background. If available, we also use dense dual-pixel auto-focus hardware, effectively a 2-sample light field with an approximately 1 millimeter baseline, to compute a dense depth map. These two signals are combined and used to render a defocused image. Our system can process a 5.4 megapixel image in 4 seconds on a mobile phone, is fully automatic, and is robust enough to be used by non-experts. The modular nature of our system allows it to degrade naturally in the absence of a dual-pixel sensor or a human subject. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: Accepted to SIGGRAPH 2018. Basis for Portrait Mode on Google Pixel 2 and Pixel 2 XL

arXiv:1804.00793 [pdf, other]

A spline-assisted semiparametric approach to non-parametric measurement error models

Authors: Fei Jiang, Yanyuan Ma, Raymond J. Carroll

Abstract: It is well known that the minimax rates of convergence of nonparametric density and regression function estimation of a random variable measured with error is much slower than the rate in the error free case. Surprisingly, we show that if one is willing to impose a relatively mild assumption in requiring that the error-prone variable has a compact support, then the results can be greatly improved.… ▽ More It is well known that the minimax rates of convergence of nonparametric density and regression function estimation of a random variable measured with error is much slower than the rate in the error free case. Surprisingly, we show that if one is willing to impose a relatively mild assumption in requiring that the error-prone variable has a compact support, then the results can be greatly improved. We describe new and constructive methods to take full advantage of the compact support assumption via spline-assisted semiparametric methods. We further prove that the new estimator achieves the usual nonparametric rate in estimating both the density and regression functions as if there were no measurement error. The proof involves linear and bilinear operator theories, semiparametric theory, asymptotic analysis regarding Bsplines, as well as integral equation treatments. The performance of the new methods is demonstrated through several simulations and a data example. △ Less

Submitted 19 August, 2019; v1 submitted 2 April, 2018; originally announced April 2018.

Comments: 30 pages

arXiv:1712.02327 [pdf, other]

Burst Denoising with Kernel Prediction Networks

Authors: Ben Mildenhall, Jonathan T. Barron, Jiawen Chen, Dillon Sharlet, Ren Ng, Robert Carroll

Abstract: We present a technique for jointly denoising bursts of images taken from a handheld camera. In particular, we propose a convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable… ▽ More We present a technique for jointly denoising bursts of images taken from a handheld camera. In particular, we propose a convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable local minima. Our model matches or outperforms the state-of-the-art across a wide range of noise levels on both real and synthetic data. △ Less

Submitted 29 March, 2018; v1 submitted 6 December, 2017; originally announced December 2017.

Comments: To appear in CVPR 2018 (spotlight). Project page: http://people.eecs.berkeley.edu/~bmild/kpn/

arXiv:1701.05230 [pdf, other]

Surrogate Aided Unsupervised Recovery of Sparse Signals in Single Index Models for Binary Outcomes

Authors: Abhishek Chakrabortty, Matey Neykov, Raymond Carroll, Tianxi Cai

Abstract: We consider the recovery of regression coefficients, denoted by $\boldsymbolβ_0$, for a single index model (SIM) relating a binary outcome $Y$ to a set of possibly high dimensional covariates $\boldsymbol{X}$, based on a large but 'unlabeled' dataset $\mathcal{U}$, with $Y$ never observed. On $\mathcal{U}$, we fully observe $\boldsymbol{X}$ and additionally, a surrogate $S$ which, while not being… ▽ More We consider the recovery of regression coefficients, denoted by $\boldsymbolβ_0$, for a single index model (SIM) relating a binary outcome $Y$ to a set of possibly high dimensional covariates $\boldsymbol{X}$, based on a large but 'unlabeled' dataset $\mathcal{U}$, with $Y$ never observed. On $\mathcal{U}$, we fully observe $\boldsymbol{X}$ and additionally, a surrogate $S$ which, while not being strongly predictive of $Y$ throughout the entirety of its support, can forecast it with high accuracy when it assumes extreme values. Such datasets arise naturally in modern studies involving large databases such as electronic medical records (EMR) where $Y$, unlike $(\boldsymbol{X}, S)$, is difficult and/or expensive to obtain. In EMR studies, an example of $Y$ and $S$ would be the true disease phenotype and the count of the associated diagnostic codes respectively. Assuming another SIM for $S$ given $\boldsymbol{X}$, we show that under sparsity assumptions, we can recover $\boldsymbolβ_0$ proportionally by simply fitting a least squares LASSO estimator to the subset of the observed data on $(\boldsymbol{X}, S)$ restricted to the extreme sets of $S$, with $Y$ imputed using the surrogacy of $S$. We obtain sharp finite sample performance bounds for our estimator, including deterministic deviation bounds and probabilistic guarantees. We demonstrate the effectiveness of our approach through multiple simulation studies, as well as by application to real data from an EMR study conducted at the Partners HealthCare Systems. △ Less

Submitted 30 June, 2018; v1 submitted 18 January, 2017; originally announced January 2017.

Comments: 50 pages, 3 tables, 1 figure

MSC Class: 62J12; 62J07; 62H30; 62G32; 62F10; 62F30

arXiv:1610.00667 [pdf, ps, other]

Data Integration with High Dimensionality

Authors: Xin Gao, Raymond J. Carroll

Abstract: We consider a problem of data integration. Consider determining which genes affect a disease. The genes, which we call predictor objects, can be measured in different experiments on the same individual. We address the question of finding which genes are predictors of disease by any of the experiments. Our formulation is more general. In a given data set, there are a fixed number of responses for e… ▽ More We consider a problem of data integration. Consider determining which genes affect a disease. The genes, which we call predictor objects, can be measured in different experiments on the same individual. We address the question of finding which genes are predictors of disease by any of the experiments. Our formulation is more general. In a given data set, there are a fixed number of responses for each individual, which may include a mix of discrete, binary and continuous variables. There is also a class of predictor objects, which may differ within a subject depending on how the predictor object is measured, i.e., depend on the experiment. The goal is to select which predictor objects affect any of the responses, where the number of such informative predictor objects or features tends to infinity as sample size increases. There are marginal likelihoods for each way the predictor object is measured, i.e., for each experiment. We specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for the pseudolikelihood information criterion with unbounded true model size, which includes a Bayesian information criterion with appropriate penalty term as a special case. Simulations indicate that data integration improves upon, sometimes dramatically, using only one of the data sources. △ Less

Submitted 3 October, 2016; originally announced October 2016.

arXiv:1606.03775 [pdf, ps, other]

Additive Function-on-Function Regression

Authors: Janet S. Kim, Ana-Maria Staicu, Arnab Maity, Raymond J. Carroll, David Ruppert

Abstract: We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory,… ▽ More We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory, propose an inference procedure that accounts for total variability in the predicted response curves, and construct pointwise prediction intervals. The estimation/inferential procedure accommodates realistic scenarios such as correlated error structure as well as sparse and/or irregular designs. We investigate our methodology in finite sample size through simulations and two real data applications. △ Less

Submitted 14 December, 2016; v1 submitted 12 June, 2016; originally announced June 2016.

Comments: 26 pages, 4 figures

arXiv:1603.08056 [pdf, other]

doi 10.1016/j.physletb.2016.06.065

The mutable nature of particle-core excitations with spin in the one-valence-proton nucleus 133Sb

Authors: G. Bocchi, S. Leoni, B. Fornal, G. Colo', P. F. Bortignon, S. Bottoni, A. Bracco, C. Michelagnoli, D. Bazzacco, A. Blanc, G. De France, M. Jentschel, U. Koster, P. Mutti, J. -M. Regis, G. Simpson, T. Soldner, C. A. Ur, W. Urban, L. M. Fraile, R. Lozeva, B. Belvito, G. Benzoni, A. Bruce, R. Carroll , et al. (21 additional authors not shown)

Abstract: The gamma-ray decay of excited states of the one-valence-proton nucleus 133Sb has been studied using cold-neutron induced fission of 235U and 241Pu targets, during the EXILL campaign at the ILL reactor in Grenoble. By using a highly efficient HPGe array, coincidences between gamma-rays prompt with the fission event and those delayed up to several tens of microseconds were investigated, allowing to… ▽ More The gamma-ray decay of excited states of the one-valence-proton nucleus 133Sb has been studied using cold-neutron induced fission of 235U and 241Pu targets, during the EXILL campaign at the ILL reactor in Grenoble. By using a highly efficient HPGe array, coincidences between gamma-rays prompt with the fission event and those delayed up to several tens of microseconds were investigated, allowing to observe, for the first time, high-spin excited states above the 16.6 micros isomer. Lifetimes analysis, performed by fast-timing techniques with LaBr3(Ce) scintillators, reveals a difference of almost two orders of magnitude in B(M1) strength for transitions between positive-parity medium-spin yrast states. The data are interpreted by a newly developed microscopic model which takes into account couplings between core excitations (both collective and non-collective) of the doubly magic nucleus 132Sn and the valence proton, using the Skyrme effective interaction in a consistent way. The results point to a fast change in the nature of particle-core excitations with increasing spin. △ Less

Submitted 25 March, 2016; originally announced March 2016.

arXiv:1510.04027 [pdf, ps, other]

doi 10.1214/15-AOS1344

Estimation and inference in generalized additive coefficient models for nonlinear interactions with high-dimensional covariates

Authors: Shujie Ma, Raymond J. Carroll, Hua Liang, Shizhong Xu

Abstract: In the low-dimensional case, the generalized additive coefficient model (GACM) proposed by Xue and Yang [Statist. Sinica 16 (2006) 1423-1446] has been demonstrated to be a powerful tool for studying nonlinear interaction effects of variables. In this paper, we propose estimation and inference procedures for the GACM when the dimension of the variables is high. Specifically, we propose a groupwise… ▽ More In the low-dimensional case, the generalized additive coefficient model (GACM) proposed by Xue and Yang [Statist. Sinica 16 (2006) 1423-1446] has been demonstrated to be a powerful tool for studying nonlinear interaction effects of variables. In this paper, we propose estimation and inference procedures for the GACM when the dimension of the variables is high. Specifically, we propose a groupwise penalization based procedure to distinguish significant covariates for the "large $p$ small $n$" setting. The procedure is shown to be consistent for model structure identification. Further, we construct simultaneous confidence bands for the coefficient functions in the selected model based on a refined two-step spline estimator. We also discuss how to choose the tuning parameters. To estimate the standard deviation of the functional estimator, we adopt the smoothed bootstrap method. We conduct simulation experiments to evaluate the numerical performance of the proposed methods and analyze an obesity data set from a genome-wide association study as an illustration. △ Less

Submitted 14 October, 2015; originally announced October 2015.

Comments: Published at http://dx.doi.org/10.1214/15-AOS1344 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1344

Journal ref: Annals of Statistics 2015, Vol. 43, No. 5, 2102-2131

arXiv:1405.5055 [pdf, ps, other]

doi 10.1214/14-STS466

Reply to the Discussion of "Estimating the Distribution of Dietary Consumption Patterns"

Authors: Raymond J. Carroll

Abstract: Reply to the "Discussion of "Estimating the Distribution of Dietary Consumption Patterns" by Raymond J. Carroll [arXiv:1405.4667]" by Stephen E. Fienberg and Rebecca C. Steorts [arXiv:1403.0566]. Reply to the "Discussion of "Estimating the Distribution of Dietary Consumption Patterns" by Raymond J. Carroll [arXiv:1405.4667]" by Stephen E. Fienberg and Rebecca C. Steorts [arXiv:1403.0566]. △ Less

Submitted 20 May, 2014; originally announced May 2014.

Comments: Published in at http://dx.doi.org/10.1214/14-STS466 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-STS-STS466

Journal ref: Statistical Science 2014, Vol. 29, No. 1, 103-103

arXiv:1405.4667 [pdf, ps, other]

doi 10.1214/12-STS413

Estimating the Distribution of Dietary Consumption Patterns

Authors: Raymond J. Carroll

Abstract: In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. We were interested in estimating the population distribution of the Healthy Eating Index-2005 (HEI-2005), a m… ▽ More In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. We were interested in estimating the population distribution of the Healthy Eating Index-2005 (HEI-2005), a multi-component dietary quality index involving ratios of interrelated dietary components to energy, among children aged 2-8 in the United States, using a national survey and incorporating survey weights. We developed a highly nonlinear, multivariate zero-inflated data model with measurement error to address this question. Standard nonlinear mixed model software such as SAS NLMIXED cannot handle this problem. We found that taking a Bayesian approach, and using MCMC, resolved the computational issues and doing so enabled us to provide a realistic distribution estimate for the HEI-2005 total score. While our computation and thinking in solving this problem was Bayesian, we relied on the well-known close relationship between Bayesian posterior means and maximum likelihood, the latter not computationally feasible, and thus were able to develop standard errors using balanced repeated replication, a survey-sampling approach. △ Less

Submitted 19 May, 2014; originally announced May 2014.

Comments: Published in at http://dx.doi.org/10.1214/12-STS413 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: substantial text overlap with arXiv:1107.4868

Report number: IMS-STS-STS413

Journal ref: Statistical Science 2014, Vol. 29, No. 1, 2-8

arXiv:1404.6462 [pdf, other]

Bayesian Semiparametric Multivariate Density Deconvolution

Authors: Abhra Sarkar, Debdeep Pati, Bani K. Mallick, Raymond J. Carroll

Abstract: We consider the problem of multivariate density deconvolution when the interest lies in estimating the distribution of a vector-valued random variable but precise measurements of the variable of interest are not available, observations being contaminated with additive measurement errors. The existing sparse literature on the problem assumes the density of the measurement errors to be completely kn… ▽ More We consider the problem of multivariate density deconvolution when the interest lies in estimating the distribution of a vector-valued random variable but precise measurements of the variable of interest are not available, observations being contaminated with additive measurement errors. The existing sparse literature on the problem assumes the density of the measurement errors to be completely known. We propose robust Bayesian semiparametric multivariate deconvolution approaches when the measurement error density is not known but replicated proxies are available for each unobserved value of the random vector. Additionally, we allow the variability of the measurement errors to depend on the associated unobserved value of the vector of interest through unknown relationships which also automatically includes the case of multivariate multiplicative measurement errors. Basic properties of finite mixture models, multivariate normal kernels and exchangeable priors are exploited in many novel ways to meet the modeling and computational challenges. Theoretical results that show the flexibility of the proposed methods are provided. We illustrate the efficiency of the proposed methods in recovering the true density of interest through simulation experiments. The methodology is applied to estimate the joint consumption pattern of different dietary components from contaminated 24 hour recalls. △ Less

Submitted 5 December, 2016; v1 submitted 25 April, 2014; originally announced April 2014.

arXiv:1401.5030 [pdf, other]

doi 10.1103/PhysRevC.89.024307

Shape coexistence in neutron-deficient Hg isotopes studied via lifetime measurements in $^{184,186}$Hg and two-state mixing calculations

Authors: L. P. Gaffney, M. Hackstein, R. D. Page, T. Grahn, M. Scheck, P. A. Butler, P. F. Bertone, N. Bree, R. J. Carroll, M. P. Carpenter, C. J. Chiara, A. Dewald, F. Filmer, C. Fransen, M. Huyse, R. V. F. Janssens, D. T. Joss, R. Julin, F. G. Kondev, P. Nieminen, J. Pakarinen, S. V. Rigby, W. Rother, P. Van Duppen, H. V. Watkins , et al. (2 additional authors not shown)

Abstract: The neutron-deficient mercury isotopes, $^{184,186}$Hg, were studied with the Recoil Distance Doppler Shift (RDDS) method using the Gammasphere array and the Köln Plunger device. The Differential Decay Curve Method (DDCM) was employed to determine the lifetimes of the yrast states in $^{184,186}$Hg. An improvement on previously measured values of yrast states up to $8^{+}$ is presented as well as… ▽ More The neutron-deficient mercury isotopes, $^{184,186}$Hg, were studied with the Recoil Distance Doppler Shift (RDDS) method using the Gammasphere array and the Köln Plunger device. The Differential Decay Curve Method (DDCM) was employed to determine the lifetimes of the yrast states in $^{184,186}$Hg. An improvement on previously measured values of yrast states up to $8^{+}$ is presented as well as first values for the $9_{3}$ state in $^{184}$Hg and $10^{+}$ state in $^{186}$Hg. $B(E2)$ values are calculated and compared to a two-state mixing model which utilizes the variable moment of inertia (VMI) model, allowing for extraction of spin-dependent mixing strengths and amplitudes. △ Less

Submitted 11 February, 2014; v1 submitted 20 January, 2014; originally announced January 2014.

Comments: 8 pages, 9 figures, 2 tables

Journal ref: Phys. Rev. C 89, 024307 (2014)

arXiv:1312.5082 [pdf, ps, other]

doi 10.1214/13-AOS1158

Unexpected properties of bandwidth choice when smoothing discrete data for constructing a functional data classifier

Authors: Raymond J. Carroll, Aurore Delaigle, Peter Hall

Abstract: The data functions that are studied in the course of functional data analysis are assembled from discrete data, and the level of smoothing that is used is generally that which is appropriate for accurate approximation of the conceptually smooth functions that were not actually observed. Existing literature shows that this approach is effective, and even optimal, when using functional data methods… ▽ More The data functions that are studied in the course of functional data analysis are assembled from discrete data, and the level of smoothing that is used is generally that which is appropriate for accurate approximation of the conceptually smooth functions that were not actually observed. Existing literature shows that this approach is effective, and even optimal, when using functional data methods for prediction or hypothesis testing. However, in the present paper we show that this approach is not effective in classification problems. There a useful rule of thumb is that undersmoothing is often desirable, but there are several surprising qualifications to that approach. First, the effect of smoothing the training data can be more significant than that of smoothing the new data set to be classified; second, undersmoothing is not always the right approach, and in fact in some cases using a relatively large bandwidth can be more effective; and third, these perverse results are the consequence of very unusual properties of error rates, expressed as functions of smoothing parameters. For example, the orders of magnitude of optimal smoothing parameter choices depend on the signs and sizes of terms in an expansion of error rate, and those signs and sizes can vary dramatically from one setting to another, even for the same classifier. △ Less

Submitted 18 December, 2013; originally announced December 2013.

Comments: Published in at http://dx.doi.org/10.1214/13-AOS1158 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1158

Journal ref: Annals of Statistics 2013, Vol. 41, No. 6, 2739-2767

arXiv:1310.8176 [pdf, other]

Bayesian Regression Analysis of Data with Random Effects Covariates from Nonlinear Longitudinal Measurements

Authors: Rolando De la Cruz, Cristian Meza, Ana Arribas-Gil, Raymond J. Carroll

Abstract: Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to asses association between these two kinds of data, which in clinical studies are often collected jointly on a series of indivi… ▽ More Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to asses association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. The most common joint model in this framework is based on a linear mixed model for the longitudinal data. However, for complex datasets the linearity assumption may be too restrictive. Some works have considered generalizing this setting with the use of a nonlinear mixed-effects model for the longitudinal trajectories but the proposed estimation procedures based on likelihood approximations have been shown De la Cruz et al. (2011) to exhibit some computational efficiency problems. In this article we propose an MCMC-based estimation procedure in the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to asses the importance of modelling correlated errors and quantify the consequences of model misspecification. △ Less

Submitted 2 July, 2014; v1 submitted 30 October, 2013; originally announced October 2013.

Comments: 20 pages, 3 figures

arXiv:1308.5427 [pdf, ps, other]

Adaptive Posterior Convergence Rates in Bayesian Density Deconvolution with Supersmooth Errors

Authors: Abhra Sarkar, Debdeep Pati, Bani K. Mallick, Raymond J. Carroll

Abstract: Bayesian density deconvolution using nonparametric prior distributions is a useful alternative to the frequentist kernel based deconvolution estimators due to its potentially wide range of applicability, straightforward uncertainty quantification and generalizability to more sophisticated models. This article is the first substantive effort to theoretically quantify the behavior of the posterior i… ▽ More Bayesian density deconvolution using nonparametric prior distributions is a useful alternative to the frequentist kernel based deconvolution estimators due to its potentially wide range of applicability, straightforward uncertainty quantification and generalizability to more sophisticated models. This article is the first substantive effort to theoretically quantify the behavior of the posterior in this recent line of research. In particular, assuming a known supersmooth error density, a Dirichlet process mixture of Normals on the true density leads to a posterior convergence rate same as the minimax rate $(\log n)^{-η/β}$ adaptively over the smoothness $η$ of an appropriate Hölder space of densities, where $β$ is the degree of smoothness of the error distribution. Our main contribution is achieving adaptive minimax rates with respect to the $L_p$ norm for $2 \leq p \leq \infty$ under mild regularity conditions on the true density. En route, we develop tight concentration bounds for a class of kernel based deconvolution estimators which might be of independent interest. △ Less

Submitted 9 September, 2013; v1 submitted 25 August, 2013; originally announced August 2013.

arXiv:1212.4852 [pdf, ps, other]

doi 10.1103/PhysRevC.86.064315

Characterizing the atomic mass surface beyond the proton drip line via a-decay measurements of the s1/2 ground state of 165Re and the h11/2 isomer in 161Ta

Authors: D. O'Donnell, R. D. Page, C. Scholey, L. Bianco, L. Capponi, R. J. Carroll, I. G. Darby, L. Donosa, M. Drummond, F. Ertugral, P. T. Greenlees, T. Grahn, K. Hauschild, A. Herzan, U. Jakobsson, P. Jones, D. T. Joss, R. Julin, S. Juutinen, S. Ketelhut, M. Labiche, M. Leino, A. Lopez-Martens, K. Mullholland, P. Nieminen , et al. (11 additional authors not shown)

Abstract: The a-decay chains originating from the s1/2 and h11/2 states in 173Au have been investigated following fusion-evaporation reactions. Four generations of a radioactivities have been correlated with 173Aum leading to a measurement of the a decay of 161Tam. It has been found that the known a decay of 161Ta, which was previously associated with the decay of the ground state, is in fact the decay of a… ▽ More The a-decay chains originating from the s1/2 and h11/2 states in 173Au have been investigated following fusion-evaporation reactions. Four generations of a radioactivities have been correlated with 173Aum leading to a measurement of the a decay of 161Tam. It has been found that the known a decay of 161Ta, which was previously associated with the decay of the ground state, is in fact the decay of an isomeric state. This work also reports on the first observation of prompt g rays feeding the ground state of 173Au. This prompt radiation was used to aid the study of the a-decay chain originating from the s1/2 state in 173Au. Three generations of a decays have been correlated with this state leading to the observation of a previously unreported activity which is assigned as the decay of 165Reg. This work also reports the excitation energy of an a-decaying isomer in 161Ta and the Q-value of the decay of 161Tag. △ Less

Submitted 19 December, 2012; originally announced December 2012.

Comments: 7 pages, 6 figures

Journal ref: Physical Review C 86, 064315 (2012)

arXiv:1211.6702 [pdf, ps, other]

On deformed quantum potentials

Authors: Robert Carroll

Abstract: We describe some analogues of quantum potentials arising in fractional or deformed Schroedinger equations. We describe some analogues of quantum potentials arising in fractional or deformed Schroedinger equations. △ Less

Submitted 28 November, 2012; originally announced November 2012.

Comments: 17 pages

arXiv:1211.1898 [pdf, ps, other]

Some topics in thermodynamics and quantum mechanics

Authors: Robert Carroll

Abstract: We sketch some connecting relations involving fractional and quantum calculi, fractal structure, thermodynamics, and quantum mechanics. We sketch some connecting relations involving fractional and quantum calculi, fractal structure, thermodynamics, and quantum mechanics. △ Less

Submitted 17 November, 2012; v1 submitted 29 October, 2012; originally announced November 2012.

Comments: 28 pages. A few misprints are corrected. arXiv admin note: text overlap with arXiv:0911.5392 by other authors

arXiv:1210.7052 [pdf, other]

Testing Hardy-Weinberg equilibrium with a simple root-mean-square statistic

Authors: Rachel Ward, Raymond J. Carroll

Abstract: We provide evidence that a root-mean-square test of goodness-of-fit can be significantly more powerful than state-of-the-art exact tests in detecting deviations from Hardy-Weinberg equilibrium. Unlike Pearson's chi-square test, the log--likelihood-ratio test, and Fisher's exact test, which are sensitive to relative discrepancies between genotypic frequencies, the root-mean-square test is sensitive… ▽ More We provide evidence that a root-mean-square test of goodness-of-fit can be significantly more powerful than state-of-the-art exact tests in detecting deviations from Hardy-Weinberg equilibrium. Unlike Pearson's chi-square test, the log--likelihood-ratio test, and Fisher's exact test, which are sensitive to relative discrepancies between genotypic frequencies, the root-mean-square test is sensitive to absolute discrepancies. This can increase statistical power, as we demonstrate using benchmark datasets and through asymptotic analysis. With the aid of computers, exact P-values for the root-mean-square statistic can be calculated eeffortlessly, and can be easily implemented using the author's freely available code. △ Less

Submitted 30 May, 2013; v1 submitted 26 October, 2012; originally announced October 2012.

Comments: 29 pages, 6 figures

MSC Class: 62P10

arXiv:1206.0900 [pdf, ps, other]

On a fractional quantum potential

Authors: Robert Carroll

Abstract: We use the fractional calculus of Kobelev to produce a fractional quantum potential for a corresponding Schrodinger type equation. We use the fractional calculus of Kobelev to produce a fractional quantum potential for a corresponding Schrodinger type equation. △ Less

Submitted 20 March, 2012; originally announced June 2012.

Comments: 10 pages

arXiv:1201.6239 [pdf, ps, other]

doi 10.1103/PhysRevC.85.054315

First observation of excited states in 173Hg

Authors: D. O'Donnell, R. D. Page, C. Scholey, L. Bianco, L. Capponi, R. J. Carroll, I. G. Darby, L. Donosa, M. Drummond, F. Ertugral, P. T. Greenlees, T. Grahn, K. Hauschild, A. Herzan, U. Jakobsson, P. Jones, D. T. Joss, R. Julin, S. Juutinen, S. Ketelhut, M. Labiche, M. Leino, A. Lopez-Martens, K. Mullholland, P. Nieminen , et al. (11 additional authors not shown)

Abstract: The neutron-deficient nucleus 173Hg has been studied following fusion-evaporation reactions. The observation of gamma rays decaying from excited states are reported for the first time and a tentative level scheme is proposed. The proposed level scheme is discussed within the context of the systematics of neighbouring neutron-deficient Hg nuclei. In addition to the gamma-ray spectroscopy, the alpha… ▽ More The neutron-deficient nucleus 173Hg has been studied following fusion-evaporation reactions. The observation of gamma rays decaying from excited states are reported for the first time and a tentative level scheme is proposed. The proposed level scheme is discussed within the context of the systematics of neighbouring neutron-deficient Hg nuclei. In addition to the gamma-ray spectroscopy, the alpha decay of this nucleus has been measured yielding superior precision to earlier measurements. △ Less

Submitted 5 July, 2012; v1 submitted 30 January, 2012; originally announced January 2012.

Comments: 5 pages, 4 figures

Journal ref: Physical Review C 85, 054315 (2012)

arXiv:1112.2502 [pdf, ps, other]

doi 10.1214/11-AOS885

Estimation and variable selection for generalized additive partial linear models

Authors: Li Wang, Xiang Liu, Hua Liang, Raymond J. Carroll

Abstract: We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus res… ▽ More We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. △ Less

Submitted 12 December, 2011; originally announced December 2011.

Comments: Published in at http://dx.doi.org/10.1214/11-AOS885 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS885

Journal ref: Annals of Statistics 2011, Vol. 39, No. 4, 1827-1851

arXiv:1111.6283 [pdf, other]

Feature selection for high-dimensional integrated data

Authors: Charles Zheng, Scott Schwartz, Robert Chapkin, Raymond Carroll, Ivan Ivanov

Abstract: Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of \emph{feature selection} in which only a subset of the predictors $X_t$ are dependent on the multidimensional variate $Y$, and the remainder of the predictors constitute a "noise set" $X_u$ independent of $Y$. Using Monte Carlo simulations, we investigated the rel… ▽ More Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of \emph{feature selection} in which only a subset of the predictors $X_t$ are dependent on the multidimensional variate $Y$, and the remainder of the predictors constitute a "noise set" $X_u$ independent of $Y$. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine "empirical bounds" on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset. △ Less

Submitted 27 November, 2011; originally announced November 2011.

Comments: Submitted

arXiv:1110.3059 [pdf, ps, other]

Thermodynamics and scale relativity

Authors: Robert Carroll

Abstract: It is shown how the fractal paths of scale relativity (following Nottale) can be introduced into a thermodynamical context (following Asadov-Kechkin). It is shown how the fractal paths of scale relativity (following Nottale) can be introduced into a thermodynamical context (following Asadov-Kechkin). △ Less

Submitted 13 October, 2011; originally announced October 2011.

Comments: 5 pages

arXiv:1107.4868 [pdf, ps, other]

doi 10.1214/10-AOAS446

A new multivariate measurement error model with zero-inflated dietary data, and its application to dietary assessment

Authors: Saijuan Zhang, Raymond J. Carroll, Douglas Midthune, Patricia M. Guenther, Susan M. Krebs-Smith, Victor Kipnis, Kevin W. Dodd, Dennis W. Buckman, Janet A. Tooze, Laurence Freedman

Abstract: In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. Also, diet represents numerous foods, nutrients and other components, each of which have distinctive attribut… ▽ More In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. Also, diet represents numerous foods, nutrients and other components, each of which have distinctive attributes. Sometimes, it is useful to examine intake of these components separately, but increasingly nutritionists are interested in exploring them collectively to capture overall dietary patterns. Consumption of these components varies widely: some are consumed daily by almost everyone on every day, while others are episodically consumed so that 24-hour recall data are zero-inflated. In addition, they are often correlated with each other. Finally, it is often preferable to analyze the amount of a dietary component relative to the amount of energy (calories) in a diet because dietary recommendations often vary with energy level. The quest to understand overall dietary patterns of usual intake has to this point reached a standstill. There are no statistical methods or models available to model such complex multivariate data with its measurement error and zero inflation. This paper proposes the first such model, and it proposes the first workable solution to fit such a model. After describing the model, we use survey-weighted MCMC computations to fit the model, with uncertainty estimation coming from balanced repeated replication. △ Less

Submitted 25 July, 2011; originally announced July 2011.

Comments: Published in at http://dx.doi.org/10.1214/10-AOAS446 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS446

Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 2B, 1456-1487

arXiv:1107.2980 [pdf, other]

doi 10.1088/0266-5611/27/11/115009

A Bayesian Approach to Detection of Small Low Emission Sources

Authors: Xiaolei Xun, Bani Mallick, Raymond J. Carroll, Peter Kuchment

Abstract: The article addresses the problem of detecting presence and location of a small low emission source inside of an object, when the background noise dominates. This problem arises, for instance, in some homeland security applications. The goal is to reach the signal-to-noise ratio (SNR) levels on the order of $10^{-3}$. A Bayesian approach to this problem is implemented in 2D. The method allows infe… ▽ More The article addresses the problem of detecting presence and location of a small low emission source inside of an object, when the background noise dominates. This problem arises, for instance, in some homeland security applications. The goal is to reach the signal-to-noise ratio (SNR) levels on the order of $10^{-3}$. A Bayesian approach to this problem is implemented in 2D. The method allows inference not only about the existence of the source, but also about its location. We derive Bayes factors for model selection and estimation of location based on Markov Chain Monte Carlo (MCMC) simulation. A simulation study shows that with sufficiently high total emission level, our method can effectively locate the source. △ Less

Submitted 14 July, 2011; originally announced July 2011.

MSC Class: 65C60; 82Dxx

Journal ref: Inverse Problems 27 (2011), 115009 (11pp)

arXiv:1104.0383 [pdf, ps, other]

doi 10.1088/1742-6596/361/1/012010

Remarks on osmosis, quantum mechanics, and gravity

Authors: Robert Carroll

Abstract: Some relations of the quantum potential to Weyl geometry are indicated with applications to the Friedmann equations for a toy quantum cosmology. Osmotic velocity and pressure are briefly discussed in terms of quantum mechanics and superfluids with connections to gravity. Some relations of the quantum potential to Weyl geometry are indicated with applications to the Friedmann equations for a toy quantum cosmology. Osmotic velocity and pressure are briefly discussed in terms of quantum mechanics and superfluids with connections to gravity. △ Less

Submitted 3 April, 2011; originally announced April 2011.

Comments: 16 pages

arXiv:1010.4700 [pdf, ps, other]

doi 10.1214/09-STS297

Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

Authors: Nilanjan Chatterjee, Yi-Hau Chen, Sheng Luo, Raymond J. Carroll

Abstract: Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the ``retrospective'' likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we re… ▽ More Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the ``retrospective'' likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. △ Less

Submitted 22 October, 2010; originally announced October 2010.

Comments: Published in at http://dx.doi.org/10.1214/09-STS297 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-STS-STS297

Journal ref: Statistical Science 2009, Vol. 24, No. 4, 489-502

arXiv:1010.1732 [pdf, ps, other]

On some toy quantum cosmology

Authors: Robert Carroll

Abstract: Some connections of the quantum potential to gravitation are discussed. Some connections of the quantum potential to gravitation are discussed. △ Less

Submitted 8 October, 2010; originally announced October 2010.

arXiv:1009.5750 [pdf, ps, other]

doi 10.1214/09-AOAS253

Use of multiple singular value decompositions to analyze complex intracellular calcium ion signals

Authors: Josue G. Martinez, Jianhua Z. Huang, Robert C. Burghardt, Rola Barhoumi, Raymond J. Carroll

Abstract: We compare calcium ion signaling ($\mathrm {Ca}^{2+}$) between two exposures; the data are present as movies, or, more prosaically, time series of images. This paper describes novel uses of singular value decompositions (SVD) and weighted versions of them (WSVD) to extract the signals from such movies, in a way that is semi-automatic and tuned closely to the actual data and their many complexities… ▽ More We compare calcium ion signaling ($\mathrm {Ca}^{2+}$) between two exposures; the data are present as movies, or, more prosaically, time series of images. This paper describes novel uses of singular value decompositions (SVD) and weighted versions of them (WSVD) to extract the signals from such movies, in a way that is semi-automatic and tuned closely to the actual data and their many complexities. These complexities include the following. First, the images themselves are of no interest: all interest focuses on the behavior of individual cells across time, and thus, the cells need to be segmented in an automated manner. Second, the cells themselves have 100$+$ pixels, so that they form 100$+$ curves measured over time, so that data compression is required to extract the features of these curves. Third, some of the pixels in some of the cells are subject to image saturation due to bit depth limits, and this saturation needs to be accounted for if one is to normalize the images in a reasonably unbiased manner. Finally, the $\mathrm {Ca}^{2+}$ signals have oscillations or waves that vary with time and these signals need to be extracted. Thus, our aim is to show how to use multiple weighted and standard singular value decompositions to detect, extract and clarify the $\mathrm {Ca}^{2+}$ signals. Our signal extraction methods then lead to simple although finely focused statistical methods to compare $\mathrm {Ca}^{2+}$ signals across experimental conditions. △ Less

Submitted 28 September, 2010; originally announced September 2010.

Comments: Published in at http://dx.doi.org/10.1214/09-AOAS253 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS253

Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 4, 1467-1492

arXiv:1007.4744 [pdf, ps, other]

Remarks on gravity and quantum mechanics

Authors: Robert Carroll

Abstract: Some relations between conformal relativity and Weyl-Dirac theory are rephrased and clarified. Some relations between conformal relativity and Weyl-Dirac theory are rephrased and clarified. △ Less

Submitted 2 August, 2010; v1 submitted 27 July, 2010; originally announced July 2010.

Comments: 7 Pages

arXiv:0808.2965 [pdf, ps, other]

On stability, fluctuations, and quantum mechanics

Authors: Robert Carroll

Abstract: We review a stability approach to quantization by Rusov and Vlasenko and indicate possible comparisons of fluctuations to standard situations involving a quantum potential. We review a stability approach to quantization by Rusov and Vlasenko and indicate possible comparisons of fluctuations to standard situations involving a quantum potential. △ Less

Submitted 31 August, 2008; v1 submitted 21 August, 2008; originally announced August 2008.

Comments: 13 pages, Latex, a few comments added

arXiv:0807.4158 [pdf, ps, other]

Remarks on Fisher information

Authors: Robert Carroll

Abstract: Some situations are discussed where subquantum oscillations in momentum arise in connectiion with Fisher information and the quantum potential. Some situations are discussed where subquantum oscillations in momentum arise in connectiion with Fisher information and the quantum potential. △ Less

Submitted 25 July, 2008; originally announced July 2008.

Comments: 15 pages, Latex

Showing 1–50 of 97 results for author: Carroll, R