-
Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data
Authors:
Mariela De Lucas Álvarez,
Jichen Guo,
Raul Domínguez,
Matias Valdenegro-Toro
Abstract:
Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this…
▽ More
Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this by proposing Neural Networks with Uncertainty Quantification in Terrain Classification. We enable our Neural Networks with Monte Carlo Dropout, DropConnect, and Flipout in time series-capable architectures using only proprioceptive data as input. We use Bayesian Optimization with Hyperband for efficient hyperparameter optimization to find optimal models for trustworthy terrain classification.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Scalable Multi-Output Gaussian Processes with Stochastic Variational Inference
Authors:
Xiaoyu Jiang,
Sokratia Georgaka,
Magnus Rattray,
Mauricio A. Alvarez
Abstract:
The Multi-Output Gaussian Process is is a popular tool for modelling data from multiple sources. A typical choice to build a covariance function for a MOGP is the Linear Model of Coregionalization (LMC) which parametrically models the covariance between outputs. The Latent Variable MOGP (LV-MOGP) generalises this idea by modelling the covariance between outputs using a kernel applied to latent var…
▽ More
The Multi-Output Gaussian Process is is a popular tool for modelling data from multiple sources. A typical choice to build a covariance function for a MOGP is the Linear Model of Coregionalization (LMC) which parametrically models the covariance between outputs. The Latent Variable MOGP (LV-MOGP) generalises this idea by modelling the covariance between outputs using a kernel applied to latent variables, one per output, leading to a flexible MOGP model that allows efficient generalization to new outputs with few data points. Computational complexity in LV-MOGP grows linearly with the number of outputs, which makes it unsuitable for problems with a large number of outputs. In this paper, we propose a stochastic variational inference approach for the LV-MOGP that allows mini-batches for both inputs and outputs, making computational complexity per training iteration independent of the number of outputs.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Adaptive RKHS Fourier Features for Compositional Gaussian Process Models
Authors:
Xinxing Shi,
Thomas Baldwin-McDonald,
Mauricio A. Álvarez
Abstract:
Deep Gaussian Processes (DGPs) leverage a compositional structure to model non-stationary processes. DGPs typically rely on local inducing point approximations across intermediate GP layers. Recent advances in DGP inference have shown that incorporating global Fourier features from Reproducing Kernel Hilbert Space (RKHS) can enhance the DGPs' capability to capture complex non-stationary patterns.…
▽ More
Deep Gaussian Processes (DGPs) leverage a compositional structure to model non-stationary processes. DGPs typically rely on local inducing point approximations across intermediate GP layers. Recent advances in DGP inference have shown that incorporating global Fourier features from Reproducing Kernel Hilbert Space (RKHS) can enhance the DGPs' capability to capture complex non-stationary patterns. This paper extends the use of these features to compositional GPs involving linear transformations. In particular, we introduce Ordinary Differential Equation (ODE) -based RKHS Fourier features that allow for adaptive amplitude and phase modulation through convolution operations. This convolutional formulation relates our work to recently proposed deep latent force models, a multi-layer structure designed for modelling nonlinear dynamical systems. By embedding these adjustable RKHS Fourier features within a doubly stochastic variational inference framework, our model exhibits improved predictive performance across various regression tasks.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Galactic Rotation Curves of LSB Galaxies using core-halo FDM configurations
Authors:
Ivan Alvarez-Rios,
Tula Bernal,
Pierre-Henri Chavanis,
Francisco S. Guzman
Abstract:
In this work, we construct galactic halos in order to fit the rotation curves (RCs) of a sample of low surface brightness (LSB) galaxies. These halos are made of Fuzzy Dark Matter (FDM) with a multimode expansion of non-spherical modes that in average contribute to the appropriate density profile consisting of a core and an envelope needed to fit the rotation curves. These halos are constructed as…
▽ More
In this work, we construct galactic halos in order to fit the rotation curves (RCs) of a sample of low surface brightness (LSB) galaxies. These halos are made of Fuzzy Dark Matter (FDM) with a multimode expansion of non-spherical modes that in average contribute to the appropriate density profile consisting of a core and an envelope needed to fit the rotation curves. These halos are constructed assuming a solitonic core at the center and two types of envelopes, Navarro-Frenk-White and Pseudo-Isothermal density profiles. The resulting FDM configurations are then evolved in order to show how the average density changes in time due to the secular dynamical evolution, along with a condensation process that lead to the growth of the solitonic core.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation
Authors:
Zhenxin Li,
Kailin Li,
Shihao Wang,
Shiyi Lan,
Zhiding Yu,
Yishen Ji,
Zhiqi Li,
Ziyue Zhu,
Jan Kautz,
Zuxuan Wu,
Yu-Gang Jiang,
Jose M. Alvarez
Abstract:
We propose Hydra-MDP, a novel paradigm employing multiple teachers in a teacher-student model. This approach uses knowledge distillation from both human and rule-based teachers to train the student model, which features a multi-head decoder to learn diverse trajectory candidates tailored to various evaluation metrics. With the knowledge of rule-based teachers, Hydra-MDP learns how the environment…
▽ More
We propose Hydra-MDP, a novel paradigm employing multiple teachers in a teacher-student model. This approach uses knowledge distillation from both human and rule-based teachers to train the student model, which features a multi-head decoder to learn diverse trajectory candidates tailored to various evaluation metrics. With the knowledge of rule-based teachers, Hydra-MDP learns how the environment influences the planning in an end-to-end manner instead of resorting to non-differentiable post-processing. This method achieves the $1^{st}$ place in the Navsim challenge, demonstrating significant improvements in generalization across diverse driving environments and conditions. Code will be available at \url{https://github.com/NVlabs/Hydra-MDP}.
△ Less
Submitted 19 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Step Out and Seek Around: On Warm-Start Training with Incremental Data
Authors:
Maying Shen,
Hongxu Yin,
Pavlo Molchanov,
Lei Mao,
Jose M. Alvarez
Abstract:
Data often arrives in sequence over time in real-world deep learning applications such as autonomous driving. When new training data is available, training the model from scratch undermines the benefit of leveraging the learned knowledge, leading to significant training costs. Warm-starting from a previously trained checkpoint is the most intuitive way to retain knowledge and advance learning. How…
▽ More
Data often arrives in sequence over time in real-world deep learning applications such as autonomous driving. When new training data is available, training the model from scratch undermines the benefit of leveraging the learned knowledge, leading to significant training costs. Warm-starting from a previously trained checkpoint is the most intuitive way to retain knowledge and advance learning. However, existing literature suggests that this warm-starting degrades generalization. In this paper, we advocate for warm-starting but stepping out of the previous converging point, thus allowing a better adaptation to new data without compromising previous knowledge. We propose Knowledge Consolidation and Acquisition (CKCA), a continuous model improvement algorithm with two novel components. First, a novel feature regularization (FeatReg) to retain and refine knowledge from existing checkpoints; Second, we propose adaptive knowledge distillation (AdaKD), a novel approach to forget mitigation and knowledge transfer. We tested our method on ImageNet using multiple splits of the training data. Our approach achieves up to $8.39\%$ higher top1 accuracy than the vanilla warm-starting and consistently outperforms the prior art with a large margin.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
A Causal Framework for Evaluating Deferring Systems
Authors:
Filippo Palomba,
Andrea Pugnana,
José Manuel Alvarez,
Salvatore Ruggieri
Abstract:
Deferring systems extend supervised Machine Learning (ML) models with the possibility to defer predictions to human experts. However, evaluating the impact of a deferring strategy on system accuracy is still an overlooked area. This paper fills this gap by evaluating deferring systems through a causal lens. We link the potential outcomes framework for causal inference with deferring systems. This…
▽ More
Deferring systems extend supervised Machine Learning (ML) models with the possibility to defer predictions to human experts. However, evaluating the impact of a deferring strategy on system accuracy is still an overlooked area. This paper fills this gap by evaluating deferring systems through a causal lens. We link the potential outcomes framework for causal inference with deferring systems. This allows us to identify the causal impact of the deferring strategy on predictive accuracy. We distinguish two scenarios. In the first one, we can access both the human and the ML model predictions for the deferred instances. In such a case, we can identify the individual causal effects for deferred instances and aggregates of them. In the second scenario, only human predictions are available for the deferred instances. In this case, we can resort to regression discontinuity design to estimate a local causal effect. We empirically evaluate our approach on synthetic and real datasets for seven deferring systems from the literature.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Memorize What Matters: Emergent Scene Decomposition from Multitraverse
Authors:
Yiming Li,
Zehong Wang,
Yue Wang,
Zhiding Yu,
Zan Gojcic,
Marco Pavone,
Chen Feng,
Jose M. Alvarez
Abstract:
Humans naturally retain memories of permanent elements, while ephemeral moments often slip through the cracks of memory. This selective retention is crucial for robotic perception, localization, and mapping. To endow robots with this capability, we introduce 3D Gaussian Mapping (3DGM), a self-supervised, camera-only offline mapping framework grounded in 3D Gaussian Splatting. 3DGM converts multitr…
▽ More
Humans naturally retain memories of permanent elements, while ephemeral moments often slip through the cracks of memory. This selective retention is crucial for robotic perception, localization, and mapping. To endow robots with this capability, we introduce 3D Gaussian Mapping (3DGM), a self-supervised, camera-only offline mapping framework grounded in 3D Gaussian Splatting. 3DGM converts multitraverse RGB videos from the same region into a Gaussian-based environmental map while concurrently performing 2D ephemeral object segmentation. Our key observation is that the environment remains consistent across traversals, while objects frequently change. This allows us to exploit self-supervision from repeated traversals to achieve environment-object decomposition. More specifically, 3DGM formulates multitraverse environmental mapping as a robust differentiable rendering problem, treating pixels of the environment and objects as inliers and outliers, respectively. Using robust feature distillation, feature residuals mining, and robust optimization, 3DGM jointly performs 2D segmentation and 3D mapping without human intervention. We build the Mapverse benchmark, sourced from the Ithaca365 and nuPlan datasets, to evaluate our method in unsupervised 2D segmentation, 3D reconstruction, and neural rendering. Extensive results verify the effectiveness and potential of our method for self-driving and robotics.
△ Less
Submitted 29 May, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
Uncovering Algorithmic Discrimination: An Opportunity to Revisit the Comparator
Authors:
Jose M. Alvarez,
Salvatore Ruggieri
Abstract:
Causal reasoning, in particular, counterfactual reasoning plays a central role in testing for discrimination. Counterfactual reasoning materializes when testing for discrimination, what is known as the counterfactual model of discrimination, when we compare the discrimination comparator with the discrimination complainant, where the comparator is a similar (or similarly situated) profile to that o…
▽ More
Causal reasoning, in particular, counterfactual reasoning plays a central role in testing for discrimination. Counterfactual reasoning materializes when testing for discrimination, what is known as the counterfactual model of discrimination, when we compare the discrimination comparator with the discrimination complainant, where the comparator is a similar (or similarly situated) profile to that of the complainant used for testing the discrimination claim of the complainant. In this paper, we revisit the comparator by presenting two kinds of comparators based on the sort of causal intervention we want to represent. We present the ceteris paribus and the mutatis mutandis comparator, where the former is the standard and the latter is a new kind of comparator. We argue for the use of the mutatis mutandis comparator, which is built on the fairness given the difference notion, for testing future algorithmic discrimination cases.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Deciphering public attention to geoengineering and climate issues using machine learning and dynamic analysis
Authors:
Ramit Debnath,
Pengyu Zhang,
Tianzhu Qin,
R. Michael Alvarez,
Shaun D. Fitzgerald
Abstract:
As the conversation around using geoengineering to combat climate change intensifies, it is imperative to engage the public and deeply understand their perspectives on geoengineering research, development, and potential deployment. Through a comprehensive data-driven investigation, this paper explores the types of news that captivate public interest in geoengineering. We delved into 30,773 English…
▽ More
As the conversation around using geoengineering to combat climate change intensifies, it is imperative to engage the public and deeply understand their perspectives on geoengineering research, development, and potential deployment. Through a comprehensive data-driven investigation, this paper explores the types of news that captivate public interest in geoengineering. We delved into 30,773 English-language news articles from the BBC and the New York Times, combined with Google Trends data spanning 2018 to 2022, to explore how public interest in geoengineering fluctuates in response to news coverage of broader climate issues. Using BERT-based topic modeling, sentiment analysis, and time-series regression models, we found that positive sentiment in energy-related news serves as a good predictor of heightened public interest in geoengineering, a trend that persists over time. Our findings suggest that public engagement with geoengineering and climate action is not uniform, with some topics being more potent in shaping interest over time, such as climate news related to energy, disasters, and politics. Understanding these patterns is crucial for scientists, policymakers, and educators aiming to craft effective strategies for engaging with the public and fostering dialogue around emerging climate technologies.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Physics-based deep learning reveals rising heating demand heightens air pollution in Norwegian cities
Authors:
Cong Cao,
Ramit Debnath,
R. Michael Alvarez
Abstract:
Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air p…
▽ More
Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air pollution predictions. Our analysis utilizes ten years (2009-2018) of daily traffic, weather, and air pollution data from three major cities in Norway. Findings from feature selection reveal a correlation between rising heating degree days and heightened air pollution levels, suggesting increased heating activities in Norway are a contributing factor to worsening air quality. PBDL demonstrates superior accuracy in air pollution predictions compared to LSTM. This paper contributes to the growing literature on PBDL methods for more accurate air pollution predictions using environmental variables, aiding policymakers in formulating effective data-driven climate policies.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning
Authors:
Shihao Wang,
Zhiding Yu,
Xiaohui Jiang,
Shiyi Lan,
Min Shi,
Nadine Chang,
Jan Kautz,
Ying Li,
Jose M. Alvarez
Abstract:
The advances in multimodal large language models (MLLMs) have led to growing interests in LLM-based autonomous driving agents to leverage their strong reasoning capabilities. However, capitalizing on MLLMs' strong reasoning capabilities for improved planning behavior is challenging since planning requires full 3D situational awareness beyond 2D reasoning. To address this challenge, our work propos…
▽ More
The advances in multimodal large language models (MLLMs) have led to growing interests in LLM-based autonomous driving agents to leverage their strong reasoning capabilities. However, capitalizing on MLLMs' strong reasoning capabilities for improved planning behavior is challenging since planning requires full 3D situational awareness beyond 2D reasoning. To address this challenge, our work proposes a holistic framework for strong alignment between agent models and 3D driving tasks. Our framework starts with a novel 3D MLLM architecture that uses sparse queries to lift and compress visual representations into 3D before feeding them into an LLM. This query-based representation allows us to jointly encode dynamic objects and static map elements (e.g., traffic lanes), providing a condensed world model for perception-action alignment in 3D. We further propose OmniDrive-nuScenes, a new visual question-answering dataset challenging the true 3D situational awareness of a model with comprehensive visual question-answering (VQA) tasks, including scene description, traffic regulation, 3D grounding, counterfactual reasoning, decision making and planning. Extensive studies show the effectiveness of the proposed architecture as well as the importance of the VQA tasks for reasoning and planning in complex 3D scenes.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Mining Supervision for Dynamic Regions in Self-Supervised Monocular Depth Estimation
Authors:
Hoang Chuong Nguyen,
Tianyu Wang,
Jose M. Alvarez,
Miaomiao Liu
Abstract:
This paper focuses on self-supervised monocular depth estimation in dynamic scenes trained on monocular videos. Existing methods jointly estimate pixel-wise depth and motion, relying mainly on an image reconstruction loss. Dynamic regions1 remain a critical challenge for these methods due to the inherent ambiguity in depth and motion estimation, resulting in inaccurate depth estimation. This paper…
▽ More
This paper focuses on self-supervised monocular depth estimation in dynamic scenes trained on monocular videos. Existing methods jointly estimate pixel-wise depth and motion, relying mainly on an image reconstruction loss. Dynamic regions1 remain a critical challenge for these methods due to the inherent ambiguity in depth and motion estimation, resulting in inaccurate depth estimation. This paper proposes a self-supervised training framework exploiting pseudo depth labels for dynamic regions from training data. The key contribution of our framework is to decouple depth estimation for static and dynamic regions of images in the training data. We start with an unsupervised depth estimation approach, which provides reliable depth estimates for static regions and motion cues for dynamic regions and allows us to extract moving object information at the instance level. In the next stage, we use an object network to estimate the depth of those moving objects assuming rigid motions. Then, we propose a new scale alignment module to address the scale ambiguity between estimated depths for static and dynamic regions. We can then use the depth labels generated to train an end-to-end depth estimation network and improve its performance. Extensive experiments on the Cityscapes and KITTI datasets show that our self-training strategy consistently outperforms existing self/unsupervised depth estimation methods.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
BoLD: Fast and Cheap Dispute Resolution
Authors:
Mario M. Alvarez,
Henry Arneson,
Ben Berger,
Lee Bousfield,
Chris Buckland,
Yafah Edelman,
Edward W. Felten,
Daniel Goldman,
Raul Jordan,
Mahimna Kelkar,
Akaki Mamageishvili,
Harry Ng,
Aman Sanghi,
Victor Shoup,
Terence Tsao
Abstract:
BoLD is a new dispute resolution protocol that is designed to replace the originally deployed Arbitrum dispute resolution protocol. Unlike that protocol, BoLD is resistant to delay attacks. It achieves this resistance without a significant increase in onchain computation costs and with reduced staking costs.
BoLD is a new dispute resolution protocol that is designed to replace the originally deployed Arbitrum dispute resolution protocol. Unlike that protocol, BoLD is resistant to delay attacks. It achieves this resistance without a significant increase in onchain computation costs and with reduced staking costs.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Discovery of a dormant 33 solar-mass black hole in pre-release Gaia astrometry
Authors:
Gaia Collaboration,
P. Panuzzo,
T. Mazeh,
F. Arenou,
B. Holl,
E. Caffau,
A. Jorissen,
C. Babusiaux,
P. Gavras,
J. Sahlmann,
U. Bastian,
Ł. Wyrzykowski,
L. Eyer,
N. Leclerc,
N. Bauchet,
A. Bombrun,
N. Mowlavi,
G. M. Seabroke,
D. Teyssier,
E. Balbinot,
A. Helmi,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne
, et al. (390 additional authors not shown)
Abstract:
Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is exp…
▽ More
Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is expected to uncover many Galactic wide-binary systems containing dormant BHs, which may not have been detected before. The study of this population will provide new information on the BH-mass distribution in binaries and shed light on their formation mechanisms and progenitors. As part of the validation efforts in preparation for the fourth Gaia data release (DR4), we analysed the preliminary astrometric binary solutions, obtained by the Gaia Non-Single Star pipeline, to verify their significance and to minimise false-detection rates in high-mass-function orbital solutions. The astrometric binary solution of one source, Gaia BH3, implies the presence of a 32.70 \pm 0.82 M\odot BH in a binary system with a period of 11.6 yr. Gaia radial velocities independently validate the astrometric orbit. Broad-band photometric and spectroscopic data show that the visible component is an old, very metal-poor giant of the Galactic halo, at a distance of 590 pc. The BH in the Gaia BH3 system is more massive than any other Galactic stellar-origin BH known thus far. The low metallicity of the star companion supports the scenario that metal-poor massive stars are progenitors of the high-mass BHs detected by gravitational-wave telescopes. The Galactic orbit of the system and its metallicity indicate that it might belong to the Sequoia halo substructure. Alternatively, and more plausibly, it could belong to the ED-2 stream, which likely originated from a globular cluster that had been disrupted by the Milky Way.
△ Less
Submitted 19 April, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Evaluating the Quality of Answers in Political Q&A Sessions with Large Language Models
Authors:
R. Michael Alvarez,
Jacob Morrier
Abstract:
This paper presents a new approach to evaluating the quality of answers in political question-and-answer sessions. We propose to measure an answer's quality based on the degree to which it allows us to infer the initial question accurately. This conception of answer quality inherently reflects their relevance to initial questions. Drawing parallels with semantic search, we argue that this measurem…
▽ More
This paper presents a new approach to evaluating the quality of answers in political question-and-answer sessions. We propose to measure an answer's quality based on the degree to which it allows us to infer the initial question accurately. This conception of answer quality inherently reflects their relevance to initial questions. Drawing parallels with semantic search, we argue that this measurement approach can be operationalized by fine-tuning a large language model on the observed corpus of questions and answers without additional labeled data. We showcase our measurement approach within the context of the Question Period in the Canadian House of Commons. Our approach yields valuable insights into the correlates of the quality of answers in the Question Period. We find that answer quality varies significantly based on the party affiliation of the members of Parliament asking the questions and uncover a meaningful correlation between answer quality and the topics of the questions.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
B. Bahr-Kalus,
S. Bailey,
C. Baltay,
A. Bault,
J. Behera,
S. BenZvi,
A. Bera,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum
, et al. (178 additional authors not shown)
Abstract:
We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the s…
▽ More
We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-$α$ forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the sound horizon, in seven redshift bins from over 6 million extragalactic objects in the redshift range $0.1<z<4.2$. DESI BAO data alone are consistent with the standard flat $Λ$CDM cosmological model with a matter density $Ω_\mathrm{m}=0.295\pm 0.015$. Paired with a BBN prior and the robustly measured acoustic angular scale from the CMB, DESI requires $H_0=(68.52\pm0.62)$ km/s/Mpc. In conjunction with CMB anisotropies from Planck and CMB lensing data from Planck and ACT, we find $Ω_\mathrm{m}=0.307\pm 0.005$ and $H_0=(67.97\pm0.38)$ km/s/Mpc. Extending the baseline model with a constant dark energy equation of state parameter $w$, DESI BAO alone require $w=-0.99^{+0.15}_{-0.13}$. In models with a time-varying dark energy equation of state parametrized by $w_0$ and $w_a$, combinations of DESI with CMB or with SN~Ia individually prefer $w_0>-1$ and $w_a<0$. This preference is 2.6$σ$ for the DESI+CMB combination, and persists or grows when SN~Ia are added in, giving results discrepant with the $Λ$CDM model at the $2.5σ$, $3.5σ$ or $3.9σ$ levels for the addition of Pantheon+, Union3, or DES-SN5YR datasets respectively. For the flat $Λ$CDM model with the sum of neutrino mass $\sum m_ν$ free, combining the DESI and CMB data yields an upper limit $\sum m_ν< 0.072$ $(0.113)$ eV at 95% confidence for a $\sum m_ν>0$ $(\sum m_ν>0.059)$ eV prior. These neutrino-mass constraints are substantially relaxed in models beyond $Λ$CDM. [Abridged.]
△ Less
Submitted 24 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman Alpha Forest
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
S. Bailey,
C. Baltay,
A. Bault,
J. Bautista,
J. Behera,
S. BenZvi,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum,
S. Brieden
, et al. (174 additional authors not shown)
Abstract:
We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a…
▽ More
We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-$α$ (Ly$α$) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over $420\,000$ Ly$α$ forest spectra and their correlation with the spatial distribution of more than $700\,000$ quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon ($r_d$), we measure the expansion at $z_{\rm eff}=2.33$ with 2\% precision, $H(z_{\rm eff}) = (239.2 \pm 4.8) (147.09~{\rm Mpc} /r_d)$ km/s/Mpc. Similarly, we present a 2.4\% measurement of the transverse comoving distance to the same redshift, $D_M(z_{\rm eff}) = (5.84 \pm 0.14) (r_d/147.09~{\rm Mpc})$ Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters.
△ Less
Submitted 12 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
DESI 2024 III: Baryon Acoustic Oscillations from Galaxies and Quasars
Authors:
DESI Collaboration,
A. G. Adame,
J. Aguilar,
S. Ahlen,
S. Alam,
D. M. Alexander,
M. Alvarez,
O. Alves,
A. Anand,
U. Andrade,
E. Armengaud,
S. Avila,
A. Aviles,
H. Awan,
S. Bailey,
C. Baltay,
A. Bault,
J. Behera,
S. BenZvi,
F. Beutler,
D. Bianchi,
C. Blake,
R. Blum,
S. Brieden,
A. Brodzeller
, et al. (171 additional authors not shown)
Abstract:
We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 qu…
▽ More
We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1<z<2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1<z<0.4, 2,138,600 Luminous Red Galaxies with 0.4<z<1.1, 2,432,022 Emission Line Galaxies with 0.8<z<1.6, and 856,652 quasars with 0.8<z<2.1, over a ~7,500 square degree footprint. The analysis was blinded at the catalog-level to avoid confirmation bias. All fiducial choices of the BAO fitting and reconstruction methodology, as well as the size of the systematic errors, were determined on the basis of the tests with mock catalogs and the blinded data catalogs. We present several improvements to the BAO analysis pipeline, including enhancing the BAO fitting and reconstruction methods in a more physically-motivated direction, and also present results using combinations of tracers. We present a re-analysis of SDSS BOSS and eBOSS results applying the improved DESI methodology and find scatter consistent with the level of the quoted SDSS theoretical systematic uncertainties. With the total effective survey volume of ~ 18 Gpc$^3$, the combined precision of the BAO measurements across the six different redshift bins is ~0.52%, marking a 1.2-fold improvement over the previous state-of-the-art results using only first-year data. We detect the BAO in all of these six redshift bins. The highest significance of BAO detection is $9.1σ$ at the effective redshift of 0.93, with a constraint of 0.86% placed on the BAO scale. We find our measurements are systematically larger than the prediction of Planck-2018 LCDM model at z<0.8. We translate the results into transverse comoving distance and radial Hubble distance measurements, which are used to constrain cosmological models in our companion paper [abridged].
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
What is Point Supervision Worth in Video Instance Segmentation?
Authors:
Shuaiyi Huang,
De-An Huang,
Zhiding Yu,
Shiyi Lan,
Subhashree Radhakrishnan,
Jose M. Alvarez,
Abhinav Shrivastava,
Anima Anandkumar
Abstract:
Video instance segmentation (VIS) is a challenging vision task that aims to detect, segment, and track objects in videos. Conventional VIS methods rely on densely-annotated object masks which are expensive. We reduce the human annotations to only one point for each object in a video frame during training, and obtain high-quality mask predictions close to fully supervised models. Our proposed train…
▽ More
Video instance segmentation (VIS) is a challenging vision task that aims to detect, segment, and track objects in videos. Conventional VIS methods rely on densely-annotated object masks which are expensive. We reduce the human annotations to only one point for each object in a video frame during training, and obtain high-quality mask predictions close to fully supervised models. Our proposed training method consists of a class-agnostic proposal generation module to provide rich negative samples and a spatio-temporal point-based matcher to match the object queries with the provided point annotations. Comprehensive experiments on three VIS benchmarks demonstrate competitive performance of the proposed framework, nearly matching fully supervised methods.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Thalamocortical interactions shape hierarchical neural variability during stimulus perception
Authors:
Adrià Tauste Campo,
Antonio Zainos,
Yuriria Vázquez,
Raul Adell Segarra,
Manuel Álvarez,
Gustavo Deco,
Héctor Díaz,
Sergio Parra,
Ranulfo Romo,
Román Rossi-Pool
Abstract:
The brain is hierarchically organized to process sensory signals. But, to what extent do functional connections within and across areas shape this hierarchical order? We addressed this problem in the thalamocortical network, while monkeys judged the presence or absence of a vibrotactile stimulus. We quantified the variability by means of intrinsic timescales and Fano factor, and functional connect…
▽ More
The brain is hierarchically organized to process sensory signals. But, to what extent do functional connections within and across areas shape this hierarchical order? We addressed this problem in the thalamocortical network, while monkeys judged the presence or absence of a vibrotactile stimulus. We quantified the variability by means of intrinsic timescales and Fano factor, and functional connectivity by means of a directionality measure in simultaneously recorded neurons sharing the same cutaneous receptive field from the somatosensory thalamus (VPL) and areas 3b and 1 from the somatosensory cortex. During the pre-stimulus periods, VPL and area 3b exhibited similarly fast dynamics while area 1 showed much slower timescales. Furthermore, during the stimulus presence, the Fano factor increased along the network VPL-3b-1. In parallel, VPL established two separate main feedforward pathways with areas 3b and 1 to process stimulus information. While feedforward interactions from VPL and area 3b were favored by neurons within specific Fano factor ranges, neural variability in area 1 was invariant to the incoming pathways. In contrast to VPL and area 3b, during the stimulus arrival, area 1 showed significant intra-area interactions, which mainly pointed to neurons with slow intrinsic timescales. Overall, our results suggest that the lower variability of VPL and area 3b regulates feedforward thalamocortical communication, while the higher variability of area 1 supports intra-cortical interactions during sensory processing. These results provide evidence of a hierarchical order along the thalamocortical network.
△ Less
Submitted 19 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Improving Distant 3D Object Detection Using 2D Box Supervision
Authors:
Zetong Yang,
Zhiding Yu,
Chris Choy,
Renhao Wang,
Anima Anandkumar,
Jose M. Alvarez
Abstract:
Improving the detection of distant 3d objects is an important yet challenging task. For camera-based 3D perception, the annotation of 3d bounding relies heavily on LiDAR for accurate depth information. As such, the distance of annotation is often limited due to the sparsity of LiDAR points on distant objects, which hampers the capability of existing detectors for long-range scenarios. We address t…
▽ More
Improving the detection of distant 3d objects is an important yet challenging task. For camera-based 3D perception, the annotation of 3d bounding relies heavily on LiDAR for accurate depth information. As such, the distance of annotation is often limited due to the sparsity of LiDAR points on distant objects, which hampers the capability of existing detectors for long-range scenarios. We address this challenge by considering only 2D box supervision for distant objects since they are easy to annotate. We propose LR3D, a framework that learns to recover the missing depth of distant objects. LR3D adopts an implicit projection head to learn the generation of mapping between 2D boxes and depth using the 3D supervision on close objects. This mapping allows the depth estimation of distant objects conditioned on their 2D boxes, making long-range 3D detection with 2D supervision feasible. Experiments show that without distant 3D annotations, LR3D allows camera-based methods to detect distant objects (over 200m) with comparable accuracy to full 3D supervision. Our framework is general, and could widely benefit 3D detection methods to a large extent.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Monte Carlo algorithm for calculating pre-neutron fragment mass and kinetic energy distributions from measurements using the 2E Technique for reaction 235U(nth, f)
Authors:
Modesto Montoya,
Miguel Roca,
Mario Alvarez
Abstract:
An algorithm is proposed that enables the calculation of pre-neutron mass and energy distributions of fission fragments from uranium-235, induced by thermal neutrons, utilizing measurements obtained through the 2E technique. This algorithm facilitates the inference of the curve representing the average multiplicity of prompt neutrons relative to the pre-neutron mass of the fission fragments.
An algorithm is proposed that enables the calculation of pre-neutron mass and energy distributions of fission fragments from uranium-235, induced by thermal neutrons, utilizing measurements obtained through the 2E technique. This algorithm facilitates the inference of the curve representing the average multiplicity of prompt neutrons relative to the pre-neutron mass of the fission fragments.
△ Less
Submitted 8 January, 2024;
originally announced March 2024.
-
Triple Roman Domination in Graphs
Authors:
Hossein Abdollahzadeh Ahangar,
M. Pilar Alvarez,
Mustapha Chellali,
Seyed Mahmoud Sheikholeslami,
Juan Carlos Valenzuela-Tripodoro
Abstract:
The Roman domination in graphs is well-studied in graph theory. The topic is related to a defensive strategy problem in which the Roman legions are settled in some secure cities of the Roman Empire. The deployment of the legions around the Empire is designed in such a way that a sudden attack to any undefended city could be quelled by a legion from a strong neighbour. There is an additional condit…
▽ More
The Roman domination in graphs is well-studied in graph theory. The topic is related to a defensive strategy problem in which the Roman legions are settled in some secure cities of the Roman Empire. The deployment of the legions around the Empire is designed in such a way that a sudden attack to any undefended city could be quelled by a legion from a strong neighbour. There is an additional condition: no legion can move if doing so leaves its base city defenceless. In this manuscript we start the study of a variant of Roman domination in graphs: the triple Roman domination. We consider that any city of the Roman Empire must be able to be defended by at least three legions. These legions should be either in the attacked city or in one of its neighbours. We determine various bounds on the triple Roman domination number for general graphs, and we give exact values for some graph families. Moreover, complexity results are also obtained.
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
Prospective Prediction of Body Mass Index Trajectories using Multi-task Gaussian Processes
Authors:
Arthur Leroy,
Varsha Gupta,
Mya Thway Tint,
Delicia Ooi Shu Qin,
Keith M. Godfrey,
Fabian Yap,
Leck Ngee,
Yung Seng Lee,
Johan G. Eriksson,
Navin Michael,
Mauricio A. Alvarez,
Dennis Wang
Abstract:
Clinicians often investigate the body mass index (BMI) trajectories of children to assess their growth with respect to their peers, as well as to anticipate future growth and disease risk. While retrospective modelling of BMI trajectories has been an active area of research, prospective prediction of continuous BMI trajectories from historical growth data has not been well investigated. Using weig…
▽ More
Clinicians often investigate the body mass index (BMI) trajectories of children to assess their growth with respect to their peers, as well as to anticipate future growth and disease risk. While retrospective modelling of BMI trajectories has been an active area of research, prospective prediction of continuous BMI trajectories from historical growth data has not been well investigated. Using weight and height measurements from birth to age 10 years from a longitudinal mother-offspring cohort, we leveraged a multi-task Gaussian processes model, called MagmaClust, to derive probabilistic predictions for BMI trajectories over various forecasting periods. Experiments were conducted to evaluate the accuracy, sensitivity to missing values, and number of clusters. The results were compared with cubic B-spline regression and a parametric Jenss-Bayley mixed effects model. A downstream tool computing individual overweight probabilities was also proposed and evaluated. In all experiments, MagmaClust outperformed conventional models in prediction accuracy while correctly calibrating uncertainty regardless of the missing data amount (up to 90\% missing) or the forecasting period (from 2 to 8 years in the future). Moreover, the overweight probabilities computed from MagmaClust's uncertainty quantification exhibited high specificity ($0.94$ to $0.96$) and accuracy ($0.86$ to $0.94$) in predicting the 10-year overweight status even from age 2 years. MagmaClust provides a probabilistic non-parametric framework to prospectively predict BMI trajectories, which is robust to missing values and outperforms conventional BMI trajectory modelling approaches. It also clusters individuals to identify typical BMI patterns (early peak, adiposity rebounds) during childhood. Overall, we demonstrated its potential to anticipate BMI evolution throughout childhood, allowing clinicians to implement prevention strategies.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Causal Perception
Authors:
Jose M. Alvarez,
Salvatore Ruggieri
Abstract:
Perception occurs when two individuals interpret the same information differently. Despite being a known phenomenon with implications for bias in decision-making, as individual experience determines interpretation, perception remains largely overlooked in machine learning (ML) research. Modern decision flows, whether partially or fully automated, involve human experts interacting with ML applicati…
▽ More
Perception occurs when two individuals interpret the same information differently. Despite being a known phenomenon with implications for bias in decision-making, as individual experience determines interpretation, perception remains largely overlooked in machine learning (ML) research. Modern decision flows, whether partially or fully automated, involve human experts interacting with ML applications. How might we then, e.g., account for two experts that interpret differently a deferred instance or an explanation from a ML model? To account for perception, we first need to formulate it. In this work, we define perception under causal reasoning using structural causal models (SCM). Our framework formalizes individual experience as additional causal knowledge that comes with and is used by a human expert (read, decision maker). We present two kinds of causal perception, unfaithful and inconsistent, based on the SCM properties of faithfulness and consistency. Further, we motivate the importance of perception within fairness problems. We illustrate our framework through a series of decision flow examples involving ML applications and human experts.
△ Less
Submitted 22 May, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Physics-informed Meta-instrument for eXperiments (PiMiX) with applications to fusion energy
Authors:
Zhehui Wang,
Shanny Lin,
Miles Teng-Levy,
Pinghan Chu,
Bradley T. Wolfe,
Chun-Shang Wong,
Christopher S. Campbell,
Xin Yue,
Liyuan Zhang,
Derek Aberle,
Mariana Alvarado Alvarez,
David Broughton,
Ray T. Chen,
Baolian Cheng,
Feng Chu,
Eric R. Fossum,
Mark A. Foster,
Chengkun Huang,
Velat Kilic,
Karl Krushelnick,
Wenting Li,
Eric Loomis,
Thomas Schmidt Jr.,
Sky K. Sjue,
Chris Tomkins
, et al. (2 additional authors not shown)
Abstract:
Data-driven methods (DDMs), such as deep neural networks, offer a generic approach to integrated data analysis (IDA), integrated diagnostic-to-control (IDC) workflows through data fusion (DF), which includes multi-instrument data fusion (MIDF), multi-experiment data fusion (MXDF), and simulation-experiment data fusion (SXDF). These features make DDMs attractive to nuclear fusion energy and power p…
▽ More
Data-driven methods (DDMs), such as deep neural networks, offer a generic approach to integrated data analysis (IDA), integrated diagnostic-to-control (IDC) workflows through data fusion (DF), which includes multi-instrument data fusion (MIDF), multi-experiment data fusion (MXDF), and simulation-experiment data fusion (SXDF). These features make DDMs attractive to nuclear fusion energy and power plant applications, leveraging accelerated workflows through machine learning and artificial intelligence. Here we describe Physics-informed Meta-instrument for eXperiments (PiMiX) that integrates X-ray (including high-energy photons such as $γ$-rays from nuclear fusion), neutron and others (such as proton radiography) measurements for nuclear fusion. PiMiX solves multi-domain high-dimensional optimization problems and integrates multi-modal measurements with multiphysics modeling through neural networks. Super-resolution for neutron detection and energy resolved X-ray detection have been demonstrated. Multi-modal measurements through MIDF can extract more information than individual or uni-modal measurements alone. Further optimization schemes through DF are possible towards empirical fusion scaling laws discovery and new fusion reactor designs.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
EMG subspace alignment and visualization for cross-subject hand gesture classification
Authors:
Martin Colot,
Cédric Simar,
Mathieu Petieau,
Ana Maria Cebolla Alvarez,
Guy Cheron,
Gianluca Bontempi
Abstract:
Electromyograms (EMG)-based hand gesture recognition systems are a promising technology for human/machine interfaces. However, one of their main limitations is the long calibration time that is typically required to handle new users. The paper discusses and analyses the challenge of cross-subject generalization thanks to an original dataset containing the EMG signals of 14 human subjects during ha…
▽ More
Electromyograms (EMG)-based hand gesture recognition systems are a promising technology for human/machine interfaces. However, one of their main limitations is the long calibration time that is typically required to handle new users. The paper discusses and analyses the challenge of cross-subject generalization thanks to an original dataset containing the EMG signals of 14 human subjects during hand gestures. The experimental results show that, though an accurate generalization based on pooling multiple subjects is hardly achievable, it is possible to improve the cross-subject estimation by identifying a robust low-dimensional subspace for multiple subjects and aligning it to a target subject. A visualization of the subspace enables us to provide insights for the improvement of cross-subject generalization with EMG signals.
△ Less
Submitted 18 December, 2023;
originally announced January 2024.
-
Fully Attentional Networks with Self-emerging Token Labeling
Authors:
Bingyin Zhao,
Zhiding Yu,
Shiyi Lan,
Yutao Cheng,
Anima Anandkumar,
Yingjie Lao,
Jose M. Alvarez
Abstract:
Recent studies indicate that Vision Transformers (ViTs) are robust against out-of-distribution scenarios. In particular, the Fully Attentional Network (FAN) - a family of ViT backbones, has achieved state-of-the-art robustness. In this paper, we revisit the FAN models and improve their pre-training with a self-emerging token labeling (STL) framework. Our method contains a two-stage training framew…
▽ More
Recent studies indicate that Vision Transformers (ViTs) are robust against out-of-distribution scenarios. In particular, the Fully Attentional Network (FAN) - a family of ViT backbones, has achieved state-of-the-art robustness. In this paper, we revisit the FAN models and improve their pre-training with a self-emerging token labeling (STL) framework. Our method contains a two-stage training framework. Specifically, we first train a FAN token labeler (FAN-TL) to generate semantically meaningful patch token labels, followed by a FAN student model training stage that uses both the token labels and the original class label. With the proposed STL framework, our best model based on FAN-L-Hybrid (77.3M parameters) achieves 84.8% Top-1 accuracy and 42.1% mCE on ImageNet-1K and ImageNet-C, and sets a new state-of-the-art for ImageNet-A (46.1%) and ImageNet-R (56.6%) without using extra data, outperforming the original FAN counterpart by significant margins. The proposed framework also demonstrates significantly enhanced performance on downstream tasks such as semantic segmentation, with up to 1.7% improvement in robustness over the counterpart model. Code is available at https://github.com/NVlabs/STL.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Longitudinal prediction of DNA methylation to forecast epigenetic outcomes
Authors:
Arthur Leroy,
Ai Ling Teh,
Frank Dondelinger,
Mauricio A. Alvarez,
Dennis Wang
Abstract:
Interrogating the evolution of biological changes at early stages of life requires longitudinal profiling of molecules, such as DNA methylation, which can be challenging with children. We introduce a probabilistic and longitudinal machine learning framework based on multi-mean Gaussian processes (GPs), accounting for individual and gene correlations across time. This method provides future predict…
▽ More
Interrogating the evolution of biological changes at early stages of life requires longitudinal profiling of molecules, such as DNA methylation, which can be challenging with children. We introduce a probabilistic and longitudinal machine learning framework based on multi-mean Gaussian processes (GPs), accounting for individual and gene correlations across time. This method provides future predictions of DNA methylation status at different individual ages while accounting for uncertainty. Our model is trained on a birth cohort of children with methylation profiled at ages 0-4, and we demonstrated that the status of methylation sites for each child can be accurately predicted at ages 5-7. We show that methylation profiles predicted by multi-mean GPs can be used to estimate other phenotypes, such as epigenetic age, and enable comparison to other health measures of interest. This approach encourages epigenetic studies to move towards longitudinal design for investigating epigenetic changes during development, ageing and disease progression.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
Authors:
Zhiqi Li,
Zhiding Yu,
Shiyi Lan,
Jiahan Li,
Jan Kautz,
Tong Lu,
Jose M. Alvarez
Abstract:
End-to-end autonomous driving recently emerged as a promising research direction to target autonomy from a full-stack perspective. Along this line, many of the latest works follow an open-loop evaluation setting on nuScenes to study the planning behavior. In this paper, we delve deeper into the problem by conducting thorough analyses and demystifying more devils in the details. We initially observ…
▽ More
End-to-end autonomous driving recently emerged as a promising research direction to target autonomy from a full-stack perspective. Along this line, many of the latest works follow an open-loop evaluation setting on nuScenes to study the planning behavior. In this paper, we delve deeper into the problem by conducting thorough analyses and demystifying more devils in the details. We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity. These models tend to rely predominantly on the ego vehicle's status for future path planning. Beyond the limitations of the dataset, we also note that current metrics do not comprehensively assess the planning quality, leading to potentially biased conclusions drawn from existing benchmarks. To address this issue, we introduce a new metric to evaluate whether the predicted trajectories adhere to the road. We further propose a simple baseline able to achieve competitive results without relying on perception annotations. Given the current limitations on the benchmark and metrics, we suggest the community reassess relevant prevailing research and be cautious whether the continued pursuit of state-of-the-art would yield convincing and universal conclusions. Code and models are available at \url{https://github.com/NVlabs/BEV-Planner}
△ Less
Submitted 2 June, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
Authors:
Zhenxin Li,
Shiyi Lan,
Jose M. Alvarez,
Zuxuan Wu
Abstract:
Recently, the rise of query-based Transformer decoders is reshaping camera-based 3D object detection. These query-based decoders are surpassing the traditional dense BEV (Bird's Eye View)-based methods. However, we argue that dense BEV frameworks remain important due to their outstanding abilities in depth estimation and object localization, depicting 3D scenes accurately and comprehensively. This…
▽ More
Recently, the rise of query-based Transformer decoders is reshaping camera-based 3D object detection. These query-based decoders are surpassing the traditional dense BEV (Bird's Eye View)-based methods. However, we argue that dense BEV frameworks remain important due to their outstanding abilities in depth estimation and object localization, depicting 3D scenes accurately and comprehensively. This paper aims to address the drawbacks of the existing dense BEV-based 3D object detectors by introducing our proposed enhanced components, including a CRF-modulated depth estimation module enforcing object-level consistencies, a long-term temporal aggregation module with extended receptive fields, and a two-stage object decoder combining perspective techniques with CRF-modulated depth embedding. These enhancements lead to a "modernized" dense BEV framework dubbed BEVNeXt. On the nuScenes benchmark, BEVNeXt outperforms both BEV-based and query-based frameworks under various settings, achieving a state-of-the-art result of 64.2 NDS on the nuScenes test set. Code will be available at \url{https://github.com/woxihuanjiangguo/BEVNeXt}.
△ Less
Submitted 24 March, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning
Authors:
Thomas Baldwin-McDonald,
Mauricio A. Álvarez
Abstract:
Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ord…
▽ More
Modelling the behaviour of highly nonlinear dynamical systems with robust uncertainty quantification is a challenging task which typically requires approaches specifically designed to address the problem at hand. We introduce a domain-agnostic model to address this issue termed the deep latent force model (DLFM), a deep Gaussian process with physics-informed kernels at each layer, derived from ordinary differential equations using the framework of process convolutions. Two distinct formulations of the DLFM are presented which utilise weight-space and variational inducing points-based Gaussian process approximations, both of which are amenable to doubly stochastic variational inference. We present empirical evidence of the capability of the DLFM to capture the dynamics present in highly nonlinear real-world multi-output time series data. Additionally, we find that the DLFM is capable of achieving comparable performance to a range of non-physics-informed probabilistic models on benchmark univariate regression tasks. We also empirically assess the negative impact of the inducing points framework on the extrapolation capabilities of LFM-based models.
△ Less
Submitted 24 January, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation
Authors:
Lingchen Meng,
Shiyi Lan,
Hengduo Li,
Jose M. Alvarez,
Zuxuan Wu,
Yu-Gang Jiang
Abstract:
In-context segmentation aims at segmenting novel images using a few labeled example images, termed as "in-context examples", exploring content similarities between examples and the target. The resulting models can be generalized seamlessly to novel segmentation tasks, significantly reducing the labeling and training costs compared with conventional pipelines. However, in-context segmentation is mo…
▽ More
In-context segmentation aims at segmenting novel images using a few labeled example images, termed as "in-context examples", exploring content similarities between examples and the target. The resulting models can be generalized seamlessly to novel segmentation tasks, significantly reducing the labeling and training costs compared with conventional pipelines. However, in-context segmentation is more challenging than classic ones requiring the model to learn segmentation rules conditioned on a few samples. Unlike previous work with ad-hoc or non-end-to-end designs, we propose SEGIC, an end-to-end segment-in-context framework built upon a single vision foundation model (VFM). In particular, SEGIC leverages the emergent correspondence within VFM to capture dense relationships between target images and in-context samples. As such, information from in-context samples is then extracted into three types of instructions, i.e. geometric, visual, and meta instructions, serving as explicit conditions for the final mask prediction. SEGIC is a straightforward yet effective approach that yields state-of-the-art performance on one-shot segmentation benchmarks. Notably, SEGIC can be easily generalized to diverse tasks, including video object segmentation and open-vocabulary segmentation. Code will be available at https://github.com/MengLcool/SEGIC.
△ Less
Submitted 29 March, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Unbiased and Multilevel Methods for a Class of Diffusions Partially Observed via Marked Point Processes
Authors:
Miguel Alvarez,
Ajay Jasra,
Hamza Ruzayqat
Abstract:
In this article we consider the filtering problem associated to partially observed diffusions, with observations following a marked point process. In the model, the data form a point process with observation times that have its intensity driven by a diffusion, with the associated marks also depending upon the diffusion process. We assume that one must resort to time-discretizing the diffusion proc…
▽ More
In this article we consider the filtering problem associated to partially observed diffusions, with observations following a marked point process. In the model, the data form a point process with observation times that have its intensity driven by a diffusion, with the associated marks also depending upon the diffusion process. We assume that one must resort to time-discretizing the diffusion process and develop particle and multilevel particle filters to recursively approximate the filter. In particular, we prove that our multilevel particle filter can achieve a mean square error (MSE) of $\mathcal{O}(ε^2)$ ($ε>0$ and arbitrary) with a cost of $\mathcal{O}(ε^{-2.5})$ versus using a particle filter which has a cost of $\mathcal{O}(ε^{-3})$ to achieve the same MSE. We then show how this methodology can be extended to give unbiased (that is with no time-discretization error) estimators of the filter, which are proved to have finite variance and with high-probability have finite cost. Finally, we extend our methodology to the problem of online static-parameter estimation.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
ViR: Towards Efficient Vision Retention Backbones
Authors:
Ali Hatamizadeh,
Michael Ranzinger,
Shiyi Lan,
Jose M. Alvarez,
Sanja Fidler,
Jan Kautz
Abstract:
Vision Transformers (ViTs) have attracted a lot of popularity in recent years, due to their exceptional capabilities in modeling long-range spatial dependencies and scalability for large scale training. Although the training parallelism of self-attention mechanism plays an important role in retaining great performance, its quadratic complexity baffles the application of ViTs in many scenarios whic…
▽ More
Vision Transformers (ViTs) have attracted a lot of popularity in recent years, due to their exceptional capabilities in modeling long-range spatial dependencies and scalability for large scale training. Although the training parallelism of self-attention mechanism plays an important role in retaining great performance, its quadratic complexity baffles the application of ViTs in many scenarios which demand fast inference. This effect is even more pronounced in applications in which autoregressive modeling of input features is required. In Natural Language Processing (NLP), a new stream of efforts has proposed parallelizable models with recurrent formulation that allows for efficient inference in generative applications. Inspired by this trend, we propose a new class of computer vision models, dubbed Vision Retention Networks (ViR), with dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance. In particular, ViR scales favorably for image throughput and memory consumption in tasks that require higher-resolution images due to its flexible formulation in processing large sequence lengths. The ViR is the first attempt to realize dual parallel and recurrent equivalency in a general vision backbone for recognition tasks. We have validated the effectiveness of ViR through extensive experiments with different dataset sizes and various image resolutions and achieved competitive performance. Code: https://github.com/NVlabs/ViR
△ Less
Submitted 26 January, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Thin and Deep Gaussian Processes
Authors:
Daniel Augusto de Souza,
Alexander Nikitin,
ST John,
Magnus Ross,
Mauricio A. Álvarez,
Marc Peter Deisenroth,
João P. P. Gomes,
Diego Mesquita,
César Lincoln C. Mattos
Abstract:
Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn…
▽ More
Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Gaia Focused Product Release: Sources from Service Interface Function image analysis -- Half a million new sources in omega Centauri
Authors:
Gaia Collaboration,
K. Weingrill,
A. Mints,
J. Castañeda,
Z. Kostrzewa-Rutkowska,
M. Davidson,
F. De Angeli,
J. Hernández,
F. Torra,
M. Ramos-Lerate,
C. Babusiaux,
M. Biermann,
C. Crowley,
D. W. Evans,
L. Lindegren,
J. M. Martín-Fleitas,
L. Palaversa,
D. Ruz Mieres,
K. Tisanić,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne,
F. Arenou,
A. Barbier
, et al. (378 additional authors not shown)
Abstract:
Gaia's readout window strategy is challenged by very dense fields in the sky. Therefore, in addition to standard Gaia observations, full Sky Mapper (SM) images were recorded for nine selected regions in the sky. A new software pipeline exploits these Service Interface Function (SIF) images of crowded fields (CFs), making use of the availability of the full two-dimensional (2D) information. This ne…
▽ More
Gaia's readout window strategy is challenged by very dense fields in the sky. Therefore, in addition to standard Gaia observations, full Sky Mapper (SM) images were recorded for nine selected regions in the sky. A new software pipeline exploits these Service Interface Function (SIF) images of crowded fields (CFs), making use of the availability of the full two-dimensional (2D) information. This new pipeline produced half a million additional Gaia sources in the region of the omega Centauri ($ω$ Cen) cluster, which are published with this Focused Product Release. We discuss the dedicated SIF CF data reduction pipeline, validate its data products, and introduce their Gaia archive table. Our aim is to improve the completeness of the {\it Gaia} source inventory in a very dense region in the sky, $ω$ Cen. An adapted version of {\it Gaia}'s Source Detection and Image Parameter Determination software located sources in the 2D SIF CF images. We validated the results by comparing them to the public {\it Gaia} DR3 catalogue and external Hubble Space Telescope data. With this Focused Product Release, 526\,587 new sources have been added to the {\it Gaia} catalogue in $ω$ Cen. Apart from positions and brightnesses, the additional catalogue contains parallaxes and proper motions, but no meaningful colour information. While SIF CF source parameters generally have a lower precision than nominal {\it Gaia} sources, in the cluster centre they increase the depth of the combined catalogue by three magnitudes and improve the source density by a factor of ten. This first SIF CF data publication already adds great value to the {\it Gaia} catalogue. It demonstrates what to expect for the fourth {\it Gaia} catalogue, which will contain additional sources for all nine SIF CF regions.
△ Less
Submitted 8 November, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Gaia Focused Product Release: A catalogue of sources around quasars to search for strongly lensed quasars
Authors:
Gaia Collaboration,
A. Krone-Martins,
C. Ducourant,
L. Galluccio,
L. Delchambre,
I. Oreshina-Slezak,
R. Teixeira,
J. Braine,
J. -F. Le Campion,
F. Mignard,
W. Roux,
A. Blazere,
L. Pegoraro,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne,
F. Arenou,
C. Babusiaux,
A. Barbier,
M. Biermann,
O. L. Creevey,
D. W. Evans,
L. Eyer,
R. Guerra
, et al. (376 additional authors not shown)
Abstract:
Context. Strongly lensed quasars are fundamental sources for cosmology. The Gaia space mission covers the entire sky with the unprecedented resolution of $0.18$" in the optical, making it an ideal instrument to search for gravitational lenses down to the limiting magnitude of 21. Nevertheless, the previous Gaia Data Releases are known to be incomplete for small angular separations such as those ex…
▽ More
Context. Strongly lensed quasars are fundamental sources for cosmology. The Gaia space mission covers the entire sky with the unprecedented resolution of $0.18$" in the optical, making it an ideal instrument to search for gravitational lenses down to the limiting magnitude of 21. Nevertheless, the previous Gaia Data Releases are known to be incomplete for small angular separations such as those expected for most lenses. Aims. We present the Data Processing and Analysis Consortium GravLens pipeline, which was built to analyse all Gaia detections around quasars and to cluster them into sources, thus producing a catalogue of secondary sources around each quasar. We analysed the resulting catalogue to produce scores that indicate source configurations that are compatible with strongly lensed quasars. Methods. GravLens uses the DBSCAN unsupervised clustering algorithm to detect sources around quasars. The resulting catalogue of multiplets is then analysed with several methods to identify potential gravitational lenses. We developed and applied an outlier scoring method, a comparison between the average BP and RP spectra of the components, and we also used an extremely randomised tree algorithm. These methods produce scores to identify the most probable configurations and to establish a list of lens candidates. Results. We analysed the environment of 3 760 032 quasars. A total of 4 760 920 sources, including the quasars, were found within 6" of the quasar positions. This list is given in the Gaia archive. In 87\% of cases, the quasar remains a single source, and in 501 385 cases neighbouring sources were detected. We propose a list of 381 lensed candidates, of which we identified 49 as the most promising. Beyond these candidates, the associate tables in this Focused Product Release allow the entire community to explore the unique Gaia data for strong lensing studies further.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Gaia Focused Product Release: Radial velocity time series of long-period variables
Authors:
Gaia Collaboration,
Gaia Collaboration,
M. Trabucchi,
N. Mowlavi,
T. Lebzelter,
I. Lecoeur-Taibi,
M. Audard,
L. Eyer,
P. García-Lario,
P. Gavras,
B. Holl,
G. Jevardat de Fombelle,
K. Nienartowicz,
L. Rimoldini,
P. Sartoretti,
R. Blomme,
Y. Frémat,
O. Marchal,
Y. Damerdji,
A. G. A. Brown,
A. Guerrier,
P. Panuzzo,
D. Katz,
G. M. Seabroke,
K. Benson
, et al. (382 additional authors not shown)
Abstract:
The third Gaia Data Release (DR3) provided photometric time series of more than 2 million long-period variable (LPV) candidates. Anticipating the publication of full radial-velocity (RV) in DR4, this Focused Product Release (FPR) provides RV time series for a selection of LPVs with high-quality observations. We describe the production and content of the Gaia catalog of LPV RV time series, and the…
▽ More
The third Gaia Data Release (DR3) provided photometric time series of more than 2 million long-period variable (LPV) candidates. Anticipating the publication of full radial-velocity (RV) in DR4, this Focused Product Release (FPR) provides RV time series for a selection of LPVs with high-quality observations. We describe the production and content of the Gaia catalog of LPV RV time series, and the methods used to compute variability parameters published in the Gaia FPR. Starting from the DR3 LPVs catalog, we applied filters to construct a sample of sources with high-quality RV measurements. We modeled their RV and photometric time series to derive their periods and amplitudes, and further refined the sample by requiring compatibility between the RV period and at least one of the $G$, $G_{\rm BP}$, or $G_{\rm RP}$ photometric periods. The catalog includes RV time series and variability parameters for 9\,614 sources in the magnitude range $6\lesssim G/{\rm mag}\lesssim 14$, including a flagged top-quality subsample of 6\,093 stars whose RV periods are fully compatible with the values derived from the $G$, $G_{\rm BP}$, and $G_{\rm RP}$ photometric time series. The RV time series contain a mean of 24 measurements per source taken unevenly over a duration of about three years. We identify the great most sources (88%) as genuine LPVs, with about half of them showing a pulsation period and the other half displaying a long secondary period. The remaining 12% consists of candidate ellipsoidal binaries. Quality checks against RVs available in the literature show excellent agreement. We provide illustrative examples and cautionary remarks. The publication of RV time series for almost 10\,000 LPVs constitutes, by far, the largest such database available to date in the literature. The availability of simultaneous photometric measurements gives a unique added value to the Gaia catalog (abridged)
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Magnetocaloric effect of nanostructured La0.6Sr0.4CoO3
Authors:
Fabiana Morales Alvarez,
María Belén Vigna,
Mariano Quintero,
Diego G. Lamas,
Joaquín Sacanell
Abstract:
In this study, we investigate the magnetic and magnetocaloric properties of nanostructured La0.6Sr0.4CoO3 (LSC) samples synthesized under confinement conditions within porous templates. Using this method, we obtained de-agglomerated nanoparticles, which provide us with the feasibility of applying them in nanoparticle films that can be tailored to intricate geometries. We specifically explored the…
▽ More
In this study, we investigate the magnetic and magnetocaloric properties of nanostructured La0.6Sr0.4CoO3 (LSC) samples synthesized under confinement conditions within porous templates. Using this method, we obtained de-agglomerated nanoparticles, which provide us with the feasibility of applying them in nanoparticle films that can be tailored to intricate geometries. We specifically explored the impact of pore size of the template on key parameters including saturation magnetization (MS), Curie temperature (TC), maximum entropy change (ΔS), and relative cooling power (RCP). Our findings reveal enhancements in those quantities, that are likely to be related with the nanostructure of the samples, indicating the potential of nanostructured LSC as an active material for magnetic refrigeration devices. Our alternative approach of synthesizing magnetocaloric materials under confinement conditions presents an exciting prospect for future research and development in the field.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Quartic rigid systems in the plane and in the Poincaré sphere
Authors:
M. J. Álvarez,
J. L. Bravo,
L. A. Calderón
Abstract:
We consider the planar family of rigid systems of the form $x'=-y+xP(x,y), y'=x+yP(x,y)$, where $P$ is any polynomial with monomials of degree one and three. This is the simplest non-trivial family of rigid systems with no rotatory parameters.
The family can be compactified to the Poincaré sphere such that the vector field along the equator is not identically null. We study the centers, singular…
▽ More
We consider the planar family of rigid systems of the form $x'=-y+xP(x,y), y'=x+yP(x,y)$, where $P$ is any polynomial with monomials of degree one and three. This is the simplest non-trivial family of rigid systems with no rotatory parameters.
The family can be compactified to the Poincaré sphere such that the vector field along the equator is not identically null. We study the centers, singular points and limit cycles of that family on the plane and on the sphere.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Single-Element Dual-Interferometer for Precision Inertial Sensing: Sub-picometer Structural Stability and Performance as a Reference for Laser Frequency Stabilization
Authors:
Victor Huarcaya,
Miguel Dovale Álvarez,
Kohei Yamamoto,
Yichao Yang,
Stefano Gozzo,
Pablo Martínez Cano,
Moritz Mehmet,
Juan José Esteban Delgado,
Jianjun Jia,
Gerhard Heinzel
Abstract:
To reach sub-picometer sensitivity in the millihertz range, displacement sensors based on laser interferometry require suppression of laser-frequency noise by several orders of magnitude. Many optical frequency stabilization methods exist with varying levels of complexity, size, and performance. In this paper, we describe the performance of a compact Mach-Zehnder interferometer based on a monolith…
▽ More
To reach sub-picometer sensitivity in the millihertz range, displacement sensors based on laser interferometry require suppression of laser-frequency noise by several orders of magnitude. Many optical frequency stabilization methods exist with varying levels of complexity, size, and performance. In this paper, we describe the performance of a compact Mach-Zehnder interferometer based on a monolithic optic. The setup consists of a commercial fiber injector, a custom-designed pentaprism used to split and recombine the laser beam, and two photoreceivers placed at the complementary output ports of the interferometer. The structural stability of the prism is transferred to the laser frequency via amplification, integration, and feedback of the balanced-detection signal, achieving a fractional frequency instability better than 6 parts in $10^{13}$, corresponding to an interferometer pathlength stability better than $10^{-12}$ m$/\sqrt{\mathrm{Hz}}$.
△ Less
Submitted 1 December, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Towards Viewpoint Robustness in Bird's Eye View Segmentation
Authors:
Tzofi Klinghoffer,
Jonah Philion,
Wenzheng Chen,
Or Litany,
Zan Gojcic,
Jungseock Joo,
Ramesh Raskar,
Sanja Fidler,
Jose M. Alvarez
Abstract:
Autonomous vehicles (AV) require that neural networks used for perception be robust to different viewpoints if they are to be deployed across many types of vehicles without the repeated cost of data collection and labeling for each. AV companies typically focus on collecting data from diverse scenarios and locations, but not camera rig configurations, due to cost. As a result, only a small number…
▽ More
Autonomous vehicles (AV) require that neural networks used for perception be robust to different viewpoints if they are to be deployed across many types of vehicles without the repeated cost of data collection and labeling for each. AV companies typically focus on collecting data from diverse scenarios and locations, but not camera rig configurations, due to cost. As a result, only a small number of rig variations exist across most fleets. In this paper, we study how AV perception models are affected by changes in camera viewpoint and propose a way to scale them across vehicle types without repeated data collection and labeling. Using bird's eye view (BEV) segmentation as a motivating task, we find through extensive experiments that existing perception models are surprisingly sensitive to changes in camera viewpoint. When trained with data from one camera rig, small changes to pitch, yaw, depth, or height of the camera at inference time lead to large drops in performance. We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs, allowing us to train BEV segmentation models for diverse target rigs without any additional data collection or labeling cost. To analyze the impact of viewpoint changes, we leverage synthetic data to mitigate other gaps (content, ISP, etc). Our approach is then trained on real data and evaluated on synthetic data, enabling evaluation on diverse target rigs. We release all data for use in future work. Our method is able to recover an average of 14.7% of the IoU that is otherwise lost when deploying to new rigs.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Impact of Self-shielding Minihalos on the Ly$α$ Forest at High Redshift
Authors:
Hyunbae Park,
Zarija Lukić,
Jean Sexton,
Marcelo Alvarez,
Paul R. Shapiro
Abstract:
Dense gas in minihalos with masses of $10^6-10^8~M_\odot$ can shield themselves from reionization for $\sim100$ Myr after being exposed to the UV background. These self-shielded systems, often unresolved in cosmological simulations, can introduce strong absorption in quasar spectra. This paper is the first systematic study on the impact of these systems on the Ly$α$ forest. We first derive the HI…
▽ More
Dense gas in minihalos with masses of $10^6-10^8~M_\odot$ can shield themselves from reionization for $\sim100$ Myr after being exposed to the UV background. These self-shielded systems, often unresolved in cosmological simulations, can introduce strong absorption in quasar spectra. This paper is the first systematic study on the impact of these systems on the Ly$α$ forest. We first derive the HI column density profile of photoevaporating minihalos by conducting 1D radiation-hydrodynamics simulations. We utilize these results to estimate the Ly$α$ opacity from minihalos in a large-scale simulation that cannot resolve self-shielding. When the ionization rate of the background radiation is $0.03\times10^{-12}~{\rm s}^{-1}$, as expected near the end of reionization at $z\sim5.5$, we find that the incidence rate of damped Ly$α$ absorbers increases by a factor of $\sim2-4$ compared to at $z=4.5$. The Ly$α$ flux is, on average, suppressed by $\sim 3\%$ of its mean due to minihalos. The absorption features enhance the 1D power spectrum up to $\sim5\%$ at $k\sim0.1~h~{\rm Mpc}^{-1}~({\rm or}~10^{-3}~{\rm km}^{-1}~{\rm s})$, which is comparable to the enhancement caused by inhomogeneous reionization. The flux is particularly suppressed in the vicinity of large halos along the line-of-sight direction at separations of up to $10~h^{-1}~{\rm Mpc}$ at $r_\perp\lesssim2~h^{-1}~{\rm Mpc}$. However, these effects become much smaller for higher ionizing rates ($\gtrsim0.3\times10^{-12}~{\rm s}^{-1}$) expected in the post-reionization Universe. Our findings highlight the need to consider minihalo absorption when interpreting the Ly$α$ forest at $z\gtrsim5.5$. Moreover, the sensitivity of these quantities to the ionizing background intensity can be exploited to constrain the intensity itself.
△ Less
Submitted 15 June, 2024; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Latent Variable Multi-output Gaussian Processes for Hierarchical Datasets
Authors:
Chunchao Ma,
Arthur Leroy,
Mauricio Alvarez
Abstract:
Multi-output Gaussian processes (MOGPs) have been introduced to deal with multiple tasks by exploiting the correlations between different outputs. Generally, MOGPs models assume a flat correlation structure between the outputs. However, such a formulation does not account for more elaborate relationships, for instance, if several replicates were observed for each output (which is a typical setting…
▽ More
Multi-output Gaussian processes (MOGPs) have been introduced to deal with multiple tasks by exploiting the correlations between different outputs. Generally, MOGPs models assume a flat correlation structure between the outputs. However, such a formulation does not account for more elaborate relationships, for instance, if several replicates were observed for each output (which is a typical setting in biological experiments). This paper proposes an extension of MOGPs for hierarchical datasets (i.e. datasets for which the relationships between observations can be represented within a tree structure). Our model defines a tailored kernel function accounting for hierarchical structures in the data to capture different levels of correlations while leveraging the introduction of latent variables to express the underlying dependencies between outputs through a dedicated kernel. This latter feature is expected to significantly improve scalability as the number of tasks increases. An extensive experimental study involving both synthetic and real-world data from genomics and motion capture is proposed to support our claims.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
$2\cdot 10^{-13}$ fractional laser frequency stability with a 7-cm unequal-arm Mach-Zehnder interferometer
Authors:
Victor Huarcaya,
Miguel Dovale Álvarez,
Daniel Penkert,
Stefano Gozzo,
Pablo Martínez Cano,
Kohei Yamamoto,
Juan José Esteban Delgado,
Moritz Mehmet,
Karsten Danzmann,
Gerhard Heinzel
Abstract:
To achieve sub-picometer sensitivities in the millihertz band, laser interferometric inertial sensors rely on some form of reduction of the laser frequency noise, typically by locking the laser to a stable frequency reference, such as the narrow-linewidth resonance of an ultra-stable optical cavity or an atomic or molecular transition. In this paper we report on a compact laser frequency stabiliza…
▽ More
To achieve sub-picometer sensitivities in the millihertz band, laser interferometric inertial sensors rely on some form of reduction of the laser frequency noise, typically by locking the laser to a stable frequency reference, such as the narrow-linewidth resonance of an ultra-stable optical cavity or an atomic or molecular transition. In this paper we report on a compact laser frequency stabilization technique based on an unequal-arm Mach-Zehnder interferometer that is sub-nanometer stable at $10\,μ$Hz, sub-picometer at $0.5\,$mHz, and reaches a noise floor of $7\,\mathrm{fm}/\!\sqrt{\mathrm{Hz}}$ at 1 Hz. The interferometer is used in conjunction with a DC servo to stabilize the frequency of a laser down to a fractional instability below $4 \times 10^{-13}$ at averaging times from 0.1 to 100 seconds. The technique offers a wide operating range, does not rely on complex lock acquisition procedures, and can be readily integrated as part of the optical bench in future gravity missions.
△ Less
Submitted 29 August, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
FB-BEV: BEV Representation from Forward-Backward View Transformations
Authors:
Zhiqi Li,
Zhiding Yu,
Wenhai Wang,
Anima Anandkumar,
Tong Lu,
Jose M. Alvarez
Abstract:
View Transformation Module (VTM), where transformations happen between multi-view image features and Bird-Eye-View (BEV) representation, is a crucial step in camera-based BEV perception systems. Currently, the two most prominent VTM paradigms are forward projection and backward projection. Forward projection, represented by Lift-Splat-Shoot, leads to sparsely projected BEV features without post-pr…
▽ More
View Transformation Module (VTM), where transformations happen between multi-view image features and Bird-Eye-View (BEV) representation, is a crucial step in camera-based BEV perception systems. Currently, the two most prominent VTM paradigms are forward projection and backward projection. Forward projection, represented by Lift-Splat-Shoot, leads to sparsely projected BEV features without post-processing. Backward projection, with BEVFormer being an example, tends to generate false-positive BEV features from incorrect projections due to the lack of utilization on depth. To address the above limitations, we propose a novel forward-backward view transformation module. Our approach compensates for the deficiencies in both existing methods, allowing them to enhance each other to obtain higher quality BEV representations mutually. We instantiate the proposed module with FB-BEV, which achieves a new state-of-the-art result of 62.4% NDS on the nuScenes test set. Code and models are available at https://github.com/NVlabs/FB-BEV.
△ Less
Submitted 17 August, 2023; v1 submitted 4 August, 2023;
originally announced August 2023.
-
The Initial Screening Order Problem
Authors:
Jose M. Alvarez,
Antonio Mastropietro,
Salvatore Ruggieri
Abstract:
We investigate the role of the initial screening order (ISO) in candidate screening processes, such as employee hiring and academic admissions. The ISO refers to the order in which the screener evaluates the candidate pool. It has been largely overlooked in the literature, despite its potential impact on the optimality and fairness of the chosen set, especially under a human screener. We define tw…
▽ More
We investigate the role of the initial screening order (ISO) in candidate screening processes, such as employee hiring and academic admissions. The ISO refers to the order in which the screener evaluates the candidate pool. It has been largely overlooked in the literature, despite its potential impact on the optimality and fairness of the chosen set, especially under a human screener. We define two problem formulations: the best-$k$, where the screener selects the $k$ best candidates, and the good-$k$, where the screener selects the $k$ first good-enough candidates. To study the impact of the ISO, we introduce a human-like screener and compare it to its algorithmic counterpart. The human-like screener is conceived to be inconsistent over time due to fatigue. Our analysis shows that the ISO, in particular, under a human-like screener hinders individual fairness despite meeting group level fairness. This is due to the position bias, where a candidate's evaluation is affected by its position within the ISO. We report extensive simulated experiments exploring the parameters of the best-$k$ and good-$k$ problem formulations both for the algorithmic and human-like screeners. This work is motivated by a real world candidate screening problem studied in collaboration with a large European company.
△ Less
Submitted 24 April, 2024; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Parametric Depth Based Feature Representation Learning for Object Detection and Segmentation in Bird's Eye View
Authors:
Jiayu Yang,
Enze Xie,
Miaomiao Liu,
Jose M. Alvarez
Abstract:
Recent vision-only perception models for autonomous driving achieved promising results by encoding multi-view image features into Bird's-Eye-View (BEV) space. A critical step and the main bottleneck of these methods is transforming image features into the BEV coordinate frame. This paper focuses on leveraging geometry information, such as depth, to model such feature transformation. Existing works…
▽ More
Recent vision-only perception models for autonomous driving achieved promising results by encoding multi-view image features into Bird's-Eye-View (BEV) space. A critical step and the main bottleneck of these methods is transforming image features into the BEV coordinate frame. This paper focuses on leveraging geometry information, such as depth, to model such feature transformation. Existing works rely on non-parametric depth distribution modeling leading to significant memory consumption, or ignore the geometry information to address this problem. In contrast, we propose to use parametric depth distribution modeling for feature transformation. We first lift the 2D image features to the 3D space defined for the ego vehicle via a predicted parametric depth distribution for each pixel in each view. Then, we aggregate the 3D feature volume based on the 3D space occupancy derived from depth to the BEV frame. Finally, we use the transformed features for downstream tasks such as object detection and semantic segmentation. Existing semantic segmentation methods do also suffer from an hallucination problem as they do not take visibility information into account. This hallucination can be particularly problematic for subsequent modules such as control and planning. To mitigate the issue, our method provides depth uncertainty and reliable visibility-aware estimations. We further leverage our parametric depth modeling to present a novel visibility-aware evaluation metric that, when taken into account, can mitigate the hallucination problem. Extensive experiments on object detection and semantic segmentation on the nuScenes datasets demonstrate that our method outperforms existing methods on both tasks.
△ Less
Submitted 11 July, 2023; v1 submitted 9 July, 2023;
originally announced July 2023.