-
Structure preserving schemes for a class of Wasserstein gradient flows
Authors:
Shiheng Zhang,
Jie Shen
Abstract:
We introduce in this paper two time discretization schemes tailored for a range of Wasserstein gradient flows. These schemes are designed to preserve mass, positivity and to be uniquely solvable. In addition, they also ensure energy dissipation in many typical scenarios. Through extensive numerical experiments, we demonstrate the schemes' robustness, accuracy and efficiency.
We introduce in this paper two time discretization schemes tailored for a range of Wasserstein gradient flows. These schemes are designed to preserve mass, positivity and to be uniquely solvable. In addition, they also ensure energy dissipation in many typical scenarios. Through extensive numerical experiments, we demonstrate the schemes' robustness, accuracy and efficiency.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias
Authors:
Jiacheng Shen,
Lihan Feng
Abstract:
In human decision-making tasks, individuals learn through trials and prediction errors. When individuals learn the task, some are more influenced by good outcomes, while others weigh bad outcomes more heavily. Such confirmation bias can lead to different learning effects. In this study, we propose a new algorithm in Deep Reinforcement Learning, CM-DQN, which applies the idea of different update st…
▽ More
In human decision-making tasks, individuals learn through trials and prediction errors. When individuals learn the task, some are more influenced by good outcomes, while others weigh bad outcomes more heavily. Such confirmation bias can lead to different learning effects. In this study, we propose a new algorithm in Deep Reinforcement Learning, CM-DQN, which applies the idea of different update strategies for positive or negative prediction errors, to simulate the human decision-making process when the task's states are continuous while the actions are discrete. We test in Lunar Lander environment with confirmatory, disconfirmatory bias and non-biased to observe the learning effects. Moreover, we apply the confirmation model in a multi-armed bandit problem (environment in discrete states and discrete actions), which utilizes the same idea as our proposed algorithm, as a contrast experiment to algorithmically simulate the impact of different confirmation bias in decision-making process. In both experiments, confirmatory bias indicates a better learning effect. Our code can be found here https://github.com/Patrickhshs/CM-DQN.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Approximating and preconditioning the stiffness matrix in the GoFD approximation of the fractional Laplacian
Authors:
Weizhang Huang,
Jinye Shen
Abstract:
In the finite difference approximation of the fractional Laplacian the stiffness matrix is typically dense and needs to be approximated numerically. The effect of the accuracy in approximating the stiffness matrix on the accuracy in the whole computation is analyzed and shown to be significant. Four such approximations are discussed. While they are shown to work well with the recently developed gr…
▽ More
In the finite difference approximation of the fractional Laplacian the stiffness matrix is typically dense and needs to be approximated numerically. The effect of the accuracy in approximating the stiffness matrix on the accuracy in the whole computation is analyzed and shown to be significant. Four such approximations are discussed. While they are shown to work well with the recently developed grid-over finite difference method (GoFD) for the numerical solution of boundary value problems of the fractional Laplacian, they differ in accuracy, economics to compute, performance of preconditioning, and asymptotic decay away from the diagonal line. In addition, two preconditioners based on sparse and circulant matrices are discussed for the iterative solution of linear systems associated with the stiffness matrix. Numerical results in two and three dimensions are presented.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Iron Snails: non-equilibrium dynamics and spiral abundance patterns
Authors:
Neige Frankel,
David W. Hogg,
Scott Tremaine,
Adrian Price-Whelan,
Jeff Shen
Abstract:
Galaxies are not in a dynamical steady state. They continually undergo perturbations, e.g., from infalling dwarf galaxies and dark-matter substructure. After a dynamical perturbation, stars phase mix towards a new steady state; in so doing they generally form spiral structures, such as spiral density waves in galaxy disks and the Gaia Snail observed in the vertical phase-space density in the solar…
▽ More
Galaxies are not in a dynamical steady state. They continually undergo perturbations, e.g., from infalling dwarf galaxies and dark-matter substructure. After a dynamical perturbation, stars phase mix towards a new steady state; in so doing they generally form spiral structures, such as spiral density waves in galaxy disks and the Gaia Snail observed in the vertical phase-space density in the solar neighborhood. Structures in phase-space density can be hard to measure accurately, because spatially varying selection effects imprint their own patterns on the density. However, stellar labels such as metallicity, or other element abundances, or stellar masses and ages, can be measured even in the face of complex or unknown spatial selection functions. We show that if the equilibrium galaxy has phase-space gradients in these labels, any perturbation that could raise a spiral wave in the phase-space density will raise a spiral wave in the distribution of labels as well. We work out the relationship between the spiral patterns in the density and in the labels. As an example, we analyze the Gaia Snail and show that its amplitude and dynamical age as derived from elemental abundances (mainly [Mg/Fe]) follow similar patterns to those derived from the phase-space density. Our best model dates the Snail's perturbation to about 400 Myr ago although we find significant variations with angular momentum in the best-fit age. Conceptually, the ideas presented here are related to Orbital Torus Imaging, chemical tagging, and other methods that use stellar labels to trace dynamics.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Algebraic cycles and Hitchin systems
Authors:
Davesh Maulik,
Junliang Shen,
Qizheng Yin
Abstract:
The purpose of this paper is to study motivic aspects of the Hitchin system for $\mathrm{GL}_n$. Our results include the following. (a) We prove the motivic decomposition conjecture of Corti-Hanamura for the Hitchin system; in particular, the decomposition theorem associated with the Hitchin system is induced by algebraic cycles. This yields an unconditional construction of the motivic perverse fi…
▽ More
The purpose of this paper is to study motivic aspects of the Hitchin system for $\mathrm{GL}_n$. Our results include the following. (a) We prove the motivic decomposition conjecture of Corti-Hanamura for the Hitchin system; in particular, the decomposition theorem associated with the Hitchin system is induced by algebraic cycles. This yields an unconditional construction of the motivic perverse filtration for the Hitchin system, which lifts the cohomological/sheaf-theoretic perverse filtration. (b) We prove that the inverse of the relative Hard Lefschetz symmetry is induced by a relative algebraic correspondence, confirming the relative Lefschetz standard conjecture for the Hitchin system. (c) We show a strong perversity bound for the normalized Chern classes of a universal bundle with respect to the motivic perverse filtration; this specializes to the sheaf-theoretic result obtained earlier by Maulik-Shen. (d) We prove a $χ$-independence result for the relative Chow motives associated with Hitchin systems.
Our methods combine Fourier transforms for compactified Jacobian fibrations associated with integral locally planar curves, nearby and vanishing cycle techniques, and a Springer-theoretic interpretation of parabolic Hitchin moduli spaces.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Vortex under Ripplet: An Empirical Study of RAG-enabled Applications
Authors:
Yuchen Shao,
Yuheng Huang,
Jiawei Shen,
Lei Ma,
Ting Su,
Chengcheng Wan
Abstract:
Large language models (LLMs) enhanced by retrieval-augmented generation (RAG) provide effective solutions in various application scenarios. However, developers face challenges in integrating RAG-enhanced LLMs into software systems, due to lack of interface specification, requirements from software context, and complicated system management. In this paper, we manually studied 100 open-source applic…
▽ More
Large language models (LLMs) enhanced by retrieval-augmented generation (RAG) provide effective solutions in various application scenarios. However, developers face challenges in integrating RAG-enhanced LLMs into software systems, due to lack of interface specification, requirements from software context, and complicated system management. In this paper, we manually studied 100 open-source applications that incorporate RAG-enhanced LLMs, and their issue reports. We have found that more than 98% of applications contain multiple integration defects that harm software functionality, efficiency, and security. We have also generalized 19 defect patterns and proposed guidelines to tackle them. We hope this work could aid LLM-enabled software development and motivate future research.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Supernova Shocks Cannot Explain the Inflated State of Hypervelocity Runaways from White Dwarf Binaries
Authors:
Aakash Bhat,
Evan B. Bauer,
Rüdiger Pakmor,
Ken J. Shen,
Ilaria Caiazzo,
Abinaya Swaruba Rajamuthukumar,
Kareem El-Badry,
Wolfgang E. Kerzendorf
Abstract:
Recent observations have found a growing number of hypervelocity stars with speeds of $\approx 1500-2500\,$km\,s$^{-1}$ which could have only been produced through thermonuclear supernovae in white dwarf binaries. Most of the observed hypervelocity runaways in this class display a surprising inflated structure: their current radii are roughly an order of magnitude greater than they would have been…
▽ More
Recent observations have found a growing number of hypervelocity stars with speeds of $\approx 1500-2500\,$km\,s$^{-1}$ which could have only been produced through thermonuclear supernovae in white dwarf binaries. Most of the observed hypervelocity runaways in this class display a surprising inflated structure: their current radii are roughly an order of magnitude greater than they would have been as white dwarfs filling their Roche lobe. While many simulations exist studying the dynamical phase leading to supernova detonation in these systems, no detailed calculations of the long-term structure of the runaways have yet been performed. We use an existing \textsc{Arepo} hydrodynamical simulation of a supernova in a white dwarf binary as a starting point for the evolution of these stars with the 1 dimensional stellar evolution code MESA. We show that the supernova shock is not enough to inflate the white dwarf over timescales longer than a few thousand years, significantly shorter than the $10^{5-6}$ year lifetimes inferred for observed hypervelocity runaways. Despite experiencing a shock from a supernova less than $\approx 0.02\,R_\odot$ away, our models do not experience significant interior heating, and all contract back to radii around $0.01\,R_\odot$ within about $10^4$\,years. Explaining the observed inflated states requires either an additional source of significant heating or some other physics that is not yet accounted for in the subsequent evolution.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction
Authors:
Dubing Chen,
Wencheng Han,
Jin Fang,
Jianbing Shen
Abstract:
In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024. Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling. Initially, we independently train the occupancy model,…
▽ More
In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024. Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling. Initially, we independently train the occupancy model, followed by flow prediction using sequential frame integration. Our method combines regression with classification to address scale variations in different scenes, and leverages predicted flow to warp current voxel features to future frames, guided by future frame ground truth. Experimental results on the nuScenes dataset demonstrate significant improvements in accuracy and robustness, showcasing the effectiveness of our approach in real-world scenarios. Our single model based on Swin-Base ranks second on the public leaderboard, validating the potential of our method in advancing autonomous car perception systems.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
A slightly oblate dark matter halo revealed by a retrograde precessing Galactic disk warp
Authors:
Yang Huang,
Qikang Feng,
Tigran Khachaturyants,
Huawei Zhang,
Jifeng Liu,
Juntai Shen,
Timothy C. Beers,
Youjun Lu,
Song Wang,
Haibo Yuan
Abstract:
The shape of the dark matter (DM) halo is key to understanding the hierarchical formation of the Galaxy. Despite extensive efforts in recent decades, however, its shape remains a matter of debate, with suggestions ranging from strongly oblate to prolate. Here, we present a new constraint on its present shape by directly measuring the evolution of the Galactic disk warp with time, as traced by accu…
▽ More
The shape of the dark matter (DM) halo is key to understanding the hierarchical formation of the Galaxy. Despite extensive efforts in recent decades, however, its shape remains a matter of debate, with suggestions ranging from strongly oblate to prolate. Here, we present a new constraint on its present shape by directly measuring the evolution of the Galactic disk warp with time, as traced by accurate distance estimates and precise age determinations for about 2,600 classical Cepheids. We show that the Galactic warp is mildly precessing in a retrograde direction at a rate of $ω= -2.1 \pm 0.5 ({\rm statistical}) \pm 0.6 ({\rm systematic})$ km s$^{-1}$ kpc$^{-1}$ for the outer disk over the Galactocentric radius [$7.5, 25$] kpc, decreasing with radius. This constrains the shape of the DM halo to be slightly oblate with a flattening (minor axis to major axis ratio) in the range $0.84 \le q_Φ \le 0.96$. Given the young nature of the disk warp traced by Cepheids (less than 200 Myr), our approach directly measures the shape of the present-day DM halo. This measurement, combined with other measurements from older tracers, could provide vital constraints on the evolution of the DM halo and the assembly history of the Galaxy.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Interference Cancellation Based Neural Receiver for Superimposed Pilot in Multi-Layer Transmission
Authors:
Han Xiao,
Wenqiang Tian,
Shi Jin,
Wendong Liu,
Jia Shen,
Zhihua Shi,
Zhi Zhang
Abstract:
In this paper, an interference cancellation based neural receiver for superimposed pilot (SIP) in multi-layer transmission is proposed, where the data and pilot are non-orthogonally superimposed in the same time-frequency resource. Specifically, to deal with the intra-layer and inter-layer interference of SIP under multi-layer transmission, the interference cancellation with superimposed symbol ai…
▽ More
In this paper, an interference cancellation based neural receiver for superimposed pilot (SIP) in multi-layer transmission is proposed, where the data and pilot are non-orthogonally superimposed in the same time-frequency resource. Specifically, to deal with the intra-layer and inter-layer interference of SIP under multi-layer transmission, the interference cancellation with superimposed symbol aided channel estimation is leveraged in the neural receiver, accompanied by the pre-design of pilot code-division orthogonal mechanism at transmitter. In addition, to address the complexity issue for inter-vendor collaboration and the generalization problem in practical deployments, respectively, this paper also provides a fixed SIP (F-SIP) design based on constant pilot power ratio and scalable mechanisms for different modulation and coding schemes (MCSs) and transmission layers. Simulation results demonstrate the superiority of the proposed schemes on the performance of block error rate and throughput compared with existing counterparts.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo Labeling
Authors:
Haoran Li,
Xingjian Li,
Jiahua Shi,
Huaming Chen,
Bo Du,
Daisuke Kihara,
Johan Barthelemy,
Jun Shen,
Min Xu
Abstract:
Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology facilitating the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of full…
▽ More
Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology facilitating the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of fully-supervised approaches for cryo-ET images. Some unsupervised domain adaptation (UDA) approaches have been designed to enhance the segmentation network performance using unlabeled data. However, applying these methods directly to cryo-ET images segmentation tasks remains challenging due to two main issues: 1) the source data, usually obtained through simulation, contain a certain level of noise, while the target data, directly collected from raw-data from real-world scenario, have unpredictable noise levels. 2) the source data used for training typically consists of known macromoleculars, while the target domain data are often unknown, causing the model's segmenter to be biased towards these known macromolecules, leading to a domain shift problem. To address these challenges, in this work, we introduce the first voxel-wise unsupervised domain adaptation approach, termed Vox-UDA, specifically for cryo-ET subtomogram segmentation. Vox-UDA incorporates a noise generation module to simulate target-like noises in the source dataset for cross-noise level adaptation. Additionally, we propose a denoised pseudo-labeling strategy based on improved Bilateral Filter to alleviate the domain shift problem. Experimental results on both simulated and real cryo-ET subtomogram datasets demonstrate the superiority of our proposed approach compared to state-of-the-art UDA methods.
△ Less
Submitted 30 June, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Flat bands and distinct density wave orders in correlated Kagome superconductor CsCr$_3$Sb$_5$
Authors:
Shuting Peng,
Yulei Han,
Yongkai Li,
Jianchang Shen,
Yu Miao,
Yang Luo,
Linwei Huai,
Zhipeng Ou,
Hongyu Li,
Ziji Xiang,
Zhengtai Liu,
Dawei Shen,
Makoto Hashimoto,
Donghui Lu,
Yugui Yao,
Zhenhua Qiao,
Zhiwei Wang,
Junfeng He
Abstract:
Kagome metal CsV$_3$Sb$_5$ has attracted much recent attention due to the coexistence of multiple exotic orders and the associated proposals to mimic unconventional high temperature superconductors. Nevertheless, magnetism and strong electronic correlations -- two essential ingredients for unconventional superconductivity, are absent in this V-based Kagome metal. CsCr$_3$Sb$_5$ is a newly discover…
▽ More
Kagome metal CsV$_3$Sb$_5$ has attracted much recent attention due to the coexistence of multiple exotic orders and the associated proposals to mimic unconventional high temperature superconductors. Nevertheless, magnetism and strong electronic correlations -- two essential ingredients for unconventional superconductivity, are absent in this V-based Kagome metal. CsCr$_3$Sb$_5$ is a newly discovered Cr-based parallel of CsV$_3$Sb$_5$, in which magnetism appears with charge density wave and superconductivity at different temperature and pressure regions. Enhanced electronic correlations are also suggested by theoretical proposals due to the calculated flat bands. Here, we report angle-resolved photoemission measurements and first-principles calculations on this new material system. Electron energy bands and the associated orbitals are resolved. Flat bands are observed near the Fermi level. Doping dependent measurements on Cs(Cr$_x$V$_{1-x}$)$_3$Sb$_5$ reveal a gradually enhanced band renormalization from CsV$_3$Sb$_5$ to CsCr$_3$Sb$_5$, accompanied by distinct spatial symmetry breaking states in the phase diagram.
△ Less
Submitted 26 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Importance of Initial Condition on Bar Secular Evolution: Role of Halo Angular Momentum Distribution Discontinuity
Authors:
Sandeep Kumar Kataria,
Juntai Shen
Abstract:
The dark matter halo properties, for example, mass, spin and concentration play a significant role in the formation and evolution of bars in disk galaxies. This study highlights the importance of a new parameter: the dark matter halo angular momentum distribution in the central region of disk. We experiment with N-body galaxy models having a disk and dark matter similar to Milky Way-type galaxies.…
▽ More
The dark matter halo properties, for example, mass, spin and concentration play a significant role in the formation and evolution of bars in disk galaxies. This study highlights the importance of a new parameter: the dark matter halo angular momentum distribution in the central region of disk. We experiment with N-body galaxy models having a disk and dark matter similar to Milky Way-type galaxies. In these models, we vary the discontinuity of the angular momentum distribution of the halo (the total spin is the same for all models). Our N-body experiments suggest that bar forms in all models after a few Gyr of disk evolution. However, in the secular evolution of the bar, as we evolve these models until 9.78 Gyr, the bar gains its strength in the model with the most continuous halo angular momentum distribution, and the bar loses strength for the most discontinuous halo angular momentum distribution. The secular evolution of the bar suggests that box/peanut/x-shaped bulges similar to those found in the Milky Way disk should be more pronounced in halos with continuous halo angular momentum distributions. This study demonstrates the importance of the initial condition setup of galaxy systems, namely the discontinuity in the dark matter halo angular momentum distribution for a given density distribution, on the bar secular evolution in the disk galaxy simulations. Further, this study helps reconcile the conflicting results of bar secular evolution in a high-spinning halo of the recent literature.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Multi-threshold Deep Metric Learning for Facial Expression Recognition
Authors:
Wenwu Yang,
Jinyi Yu,
Tuo Chen,
Zhenguang Liu,
Xun Wang,
Jianbing Shen
Abstract:
Effective expression feature representations generated by a triplet-based deep metric learning are highly advantageous for facial expression recognition (FER). The performance of triplet-based deep metric learning is contingent upon identifying the best threshold for triplet loss. Threshold validation, however, is tough and challenging, as the ideal threshold changes among datasets and even across…
▽ More
Effective expression feature representations generated by a triplet-based deep metric learning are highly advantageous for facial expression recognition (FER). The performance of triplet-based deep metric learning is contingent upon identifying the best threshold for triplet loss. Threshold validation, however, is tough and challenging, as the ideal threshold changes among datasets and even across classes within the same dataset. In this paper, we present the multi-threshold deep metric learning technique, which not only avoids the difficult threshold validation but also vastly increases the capacity of triplet loss learning to construct expression feature representations. We find that each threshold of the triplet loss intrinsically determines a distinctive distribution of inter-class variations and corresponds, thus, to a unique expression feature representation. Therefore, rather than selecting a single optimal threshold from a valid threshold range, we thoroughly sample thresholds across the range, allowing the representation characteristics manifested by thresholds within the range to be fully extracted and leveraged for FER. To realize this approach, we partition the embedding layer of the deep metric learning network into a collection of slices and model training these embedding slices as an end-to-end multi-threshold deep metric learning problem. Each embedding slice corresponds to a sample threshold and is learned by enforcing the corresponding triplet loss, yielding a set of distinct expression features, one for each embedding slice. It makes the embedding layer, which is composed of a set of slices, a more informative and discriminative feature, hence enhancing the FER accuracy. Extensive evaluations demonstrate the superior performance of the proposed approach on both posed and spontaneous facial expression datasets.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
A new flow dynamic approach for Wasserstein gradient flows
Authors:
Qing Cheng,
Qianqian Liu,
Wenbin Chen,
Jie Shen
Abstract:
We develop in this paper a new regularized flow dynamic approach to construct efficient numerical schemes for Wasserstein gradient flows in Lagrangian coordinates. Instead of approximating the Wasserstein distance which needs to solve constrained minimization problems, we reformulate the problem using the Benamou-Brenier's flow dynamic approach, leading to algorithms which only need to solve uncon…
▽ More
We develop in this paper a new regularized flow dynamic approach to construct efficient numerical schemes for Wasserstein gradient flows in Lagrangian coordinates. Instead of approximating the Wasserstein distance which needs to solve constrained minimization problems, we reformulate the problem using the Benamou-Brenier's flow dynamic approach, leading to algorithms which only need to solve unconstrained minimization problem in $L^2$ distance. Our schemes automatically inherit some essential properties of Wasserstein gradient systems such as positivity-preserving, mass conservative and energy dissipation. We present ample numerical simulations of Porous-Medium equations, Keller-Segel equations and Aggregation equations to validate the accuracy and stability of the proposed schemes. Compared to numerical schemes in Eulerian coordinates, our new schemes can capture sharp interfaces for various Wasserstein gradient flows using relatively smaller number of unknowns.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Do Multimodal Foundation Models Understand Enterprise Workflows? A Benchmark for Business Process Management Tasks
Authors:
Michael Wornow,
Avanika Narayan,
Ben Viggiano,
Ishan S. Khare,
Tathagat Verma,
Tibor Thompson,
Miguel Angel Fuentes Hernandez,
Sudharsan Sundar,
Chloe Trujillo,
Krrish Chawla,
Rongfei Lu,
Justin Shen,
Divya Nagaraj,
Joshua Martinez,
Vardhan Agrawal,
Althea Hudson,
Nigam H. Shah,
Christopher Re
Abstract:
Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This f…
▽ More
Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This focus on automation ignores the reality of how most BPM tools are applied today - simply documenting the relevant workflow takes 60% of the time of the typical process optimization project. To address this gap we present WONDERBREAD, the first benchmark for evaluating multimodal FMs on BPM tasks beyond automation. Our contributions are: (1) a dataset containing 2928 documented workflow demonstrations; (2) 6 novel BPM tasks sourced from real-world applications ranging from workflow documentation to knowledge transfer to process improvement; and (3) an automated evaluation harness. Our benchmark shows that while state-of-the-art FMs can automatically generate documentation (e.g. recalling 88% of the steps taken in a video demonstration of a workflow), they struggle to re-apply that knowledge towards finer-grained validation of workflow completion (F1 < 0.3). We hope WONDERBREAD encourages the development of more "human-centered" AI tooling for enterprise applications and furthers the exploration of multimodal FMs for the broader universe of BPM tasks. We publish our dataset and experiments here: https://github.com/HazyResearch/wonderbread
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection
Authors:
Junhao Lin,
Lei Zhu,
Jiaxing Shen,
Huazhu Fu,
Qing Zhang,
Liansheng Wang
Abstract:
With the rapid development of depth sensor, more and more RGB-D videos could be obtained. Identifying the foreground in RGB-D videos is a fundamental and important task. However, the existing salient object detection (SOD) works only focus on either static RGB-D images or RGB videos, ignoring the collaborating of RGB-D and video information. In this paper, we first collect a new annotated RGB-D vi…
▽ More
With the rapid development of depth sensor, more and more RGB-D videos could be obtained. Identifying the foreground in RGB-D videos is a fundamental and important task. However, the existing salient object detection (SOD) works only focus on either static RGB-D images or RGB videos, ignoring the collaborating of RGB-D and video information. In this paper, we first collect a new annotated RGB-D video SOD (ViDSOD-100) dataset, which contains 100 videos within a total of 9,362 frames, acquired from diverse natural scenes. All the frames in each video are manually annotated to a high-quality saliency annotation. Moreover, we propose a new baseline model, named attentive triple-fusion network (ATF-Net), for RGB-D video salient object detection. Our method aggregates the appearance information from an input RGB image, spatio-temporal information from an estimated motion map, and the geometry information from the depth map by devising three modality-specific branches and a multi-modality integration branch. The modality-specific branches extract the representation of different inputs, while the multi-modality integration branch combines the multi-level modality-specific features by introducing the encoder feature aggregation (MEA) modules and decoder feature aggregation (MDA) modules. The experimental findings conducted on both our newly introduced ViDSOD-100 dataset and the well-established DAVSOD dataset highlight the superior performance of the proposed ATF-Net. This performance enhancement is demonstrated both quantitatively and qualitatively, surpassing the capabilities of current state-of-the-art techniques across various domains, including RGB-D saliency detection, video saliency detection, and video object segmentation. Our data and our code are available at github.com/jhl-Det/RGBD_Video_SOD.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
CodeGemma: Open Code Models Based on Gemma
Authors:
CodeGemma Team,
Heri Zhao,
Jeffrey Hui,
Joshua Howland,
Nam Nguyen,
Siqi Zuo,
Andrea Hu,
Christopher A. Choquette-Choo,
Jingyue Shen,
Joe Kelley,
Kshitij Bansal,
Luke Vilnis,
Mateo Wirth,
Paul Michel,
Peter Choy,
Pratik Joshi,
Ravin Kumar,
Sarmad Hashmi,
Shubham Agrawal,
Zhitao Gong,
Jane Fine,
Tris Warkentin,
Ale Jakse Hartman,
Bin Ni,
Kathy Korevec
, et al. (2 additional authors not shown)
Abstract:
This paper introduces CodeGemma, a collection of specialized open code models built on top of Gemma, capable of a variety of code and natural language generation tasks. We release three model variants. CodeGemma 7B pretrained (PT) and instruction-tuned (IT) variants have remarkably resilient natural language understanding, excel in mathematical reasoning, and match code capabilities of other open…
▽ More
This paper introduces CodeGemma, a collection of specialized open code models built on top of Gemma, capable of a variety of code and natural language generation tasks. We release three model variants. CodeGemma 7B pretrained (PT) and instruction-tuned (IT) variants have remarkably resilient natural language understanding, excel in mathematical reasoning, and match code capabilities of other open models. CodeGemma 2B is a state-of-the-art code completion model designed for fast code infilling and open-ended generation in latency-sensitive settings.
△ Less
Submitted 18 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
On the $P=C$ conjecture and refined BPS invariants for local $\mathbb{P}^2$
Authors:
Weite Pi,
Junliang Shen
Abstract:
We prove that the refined BPS invariants for local $\mathbb{P}^2$ satisfy an asymptotic product formula as conjectured by Kononov--Pi--Shen. Combined with the $P\supset C$ result of Maulik--Shen--Yin obtained from a theory of Fourier transform, we prove the $P=C$ conjecture for degree $d$ curves on $\mathbb{P}^2$ in cohomological degrees $\leq d+1$.
We prove that the refined BPS invariants for local $\mathbb{P}^2$ satisfy an asymptotic product formula as conjectured by Kononov--Pi--Shen. Combined with the $P\supset C$ result of Maulik--Shen--Yin obtained from a theory of Fourier transform, we prove the $P=C$ conjecture for degree $d$ curves on $\mathbb{P}^2$ in cohomological degrees $\leq d+1$.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
A way to identify whether a DFT gap is from right reasons or error cancellations: The case of copper chalcogenides
Authors:
Jiale Shen,
Haitao Liu,
Yuanchang Li
Abstract:
Gap opening remains elusive in copper chalcogenides (Cu$_{2}X$, $X$ = S, Se and Te), not least because Hubbard + $U$, hybrid functional and ${GW}$ methods have also failed. In this work, we elucidate that their failure originates from a severe underestimation of the 4$s$-3$d$ orbital splitting of the Cu atom, which leads to a band-order inversion in the presence of an anionic crystal field. As a r…
▽ More
Gap opening remains elusive in copper chalcogenides (Cu$_{2}X$, $X$ = S, Se and Te), not least because Hubbard + $U$, hybrid functional and ${GW}$ methods have also failed. In this work, we elucidate that their failure originates from a severe underestimation of the 4$s$-3$d$ orbital splitting of the Cu atom, which leads to a band-order inversion in the presence of an anionic crystal field. As a result, the Fermi energy is pinned due to symmetry, yielding an invariant zero gap. Utilizing the hybrid pseudopotentials to correct the underestimation on the atomic side opens up gaps of experimental magnitude in Cu$_{2}X$, suggesting their predominantly electronic nature. Our work not only clarifies the debate about the Cu$_{2}X$ gap, but also provides a way to identify which of the different methods really captures the physical essence and which is the result of error cancellation.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Sparse two-stage Bayesian meta-analysis for individualized treatments
Authors:
Junwei Shen,
Erica E. M. Moodie,
Shirin Golchi
Abstract:
Individualized treatment rules tailor treatments to patients based on clinical, demographic, and other characteristics. Estimation of individualized treatment rules requires the identification of individuals who benefit most from the particular treatments and thus the detection of variability in treatment effects. To develop an effective individualized treatment rule, data from multisite studies m…
▽ More
Individualized treatment rules tailor treatments to patients based on clinical, demographic, and other characteristics. Estimation of individualized treatment rules requires the identification of individuals who benefit most from the particular treatments and thus the detection of variability in treatment effects. To develop an effective individualized treatment rule, data from multisite studies may be required due to the low power provided by smaller datasets for detecting the often small treatment-covariate interactions. However, sharing of individual-level data is sometimes constrained. Furthermore, sparsity may arise in two senses: different data sites may recruit from different populations, making it infeasible to estimate identical models or all parameters of interest at all sites, and the number of non-zero parameters in the model for the treatment rule may be small. To address these issues, we adopt a two-stage Bayesian meta-analysis approach to estimate individualized treatment rules which optimize expected patient outcomes using multisite data without disclosing individual-level data beyond the sites. Simulation results demonstrate that our approach can provide consistent estimates of the parameters which fully characterize the optimal individualized treatment rule. We estimate the optimal Warfarin dose strategy using data from the International Warfarin Pharmacogenetics Consortium, where data sparsity and small treatment-covariate interaction effects pose additional statistical challenges.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Authors:
Rongzhi Zhang,
Jiaming Shen,
Tianqi Liu,
Haorui Wang,
Zhen Qin,
Feng Han,
Jialu Liu,
Simon Baumgartner,
Michael Bendersky,
Chao Zhang
Abstract:
Large Language Models (LLMs) have exhibited impressive capabilities in various tasks, yet their vast parameter sizes restrict their applicability in resource-constrained settings. Knowledge distillation (KD) offers a viable solution by transferring expertise from large teacher models to compact student models. However, traditional KD techniques face specific challenges when applied to LLMs, includ…
▽ More
Large Language Models (LLMs) have exhibited impressive capabilities in various tasks, yet their vast parameter sizes restrict their applicability in resource-constrained settings. Knowledge distillation (KD) offers a viable solution by transferring expertise from large teacher models to compact student models. However, traditional KD techniques face specific challenges when applied to LLMs, including restricted access to LLM outputs, significant teacher-student capacity gaps, and the inherited mis-calibration issue. In this work, we present PLaD, a novel preference-based LLM distillation framework. PLaD exploits the teacher-student capacity discrepancy to generate pseudo-preference pairs where teacher outputs are preferred over student outputs. Then, PLaD leverages a ranking loss to re-calibrate student's estimation of sequence likelihood, which steers the student's focus towards understanding the relative quality of outputs instead of simply imitating the teacher. PLaD bypasses the need for access to teacher LLM's internal states, tackles the student's expressivity limitations, and mitigates the student mis-calibration issue. Through extensive experiments on two sequence generation tasks and with various LLMs, we demonstrate the effectiveness of our proposed PLaD framework.
△ Less
Submitted 6 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Context Gating in Spiking Neural Networks: Achieving Lifelong Learning through Integration of Local and Global Plasticity
Authors:
Jiangrong Shen,
Wenyao Ni,
Qi Xu,
Gang Pan,
Huajin Tang
Abstract:
Humans learn multiple tasks in succession with minimal mutual interference, through the context gating mechanism in the prefrontal cortex (PFC). The brain-inspired models of spiking neural networks (SNN) have drawn massive attention for their energy efficiency and biological plausibility. To overcome catastrophic forgetting when learning multiple tasks in sequence, current SNN models for lifelong…
▽ More
Humans learn multiple tasks in succession with minimal mutual interference, through the context gating mechanism in the prefrontal cortex (PFC). The brain-inspired models of spiking neural networks (SNN) have drawn massive attention for their energy efficiency and biological plausibility. To overcome catastrophic forgetting when learning multiple tasks in sequence, current SNN models for lifelong learning focus on memory reserving or regularization-based modification, while lacking SNN to replicate human experimental behavior. Inspired by biological context-dependent gating mechanisms found in PFC, we propose SNN with context gating trained by the local plasticity rule (CG-SNN) for lifelong learning. The iterative training between global and local plasticity for task units is designed to strengthen the connections between task neurons and hidden neurons and preserve the multi-task relevant information. The experiments show that the proposed model is effective in maintaining the past learning experience and has better task-selectivity than other methods during lifelong learning. Our results provide new insights that the CG-SNN model can extend context gating with good scalability on different SNN architectures with different spike-firing mechanisms. Thus, our models have good potential for parallel implementation on neuromorphic hardware and model human's behavior.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning
Authors:
Yaxin Li,
Qi Xu,
Jiangrong Shen,
Hongming Xu,
Long Chen,
Gang Pan
Abstract:
The emergence of deep and large-scale spiking neural networks (SNNs) exhibiting high performance across diverse complex datasets has led to a need for compressing network models due to the presence of a significant number of redundant structural units, aiming to more effectively leverage their low-power consumption and biological interpretability advantages. Currently, most model compression techn…
▽ More
The emergence of deep and large-scale spiking neural networks (SNNs) exhibiting high performance across diverse complex datasets has led to a need for compressing network models due to the presence of a significant number of redundant structural units, aiming to more effectively leverage their low-power consumption and biological interpretability advantages. Currently, most model compression techniques for SNNs are based on unstructured pruning of individual connections, which requires specific hardware support. Hence, we propose a structured pruning approach based on the activity levels of convolutional kernels named Spiking Channel Activity-based (SCA) network pruning framework. Inspired by synaptic plasticity mechanisms, our method dynamically adjusts the network's structure by pruning and regenerating convolutional kernels during training, enhancing the model's adaptation to the current target task. While maintaining model performance, this approach refines the network architecture, ultimately reducing computational load and accelerating the inference process. This indicates that structured dynamic sparse learning methods can better facilitate the application of deep SNNs in low-power and high-efficiency scenarios.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Ta2Pd3Te5 topological thermometer
Authors:
Yupeng Li,
Anqi Wang,
Senyang Pan,
Dayu Yan,
Guang Yang,
Xingchen Guo,
Yu Hong,
Guangtong Liu,
Fanming Qu,
Zhijun Wang,
Tian Qian,
Jinglei Zhang,
Youguo Shi,
Li Lu,
Jie Shen
Abstract:
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in te…
▽ More
In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in temperature-dependent resistance at low temperatures, stemming from its Luttinger liquid behavior of edge states, while exhibiting semiconductor behavior at high temperatures. The power-law behavior effectively addresses the issue of infinite resistance in semiconductor thermometers at ultra-low temperatures, thereby playing a crucial role in enabling efficient thermometry in refrigerators supporting millikelvin temperatures or below. By employing chemical doping, adjusting thickness, and controlling gate voltage, its power-law behavior and semiconductor behavior can be effectively modulated. This enables efficient thermometry spanning from millikelvin temperatures to room temperature, and allows for precise local temperature measurement. Furthermore, this thermometer exhibits excellent temperature sensitivity and resolution, and can be fine-tuned to show small magnetoresistance. In summary, the Ta2Pd3Te5 thermometer, also referred to as a topological thermometer, exhibits outstanding performance and significant potential for measuring a wider range of temperatures compared to conventional low-temperature thermometers.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
A Survey on Large Language Models for Code Generation
Authors:
Juyong Jiang,
Fan Wang,
Jiasi Shen,
Sungju Kim,
Sunghun Kim
Abstract:
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e…
▽ More
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e.g., GitHub Copilot. Despite the active exploration of LLMs for a variety of code tasks, either from the perspective of natural language processing (NLP) or software engineering (SE) or both, there is a noticeable absence of a comprehensive and up-to-date literature review dedicated to LLM for code generation. In this survey, we aim to bridge this gap by providing a systematic literature review that serves as a valuable reference for researchers investigating the cutting-edge progress in LLMs for code generation. We introduce a taxonomy to categorize and discuss the recent developments in LLMs for code generation, covering aspects such as data curation, latest advances, performance evaluation, and real-world applications. In addition, we present a historical overview of the evolution of LLMs for code generation and offer an empirical comparison using the widely recognized HumanEval and MBPP benchmarks to highlight the progressive enhancements in LLM capabilities for code generation. We identify critical challenges and promising opportunities regarding the gap between academia and practical development. Furthermore, we have established a dedicated resource website (https://codellm.github.io) to continuously document and disseminate the most recent advances in the field.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Distributed Simulation for Digital Twins of Large-Scale Real-World DiffServ-Based Networks
Authors:
Zhuoyao Huang,
Nan Zhang,
Jingran Shen,
Georgios Diamantopoulos,
Zhengchang Hua,
Nikos Tziritas,
Georgios Theodoropoulos
Abstract:
Digital Twin technology facilitates the monitoring and online analysis of large-scale communication networks. Faster predictions of network performance thus become imperative, especially for analysing Quality of Service (QoS) parameters in large-scale city networks. Discrete Event Simulation (DES) is a standard network analysis technology, and can be further optimised with parallel and distributed…
▽ More
Digital Twin technology facilitates the monitoring and online analysis of large-scale communication networks. Faster predictions of network performance thus become imperative, especially for analysing Quality of Service (QoS) parameters in large-scale city networks. Discrete Event Simulation (DES) is a standard network analysis technology, and can be further optimised with parallel and distributed execution for speedup, referred to as Parallel Discrete Event Simulation (PDES). However, modelling detailed QoS mechanisms such as DiffServ requires complex event handling for each network router, which can involve excessive simulation events. In addition, current PDES for network analysis mostly adopts conservative scheduling, which suffers from excessive global synchronisation to avoid causality problems. The performance analysis of optimistic PDES for real-world large-scale network topology and complex QoS mechanisms is still inadequate. To address these gaps, this paper proposes a simulation toolkit, Quaint, which leverages an optimistic PDES engine ROSS, for detailed modelling of DiffServ-based networks. A novel event-handling model for each network router is also proposed to significantly reduce the number of events in complex QoS modelling. Quaint has been evaluated using a real-world metropolitan-scale network topology with 5,000 routers/switches. Results show that compared to the conventional simulator OMNeT++/INET, even the sequential mode of Quaint can achieve a speedup of 53 times, and the distributed mode has a speedup of 232 times. Scalability characterisation is conducted to portray the efficiency of distributed execution, and the results indicate the future direction for workload-aware model partitioning.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction
Authors:
Jianghao Shen,
Nan Xue,
Tianfu Wu
Abstract:
Learning 3D scene representation from a single-view image is a long-standing fundamental problem in computer vision, with the inherent ambiguity in predicting contents unseen from the input view. Built on the recently proposed 3D Gaussian Splatting (3DGS), the Splatter Image method has made promising progress on fast single-image novel view synthesis via learning a single 3D Gaussian for each pixe…
▽ More
Learning 3D scene representation from a single-view image is a long-standing fundamental problem in computer vision, with the inherent ambiguity in predicting contents unseen from the input view. Built on the recently proposed 3D Gaussian Splatting (3DGS), the Splatter Image method has made promising progress on fast single-image novel view synthesis via learning a single 3D Gaussian for each pixel based on the U-Net feature map of an input image. However, it has limited expressive power to represent occluded components that are not observable in the input view. To address this problem, this paper presents a Hierarchical Splatter Image method in which a pixel is worth more than one 3D Gaussians. Specifically, each pixel is represented by a parent 3D Gaussian and a small number of child 3D Gaussians. Parent 3D Gaussians are learned as done in the vanilla Splatter Image. Child 3D Gaussians are learned via a lightweight Multi-Layer Perceptron (MLP) which takes as input the projected image features of a parent 3D Gaussian and the embedding of a target camera view. Both parent and child 3D Gaussians are learned end-to-end in a stage-wise way. The joint condition of input image features from eyes of the parent Gaussians and the target camera position facilitates learning to allocate child Gaussians to ``see the unseen'', recovering the occluded details that are often missed by parent Gaussians.
In experiments, the proposed method is tested on the ShapeNet-SRN and CO3D datasets with state-of-the-art performance obtained, especially showing promising capabilities of reconstructing occluded contents in the input view.
△ Less
Submitted 3 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Almost All Carbon/Oxygen White Dwarfs Can Support Double Detonations
Authors:
Ken J. Shen,
Samuel J. Boos,
Dean M. Townsley
Abstract:
Double detonations of sub-Chandrasekhar-mass white dwarfs (WDs) in unstably mass-transferring double WD binaries have become a leading contender to explain most, if not all, Type Ia supernovae. However, past theoretical studies of the explosion process have assumed relatively ad hoc initial conditions for the helium shells in which the double detonations begin. In this work, we construct realistic…
▽ More
Double detonations of sub-Chandrasekhar-mass white dwarfs (WDs) in unstably mass-transferring double WD binaries have become a leading contender to explain most, if not all, Type Ia supernovae. However, past theoretical studies of the explosion process have assumed relatively ad hoc initial conditions for the helium shells in which the double detonations begin. In this work, we construct realistic C/O WDs to use as the starting points for multidimensional double detonation simulations. We supplement these with simplified one-dimensional detonation calculations to gain a physical understanding of the conditions under which shell detonations can propagate successfully. We find that C/O WDs <= 1.0 Msol, which make up the majority of C/O WDs, are born with structures that can support double detonations. More massive C/O WDs require ~1e-3 Msol of accretion before detonations can successfully propagate in their shells, but such accretion may be common in the double WD binaries that host massive WDs. Our findings strongly suggest that if the direct impact accretion stream reaches high enough temperatures and densities during mass transfer from one WD to another, the accreting WD will undergo a double detonation. Furthermore, if the companion is also a C/O WD <= 1.0 Msol, it will undergo its own double detonation when impacted by the ejecta from the first explosion. Exceptions to this outcome may explain the newly discovered class of hypervelocity supernova survivors.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?
Authors:
Yifan Bai,
Dongming Wu,
Yingfei Liu,
Fan Jia,
Weixin Mao,
Ziheng Zhang,
Yucheng Zhao,
Jianbing Shen,
Xing Wei,
Tiancai Wang,
Xiangyu Zhang
Abstract:
Rapid advancements in Autonomous Driving (AD) tasks turned a significant shift toward end-to-end fashion, particularly in the utilization of vision-language models (VLMs) that integrate robust logical reasoning and cognitive abilities to enable comprehensive end-to-end planning. However, these VLM-based approaches tend to integrate 2D vision tokenizers and a large language model (LLM) for ego-car…
▽ More
Rapid advancements in Autonomous Driving (AD) tasks turned a significant shift toward end-to-end fashion, particularly in the utilization of vision-language models (VLMs) that integrate robust logical reasoning and cognitive abilities to enable comprehensive end-to-end planning. However, these VLM-based approaches tend to integrate 2D vision tokenizers and a large language model (LLM) for ego-car planning, which lack 3D geometric priors as a cornerstone of reliable planning. Naturally, this observation raises a critical concern: Can a 2D-tokenized LLM accurately perceive the 3D environment? Our evaluation of current VLM-based methods across 3D object detection, vectorized map construction, and environmental caption suggests that the answer is, unfortunately, NO. In other words, 2D-tokenized LLM fails to provide reliable autonomous driving. In response, we introduce DETR-style 3D perceptrons as 3D tokenizers, which connect LLM with a one-layer linear projector. This simple yet elegant strategy, termed Atlas, harnesses the inherent priors of the 3D physical world, enabling it to simultaneously process high-resolution multi-view images and employ spatiotemporal modeling. Despite its simplicity, Atlas demonstrates superior performance in both 3D detection and ego planning tasks on nuScenes dataset, proving that 3D-tokenized LLM is the key to reliable autonomous driving. The code and datasets will be released.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs
Authors:
Jocelyn Shen,
Joel Mire,
Hae Won Park,
Cynthia Breazeal,
Maarten Sap
Abstract:
Empathy serves as a cornerstone in enabling prosocial behaviors, and can be evoked through sharing of personal experiences in stories. While empathy is influenced by narrative content, intuitively, people respond to the way a story is told as well, through narrative style. Yet the relationship between empathy and narrative style is not fully understood. In this work, we empirically examine and qua…
▽ More
Empathy serves as a cornerstone in enabling prosocial behaviors, and can be evoked through sharing of personal experiences in stories. While empathy is influenced by narrative content, intuitively, people respond to the way a story is told as well, through narrative style. Yet the relationship between empathy and narrative style is not fully understood. In this work, we empirically examine and quantify this relationship between style and empathy using LLMs and large-scale crowdsourcing studies. We introduce a novel, theory-based taxonomy, HEART (Human Empathy and Narrative Taxonomy) that delineates elements of narrative style that can lead to empathy with the narrator of a story. We establish the performance of LLMs in extracting narrative elements from HEART, showing that prompting with our taxonomy leads to reasonable, human-level annotations beyond what prior lexicon-based methods can do. To show empirical use of our taxonomy, we collect a dataset of empathy judgments of stories via a large-scale crowdsourcing study with N=2,624 participants. We show that narrative elements extracted via LLMs, in particular, vividness of emotions and plot volume, can elucidate the pathways by which narrative style cultivates empathy towards personal stories. Our work suggests that such models can be used for narrative analyses that lead to human-centered social and behavioral insights.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
High-Resolution Observation and Magnetic Modeling of a Solar Minifilament: the Formation, Eruption and Failing Mechanisms
Authors:
Weilin Teng,
Yingna Su,
Rui Liu,
Jialin Chen,
Yanjie Liu,
Jun Dai,
Wenda Cao,
Jinhua Shen,
Haisheng Ji
Abstract:
Minifilaments are widespread small-scale structures in the solar atmosphere. To better understand their formation and eruption mechanisms, we investigate the entire life of a sigmoidal minifilament located below a large quiescent filament observed by BBSO/GST on 2015 August 3. The Hα structure initially appears as a group of arched threads, then transforms into two J-shaped arcades, and finally fo…
▽ More
Minifilaments are widespread small-scale structures in the solar atmosphere. To better understand their formation and eruption mechanisms, we investigate the entire life of a sigmoidal minifilament located below a large quiescent filament observed by BBSO/GST on 2015 August 3. The Hα structure initially appears as a group of arched threads, then transforms into two J-shaped arcades, and finally forms a sigmoidal shape. SDO/AIA observations in 171Å show that two coronal jets occur around the southern footpoint of the minifilament before the minifilament eruption. The minifilament eruption starts from the southern footpoint, then interacts with the overlying filament and fails. The aforementioned observational changes correspond to three episodes of flux cancellations observed by SDO/HMI. Unlike previous studies, the flux cancellation occurs between the polarity where southern footpoint of the minifilament is rooted in and an external polarity. We construct two magnetic field models before the eruption using the flux rope insertion method, and find an hyperbolic flux tube (HFT) above the flux cancellation site. The observation and modeling results suggest that the eruption is triggered by the external magnetic reconnection between the core field of the minifilament and the external fields due to flux cancellations. This study reveals a new triggering mechanism for minifilament eruptions and a new relationship between minifilament eruptions and coronal jets.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Learning phase transitions by siamese neural network
Authors:
Jianmin Shen,
Shiyang Chen,
Feiyi Liu,
Youju Liu,
Wei Li
Abstract:
The wide application of machine learning (ML) techniques in statistics physics has presented new avenues for research in this field. In this paper, we introduce a semi-supervised learning method based on Siamese Neural Networks (SNN), trying to explore the potential of neural network (NN) in the study of critical behaviors beyond the approaches of supervised and unsupervised learning. By focusing…
▽ More
The wide application of machine learning (ML) techniques in statistics physics has presented new avenues for research in this field. In this paper, we introduce a semi-supervised learning method based on Siamese Neural Networks (SNN), trying to explore the potential of neural network (NN) in the study of critical behaviors beyond the approaches of supervised and unsupervised learning. By focusing on the (1+1) dimensional bond directed percolation (DP) model of nonequilibrium phase transition, we use the SNN to predict the critical values and critical exponents of the system. Different from traditional ML methods, the input of SNN is a set of configuration data pairs and the output prediction is similarity, which prompts to find an anchor point of data for pair comparison during the test. In our study, during test we set different bond probability $p$ as anchors, and discuss the impact of the configurations at this anchors on predictions. More, we use an iterative method to find the optimal training interval to make the algorithm more efficient, and the prediction results are comparable to other ML methods.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
The Singlet-Triplet Gap of Cyclobutadiene: The CIPSI-Driven CC($P$;$Q$) Study
Authors:
Swati S. Priyadarsini,
Karthik Gururangan,
Jun Shen,
Piotr Piecuch
Abstract:
An accurate determination of singlet-triplet gaps in biradicals, including cyclobutadiene in the automerization barrier region where one has to balance the substantial nondynamical many-electron correlation effects characterizing the singlet ground state with the predominantly dynamical correlations of the lowest-energy triplet, remains a challenge for many quantum chemistry methods. High-level co…
▽ More
An accurate determination of singlet-triplet gaps in biradicals, including cyclobutadiene in the automerization barrier region where one has to balance the substantial nondynamical many-electron correlation effects characterizing the singlet ground state with the predominantly dynamical correlations of the lowest-energy triplet, remains a challenge for many quantum chemistry methods. High-level coupled-cluster (CC) approaches, such as the CC method with a full treatment of singly, doubly, and triply excited clusters (CCSDT), are often capable of providing reliable results, but the routine application of such methods is hindered by their high computational costs. We have recently proposed a practical alternative to converging the CCSDT energetics at small fractions of the computational effort, even when electron correlations become stronger and connected triply excited clusters are larger and nonperturbative, by merging the CC($P$;$Q$) moment expansions with the selected configuration interaction methodology abbreviated as CIPSI. We demonstrate that one can accurately approximate the highly accurate CCSDT potential surfaces characterizing the lowest singlet and triplet states of cyclobutadiene along the automerization coordinate and the gap between them using tiny fractions of triply excited cluster amplitudes identified with the help of relatively inexpensive CIPSI Hamiltonian diagonalizations.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
EmpathicStories++: A Multimodal Dataset for Empathy towards Personal Experiences
Authors:
Jocelyn Shen,
Yubin Kim,
Mohit Hulse,
Wazeer Zulfikar,
Sharifa Alghowinem,
Cynthia Breazeal,
Hae Won Park
Abstract:
Modeling empathy is a complex endeavor that is rooted in interpersonal and experiential dimensions of human interaction, and remains an open problem within AI. Existing empathy datasets fall short in capturing the richness of empathy responses, often being confined to in-lab or acted scenarios, lacking longitudinal data, and missing self-reported labels. We introduce a new multimodal dataset for e…
▽ More
Modeling empathy is a complex endeavor that is rooted in interpersonal and experiential dimensions of human interaction, and remains an open problem within AI. Existing empathy datasets fall short in capturing the richness of empathy responses, often being confined to in-lab or acted scenarios, lacking longitudinal data, and missing self-reported labels. We introduce a new multimodal dataset for empathy during personal experience sharing: the EmpathicStories++ dataset (https://mitmedialab.github.io/empathic-stories-multimodal/) containing 53 hours of video, audio, and text data of 41 participants sharing vulnerable experiences and reading empathically resonant stories with an AI agent. EmpathicStories++ is the first longitudinal dataset on empathy, collected over a month-long deployment of social robots in participants' homes, as participants engage in natural, empathic storytelling interactions with AI agents. We then introduce a novel task of predicting individuals' empathy toward others' stories based on their personal experiences, evaluated in two contexts: participants' own personal shared story context and their reflections on stories they read. We benchmark this task using state-of-the-art models to pave the way for future improvements in contextualized and longitudinal empathy modeling. Our work provides a valuable resource for further research in developing empathetic AI systems and understanding the intricacies of human empathy within genuine, real-world settings.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Large Language Models for Explainable Decisions in Dynamic Digital Twins
Authors:
Nan Zhang,
Christian Vergara-Marcillo,
Georgios Diamantopoulos,
Jingran Shen,
Nikos Tziritas,
Rami Bahsoon,
Georgios Theodoropoulos
Abstract:
Dynamic data-driven Digital Twins (DDTs) can enable informed decision-making and provide an optimisation platform for the underlying system. By leveraging principles of Dynamic Data-Driven Applications Systems (DDDAS), DDTs can formulate computational modalities for feedback loops, model updates and decision-making, including autonomous ones. However, understanding autonomous decision-making often…
▽ More
Dynamic data-driven Digital Twins (DDTs) can enable informed decision-making and provide an optimisation platform for the underlying system. By leveraging principles of Dynamic Data-Driven Applications Systems (DDDAS), DDTs can formulate computational modalities for feedback loops, model updates and decision-making, including autonomous ones. However, understanding autonomous decision-making often requires technical and domain-specific knowledge. This paper explores using large language models (LLMs) to provide an explainability platform for DDTs, generating natural language explanations of the system's decision-making by leveraging domain-specific knowledge bases. A case study from smart agriculture is presented.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Edge Zeta Functions and Eigenvalues for Buildings of Finite Groups of Lie Type
Authors:
Jianhao Shen
Abstract:
We study the edge zeta functions of buildings associated to a finite group of Lie type, and prove that all the edge eigenvalues of these buildings are certain roots of powers of q. This work vastly generalizes the type A case, and generalizes Brouwer's work on oppositeness graph of these buildings.
We study the edge zeta functions of buildings associated to a finite group of Lie type, and prove that all the edge eigenvalues of these buildings are certain roots of powers of q. This work vastly generalizes the type A case, and generalizes Brouwer's work on oppositeness graph of these buildings.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Wrinkling of differentially growing bilayers with similar film and substrate moduli
Authors:
Jiajia Shen,
Yibin Fu,
Alberto Pirrera,
Rainer M. J. Groh
Abstract:
The study of growth-induced surface wrinkling in constrained bilayers comprising a thin film attached to a thick substrate is a canonical model for understanding pattern formation in many biological systems. While the bilayer model has received much prior attention, the nonlinear behaviour for arrangements with similar film and substrate properties, or substrate growth that outpaces film growth, r…
▽ More
The study of growth-induced surface wrinkling in constrained bilayers comprising a thin film attached to a thick substrate is a canonical model for understanding pattern formation in many biological systems. While the bilayer model has received much prior attention, the nonlinear behaviour for arrangements with similar film and substrate properties, or substrate growth that outpaces film growth, remains poorly understood. This paper therefore focuses on these cases in which the substrate's elasticity dominates surface wrinkling. We study the critical states, and the initial and advanced post-critical behaviour of growing bilayers with film-to-substrate modulus ratios in the region of $2.5$--$50$, and cases where the substrate grows faster than the film. Based on nonlinear elasticity, we formulate analytical models for linear buckling analyses and asymptotic projections around the critical point, and use finite element (FE) models coupled to continuation and branch-switching algorithms to uncover the deep post-critical regime. It is shown that a rapidly growing substrate may change the critical mode from film-governed sinusoidal wrinkling to substrate-governed Biot wrinkling depending on the stiffness ratio and growth ratio. We present a phase change diagram of the post-critical modal landscape split into sinusoidal wrinkling, period doubling, period quadrupling, and creasing regimes in terms of the stiffness ratio and growth ratio. While the post-critical regime of film- and substrate-dominated bilayers (either in terms of dominant elasticity or growth rate) is governed by sinusoidal wrinkling and Biot creasing, respectively, the intermediate regions allow for period doubling and quadrupling bifurcations. Finally, we demonstrate the existence of multi-stability in the advanced post-buckling regimes for growing bilayers where growth in the substrate surpasses that of the film.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
NeRTCAM: CAM-Based CMOS Implementation of Reference Frames for Neuromorphic Processors
Authors:
Harideep Nair,
William Leyman,
Agastya Sampath,
Quinn Jacobson,
John Paul Shen
Abstract:
Neuromorphic architectures mimicking biological neural networks have been proposed as a much more efficient alternative to conventional von Neumann architectures for the exploding compute demands of AI workloads. Recent neuroscience theory on intelligence suggests that Cortical Columns (CCs) are the fundamental compute units in the neocortex and intelligence arises from CC's ability to store, pred…
▽ More
Neuromorphic architectures mimicking biological neural networks have been proposed as a much more efficient alternative to conventional von Neumann architectures for the exploding compute demands of AI workloads. Recent neuroscience theory on intelligence suggests that Cortical Columns (CCs) are the fundamental compute units in the neocortex and intelligence arises from CC's ability to store, predict and infer information via structured Reference Frames (RFs). Based on this theory, recent works have demonstrated brain-like visual object recognition using software simulation. Our work is the first attempt towards direct CMOS implementation of Reference Frames for building CC-based neuromorphic processors. We propose NeRTCAM (Neuromorphic Reverse Ternary Content Addressable Memory), a CAM-based building block that supports the key operations (store, predict, infer) required to perform inference using RFs. NeRTCAM architecture is presented in detail including its key components. All designs are implemented in SystemVerilog and synthesized in 7nm CMOS, and hardware complexity scaling is evaluated for varying storage sizes. NeRTCAM system for biologically motivated MNIST inference with a storage size of 1024 entries incurs just 0.15 mm^2 area, 400 mW power and 9.18 us critical path latency, demonstrating the feasibility of direct CMOS implementation of CAM-based Reference Frames.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Interpretable Machine Learning Enhances Disease Prognosis: Applications on COVID-19 and Onward
Authors:
Jinzhi Shen,
Ke Ma
Abstract:
In response to the COVID-19 pandemic, the integration of interpretable machine learning techniques has garnered significant attention, offering transparent and understandable insights crucial for informed clinical decision making. This literature review delves into the applications of interpretable machine learning in predicting the prognosis of respiratory diseases, particularly focusing on COVID…
▽ More
In response to the COVID-19 pandemic, the integration of interpretable machine learning techniques has garnered significant attention, offering transparent and understandable insights crucial for informed clinical decision making. This literature review delves into the applications of interpretable machine learning in predicting the prognosis of respiratory diseases, particularly focusing on COVID-19 and its implications for future research and clinical practice. We reviewed various machine learning models that are not only capable of incorporating existing clinical domain knowledge but also have the learning capability to explore new information from the data. These models and experiences not only aid in managing the current crisis but also hold promise for addressing future disease outbreaks. By harnessing interpretable machine learning, healthcare systems can enhance their preparedness and response capabilities, thereby improving patient outcomes and mitigating the impact of respiratory diseases in the years to come.
△ Less
Submitted 20 May, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
Private Data Leakage in Federated Human Activity Recognition for Wearable Healthcare Devices
Authors:
Kongyang Chen,
Dongping Zhang,
Sijia Guan,
Bing Mi,
Jiaxing Shen,
Guoqing Wang
Abstract:
Wearable data serves various health monitoring purposes, such as determining activity states based on user behavior and providing tailored exercise recommendations. However, the individual data perception and computational capabilities of wearable devices are limited, often necessitating the joint training of models across multiple devices. Federated Human Activity Recognition (HAR) presents a via…
▽ More
Wearable data serves various health monitoring purposes, such as determining activity states based on user behavior and providing tailored exercise recommendations. However, the individual data perception and computational capabilities of wearable devices are limited, often necessitating the joint training of models across multiple devices. Federated Human Activity Recognition (HAR) presents a viable research avenue, allowing for global model training without the need to upload users' local activity data. Nonetheless, recent studies have revealed significant privacy concerns persisting within federated learning frameworks. To address this gap, we focus on investigating privacy leakage issues within federated user behavior recognition modeling across multiple wearable devices. Our proposed system entails a federated learning architecture comprising $N$ wearable device users and a parameter server, which may exhibit curiosity in extracting sensitive user information from model parameters. Consequently, we consider a membership inference attack based on a malicious server, leveraging differences in model generalization across client data. Experimentation conducted on five publicly available HAR datasets demonstrates an accuracy rate of 92\% for malicious server-based membership inference. Our study provides preliminary evidence of substantial privacy risks associated with federated training across multiple wearable devices, offering a novel research perspective within this domain.
△ Less
Submitted 20 June, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Charge-Transfer Hyperbolic Polaritons in $α$-MoO$_3$/graphene heterostructures
Authors:
J. Shen,
M. Chen,
V. Korostelev,
H. Kim,
P. Fathi-Hafshejani,
M. Mahjouri-Samani,
K. Klyukin,
G-H. Lee,
S. Dai
Abstract:
Charge transfer is a fundamental interface process that can be harnessed for light detection, photovoltaics, and photosynthesis. Recently, charge transfer was exploited in nanophotonics to alter plasmon polaritons by involving additional non-polaritonic materials to activate the charge transfer. Yet, direct charge transfer between polaritonic materials hasn't been demonstrated. We report the direc…
▽ More
Charge transfer is a fundamental interface process that can be harnessed for light detection, photovoltaics, and photosynthesis. Recently, charge transfer was exploited in nanophotonics to alter plasmon polaritons by involving additional non-polaritonic materials to activate the charge transfer. Yet, direct charge transfer between polaritonic materials hasn't been demonstrated. We report the direct charge transfer in pure polaritonic van der Waals (vdW) heterostructures of $α$-MoO$_3$/graphene. We extracted the Fermi energy of 0.6 eV for graphene by infrared nano-imaging of charge transfer hyperbolic polaritons in the vdW heterostructure. This unusually high Fermi energy is attributed to the charge transfer between graphene and $α$-MoO$_3$. Moreover, we have observed charge transfer hyperbolic polaritons in multiple energy-momentum dispersion branches with a wavelength elongation of up to 150%. With support from the DFT calculation, we find that the charge transfer between graphene and $α$-MoO$_3$, absent in mechanically assembled vdW heterostructures, is attributed to the relatively pristine heterointerface preserved in the epitaxially grown vdW heterostructure. The direct charge transfer and charge transfer hyperbolic polaritons demonstrated in our work hold great promise for developing nano-optical circuits, computational devices, communication systems, and light and energy manipulation devices.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
"Community Guidelines Make this the Best Party on the Internet": An In-Depth Study of Online Platforms' Content Moderation Policies
Authors:
Brennan Schaffner,
Arjun Nitin Bhagoji,
Siyuan Cheng,
Jacqueline Mei,
Jay L. Shen,
Grace Wang,
Marshini Chetty,
Nick Feamster,
Genevieve Lakier,
Chenhao Tan
Abstract:
Moderating user-generated content on online platforms is crucial for balancing user safety and freedom of speech. Particularly in the United States, platforms are not subject to legal constraints prescribing permissible content. Each platform has thus developed bespoke content moderation policies, but there is little work towards a comparative understanding of these policies across platforms and t…
▽ More
Moderating user-generated content on online platforms is crucial for balancing user safety and freedom of speech. Particularly in the United States, platforms are not subject to legal constraints prescribing permissible content. Each platform has thus developed bespoke content moderation policies, but there is little work towards a comparative understanding of these policies across platforms and topics. This paper presents the first systematic study of these policies from the 43 largest online platforms hosting user-generated content, focusing on policies around copyright infringement, harmful speech, and misleading content. We build a custom web-scraper to obtain policy text and develop a unified annotation scheme to analyze the text for the presence of critical components. We find significant structural and compositional variation in policies across topics and platforms, with some variation attributable to disparate legal groundings. We lay the groundwork for future studies of ever-evolving content moderation policies and their impact on users.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Tunable superconductivity in electron- and hole-doped Bernal bilayer graphene
Authors:
Chushan Li,
Fan Xu,
Bohao Li,
Jiayi Li,
Guoan Li,
Kenji Watanabe,
Takashi Taniguchi,
Bingbing Tong,
Jie Shen,
Li Lu,
Jinfeng Jia,
Fengcheng Wu,
Xiaoxue Liu,
Tingxin Li
Abstract:
Graphene-based, high quality two-dimensional electronic systems have emerged as a highly tunable platform for studying superconductivity. Specifically, superconductivity has been observed in both electron-doped and hole-doped twisted graphene moire systems, whereas in crystalline graphene systems, superconductivity has so far only been observed in hole-doped rhombohedral trilayer and hole-doped Be…
▽ More
Graphene-based, high quality two-dimensional electronic systems have emerged as a highly tunable platform for studying superconductivity. Specifically, superconductivity has been observed in both electron-doped and hole-doped twisted graphene moire systems, whereas in crystalline graphene systems, superconductivity has so far only been observed in hole-doped rhombohedral trilayer and hole-doped Bernal bilayer graphene (BBG). Recently, enhanced superconductivity has been demonstrated in BBG due to the proximity with a monolayer WSe2. Here, we report the observation of superconductivity and a series of flavor-symmetry-breaking phases in both electron- and hole-doped BBG/WSe2 device by electrostatic doping. The strength of the observed superconductivity is tunable by applied vertical electric fields. The maximum Berezinskii-Kosterlitz-Thouless (BKT) transition temperature for the electron- and hole-doped superconductivity is about 210 mK and 400 mK, respectively. Superconductivities emerge only when applied electric fields drive BBG electron or hole wavefunctions toward the WSe2 layer, underscoring the importance of the WSe2 layer in the observed superconductivity. We find the hole-doped superconductivity violates the Pauli paramagnetic limit, consistent with an Ising-like superconductor. In contrast, the electron-doped superconductivity obeys the Pauli limit, even though the proximity induced Ising spin-orbit coupling is also notable in the conduction band. Our findings highlight the rich physics associated with the conduction band in BBG, paving the way for further studies into the superconducting mechanisms of crystalline graphene and the development of novel superconductor devices based on BBG.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Robust Optimization for Spot Scanning Proton Therapy based on Dose-Linear Energy Transfer (LET) Volume Constraints
Authors:
Jingyuan Chen,
Yunze Yang,
Hongying Feng,
Lian Zhang,
Carlos E. Vargas,
Nathan Y. Yu,
Jean-Claude M. Rwigema,
Sameer R. Keole,
Sujay A. Vora,
Jiajian Shen,
Wei Liu
Abstract:
Purpose: Historically, spot scanning proton therapy (SSPT) treatment planning utilizes dose volume constraints and linear-energy-transfer (LET) volume constraints separately to balance tumor control and organs-at-risk (OARs) protection. We propose a novel dose-LET volume constraint (DLVC)-based robust optimization (DLVCRO) method for SSPT in treating prostate cancer to obtain a desirable joint dos…
▽ More
Purpose: Historically, spot scanning proton therapy (SSPT) treatment planning utilizes dose volume constraints and linear-energy-transfer (LET) volume constraints separately to balance tumor control and organs-at-risk (OARs) protection. We propose a novel dose-LET volume constraint (DLVC)-based robust optimization (DLVCRO) method for SSPT in treating prostate cancer to obtain a desirable joint dose and LET distribution to minimize adverse events (AEs).
Methods: DLVCRO treats DLVC as soft constraints controlling the joint distribution of dose and LET. Ten prostate cancer patients were included with rectum and bladder as OARs. DLVCRO was compared with the conventional robust optimization (RO) method using the worst-case analysis method. Besides the dose-volume histogram (DVH) indices, the analogous LETVH and extra-biological-dose (xBD)-volume histogram indices were also used. The Wilcoxon signed rank test was used to measure statistical significance.
Results: In nominal scenario, DLVCRO significantly improved dose, LET and xBD distributions to protect OARs (rectum: V70Gy: 3.07\% vs. 2.90\%, p = .0063, RO vs. DLVCRO; $\text{LET}_{\max}$ (keV/um): 11.53 vs. 9.44, p = .0101; $\text{xBD}_{\max}$ (Gy$\cdot$keV/um): 420.55 vs. 398.79, p = .0086; bladder: V65Gy: 4.82\% vs. 4.61\%, p = .0032; $\text{LET}_{\max}$ 8.97 vs. 7.51, p = .0047; $\text{xBD}_{\max}$ 490.11 vs. 476.71, p = .0641). The physical dose distributions in targets are comparable (D2%: 98.57\% vs. 98.39\%; p = .0805; CTV D2% - D98%: 7.10\% vs. 7.75\%, p = .4624). In the worst-case scenario, DLVCRO robustly enhanced OAR while maintaining the similar plan robustness in target dose coverage and homogeneity.
Conclusion: DLVCRO upgrades 2D DVH-based to 3D DLVH-based treatment planning to adjust dose/LET distributions simultaneously and robustly. DLVCRO is potentially a powerful tool to improve patient outcomes in SSPT.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Unique solvability and error analysis of the Lagrange multiplier approach for gradient flows
Authors:
Qing Cheng,
Jie Shen,
Cheng Wang
Abstract:
The unique solvability and error analysis of the original Lagrange multiplier approach proposed in [8] for gradient flows is studied in this paper. We identify a necessary and sufficient condition that must be satisfied for the nonlinear algebraic equation arising from the original Lagrange multiplier approach to admit a unique solution in the neighborhood of its exact solution, and propose a modi…
▽ More
The unique solvability and error analysis of the original Lagrange multiplier approach proposed in [8] for gradient flows is studied in this paper. We identify a necessary and sufficient condition that must be satisfied for the nonlinear algebraic equation arising from the original Lagrange multiplier approach to admit a unique solution in the neighborhood of its exact solution, and propose a modified Lagrange multiplier approach so that the computation can continue even if the aforementioned condition is not satisfied. Using Cahn-Hilliard equation as an example, we prove rigorously the unique solvability and establish optimal error estimates of a second-order Lagrange multiplier scheme assuming this condition and that the time step is sufficient small. We also present numerical results to demonstrate that the modified Lagrange multiplier approach is much more robust and can use much larger time step than the original Lagrange multiplier approach.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Multigroup Robustness
Authors:
Lunjia Hu,
Charlotte Peale,
Judy Hanwen Shen
Abstract:
To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions…
▽ More
To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions about people from a rich collection of overlapping subpopulations, we initiate the study of multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation. When the data corruption is not distributed uniformly over subpopulations, our algorithms provide more meaningful robustness guarantees than standard guarantees that are oblivious to how the data corruption and the affected subpopulations are related. Our techniques establish a new connection between multigroup fairness and robustness.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
On a new class of BDF and IMEX schemes for parabolic type equations
Authors:
Fukeng Huang,
Jie Shen
Abstract:
When applying the classical multistep schemes for solving differential equations, one often faces the dilemma that smaller time steps are needed with higher-order schemes, making it impractical to use high-order schemes for stiff problems. We construct in this paper a new class of BDF and implicit-explicit (IMEX) schemes for parabolic type equations based on the Taylor expansions at time…
▽ More
When applying the classical multistep schemes for solving differential equations, one often faces the dilemma that smaller time steps are needed with higher-order schemes, making it impractical to use high-order schemes for stiff problems. We construct in this paper a new class of BDF and implicit-explicit (IMEX) schemes for parabolic type equations based on the Taylor expansions at time $t^{n+β}$ with $β> 1$ being a tunable parameter. These new schemes, with a suitable $β$, allow larger time steps at higher-order for stiff problems than that is allowed with a usual higher-order scheme. For parabolic type equations, we identify an explicit uniform multiplier for the new second- to fourth-order schemes, and conduct rigorously stability and error analysis by using the energy argument. We also present ample numerical examples to validate our findings.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
The Local Dark Matter Kinematic Substructure Based on LAMOST K Giants
Authors:
Hai Zhu,
Rui Guo,
Juntai Shen,
Jianglai Liu,
Chao Liu,
Xiang-Xiang Xue,
Lan Zhang,
Shude Mao
Abstract:
Numerical simulations indicate that correlations exist between the velocity distributions of stars and dark matter (DM). We study the local DM velocity distribution based on these correlations. We select K giants from LAMOST DR8 cross-matched with Gaia DR3, which has robust measurements of three-dimensional velocity and metallicity, and separate them into the disk, halo substructure and main halo…
▽ More
Numerical simulations indicate that correlations exist between the velocity distributions of stars and dark matter (DM). We study the local DM velocity distribution based on these correlations. We select K giants from LAMOST DR8 cross-matched with Gaia DR3, which has robust measurements of three-dimensional velocity and metallicity, and separate them into the disk, halo substructure and main halo components in the chemo-dynamical space utilizing the Gaussian Mixture Model. The substructure component is highly radially anisotropic, and possibly related to the Gaia-Enceladus-Sausage (GES) merger event, while the halo component is isotropic and accreted from the earliest mergers following the Maxwell-Boltzmann Distribution (Standard Halo Model, SHM). We find that the GES-like substructure contributes $\sim85\%$ of the local non-disk stars in the Solar neighbourhood, which is nearly invariant when applying different volume cuts or additional angular momentum constraints. Utilizing the metallicity-stellar-mass relation and the stellar-mass-halo-mass relation, we find that $\sim25_{-15}^{+24}\%$ of local DM is in the kinematic substructure. Combined with the stellar distributions of non-disk components, we compute the velocity distribution of local DM. The modified heliocentric velocity distribution of local DM shifts to a lower speed and has a sharper peak compared to the SHM, which yields updated detection limits for the DM direct detection experiments. Our work confirms that the local DM velocity distribution deviates from the SHM, and needs to be properly accounted in the DM detection experiments.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
MACO: Exploring GEMM Acceleration on a Loosely-Coupled Multi-core Processor
Authors:
Bingcai Sui,
Junzhong Shen,
Caixia Sun,
Junhui Wang,
Zhong Zheng,
Wei Guo
Abstract:
General-purpose processor vendors have integrated customized accelerator in their products due to the widespread use of General Matrix-Matrix Multiplication (GEMM) kernels. However, it remains a challenge to further improve the flexibilityand scalability of these GEMM-enhanced processors to cater to the emerging large-scale GEMM workloads. In this paper we propose MACO, a novel loosely-coupled mul…
▽ More
General-purpose processor vendors have integrated customized accelerator in their products due to the widespread use of General Matrix-Matrix Multiplication (GEMM) kernels. However, it remains a challenge to further improve the flexibilityand scalability of these GEMM-enhanced processors to cater to the emerging large-scale GEMM workloads. In this paper we propose MACO, a novel loosely-coupled multi-core general-purpose architecture optimized for GEMM-related applications. To enhance the programmability and flexibility of MACO, the paper introduces a tile-based instruction set architecture. Additionally, the paper presents techniques such as hardware-assisted data prefetching and locking, and predictive address translation to further enhance the computational efficiency of MACO for GEMM workloads. The experimental results demonstrate that MACO exhibits good scalability, achieving an average computational efficiency of 90% across multiple cores. Furthermore, evaluations on state-of-the-art deep neural networks show that MACO can achieve up to 1.1 TFLOPS with 88% computational efficiency, indicating its adaptivity to deep learning workloads.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.