subscribe to arXiv mailings

F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data

Authors: Zexing Xu, Linjun Zhang, Sitan Yang, Rasoul Etesami, Hanghang Tong, Huan Zhang, Jiawei Han

Abstract: Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns f… ▽ More Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns from similar entities during non-peak periods, enriched by features learned from a graph neural networks (GNNs)-based forecasting model, to predict demand during peak events. We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm that leverages proxy data from non-peak periods and GNN-generated relational metadata to learn feature-specific layer parameters, thereby adapting to demand forecasts for peak events. Theoretically, we show that by considering domain similarities through task-specific metadata, our model achieves improved generalization, where the excess risk decreases as the number of training tasks increases. Empirical evaluations on large-scale industrial datasets demonstrate the superiority of our approach. Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset. △ Less

Submitted 23 June, 2024; originally announced June 2024.

MSC Class: 68T07; 68T05; 62M10; 62M20; 90C90; 91B84

arXiv:2406.16170 [pdf, other]

SimCE: Simplifying Cross-Entropy Loss for Collaborative Filtering

Authors: Xiaodong Yang, Huiyuan Chen, Yuchen Yan, Yuxin Tang, Yuying Zhao, Eric Xu, Yiwei Cai, Hanghang Tong

Abstract: The learning objective is integral to collaborative filtering systems, where the Bayesian Personalized Ranking (BPR) loss is widely used for learning informative backbones. However, BPR often experiences slow convergence and suboptimal local optima, partially because it only considers one negative item for each positive item, neglecting the potential impacts of other unobserved items. To address t… ▽ More The learning objective is integral to collaborative filtering systems, where the Bayesian Personalized Ranking (BPR) loss is widely used for learning informative backbones. However, BPR often experiences slow convergence and suboptimal local optima, partially because it only considers one negative item for each positive item, neglecting the potential impacts of other unobserved items. To address this issue, the recently proposed Sampled Softmax Cross-Entropy (SSM) compares one positive sample with multiple negative samples, leading to better performance. Our comprehensive experiments confirm that recommender systems consistently benefit from multiple negative samples during training. Furthermore, we introduce a \underline{Sim}plified Sampled Softmax \underline{C}ross-\underline{E}ntropy Loss (SimCE), which simplifies the SSM using its upper bound. Our validation on 12 benchmark datasets, using both MF and LightGCN backbones, shows that SimCE significantly outperforms both BPR and SSM. △ Less

Submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.08819 [pdf, other]

doi 10.1145/3637528.3671797

AIM: Attributing, Interpreting, Mitigating Data Unfairness

Authors: Zhining Liu, Ruizhong Qiu, Zhichen Zeng, Yada Zhu, Hendrik Hamann, Hanghang Tong

Abstract: Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals. Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction, with far less effort dedicated towards exploring how to trace biases present in the data, despite its importance for the transparency and inte… ▽ More Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals. Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction, with far less effort dedicated towards exploring how to trace biases present in the data, despite its importance for the transparency and interpretability of FairML. To fill this gap, we investigate a novel research problem: discovering samples that reflect biases/prejudices from the training data. Grounding on the existing fairness notions, we lay out a sample bias criterion and propose practical algorithms for measuring and countering sample bias. The derived bias score provides intuitive sample-level attribution and explanation of historical bias in data. On this basis, we further design two FairML strategies via sample-bias-informed minimal data editing. They can mitigate both group and individual unfairness at the cost of minimal or zero predictive utility loss. Extensive experiments and analyses on multiple real-world datasets demonstrate the effectiveness of our methods in explaining and mitigating unfairness. Code is available at https://github.com/ZhiningLiu1998/AIM. △ Less

Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: 12 pages, 6 figures, accepted by ACM SIGKDD 2024. Webpage: https://github.com/ZhiningLiu1998/AIM

arXiv:2406.06647 [pdf, other]

How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark

Authors: Ruizhong Qiu, Weiliang Will Zeng, Hanghang Tong, James Ezick, Christopher Lott

Abstract: The emergence of large language models (LLMs) has significantly pushed the frontiers of program synthesis. Advancement of LLM-based program synthesis calls for a thorough evaluation of LLM-generated code. Most evaluation frameworks focus on the (functional) correctness of generated code; efficiency, as an important measure of code quality, has been overlooked in existing evaluations. In this work,… ▽ More The emergence of large language models (LLMs) has significantly pushed the frontiers of program synthesis. Advancement of LLM-based program synthesis calls for a thorough evaluation of LLM-generated code. Most evaluation frameworks focus on the (functional) correctness of generated code; efficiency, as an important measure of code quality, has been overlooked in existing evaluations. In this work, we develop ENAMEL (EfficeNcy AutoMatic EvaLuator), a rigorous and high-standard benchmark for evaluating the capability of LLMs in generating efficient code. Firstly, we propose a new efficiency metric called eff@k, which generalizes the pass@k metric from correctness to efficiency and appropriately handles right-censored execution time. Furthermore, we derive an unbiased and variance-reduced estimator of eff@k via Rao--Blackwellization; we also provide a numerically stable implementation for the new estimator. Secondly, to set a high-standard for efficiency evaluation, we employ a human expert to design best algorithms and implementations as our reference solutions of efficiency, many of which are much more efficient than existing canonical solutions in HumanEval and HumanEval+. Moreover, to ensure a rigorous evaluation, we employ a human expert to curate strong test case generators to filter out wrong code and differentiate suboptimal algorithms. An extensive study across 30 popular LLMs using our benchmark ENAMEL shows that LLMs still fall short of generating expert-level efficient code. Using two subsets of our problem set, we demonstrate that such deficiency is because current LLMs struggle in designing advanced algorithms and are barely aware of implementation optimization. Our benchmark is publicly available at https://github.com/q-rz/enamel . △ Less

Submitted 16 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2405.16805 [pdf, other]

Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization

Authors: Ruizhong Qiu, Hanghang Tong

Abstract: We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $\mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $O\big(s\log\frac ds\big)$ queries per step to ach… ▽ More We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $\mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $O\big(s\log\frac ds\big)$ queries per step to achieve $O\big(\frac1T\big)$ rate of convergence w.r.t. the number T of steps. In this paper, we propose *Gradient Compressed Sensing* (GraCe), a query-efficient and accurate estimator for sparse gradients that uses only $O\big(s\log\log\frac ds\big)$ queries per step and still achieves $O\big(\frac1T\big)$ rate of convergence. To our best knowledge, we are the first to achieve a *double-logarithmic* dependence on $d$ in the query complexity under weaker assumptions. Our proposed GraCe generalizes the Indyk--Price--Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions. Furthermore, since the IPW algorithm is purely theoretical due to its impractically large constant, we improve the IPW algorithm via our *dependent random partition* technique together with our corresponding novel analysis and successfully reduce the constant by a factor of nearly 4300. Our GraCe is not only theoretically query-efficient but also achieves strong empirical performance. We benchmark our GraCe against 12 existing ZOO methods with 10000-dimensional functions and demonstrate that GraCe significantly outperforms existing methods. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.11416 [pdf, other]

Discrete-state Continuous-time Diffusion for Graph Generation

Authors: Zhe Xu, Ruizhong Qiu, Yuzhong Chen, Huiyuan Chen, Xiran Fan, Menghai Pan, Zhichen Zeng, Mahashweta Das, Hanghang Tong

Abstract: Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design. Diffusion generative models, as an emerging research focus, have been applied to graph generation tasks. Overall, according to the space of states and time steps, diffusion generative models can be categorized into discrete-/continuous-state discrete-/continuous-time fash… ▽ More Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design. Diffusion generative models, as an emerging research focus, have been applied to graph generation tasks. Overall, according to the space of states and time steps, diffusion generative models can be categorized into discrete-/continuous-state discrete-/continuous-time fashions. In this paper, we formulate the graph diffusion generation in a discrete-state continuous-time setting, which has never been studied in previous graph diffusion models. The rationale of such a formulation is to preserve the discrete nature of graph-structured data and meanwhile provide flexible sampling trade-offs between sample quality and efficiency. Analysis shows that our training objective is closely related to generation quality, and our proposed generation framework enjoys ideal invariant/equivariant properties concerning the permutation of node ordering. Our proposed model shows competitive empirical performance against state-of-the-art graph generation solutions on various benchmarks and, at the same time, can flexibly trade off the generation quality and efficiency in the sampling phase. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.07446 [pdf, other]

doi 10.3847/1538-4357/ad35be

A Generic Model for Persistent Radio Source around Fast Radio Bursts

Authors: Yushan Chen, Hao Tong

Abstract: The repeated fast radio burst FRB 121102A and FRB 190520B has been reported, along with a spatially coincident, compact, persistent radio emission. In this paper, we present a parameterized one-zone model, with a basic scenario that a relativistic magnetized wind from the pulsar sweeps up the surroundings, e.g. freely expanding supernova ejecta, giving rise to a power-law distribution of electron… ▽ More The repeated fast radio burst FRB 121102A and FRB 190520B has been reported, along with a spatially coincident, compact, persistent radio emission. In this paper, we present a parameterized one-zone model, with a basic scenario that a relativistic magnetized wind from the pulsar sweeps up the surroundings, e.g. freely expanding supernova ejecta, giving rise to a power-law distribution of electron filled between the forward shock and the termination shock. We show that via appropriate adjustment of the model parameters, we can obtain the synchrotron radio emission properties from the one-zone model bright enough to account for observation, simply and analytically fitting the observed spectra well. Through dynamical evolution of the model, we can also obtain time-varying of relevant properties. This parameterized model does not depend on concrete physical models such as central engine, instead we can constraint physical model via comparison between parameters and observation, indicating the information about the central engine and surroundings. We also discuss the synchrotron self-Compton emission in our scenario in the end, but find no clue on the counterparts at other waveband. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 12 pages,5 figures, published by ApJ

Journal ref: ApJ, Volume 966, 2024, Issue 2, id.179, 9 pp

arXiv:2405.05945 [pdf, other]

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

Authors: Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li

Abstract: Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details. In this technical report, we introduce the Lumina-T2X family - a series of Flow-based Large Diffusion Transformers (Flag-DiT) equipped with zero-initialized attention, as a unified f… ▽ More Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details. In this technical report, we introduce the Lumina-T2X family - a series of Flow-based Large Diffusion Transformers (Flag-DiT) equipped with zero-initialized attention, as a unified framework designed to transform noise into images, videos, multi-view 3D objects, and audio clips conditioned on text instructions. By tokenizing the latent spatial-temporal space and incorporating learnable placeholders such as [nextline] and [nextframe] tokens, Lumina-T2X seamlessly unifies the representations of different modalities across various spatial-temporal resolutions. This unified approach enables training within a single framework for different modalities and allows for flexible generation of multimodal data at any resolution, aspect ratio, and length during inference. Advanced techniques like RoPE, RMSNorm, and flow matching enhance the stability, flexibility, and scalability of Flag-DiT, enabling models of Lumina-T2X to scale up to 7 billion parameters and extend the context window to 128K tokens. This is particularly beneficial for creating ultra-high-definition images with our Lumina-T2I model and long 720p videos with our Lumina-T2V model. Remarkably, Lumina-T2I, powered by a 5-billion-parameter Flag-DiT, requires only 35% of the training computational costs of a 600-million-parameter naive DiT. Our further comprehensive analysis underscores Lumina-T2X's preliminary capability in resolution extrapolation, high-resolution editing, generating consistent 3D views, and synthesizing videos with seamless transitions. We expect that the open-sourcing of Lumina-T2X will further foster creativity, transparency, and diversity in the generative AI community. △ Less

Submitted 13 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: Technical Report; Code at: https://github.com/Alpha-VLLM/Lumina-T2X

arXiv:2405.05389 [pdf, other]

On foundation of generative statistics with F-entropy: a gradient-based approach

Authors: Bing Cheng, Howell Tong

Abstract: This paper explores the interplay between statistics and generative artificial intelligence. Generative statistics, an integral part of the latter, aims to construct models that can {\it generate} efficiently and meaningfully new data across the whole of the (usually high dimensional) sample space, e.g. a new photo. Within it, the gradient-based approach is a current favourite that exploits effect… ▽ More This paper explores the interplay between statistics and generative artificial intelligence. Generative statistics, an integral part of the latter, aims to construct models that can {\it generate} efficiently and meaningfully new data across the whole of the (usually high dimensional) sample space, e.g. a new photo. Within it, the gradient-based approach is a current favourite that exploits effectively, for the above purpose, the information contained in the observed sample, e.g. an old photo. However, often there are missing data in the observed sample, e.g. missing bits in the old photo. To handle this situation, we have proposed a gradient-based algorithm for generative modelling. More importantly, our paper underpins rigorously this powerful approach by introducing a new F-entropy that is related to Fisher's divergence. (The F-entropy is also of independent interest.) The underpinning has enabled the gradient-based approach to expand its scope. For example, it can now provide a tool for generative model selection. Possible future projects include discrete data and Bayesian variational inference. △ Less

Submitted 29 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: 29 pages

MSC Class: 60

arXiv:2405.04028 [pdf, other]

doi 10.1145/3626772.3657971

Masked Graph Transformer for Large-Scale Recommendation

Authors: Huiyuan Chen, Zhe Xu, Chin-Chia Michael Yeh, Vivian Lai, Yan Zheng, Minghua Xu, Hanghang Tong

Abstract: Graph Transformers have garnered significant attention for learning graph-structured data, thanks to their superb ability to capture long-range dependencies among nodes. However, the quadratic space and time complexity hinders the scalability of Graph Transformers, particularly for large-scale recommendation. Here we propose an efficient Masked Graph Transformer, named MGFormer, capable of capturi… ▽ More Graph Transformers have garnered significant attention for learning graph-structured data, thanks to their superb ability to capture long-range dependencies among nodes. However, the quadratic space and time complexity hinders the scalability of Graph Transformers, particularly for large-scale recommendation. Here we propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity. To achieve this, we treat all user/item nodes as independent tokens, enhance them with positional embeddings, and feed them into a kernelized attention module. Additionally, we incorporate learnable relative degree information to appropriately reweigh the attentions. Experimental results show the superior performance of our MGFormer, even with a single attention layer. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.03983 [pdf, other]

MEET-U Project I: The key drivers of the preference for dynamic dark energy

Authors: Zhiqi Huang, Jianqi Liu, Jianfeng Mo, Yan Su, Junchao Wang, Yanhong Yao, Guangyao Yu, Zhengxin Zhu, Zhuoyang Li, Zhenjie Liu, Haitao Miao, Hui Tong

Abstract: Joint analysis of the baryon acoustic oscillations (BAO) measurement by the Dark Energy Spectroscopic Instrument (DESI) first data release, Type Ia supernovae (SNe) of the Dark Energy Survey Year 5 (DES5YR) release and cosmic microwave background (CMB) data favors a quintom-like dynamic dark energy model over the standard Lambda cold dark matter ($Λ$CDM) model at $3.9σ$ level (Adame et al. 2024).… ▽ More Joint analysis of the baryon acoustic oscillations (BAO) measurement by the Dark Energy Spectroscopic Instrument (DESI) first data release, Type Ia supernovae (SNe) of the Dark Energy Survey Year 5 (DES5YR) release and cosmic microwave background (CMB) data favors a quintom-like dynamic dark energy model over the standard Lambda cold dark matter ($Λ$CDM) model at $3.9σ$ level (Adame et al. 2024). We demonstrate that the preference for dynamic dark energy does not rely on the detailed modeling of CMB physics and remains at $3.2σ$ level when the full CMB likelihood is replaced by a CMB acoustic-oscillation angle ($θ_\star$) prior and a baryon abundance ($Ω_bh^2$) prior. By comparing the data with over $10^4$ $Λ$CDM-based simulations, we find that both DES5YR SNe and DESI BAO contribute significantly to the preference for dynamic dark energy. The preference for dynamic dark energy is unlikely (probability $\lesssim 0.02$) due to unknown systematics in DES5YR SNe and statistical fluctuations in DESI BAO, or vice versa. △ Less

Submitted 3 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

Comments: 5 pages, 3 figures, MEET-U project

Report number: MEET-U-01 MSC Class: 83F05 ACM Class: J.2

arXiv:2405.02197 [pdf, other]

Possible Causes of False General Relativity Violations in Gravitational Wave Observations

Authors: Anuradha Gupta, K. G. Arun, Enrico Barausse, Laura Bernard, Emanuele Berti, Sajad A. Bhat, Alessandra Buonanno, Vitor Cardoso, Shun Yin Cheung, Teagan A. Clarke, Sayantani Datta, Arnab Dhani, Jose María Ezquiaga, Ish Gupta, Nir Guttman, Tanja Hinderer, Qian Hu, Justin Janquart, Nathan K. Johnson-McDaniel, Rahul Kashyap, N. V. Krishnendu, Paul D. Lasky, Andrew Lundgren, Elisa Maggio, Parthapratim Mahapatra , et al. (18 additional authors not shown)

Abstract: General relativity (GR) has proven to be a highly successful theory of gravity since its inception. The theory has thrivingly passed numerous experimental tests, predominantly in weak gravity, low relative speeds, and linear regimes, but also in the strong-field and very low-speed regimes with binary pulsars. Observable gravitational waves (GWs) originate from regions of spacetime where gravity is… ▽ More General relativity (GR) has proven to be a highly successful theory of gravity since its inception. The theory has thrivingly passed numerous experimental tests, predominantly in weak gravity, low relative speeds, and linear regimes, but also in the strong-field and very low-speed regimes with binary pulsars. Observable gravitational waves (GWs) originate from regions of spacetime where gravity is extremely strong, making them a unique tool for testing GR, in previously inaccessible regions of large curvature, relativistic speeds, and strong gravity. Since their first detection, GWs have been extensively used to test GR, but no deviations have been found so far. Given GR's tremendous success in explaining current astronomical observations and laboratory experiments, accepting any deviation from it requires a very high level of statistical confidence and consistency of the deviation across GW sources. In this paper, we compile a comprehensive list of potential causes that can lead to a false identification of a GR violation in standard tests of GR on data from current and future ground-based GW detectors. These causes include detector noise, signal overlaps, gaps in the data, detector calibration, source model inaccuracy, missing physics in the source and in the underlying environment model, source misidentification, and mismodeling of the astrophysical population. We also provide a rough estimate of when each of these causes will become important for tests of GR for different detector sensitivities. We argue that each of these causes should be thoroughly investigated, quantified, and ruled out before claiming a GR violation in GW observations. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: Review article; 1 figure; 1 table; comments welcome

arXiv:2405.01887 [pdf, other]

Ab initio calculation of hyper-neutron matter

Authors: Hui Tong, Serdar Elhatisari, Ulf-G. Meißner

Abstract: The equation of state (EoS) of neutron matter plays a decisive role in our understanding of the properties of neutron stars as well as the generation of gravitational waves in neutron star mergers. At sufficient densities, it is known that the appearance of hyperons generally softens the EoS, thus leading to a reduction in the maximum mass of neutron stars well below the observed values of about 2… ▽ More The equation of state (EoS) of neutron matter plays a decisive role in our understanding of the properties of neutron stars as well as the generation of gravitational waves in neutron star mergers. At sufficient densities, it is known that the appearance of hyperons generally softens the EoS, thus leading to a reduction in the maximum mass of neutron stars well below the observed values of about 2 solar masses. Even though repulsive three-body forces are known to solve this so-called "hyperon puzzle", so far performing \textit{ab initio} calculations with a substantial number of hyperons has remained elusive. In this work, we address this challenge by employing simulations based on Nuclear Lattice Effective Field Theory with up to 232 neutrons (pure neutron matter) and up to 116 $Λ$ hyperons (hyper-neutron matter) in a finite volume. We introduce a novel auxiliary field quantum Monte Carlo algorithm, allowing us to simulate for both pure neutron matter and hyper-neutron matter systems up to 5 times the density of nuclear matter using a single auxiliary field without any sign oscillations. Also, for the first time in {\em ab initio} calculations, we not only include $NΛ$ two-body and $NNΛ$ three-body forces, but also $ΛΛ$ and $N ΛΛ$ interactions. Consequently, we determine essential astrophysical quantities such as the mass-radius relation, the speed of sound and the tidal deformability of neutron stars. Our findings also confirm the existence of the $I$-Love-$Q$ relation, which gives access to the moment of inertia of the neutron star. △ Less

Submitted 8 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

Comments: 17 pages, 10 figures, extended discussions, many references added

arXiv:2405.00509 [pdf, other]

Polarization Perspectives on Hercules X-1: Further Constraining the Geometry

Authors: Qingchang Zhao, Hancheng Li, Lian Tao, Hua Feng, Shuangnan Zhang, Roland Walter, Mingyu Ge, Hao Tong, Long Ji, Liang Zhang, Jinlu Qu, Yue Huang, Xiang Ma, Shu Zhang, Qianqing Yin, Hongxing Yin, Ruican Ma, Shujie Zhao, Panping Li, Zixu Yang, Hexin Liu, Wei Yu, Yiming Huang, Zexi Li, Yajun Li , et al. (2 additional authors not shown)

Abstract: We conduct a comprehensive analysis of the accreting X-ray pulsar, Hercules X-1, utilizing data from IXPE and NuSTAR. IXPE performed five observations of Her X-1, consisting of three in the Main-on state and two in the Short-on state. Our time-resolved analysis uncovers the linear correlations between the flux and polarization degree as well as the pulse fraction and polarization degree. Geometry… ▽ More We conduct a comprehensive analysis of the accreting X-ray pulsar, Hercules X-1, utilizing data from IXPE and NuSTAR. IXPE performed five observations of Her X-1, consisting of three in the Main-on state and two in the Short-on state. Our time-resolved analysis uncovers the linear correlations between the flux and polarization degree as well as the pulse fraction and polarization degree. Geometry parameters are rigorously constrained by fitting the phase-resolved modulations of Cyclotron Resonance Scattering Feature and polarization angle with a simple dipole model and Rotating Vector Model respectively, yielding roughly consistent results. The changes of $χ_{\rm p}$ (the position angle of the pulsar's spin axis on the plane of the sky) between different Main-on observations suggest the possible forced precession of the neutron star crust. Furthermore, a linear association between the energy of Cyclotron Resonance Scattering Feature and polarization angle implies the prevalence of a dominant dipole magnetic field, and their phase-resolved modulations likely arise from viewing angle effects. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: Accepted for MNRAS

arXiv:2404.17675 [pdf, other]

Ideal noncrystals: A possible new class of ordered matter without apparent broken symmetry

Authors: Xinyu Fan, Ding Xu, Jianhua Zhang, Hao Hu, Peng Tan, Ning Xu, Hajime Tanaka, Hua Tong

Abstract: Order and disorder constitute two fundamental and opposite themes in condensed matter physics and materials science. Crystals are considered the epitome of order characterized by long-range translational order. The discovery of quasicrystals, with no periodicity but rotational symmetries forbidden for crystals, leads to a paradigm shift in solid-state physics. Moving one step forward, it is intrig… ▽ More Order and disorder constitute two fundamental and opposite themes in condensed matter physics and materials science. Crystals are considered the epitome of order characterized by long-range translational order. The discovery of quasicrystals, with no periodicity but rotational symmetries forbidden for crystals, leads to a paradigm shift in solid-state physics. Moving one step forward, it is intriguing to ask whether ordered matter exists without apparent symmetry breaking. The same question may arise in the pursuit of how ordered amorphous (noncrystalline) solids can be. Here we report the finding of ideal noncrystals in two dimensions, which are disordered in the conventional sense without Bragg peaks but highly ordered according to the steric order. We find that such ideal noncrystals have vibrational modes the same as phonons following the Debye law. The elastic responses are fully affine, which is again characteristic of crystals, and the spatial fluctuations of local volume fractions approach hyperuniformity. Therefore, ideal noncrystals represent an anomalous form of matter with a mixed nature of noncrystalline structure but crystal-like properties. Since such states are found to be thermodynamically favorable, we identify them as a possible new class of ordered matter without apparent broken symmetry. Our results thus extend the scope of the ordered state of matter and may impact the understanding of entropy-driving ordering also in generic amorphous materials. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.14741 [pdf, other]

Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering

Authors: Yao Xu, Shizhu He, Jiabei Chen, Zihao Wang, Yangqiu Song, Hanghang Tong, Kang Liu, Jun Zhao

Abstract: To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs). However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the… ▽ More To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs). However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the given KG. In this situation, LLM mainly acts as an agent to find answer entities by exploring the KG, rather than effectively integrating internal and external knowledge sources. However, in real-world scenarios, KGs are often incomplete to cover all the knowledge required to answer questions. To simulate real-world scenarios and evaluate the ability of LLMs to integrate internal and external knowledge, in this paper, we propose leveraging LLMs for QA under Incomplete Knowledge Graph (IKGQA), where the given KG doesn't include all the factual triples involved in each question. To handle IKGQA, we propose a training-free method called Generate-on-Graph (GoG) that can generate new factual triples while exploring on KGs. Specifically, we propose a selecting-generating-answering framework, which not only treat the LLM as an agent to explore on KGs, but also treat it as a KG to generate new facts based on the explored subgraph and its inherent knowledge. Experimental results on two datasets demonstrate that our GoG can solve IKGQA to a certain extent, while almost all previous methods cannot perform well on IKGQA. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.12522 [pdf, other]

Neural Active Learning Beyond Bandits

Authors: Yikun Ban, Ishika Agarwal, Ziwei Wu, Yada Zhu, Kommy Weldemariam, Hanghang Tong, Jingrui He

Abstract: We study both stream-based and pool-based active learning with neural network approximations. A recent line of works proposed bandit-based approaches that transformed active learning into a bandit problem, achieving both theoretical and empirical success. However, the performance and computational costs of these methods may be susceptible to the number of classes, denoted as $K$, due to this trans… ▽ More We study both stream-based and pool-based active learning with neural network approximations. A recent line of works proposed bandit-based approaches that transformed active learning into a bandit problem, achieving both theoretical and empirical success. However, the performance and computational costs of these methods may be susceptible to the number of classes, denoted as $K$, due to this transformation. Therefore, this paper seeks to answer the question: "How can we mitigate the adverse impacts of $K$ while retaining the advantages of principled exploration and provable performance guarantees in active learning?" To tackle this challenge, we propose two algorithms based on the newly designed exploitation and exploration neural networks for stream-based and pool-based active learning. Subsequently, we provide theoretical performance guarantees for both algorithms in a non-parametric setting, demonstrating a slower error-growth rate concerning $K$ for the proposed approaches. We use extensive experiments to evaluate the proposed algorithms, which consistently outperform state-of-the-art baselines. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: Published on ICLR 2024, 40 Pages

arXiv:2404.10596 [pdf, other]

Formation of GW230529 from Isolated Binary Evolution and Its Electromagnetic Counterparts

Authors: Jin-Ping Zhu, Rui-Chong Hu, Yacheng Kang, Bing Zhang, Hui Tong, Lijing Shao, Ying Qin

Abstract: In this {\em{Letter}}, we explore the formation of the mass-gap black hole-neutron star (mgBHNS) merger detected in gravitational wave (GW) event, i.e., GW230529, from the isolated binary evolution channel, and study potential signatures of its electromagnetic counterparts. By adopting the `delayed' supernova prescription and reasonable model realizations, our population synthesis simulation resul… ▽ More In this {\em{Letter}}, we explore the formation of the mass-gap black hole-neutron star (mgBHNS) merger detected in gravitational wave (GW) event, i.e., GW230529, from the isolated binary evolution channel, and study potential signatures of its electromagnetic counterparts. By adopting the `delayed' supernova prescription and reasonable model realizations, our population synthesis simulation results can simultaneously match the rate densities of mgBHNS and total BHNS mergers inferred from the population analyses, along with the population distribution of the BH mass in BHNS mergers reported by the LIGO-Virgo-KAGRA Collaboration. Because GW230529 contributes significantly to the inferred mgBHNS rate densities, we suggest that GW230529 can be explained through the isolated binary evolution channel. Considering the AP4 (DD2) equation of state, the probability that GW230529 can make tidal disruption is $12.8\%$ ($63.2\%$). If GW230529 is a disrupted event, its kilonova peak apparent magnitude is predicted $\sim23-24\,{\rm{mag}}$, and hence, can be detected by the present survey projects and LSST. Since GW230529 could be an off-axis event inferred from the GW observation, its associated gamma-ray burst (GRB) might be too dim to be observed by $γ$-ray detectors, interpreting the lack of GRB observations. Our study suggests the existence of mgBHNS mergers formed through the isolated binary evolution channel due to the discovery of GW230529, indicating that BHNS mergers are still likely to be multimessenger sources that emit GWs, GRBs, and kilonovae. Although mgBHNS mergers account for $\sim50\%$ cosmological BHNS population, we find that $\gtrsim90\%$ disrupted BHNS mergers are expected to originate from mgBHNS mergers. △ Less

Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: Submitted to ApJL, 15 pages, 5 figures, 4 tables, comments are welcome!

arXiv:2404.04460 [pdf, other]

Transdimensional inference for gravitational-wave astronomy with Bilby

Authors: Hui Tong, Nir Guttman, Teagan A. Clarke, Paul D. Lasky, Eric Thrane, Ethan Payne, Rowina Nathan, Ben Farr, Maya Fishbach, Gregory Ashton, Valentina Di Marco

Abstract: It has become increasingly useful to answer questions in gravitational-wave astronomy using transdimensional models where the number of free parameters can be varied depending on the complexity required to fit the data. Given the growing interest in transdimensional inference, we introduce a new package for the Bayesian inference Library (Bilby) called tBilby. The tBilby package allows users to se… ▽ More It has become increasingly useful to answer questions in gravitational-wave astronomy using transdimensional models where the number of free parameters can be varied depending on the complexity required to fit the data. Given the growing interest in transdimensional inference, we introduce a new package for the Bayesian inference Library (Bilby) called tBilby. The tBilby package allows users to set up transdimensional inference calculations using the existing Bilby architecture with off-the-shelf nested samplers and/or Markov Chain Monte Carlo algorithms. Transdimensional models are particularly helpful when we seek to test theoretically uncertain predictions described by phenomenological models. For example, bursts of gravitational waves can be modelled using a superposition of N wavelets where N is itself a free parameter. Short pulses are modelled with small values of N whereas longer, more complicated signals are represented with a large number of wavelets stitched together. Other transdimensional models have found use describing instrumental noise and the population properties of gravitational-wave sources. We provide a few demonstrations of tBilby, including fitting the gravitational-wave signal GW150914 with a superposition of N sine-Gaussian wavelets. We outline our plans to further develop the tbilby code suite for a broader range of transdimensional problems. △ Less

Submitted 8 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 12 pages, 7 figures

Report number: LIGO DCC P2400105

arXiv:2404.04264 [pdf, other]

Logic Query of Thoughts: Guiding Large Language Models to Answer Complex Logic Queries with Knowledge Graphs

Authors: Lihui Liu, Zihao Wang, Ruizhong Qiu, Yikun Ban, Eunice Chan, Yangqiu Song, Jingrui He, Hanghang Tong

Abstract: Despite the superb performance in many tasks, large language models (LLMs) bear the risk of generating hallucination or even wrong answers when confronted with tasks that demand the accuracy of knowledge. The issue becomes even more noticeable when addressing logic queries that require multiple logic reasoning steps. On the other hand, knowledge graph (KG) based question answering methods are capa… ▽ More Despite the superb performance in many tasks, large language models (LLMs) bear the risk of generating hallucination or even wrong answers when confronted with tasks that demand the accuracy of knowledge. The issue becomes even more noticeable when addressing logic queries that require multiple logic reasoning steps. On the other hand, knowledge graph (KG) based question answering methods are capable of accurately identifying the correct answers with the help of knowledge graph, yet its accuracy could quickly deteriorate when the knowledge graph itself is sparse and incomplete. It remains a critical challenge on how to integrate knowledge graph reasoning with LLMs in a mutually beneficial way so as to mitigate both the hallucination problem of LLMs as well as the incompleteness issue of knowledge graphs. In this paper, we propose 'Logic-Query-of-Thoughts' (LGOT) which is the first of its kind to combine LLMs with knowledge graph based logic query reasoning. LGOT seamlessly combines knowledge graph reasoning and LLMs, effectively breaking down complex logic queries into easy to answer subquestions. Through the utilization of both knowledge graph reasoning and LLMs, it successfully derives answers for each subquestion. By aggregating these results and selecting the highest quality candidate answers for each step, LGOT achieves accurate results to complex questions. Our experimental findings demonstrate substantial performance enhancements, with up to 20% improvement over ChatGPT. △ Less

Submitted 13 April, 2024; v1 submitted 17 March, 2024; originally announced April 2024.

arXiv:2404.00225 [pdf, ps, other]

Heterogeneous Contrastive Learning for Foundation Models and Beyond

Authors: Lecheng Zheng, Baoyu Jing, Zihao Li, Hanghang Tong, Jingrui He

Abstract: In the era of big data and Artificial Intelligence, an emerging paradigm is to utilize contrastive self-supervised learning to model large-scale heterogeneous data. Many existing foundation models benefit from the generalization capability of contrastive self-supervised learning by learning compact and high-quality representations without relying on any label information. Amidst the explosive adva… ▽ More In the era of big data and Artificial Intelligence, an emerging paradigm is to utilize contrastive self-supervised learning to model large-scale heterogeneous data. Many existing foundation models benefit from the generalization capability of contrastive self-supervised learning by learning compact and high-quality representations without relying on any label information. Amidst the explosive advancements in foundation models across multiple domains, including natural language processing and computer vision, a thorough survey on heterogeneous contrastive learning for the foundation model is urgently needed. In response, this survey critically evaluates the current landscape of heterogeneous contrastive learning for foundation models, highlighting the open challenges and future trends of contrastive learning. In particular, we first present how the recent advanced contrastive learning-based methods deal with view heterogeneity and how contrastive learning is applied to train and fine-tune the multi-view foundation models. Then, we move to contrastive learning methods for task heterogeneity, including pretraining tasks and downstream tasks, and show how different tasks are combined with contrastive learning loss for different purposes. Finally, we conclude this survey by discussing the open challenges and shedding light on the future directions of contrastive learning. △ Less

Submitted 29 March, 2024; originally announced April 2024.

arXiv:2403.14474 [pdf, ps, other]

doi 10.1016/j.scib.2024.05.013

Tensor-force effects on nuclear matter in relativistic ab initio theory

Authors: Sibo Wang, Hui Tong, Chencan Wang, Qiang Zhao, Peter Ring, Jie Meng

Abstract: Within the relativistic Brueckner-Hartree-Fock theory in the full Dirac space, the tensor-force effects on infinite nuclear matter are elucidated by subtracting the matrix elements of tensor forces from the realistic nucleon-nucleon interaction. The tensor-force effects for the binding energy per particle of symmetric nuclear matter (SNM) as well as the symmetry energy are attractive and are more… ▽ More Within the relativistic Brueckner-Hartree-Fock theory in the full Dirac space, the tensor-force effects on infinite nuclear matter are elucidated by subtracting the matrix elements of tensor forces from the realistic nucleon-nucleon interaction. The tensor-force effects for the binding energy per particle of symmetric nuclear matter (SNM) as well as the symmetry energy are attractive and are more pronounced around the empirical saturation density, while the tensor forces have little impact on the pure neutron matter. By tuning the tensor-force strength, an infinite (negative) scattering length in the spin-triplet channel is found. This locates the dilute SNM with only the $^3S_1$-$^3D_1$ channel interaction at the unitary limit. Its ground-state energy is found proportional to the energy of a free Fermi gas with a scaling factor 0.38, revealing good universal properties. This work paves the way to study the tensor-force effects in neutron stars as well as finite nuclei from realistic nucleon-nucleon interactions, highlights the role of the tensor force on the deviation of the nuclear physics to the unitary limit, and provides valuable reference for studies of the four-component unitary Fermi gas. △ Less

Submitted 3 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: 5 pages, 2 figures, discussion on four-component unitary Fermi gas is updated, accepted by Science Bulletin

arXiv:2403.01508 [pdf, other]

Soft Reasoning on Uncertain Knowledge Graphs

Authors: Weizhi Fei, Zihao Wang, Hang Yin, Yang Duan, Hanghang Tong, Yangqiu Song

Abstract: The study of machine learning-based logical query-answering enables reasoning with large-scale and incomplete knowledge graphs. This paper further advances this line of research by considering the uncertainty in the knowledge. The uncertain nature of knowledge is widely observed in the real world, but \textit{does not} align seamlessly with the first-order logic underpinning existing studies. To b… ▽ More The study of machine learning-based logical query-answering enables reasoning with large-scale and incomplete knowledge graphs. This paper further advances this line of research by considering the uncertainty in the knowledge. The uncertain nature of knowledge is widely observed in the real world, but \textit{does not} align seamlessly with the first-order logic underpinning existing studies. To bridge this gap, we study the setting of soft queries on uncertain knowledge, which is motivated by the establishment of soft constraint programming. We further propose an ML-based approach with both forward inference and backward calibration to answer soft queries on large-scale, incomplete, and uncertain knowledge graphs. Theoretical discussions present that our methods share the same complexity as state-of-the-art inference algorithms for first-order queries. Empirical results justify the superior performance of our approach against previous ML-based methods with number embedding extensions. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 10 pages

arXiv:2403.00782 [pdf, other]

Ploutos: Towards interpretable stock movement prediction with financial large language model

Authors: Hanshuang Tong, Jun Li, Ning Wu, Ming Gong, Dongmei Zhang, Qi Zhang

Abstract: Recent advancements in large language models (LLMs) have opened new pathways for many domains. However, the full potential of LLMs in financial investments remains largely untapped. There are two main challenges for typical deep learning-based methods for quantitative finance. First, they struggle to fuse textual and numerical information flexibly for stock movement prediction. Second, traditional… ▽ More Recent advancements in large language models (LLMs) have opened new pathways for many domains. However, the full potential of LLMs in financial investments remains largely untapped. There are two main challenges for typical deep learning-based methods for quantitative finance. First, they struggle to fuse textual and numerical information flexibly for stock movement prediction. Second, traditional methods lack clarity and interpretability, which impedes their application in scenarios where the justification for predictions is essential. To solve the above challenges, we propose Ploutos, a novel financial LLM framework that consists of PloutosGen and PloutosGPT. The PloutosGen contains multiple primary experts that can analyze different modal data, such as text and numbers, and provide quantitative strategies from different perspectives. Then PloutosGPT combines their insights and predictions and generates interpretable rationales. To generate accurate and faithful rationales, the training strategy of PloutosGPT leverage rearview-mirror prompting mechanism to guide GPT-4 to generate rationales, and a dynamic token weighting mechanism to finetune LLM by increasing key tokens weight. Extensive experiments show our framework outperforms the state-of-the-art methods on both prediction accuracy and interpretability. △ Less

Submitted 18 February, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures

arXiv:2402.18272 [pdf, other]

Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?

Authors: Qineng Wang, Zihao Wang, Ying Su, Hanghang Tong, Yangqiu Song

Abstract: Recent progress in LLMs discussion suggests that multi-agent discussion improves the reasoning abilities of LLMs. In this work, we reevaluate this claim through systematic experiments, where we propose a novel group discussion framework to enrich the set of discussion mechanisms. Interestingly, our results show that a single-agent LLM with strong prompts can achieve almost the same performance as… ▽ More Recent progress in LLMs discussion suggests that multi-agent discussion improves the reasoning abilities of LLMs. In this work, we reevaluate this claim through systematic experiments, where we propose a novel group discussion framework to enrich the set of discussion mechanisms. Interestingly, our results show that a single-agent LLM with strong prompts can achieve almost the same performance as the best existing discussion approach on a wide range of reasoning tasks and backbone LLMs. We observe that the multi-agent discussion performs better than a single agent only when there is no demonstration in the prompt. Further study reveals the common interaction mechanisms of LLMs during the discussion. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 22 pages, 5 figures, 10 tables

arXiv:2402.16387 [pdf, other]

On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method

Authors: Weilin Cong, Jian Kang, Hanghang Tong, Mehrdad Mahdavi

Abstract: Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications, especially in domains where data can be represented as a graph and evolves over time. Although TGL has recently seen notable progress in algorithmic solutions, its theoretical foundations remain largely unexplored. This paper aims at bridging this gap by investigating the generalization ability o… ▽ More Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications, especially in domains where data can be represented as a graph and evolves over time. Although TGL has recently seen notable progress in algorithmic solutions, its theoretical foundations remain largely unexplored. This paper aims at bridging this gap by investigating the generalization ability of different TGL algorithms (e.g., GNN-based, RNN-based, and memory-based methods) under the finite-wide over-parameterized regime. We establish the connection between the generalization error of TGL algorithms and "the number of layers/steps" in the GNN-/RNN-based TGL methods and "the feature-label alignment (FLA) score", where FLA can be used as a proxy for the expressive power and explains the performance of memory-based methods. Guided by our theoretical analysis, we propose Simplified-Temporal-Graph-Network, which enjoys a small generalization error, improved overall performance, and lower model complexity. Extensive experiments on real-world datasets demonstrate the effectiveness of our method. Our theoretical findings and proposed algorithm offer essential insights into TGL from a theoretical standpoint, laying the groundwork for the designing practical TGL algorithms in future studies. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.16339 [pdf, ps, other]

doi 10.1103/PhysRevC.109.064603

Microscopic optical potential from the relativistic Brueckner-Hartree-Fock theory: Proton-nucleus scattering

Authors: Pianpian Qin, Sibo Wang, Hui Tong, Qiang Zhao, Chencan Wang, Z. P. Li, Peter Ring

Abstract: A relativistic microscopic optical model potential for nucleon-nucleus scattering is developed based on the \emph{ab initio} relativistic Brueckner-Hartree-Fock (RBHF) theory with the improved local density approximation, which is abbreviated as the RBOM potential. Both real and imaginary parts of the single-particle potentials in symmetric and asymmetric nuclear matter at various densities are de… ▽ More A relativistic microscopic optical model potential for nucleon-nucleus scattering is developed based on the \emph{ab initio} relativistic Brueckner-Hartree-Fock (RBHF) theory with the improved local density approximation, which is abbreviated as the RBOM potential. Both real and imaginary parts of the single-particle potentials in symmetric and asymmetric nuclear matter at various densities are determined uniquely in the full Dirac space. The density distributions of the target nuclei are calculated by the covariant energy density functional theory with the density functional PC-PK1. The central and spin-orbit terms of the optical potentials are quantitatively consistent with the relativistic phenomenological optical potentials. The performance of the RBOM potential is evaluated by considering proton scattering with incident energy $E\leq 200$ MeV on five target nuclei, $\prescript{208}{}{\text{Pb}}$, $\prescript{120}{}{\text{Sn}}$, $\prescript{90}{}{\text{Zr}}$, $\prescript{48}{}{\text{Ca}}$, and $\prescript{40}{}{\text{Ca}}$. Scattering observables including the elastic scattering angular distributions, analyzing powers, spin rotation functions, and reaction cross sections are analyzed. Theoretical predictions show good agreements with the experimental data and the results derived from phenomenological optical potentials. We anticipate that the RBOM potential can provide reference for other phenomenological and microscopic optical model potentials, as well as reliable descriptions for nucleon scattering on exotic nuclei in the era of rare-isotope beams. △ Less

Submitted 12 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: 36 pages, 20 figures, published as Phys. Rev. C 109, 064603 (2024), Editor's Suggestion

arXiv:2402.12962 [pdf, other]

Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency

Authors: Chunyang Meng, Haogang Tong, Tianyang Wu, Maolin Pan, Yang Yu

Abstract: Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conser… ▽ More Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads, combining multi-level machine learning (ML) techniques to mitigate SLO violations while saving costs. BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts. When bursts are detected, BAScaler appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BAScaler employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's effectiveness, achieving a 57% average reduction in SLO violations and cutting resource costs by 10% compared to other prominent methods. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.01330 [pdf, other]

Video Semantic Communication with Major Object Extraction and Contextual Video Encoding

Authors: Haopeng Li, Haonan Tong, Sihua Wang, Nuocheng Yang, Zhaohui Yang, Changchuan Yin

Abstract: This paper studies an end-to-end video semantic communication system for massive communication. In the considered system, the transmitter must continuously send the video to the receiver to facilitate character reconstruction in immersive applications, such as interactive video conference. However, transmitting the original video information with substantial amounts of data poses a challenge to th… ▽ More This paper studies an end-to-end video semantic communication system for massive communication. In the considered system, the transmitter must continuously send the video to the receiver to facilitate character reconstruction in immersive applications, such as interactive video conference. However, transmitting the original video information with substantial amounts of data poses a challenge to the limited wireless resources. To address this issue, we reduce the amount of data transmitted by making the transmitter extract and send the semantic information from the video, which refines the major object and the correlation of time and space in the video. Specifically, we first develop a video semantic communication system based on major object extraction (MOE) and contextual video encoding (CVE) to achieve efficient video transmission. Then, we design the MOE and CVE modules with convolutional neural network based motion estimation, contextual extraction and entropy coding. Simulation results show that compared to the traditional coding schemes, the proposed method can reduce the amount of transmitted data by up to 25% while increasing the peak signal-to-noise ratio (PSNR) of the reconstructed video by up to 14%. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 6 pages, 9 figures, accepted by IEEE WCNC wksp 2024

arXiv:2401.16017 [pdf, other]

DMCE: Diffusion Model Channel Enhancer for Multi-User Semantic Communication Systems

Authors: Youcheng Zeng, Xinxin He, Xu Chen, Haonan Tong, Zhaohui Yang, Yijun Guo, Jianjun Hao

Abstract: To achieve continuous massive data transmission with significantly reduced data payload, the users can adopt semantic communication techniques to compress the redundant information by transmitting semantic features instead. However, current works on semantic communication mainly focus on high compression ratio, neglecting the wireless channel effects including dynamic distortion and multi-user int… ▽ More To achieve continuous massive data transmission with significantly reduced data payload, the users can adopt semantic communication techniques to compress the redundant information by transmitting semantic features instead. However, current works on semantic communication mainly focus on high compression ratio, neglecting the wireless channel effects including dynamic distortion and multi-user interference, which significantly limit the fidelity of semantic communication. To address this, this paper proposes a diffusion model (DM)-based channel enhancer (DMCE) for improving the performance of multi-user semantic communication, with the DM learning the particular data distribution of channel effects on the transmitted semantic features. In the considered system model, multiple users (such as road cameras) transmit semantic features of multi-source data to a receiver by applying the joint source-channel coding (JSCC) techniques, and the receiver fuses the semantic features from multiple users to complete specific tasks. Then, we propose DMCE to enhance the channel state information (CSI) estimation for improving the restoration of the received semantic features. Finally, the fusion results at the receiver are significantly enhanced, demonstrating a robust performance even under low signal-to-noise ratio (SNR) regimes, enabling the generation of effective object segmentation images. Extensive simulation results with a traffic scenario dataset show that the proposed scheme can improve the mean Intersection over Union (mIoU) by more than 25\% at low SNR regimes, compared with the benchmark schemes. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: accepted by IEEE ICC 2024

arXiv:2401.14157 [pdf, other]

From ultraluminous X-ray pulsar to supermassive neutron star

Authors: H. Tong

Abstract: The formation of a $2.7\ \rm M_{\odot}$ supermassive neutron star is explored, as the possible companion of PSR J0514--4002E. Magnetars may experience super-Eddington accretion. Observationally they may manifest themselves as ultraluminous X-ray pulsars. We propose that supermassive neutron stars may be formed through ultraluminous X-ray pulsar phase, if the ultraluminous X-ray pulsar phase can la… ▽ More The formation of a $2.7\ \rm M_{\odot}$ supermassive neutron star is explored, as the possible companion of PSR J0514--4002E. Magnetars may experience super-Eddington accretion. Observationally they may manifest themselves as ultraluminous X-ray pulsars. We propose that supermassive neutron stars may be formed through ultraluminous X-ray pulsar phase, if the ultraluminous X-ray pulsar phase can last for $10^{5}$--$10^6 \ \rm yr$. The accreted material will also bury the magnetic field of the neutron star. Assuming accretion equilibrium, the final output may be a millisecond supermassive neutron star. In order for the ultraluminous X-ray pulsar phase to last long enough, a magnetic field configuration of the low magnetic field magnetar is required. The mass, magnetic field and rotational evolution of super-Eddington accreting neutron stars are rather robust against different assumptions, although many of the model details are yet to be determined. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 6 pages, submitted

arXiv:2401.10153 [pdf, other]

Importance-Aware Image Segmentation-based Semantic Communication for Autonomous Driving

Authors: Jie Lv, Haonan Tong, Qiang Pan, Zhilong Zhang, Xinxin He, Tao Luo, Changchuan Yin

Abstract: This article studies the problem of image segmentation-based semantic communication in autonomous driving. In real traffic scenes, detecting the key objects (e.g., vehicles, pedestrians and obstacles) is more crucial than that of other objects to guarantee driving safety. Therefore, we propose a vehicular image segmentation-oriented semantic communication system, termed VIS-SemCom, where image seg… ▽ More This article studies the problem of image segmentation-based semantic communication in autonomous driving. In real traffic scenes, detecting the key objects (e.g., vehicles, pedestrians and obstacles) is more crucial than that of other objects to guarantee driving safety. Therefore, we propose a vehicular image segmentation-oriented semantic communication system, termed VIS-SemCom, where image segmentation features of important objects are transmitted to reduce transmission redundancy. First, to accurately extract image semantics, we develop a semantic codec based on Swin Transformer architecture, which expands the perceptual field thus improving the segmentation accuracy. Next, we propose a multi-scale semantic extraction scheme via assigning the number of Swin Transformer blocks for diverse resolution features, thus highlighting the important objects' accuracy. Furthermore, the importance-aware loss is invoked to emphasize the important objects, and an online hard sample mining (OHEM) strategy is proposed to handle small sample issues in the dataset. Experimental results demonstrate that the proposed VIS-SemCom can achieve a coding gain of nearly 6 dB with a 60% mean intersection over union (mIoU), reduce the transmitted data amount by up to 70% with a 60% mIoU, and improve the segmentation intersection over union (IoU) of important objects by 4%, compared to traditional transmission scheme. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 10 pages, 8 figures

arXiv:2401.07329 [pdf, other]

Attention-based UNet enabled Lightweight Image Semantic Communication System over Internet of Things

Authors: Guoxin Ma, Haonan Tong, Nuocheng Yang, Changchuan Yin

Abstract: This paper studies the problem of the lightweight image semantic communication system that is deployed on Internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techniques to support user behavior recognition in ultimate video service with high data transmission efficiency. However, it is computationally expensive for IoT devices to deploy semanti… ▽ More This paper studies the problem of the lightweight image semantic communication system that is deployed on Internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techniques to support user behavior recognition in ultimate video service with high data transmission efficiency. However, it is computationally expensive for IoT devices to deploy semantic codecs due to the complex calculation processes of deep learning (DL) based codec training and inference. To make it affordable for IoT devices to deploy semantic communication systems, we propose an attention-based UNet enabled lightweight image semantic communication (LSSC) system, which achieves low computational complexity and small model size. In particular, we first let the LSSC system train the codec at the edge server to reduce the training computation load on IoT devices. Then, we introduce the convolutional block attention module (CBAM) to extract the image semantic features and decrease the number of downsampling layers thus reducing the floating-point operations (FLOPs). Finally, we experimentally adjust the structure of the codec and find out the optimal number of downsampling layers. Simulation results show that the proposed LSSC system can reduce the semantic codec FLOPs by 14%, and reduce the model size by 55%, with a sacrifice of 3% accuracy, compared to the baseline. Moreover, the proposed scheme can achieve a higher transmission accuracy than the traditional communication scheme in the low channel signal-to-noise (SNR) region. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: 6 pages, 6 figures, accepted by IEEE WCNC 2024

arXiv:2401.07318 [pdf, ps, other]

Isospin splitting of the Dirac mass probed by the relativistic Brueckner-Hartree-Fock theory in the full Dirac space

Authors: Pianpian Qin, Qiang Zhao, Hui Tong, Chencan Wang, Sibo Wang

Abstract: The isospin splitting of the Dirac mass obtained with the relativistic Brueckner-Hartree-Fock (RBHF) theory is thoroughly investigated. From the perspective in the full Dirac space, the long-standing controversy between the momentum-independence approximation (MIA) method and the projection method on the isospin splitting of the Dirac mass in asymmetric nuclear matter (ANM) is analyzed in detail.… ▽ More The isospin splitting of the Dirac mass obtained with the relativistic Brueckner-Hartree-Fock (RBHF) theory is thoroughly investigated. From the perspective in the full Dirac space, the long-standing controversy between the momentum-independence approximation (MIA) method and the projection method on the isospin splitting of the Dirac mass in asymmetric nuclear matter (ANM) is analyzed in detail. We find that, the \textit{assumption procedure} of the MIA method, which assumes that the single-particle potentials are momentum independent, is not a sufficient condition that directly leads to the wrong sign of the isospin splitting of the Dirac mass, while the \textit{extraction procedure} of the MIA method, which extracts the single-particle potentials from the single-particle potential energy, leads to the wrong sign. By approximately solving the set of equations involved in the \textit{extraction procedure}, a formal expression of the Dirac mass is obtained. The wrong isospin splitting of the Dirac mass is mainly caused by that the \textit{extraction procedure} forcely assumes the momentum dependence of the single-particle potential energy to be a quadratic form where the strength is solely determined by the constant scalar potential. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: 13 pages, 4 figures

arXiv:2401.05984 [pdf, other]

HybridOctree_Hex: Hybrid Octree-Based Adaptive All-Hexahedral Mesh Generation with Jacobian Control

Authors: Hua Tong, Eni Halilaj, Yongjie Jessica Zhang

Abstract: We present a new software package, "HybridOctree_Hex," for adaptive all-hexahedral mesh generation based on hybrid octree and quality improvement with Jacobian control. The proposed HybridOctree_Hex begins by detecting curvatures and narrow regions of the input boundary to identify key surface features and initialize an octree structure. Subsequently, a strongly balanced octree is constructed usin… ▽ More We present a new software package, "HybridOctree_Hex," for adaptive all-hexahedral mesh generation based on hybrid octree and quality improvement with Jacobian control. The proposed HybridOctree_Hex begins by detecting curvatures and narrow regions of the input boundary to identify key surface features and initialize an octree structure. Subsequently, a strongly balanced octree is constructed using the balancing and pairing rules. Inspired by our earlier preliminary hybrid octree-based work, templates are designed to guarantee an all-hexahedral dual mesh generation directly from the strongly balanced octree. With these pre-defined templates, the sophisticated hybrid octree construction step is skipped to achieve an efficient implementation. After that, elements outside and around the boundary are removed to create a core mesh. The boundary points of the core mesh are connected to their corresponding closest points on the surface to fill the buffer zone and build the final mesh. Coupled with smart Laplacian smoothing, HybridOctree_Hex takes advantage of a delicate optimization-based quality improvement method considering geometric fitting, Jacobian and scaled Jacobian, to achieve a minimum scaled Jacobian that is higher than $0.5$. We empirically verify the robustness and efficiency of our method by running the HybridOctree_Hex software on dozens of complex 3D models without any manual intervention or parameter adjustment. We provide the HybridOctree_Hex source code, along with comprehensive results encompassing the input and output files and statistical data in the following repository: https://github.com/CMU-CBML/HybridOctree_Hex. △ Less

Submitted 14 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.17269 [pdf, other]

Conversational Question Answering with Reformulations over Knowledge Graph

Authors: Lihui Liu, Blaine Hill, Boxin Du, Fei Wang, Hanghang Tong

Abstract: Conversational question answering (convQA) over knowledge graphs (KGs) involves answering multi-turn natural language questions about information contained in a KG. State-of-the-art methods of ConvQA often struggle with inexplicit question-answer pairs. These inputs are easy for human beings to understand given a conversation history, but hard for a machine to interpret, which can degrade ConvQA p… ▽ More Conversational question answering (convQA) over knowledge graphs (KGs) involves answering multi-turn natural language questions about information contained in a KG. State-of-the-art methods of ConvQA often struggle with inexplicit question-answer pairs. These inputs are easy for human beings to understand given a conversation history, but hard for a machine to interpret, which can degrade ConvQA performance. To address this problem, we propose a reinforcement learning (RL) based model, CornNet, which utilizes question reformulations generated by large language models (LLMs) to improve ConvQA performance. CornNet adopts a teacher-student architecture where a teacher model learns question representations using human writing reformulations, and a student model to mimic the teacher model's output via reformulations generated by LLMs. The learned question representation is then used by an RL model to locate the correct answer in a KG. Extensive experimental results show that CornNet outperforms state-of-the-art convQA models. △ Less

Submitted 29 March, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.17264 [pdf, other]

ESGReveal: An LLM-based approach for extracting structured data from ESG reports

Authors: Yi Zou, Mengying Shi, Zhongjie Chen, Zhu Deng, ZongXiong Lei, Zihan Zeng, Shiming Yang, HongXiang Tong, Lei Xiao, Wenwen Zhou

Abstract: ESGReveal is an innovative method proposed for efficiently extracting and analyzing Environmental, Social, and Governance (ESG) data from corporate reports, catering to the critical need for reliable ESG information retrieval. This approach utilizes Large Language Models (LLM) enhanced with Retrieval Augmented Generation (RAG) techniques. The ESGReveal system includes an ESG metadata module for ta… ▽ More ESGReveal is an innovative method proposed for efficiently extracting and analyzing Environmental, Social, and Governance (ESG) data from corporate reports, catering to the critical need for reliable ESG information retrieval. This approach utilizes Large Language Models (LLM) enhanced with Retrieval Augmented Generation (RAG) techniques. The ESGReveal system includes an ESG metadata module for targeted queries, a preprocessing module for assembling databases, and an LLM agent for data extraction. Its efficacy was appraised using ESG reports from 166 companies across various sectors listed on the Hong Kong Stock Exchange in 2022, ensuring comprehensive industry and market capitalization representation. Utilizing ESGReveal unearthed significant insights into ESG reporting with GPT-4, demonstrating an accuracy of 76.9% in data extraction and 83.7% in disclosure analysis, which is an improvement over baseline models. This highlights the framework's capacity to refine ESG data analysis precision. Moreover, it revealed a demand for reinforced ESG disclosures, with environmental and social data disclosures standing at 69.5% and 57.2%, respectively, suggesting a pursuit for more corporate transparency. While current iterations of ESGReveal do not process pictorial information, a functionality intended for future enhancement, the study calls for continued research to further develop and compare the analytical capabilities of various LLMs. In summary, ESGReveal is a stride forward in ESG data processing, offering stakeholders a sophisticated tool to better evaluate and advance corporate sustainability efforts. Its evolution is promising in promoting transparency in corporate reporting and aligning with broader sustainable development aims. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.16776 [pdf, ps, other]

Crystals for set-valued decomposition tableaux

Authors: Eric Marberg, Kam Hung Tong

Abstract: We describe two crystal structures on set-valued decomposition tableaux. These provide the first examples of interesting "$K$-theoretic" crystals on shifted tableaux. Our first crystal is modeled on a similar construction of Monical, Pechenik, and Scrimshaw for semistandard (unshifted) set-valued tableaux. Our second crystal is adapted from the "square root" operators introduced by Yu on the same… ▽ More We describe two crystal structures on set-valued decomposition tableaux. These provide the first examples of interesting "$K$-theoretic" crystals on shifted tableaux. Our first crystal is modeled on a similar construction of Monical, Pechenik, and Scrimshaw for semistandard (unshifted) set-valued tableaux. Our second crystal is adapted from the "square root" operators introduced by Yu on the same set. Neither of our shifted crystals is normal, but we conjecture that our second construction is connected with a unique highest weight element. These results lead to partial progress on a conjectural formula of Cho--Ikeda for $K$-theoretic Schur $P$-functions. We also study a new category of "square root crystals" that includes our second construction and Yu's set-valued tableau crystals as examples. We observe that Buch's formula for the coefficients expanding products of symmetric Grothendieck functions has a simple description in terms of the tensor product for this category. △ Less

Submitted 10 May, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

Comments: 42 pages, 8 figures; v2: major revision, many corrections and new conjectures; v3: minor corrections

arXiv:2312.15409 [pdf, ps, other]

Primed decomposition tableaux and extended queer crystals

Authors: Eric Marberg, Kam Hung Tong

Abstract: Our previous work introduced a category of extended queer crystals, whose connected normal objects have unique highest weight elements and characters that are Schur $Q$-polynomials. Our initial models for such crystals were based on semistandard shifted tableaux. Here, we introduce a simpler construction using certain "primed" decomposition tableaux, which slightly generalize the decomposition tab… ▽ More Our previous work introduced a category of extended queer crystals, whose connected normal objects have unique highest weight elements and characters that are Schur $Q$-polynomials. Our initial models for such crystals were based on semistandard shifted tableaux. Here, we introduce a simpler construction using certain "primed" decomposition tableaux, which slightly generalize the decomposition tableaux used in work of Grantcharov et al. This leads to a new, shorter proof of the highest weight properties of the normal subcategory of extended queer crystals. As one application, we identify bijections between two kinds of decomposition tableaux appearing in the literature. △ Less

Submitted 7 February, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

Comments: 35 pages, 4 figures; v2: minor corrections, added exposition, new Section 3.6

arXiv:2312.12704 [pdf, other]

doi 10.1038/s42005-024-01692-9

Merging mechanical bound states in the continuum in high-aspect-ratio phononic crystal gratings

Authors: Hao Tong, Shengyan Liu, Kejie Fang

Abstract: Mechanical bound states in the continuum (BICs) present an alternative avenue for developing high-frequency, high-Q mechanical resonators, distinct from the conventional band structure engineering method. While symmetry-protected mechanical BICs have been realized in phononic crystals, the observation of accidental mechanical BICs -- whose existence is independent of mode symmetry and tunable by s… ▽ More Mechanical bound states in the continuum (BICs) present an alternative avenue for developing high-frequency, high-Q mechanical resonators, distinct from the conventional band structure engineering method. While symmetry-protected mechanical BICs have been realized in phononic crystals, the observation of accidental mechanical BICs -- whose existence is independent of mode symmetry and tunable by structural parameters -- has remained elusive. This challenge is primarily attributed to the additional radiation channel introduced by the longitudinal component of elastic waves. Here, we employ a coupled wave theory to predict and experimentally demonstrate mechanical accidental BICs within a high-aspect-ratio gallium arsenide phononic crystal grating. We observe the merging process of accidental BICs with symmetry-protected BICs, resulting in reduced acoustic radiation losses compared to isolated BICs. This finding opens up new possibilities for phonon trapping using BIC-based systems, with potential applications in sensing, transduction, and quantum measurements. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Journal ref: Commun Phys 7, 197 (2024)

arXiv:2312.07859 [pdf, other]

Invariant Graph Transformer

Authors: Zhe Xu, Menghai Pan, Yuzhong Chen, Huiyuan Chen, Yuchen Yan, Mahashweta Das, Hanghang Tong

Abstract: Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In graph machine learning context, graph rationale is defined to locate the critical subgraph in the given graph topology, which fundamentally determines the prediction results. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgra… ▽ More Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In graph machine learning context, graph rationale is defined to locate the critical subgraph in the given graph topology, which fundamentally determines the prediction results. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgraph. Graph rationalization can enhance the model performance as the mapping between the graph rationale and prediction label is viewed as invariant, by assumption. To ensure the discriminative power of the extracted rationale subgraphs, a key technique named "intervention" is applied. The core idea of intervention is that given any changing environment subgraphs, the semantics from the rationale subgraph is invariant, which guarantees the correct prediction result. However, most, if not all, of the existing rationalization works on graph data develop their intervention strategies on the graph level, which is coarse-grained. In this paper, we propose well-tailored intervention strategies on graph data. Our idea is driven by the development of Transformer models, whose self-attention module provides rich interactions between input nodes. Based on the self-attention module, our proposed invariant graph Transformer (IGT) can achieve fine-grained, more specifically, node-level and virtual node-level intervention. Our comprehensive experiments involve 7 real-world datasets, and the proposed IGT shows significant performance advantages compared to 13 baseline methods. △ Less

Submitted 15 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.03403 [pdf, ps, other]

Low-momentum relativistic nucleon-nucleon potentials I: Nuclear matter

Authors: Chencan Wang, Sibo Wang, Hui Tong, Jinniu Hu, Jiangming Yao

Abstract: A series of relativistic one-boson-exchange potentials for two-nucleon system, denoted as OBEP$Λ$, is constructed with a momentum cutoff $Λ$ ranging from $\infty$ to 2 fm$^{-1}$. These potentials are developed by simultaneous fitting to nucleon-nucleon ($NN$) scattering phase shifts, low-energy scattering length, effective range, and the binding energy of the deuteron. The momentum-space matrix el… ▽ More A series of relativistic one-boson-exchange potentials for two-nucleon system, denoted as OBEP$Λ$, is constructed with a momentum cutoff $Λ$ ranging from $\infty$ to 2 fm$^{-1}$. These potentials are developed by simultaneous fitting to nucleon-nucleon ($NN$) scattering phase shifts, low-energy scattering length, effective range, and the binding energy of the deuteron. The momentum-space matrix elements of the low-momentum OBEP$Λ$ ($Λ\leqslant 3$ fm$^{-1}$) demonstrate consistency with the universal behaviors observed in other realistic $NN$ potentials evolved by renormalization group methods. These OBEP$Λ$s are applied to calculate the equation of state of symmetric nuclear matter (SNM) within either the nonrelativistic (NR) Brueckner-Hartree-Fock (BHF) or relativistic Brueckner-Hartree-Fock (RBHF) frameworks. The results show that the saturation properties of SNM are reproduced qualitatively from the RBHF calculation, but not from the NR-BHF calculation. This study highlights the relativistic mechanism in explaining the saturation properties of nuclear matter. The remaining discrepancy in reproducing empirical saturation properties in the RBHF calculation using the OBEP$Λ$s signals the necessity of including three-nucleon correlations or genuine three-nucleon forces. △ Less

Submitted 6 December, 2023; originally announced December 2023.

arXiv:2311.02757 [pdf, other]

ELEGANT: Certified Defense on the Fairness of Graph Neural Networks

Authors: Yushun Dong, Binchi Zhang, Hanghang Tong, Jundong Li

Abstract: Graph Neural Networks (GNNs) have emerged as a prominent graph learning model in various graph-based tasks over the years. Nevertheless, due to the vulnerabilities of GNNs, it has been empirically proved that malicious attackers could easily corrupt the fairness level of their predictions by adding perturbations to the input graph data. In this paper, we take crucial steps to study a novel problem… ▽ More Graph Neural Networks (GNNs) have emerged as a prominent graph learning model in various graph-based tasks over the years. Nevertheless, due to the vulnerabilities of GNNs, it has been empirically proved that malicious attackers could easily corrupt the fairness level of their predictions by adding perturbations to the input graph data. In this paper, we take crucial steps to study a novel problem of certifiable defense on the fairness level of GNNs. Specifically, we propose a principled framework named ELEGANT and present a detailed theoretical certification analysis for the fairness of GNNs. ELEGANT takes any GNNs as its backbone, and the fairness level of such a backbone is theoretically impossible to be corrupted under certain perturbation budgets for attackers. Notably, ELEGANT does not have any assumption over the GNN structure or parameters, and does not require re-training the GNNs to realize certification. Hence it can serve as a plug-and-play framework for any optimized GNNs ready to be deployed. We verify the satisfactory effectiveness of ELEGANT in practice through extensive experiments on real-world datasets across different backbones of GNNs, where ELEGANT is also demonstrated to be beneficial for GNN debiasing. Open-source code can be found at https://github.com/yushundong/ELEGANT. △ Less

Submitted 5 November, 2023; originally announced November 2023.

arXiv:2310.15653 [pdf, other]

Deceptive Fairness Attacks on Graphs via Meta Learning

Authors: Jian Kang, Yinglong Xia, Ross Maciejewski, Jiebo Luo, Hanghang Tong

Abstract: We study deceptive fairness attacks on graphs to answer the following question: How can we achieve poisoning attacks on a graph learning model to exacerbate the bias deceptively? We answer this question via a bi-level optimization problem and propose a meta learning-based framework named FATE. FATE is broadly applicable with respect to various fairness definitions and graph learning models, as wel… ▽ More We study deceptive fairness attacks on graphs to answer the following question: How can we achieve poisoning attacks on a graph learning model to exacerbate the bias deceptively? We answer this question via a bi-level optimization problem and propose a meta learning-based framework named FATE. FATE is broadly applicable with respect to various fairness definitions and graph learning models, as well as arbitrary choices of manipulation operations. We further instantiate FATE to attack statistical parity and individual fairness on graph neural networks. We conduct extensive experimental evaluations on real-world datasets in the task of semi-supervised node classification. The experimental results demonstrate that FATE could amplify the bias of graph neural networks with or without fairness consideration while maintaining the utility on the downstream task. We hope this paper provides insights into the adversarial robustness of fair graph learning and can shed light on designing robust and fair graph learning in future studies. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 23 pages, 11 tables

arXiv:2310.07144 [pdf, ps, other]

Rotating vector model and radius-to-frequency mapping in the presence of multipole magnetic field

Authors: J. L. Qiu, H. Tong, H. G. Wang

Abstract: The rotating vector model and radius-to-frequency mapping in the presence of multipole magnetic field in pulsars and magnetars are considered. An axisymmetric potential field is assumed. It is found that: (1) The radiation beam in the case of multipole field is wider than the dipole case. This may account the increasing pulse width at higher frequency of pulsars (anti-radius-to-frequency mapping).… ▽ More The rotating vector model and radius-to-frequency mapping in the presence of multipole magnetic field in pulsars and magnetars are considered. An axisymmetric potential field is assumed. It is found that: (1) The radiation beam in the case of multipole field is wider than the dipole case. This may account the increasing pulse width at higher frequency of pulsars (anti-radius-to-frequency mapping). (2) The expression for the polarization position angle is unchanged. Only the inclination angle α and phase constant φ_0 will change. The angle between the rotational axis and line of sight, and the position angle constant ψ_0 will not change. When fitting the varying position angle of magnetars, these constraints should be considered. The appearance and disappearance of multipole field may account for the changing slope of position angle in the radio emitting magnetar Swift J1818.0-1607. Similar but more active process in magnetar magnetospheres may account for the diverse position angle in fast radius bursts. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: accepted in ApJ. significantly rewritten compared with the original version

arXiv:2310.04470 [pdf, other]

Hierarchical Multi-Marginal Optimal Transport for Network Alignment

Authors: Zhichen Zeng, Boxin Du, Si Zhang, Yinglong Xia, Zhining Liu, Hanghang Tong

Abstract: Finding node correspondence across networks, namely multi-network alignment, is an essential prerequisite for joint learning on multiple networks. Despite great success in aligning networks in pairs, the literature on multi-network alignment is sparse due to the exponentially growing solution space and lack of high-order discrepancy measures. To fill this gap, we propose a hierarchical multi-margi… ▽ More Finding node correspondence across networks, namely multi-network alignment, is an essential prerequisite for joint learning on multiple networks. Despite great success in aligning networks in pairs, the literature on multi-network alignment is sparse due to the exponentially growing solution space and lack of high-order discrepancy measures. To fill this gap, we propose a hierarchical multi-marginal optimal transport framework named HOT for multi-network alignment. To handle the large solution space, multiple networks are decomposed into smaller aligned clusters via the fused Gromov-Wasserstein (FGW) barycenter. To depict high-order relationships across multiple networks, the FGW distance is generalized to the multi-marginal setting, based on which networks can be aligned jointly. A fast proximal point method is further developed with guaranteed convergence to a local optimum. Extensive experiments and analysis show that our proposed HOT achieves significant improvements over the state-of-the-art in both effectiveness and scalability. △ Less

Submitted 10 February, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: 14 pages, 10 figures

arXiv:2310.03152 [pdf, other]

Towards out-of-distribution generalizable predictions of chemical kinetics properties

Authors: Zihao Wang, Yongqiang Chen, Yang Duan, Weijiang Li, Bo Han, James Cheng, Hanghang Tong

Abstract: Machine Learning (ML) techniques have found applications in estimating chemical kinetic properties. With the accumulated drug molecules identified through "AI4drug discovery", the next imperative lies in AI-driven design for high-throughput chemical synthesis processes, with the estimation of properties of unseen reactions with unexplored molecules. To this end, the existing ML approaches for kine… ▽ More Machine Learning (ML) techniques have found applications in estimating chemical kinetic properties. With the accumulated drug molecules identified through "AI4drug discovery", the next imperative lies in AI-driven design for high-throughput chemical synthesis processes, with the estimation of properties of unseen reactions with unexplored molecules. To this end, the existing ML approaches for kinetics property prediction are required to be Out-Of-Distribution (OOD) generalizable. In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism), revealing unique aspects of such problems. Under this framework, we create comprehensive datasets to benchmark (1) the state-of-the-art ML approaches for reaction prediction in the OOD setting and (2) the state-of-the-art graph OOD methods in kinetics property prediction problems. Our results demonstrated the challenges and opportunities in OOD kinetics property prediction. Our datasets and benchmarks can further support research in this direction. △ Less

Submitted 4 December, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023 Workshop in AI for Scientific Discovery: From Theory to Practice. 11 pages, 1 figure, and 5 tables Data and code can be found in https://github.com/zihao-wang/ReactionOOD

arXiv:2309.13384 [pdf, other]

On the Sweet Spot of Contrastive Views for Knowledge-enhanced Recommendation

Authors: Haibo Ye, Xinjie Li, Yuan Yao, Hanghang Tong

Abstract: In recommender systems, knowledge graph (KG) can offer critical information that is lacking in the original user-item interaction graph (IG). Recent process has explored this direction and shows that contrastive learning is a promising way to integrate both. However, we observe that existing KG-enhanced recommenders struggle in balancing between the two contrastive views of IG and KG, making them… ▽ More In recommender systems, knowledge graph (KG) can offer critical information that is lacking in the original user-item interaction graph (IG). Recent process has explored this direction and shows that contrastive learning is a promising way to integrate both. However, we observe that existing KG-enhanced recommenders struggle in balancing between the two contrastive views of IG and KG, making them sometimes even less effective than simply applying contrastive learning on IG without using KG. In this paper, we propose a new contrastive learning framework for KG-enhanced recommendation. Specifically, to make full use of the knowledge, we construct two separate contrastive views for KG and IG, and maximize their mutual information; to ease the contrastive learning on the two views, we further fuse KG information into IG in a one-direction manner.Extensive experimental results on three real-world datasets demonstrate the effectiveness and efficiency of our method, compared to the state-of-the-art. Our code is available through the anonymous link:https://figshare.com/articles/conference_contribution/SimKGCL/22783382 △ Less

Submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.08478 [pdf, other]

Current and future directions in network biology

Authors: Marinka Zitnik, Michelle M. Li, Aydin Wells, Kimberly Glass, Deisy Morselli Gysi, Arjun Krishnan, T. M. Murali, Predrag Radivojac, Sushmita Roy, Anaïs Baudot, Serdar Bozdag, Danny Z. Chen, Lenore Cowen, Kapil Devkota, Anthony Gitter, Sara Gosline, Pengfei Gu, Pietro H. Guzzi, Heng Huang, Meng Jiang, Ziynet Nesibe Kesimoglu, Mehmet Koyuturk, Jian Ma, Alexander R. Pico, Nataša Pržulj , et al. (12 additional authors not shown)

Abstract: Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These challenges stem from various fa… ▽ More Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These challenges stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology and highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on the future directions of network biology. Additionally, we offer insights into scientific communities, educational initiatives, and the importance of fostering diversity within the field. This paper establishes a roadmap for an immediate and long-term vision for network biology. △ Less

Submitted 11 June, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 52 pages, 6 figures, 1 table

arXiv:2309.05798 [pdf, other]

Enhancing Hyperedge Prediction with Context-Aware Self-Supervised Learning

Authors: Yunyong Ko, Hanghang Tong, Sang-Wook Kim

Abstract: Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How… ▽ More Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: 12 pages, 11 figures

Showing 1–50 of 282 results for author: Tong, H