subscribe to arXiv mailings

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

Authors: Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji

Abstract: Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual inform… ▽ More Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual information for phrase-level understanding largely unexplored. In this paper, we utilize Panoptic Narrative Grounding (PNG) as a proxy task to investigate this capability further. PNG aims to segment object instances mentioned by multiple noun phrases within a given narrative text. Specifically, we introduce the DiffPNG framework, a straightforward yet effective approach that fully capitalizes on the diffusion's architecture for segmentation by decomposing the process into a sequence of localization, segmentation, and refinement steps. The framework initially identifies anchor points using cross-attention mechanisms and subsequently performs segmentation with self-attention to achieve zero-shot PNG. Moreover, we introduce a refinement module based on SAM to enhance the quality of the segmentation masks. Our extensive experiments on the PNG dataset demonstrate that DiffPNG achieves strong performance in the zero-shot PNG task setting, conclusively proving the diffusion model's capability for context-aware, phrase-level understanding. Source code is available at \url{https://github.com/nini0919/DiffPNG}. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.02787 [pdf]

A versatile quantum microwave photonic signal processing platform based on coincidence window selection technique

Authors: Xinghua Li, Yifan Guo, Xiao Xiang, Runai Quan, Mingtao Cao, Ruifang Dong, Tao Liu, Ming Li, Shougang Zhang

Abstract: Quantum microwave photonics (QMWP) is an innovative approach that combines energy-time entangled biphoton sources as the optical carrier with time-correlated single-photon detection for high-speed RF signal recovery. This groundbreaking method offers unique advantages such as nonlocal RF signal encoding and robust resistance to dispersion-induced frequency fading. This paper explores the versatili… ▽ More Quantum microwave photonics (QMWP) is an innovative approach that combines energy-time entangled biphoton sources as the optical carrier with time-correlated single-photon detection for high-speed RF signal recovery. This groundbreaking method offers unique advantages such as nonlocal RF signal encoding and robust resistance to dispersion-induced frequency fading. This paper explores the versatility of processing the quantum microwave photonic signal by utilizing coincidence window selection on the biphoton coincidence distribution. The demonstration includes finely-tunable RF phase shifting, flexible multi-tap transversal filtering (with up to 15 taps), and photonically implemented RF mixing, leveraging the nonlocal RF mapping characteristic of QMWP. These accomplishments significantly enhance the capability of microwave photonic systems in processing ultra-weak signals, opening up new possibilities for various applications. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.02774 [pdf]

Quantum microwave photonic mixer with a large spurious-free dynamic range

Authors: Xinghua Li, Yifan Guo, Xiao Xiang, Runai Quan, Mingtao Cao, Ruifang Dong, Tao Liu, Ming Li, Shougang Zhang

Abstract: As one of the most fundamental functionalities of microwave photonics, microwave frequency mixing plays an essential role in modern radars and wireless communication systems. However, the commonly utilized intensity modulation in the systems often leads to inadequate spurious-free dynamic range (SFDR) for many sought-after applications. Quantum microwave photonics technique offers a promising solu… ▽ More As one of the most fundamental functionalities of microwave photonics, microwave frequency mixing plays an essential role in modern radars and wireless communication systems. However, the commonly utilized intensity modulation in the systems often leads to inadequate spurious-free dynamic range (SFDR) for many sought-after applications. Quantum microwave photonics technique offers a promising solution for improving SFDR in terms of higher-order harmonic distortion. In this paper, we demonstrate two types of quantum microwave photonic mixers based on the configuration of the intensity modulators: cascade-type and parallel-type. Leveraging the nonlocal RF signal encoding capability, both types of quantum microwave photonic mixers not only exhibit the advantage of dual-channel output but also present significant improvement in SFDR. Specifically, the parallel-type quantum microwave photonic mixer achieves a remarkable SFDR value of 113.6 dB.Hz1/2, which is 30 dB better than that of the cascade-type quantum microwave photonic mixer. When compared to the classical microwave photonic mixer, this enhancement reaches a notable 53.6 dB at the expense of 8 dB conversion loss. These results highlight the superiority of quantum microwave photonic mixers in the fields of microwave and millimeter-wave systems. Further applying multi-photon frequency entangled sources as optical carriers, the dual-channel microwave frequency conversion capability endowed by the quantum microwave photonic mixer can be extended to enhance the performance of multiple-paths microwave mixing which is essential for radar net systems. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01344 [pdf, other]

Distributionally Robust Performative Optimization

Authors: Zhuangzhuang Jia, Yijie Wang, Roy Dong, Grani A. Hanasusanto

Abstract: In this paper, we propose a general distributionally robust framework for performative optimization, where the selected decision can influence the probabilistic distribution of uncertain parameters. Our framework facilitates safe decision-making in scenarios with incomplete information about the underlying decision-dependent distributions, relying instead on accessible reference distributions. To… ▽ More In this paper, we propose a general distributionally robust framework for performative optimization, where the selected decision can influence the probabilistic distribution of uncertain parameters. Our framework facilitates safe decision-making in scenarios with incomplete information about the underlying decision-dependent distributions, relying instead on accessible reference distributions. To tackle the challenge of decision-dependent uncertainty, we introduce an algorithm named repeated robust risk minimization. This algorithm decouples the decision variables associated with the ambiguity set from the expected loss, optimizing the latter at each iteration while keeping the former fixed to the previous decision. By leveraging the strong connection between distributionally robust optimization and regularization, we establish a linear convergence rate to a performatively stable point and provide a suboptimality performance guarantee for the proposed algorithm. Finally, we examine the performance of our proposed model through an experimental study in strategic classification. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00733 [pdf, other]

CSPBench: a benchmark and critical evaluation of Crystal Structure Prediction

Authors: Lai Wei, Sadman Sadeed Omee, Rongzhi Dong, Nihang Fu, Yuqi Song, Edirisuriya M. D. Siriwardane, Meiling Xu, Chris Wolverton, Jianjun Hu

Abstract: Crystal structure prediction (CSP) is now increasingly used in discovering novel materials with applications in diverse industries. However, despite decades of developments and significant progress in this area, there lacks a set of well-defined benchmark dataset, quantitative performance metrics, and studies that evaluate the status of the field. We aim to fill this gap by introducing a CSP bench… ▽ More Crystal structure prediction (CSP) is now increasingly used in discovering novel materials with applications in diverse industries. However, despite decades of developments and significant progress in this area, there lacks a set of well-defined benchmark dataset, quantitative performance metrics, and studies that evaluate the status of the field. We aim to fill this gap by introducing a CSP benchmark suite with 180 test structures along with our recently implemented CSP performance metric set. We benchmark a collection of 13 state-of-the-art (SOTA) CSP algorithms including template-based CSP algorithms, conventional CSP algorithms based on DFT calculations and global search such as CALYPSO, CSP algorithms based on machine learning (ML) potentials and global search, and distance matrix based CSP algorithms. Our results demonstrate that the performance of the current CSP algorithms is far from being satisfactory. Most algorithms cannot even identify the structures with the correct space groups except for the template-based algorithms when applied to test structures with similar templates. We also find that the ML potential based CSP algorithms are now able to achieve competitive performances compared to the DFT-based algorithms. These CSP algorithms' performance is strongly determined by the quality of the neural potentials as well as the global optimization algorithms. Our benchmark suite comes with a comprehensive open-source codebase and 180 well-selected benchmark crystal structures, making it convenient to evaluate the advantages and disadvantages of CSP algorithms from future studies. All the code and benchmark data are available at https://github.com/usccolumbia/cspbenchmark △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 26 pages

arXiv:2406.19843 [pdf, other]

First JVLA Radio Observation on PDS70

Authors: Hauyu Baobab Liu, Simon Casassus, Ruobing Dong, Kiyoaki Doi, Jun Hashimoto, Takayuki Muto

Abstract: PDS~70 is a protoplanetary system that hosts two actively accreting gas giants, namely PDS~70b and PDS~70c. The system has a $\sim$60--100 au dusty ring that has been resolved by ALMA, along with circumplanetary disks around the two gas giants. Here we report the first JVLA Q (40--48 GHz), Ka (29--37 GHz), K (18--26 GHz), and X (8--12 GHz) bands continuum observations, and the complementary ALMA B… ▽ More PDS~70 is a protoplanetary system that hosts two actively accreting gas giants, namely PDS~70b and PDS~70c. The system has a $\sim$60--100 au dusty ring that has been resolved by ALMA, along with circumplanetary disks around the two gas giants. Here we report the first JVLA Q (40--48 GHz), Ka (29--37 GHz), K (18--26 GHz), and X (8--12 GHz) bands continuum observations, and the complementary ALMA Bands 3 ($\sim$98 GHz) and 4 ($\sim$145 GHz) observations towards PDS~70. The dusty ring appears azimuthally asymmetric in our ALMA images. We obtained firm detections at Ka and K bands without spatially resolving the source; we obtained a marginal detection at Q band, and no detection at X band. The spectral indices ($α$) are 5$\pm$1 at 33--44 GHz and 0.6$\pm$0.2 at 22--33 GHz. At 10--22 GHz, the conservative lower limit of $α$ is 1.7. The 33--44 GHz flux density is likely dominated by the optically thin thermal emission of grown dust with $\gtrsim$1 mm maximum grain sizes, which may be associated with the azimuthally asymmetric substructure induced by planet-disk interaction. Since PDS~70 was not detected at X band, we found it hard to explain the low spectral index at 22--33 GHz only with free-free emission. Hence, we attribute the dominant emission at 22--33 GHz to the emission of spinning nanometer-sized dust particles, while free-free emission may partly contribute to emission at this frequency range. In some protoplanetary disks, the emission of spinning nanometer-sized dust particles may resemble the 20--50 GHz excess in the spectra of millimeter-sized dust. The finding of strong continuum emission of spinning nanometer-sized particles can complicate the procedure of constraining the properties of grown dust. Future high-resolution, multi-frequency JVLA/ngVLA and SKA observations may shed light on this issue. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.16855 [pdf, other]

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Authors: Yuang Peng, Yuxin Cui, Haomiao Tang, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia

Abstract: Personalized image generation holds great promise in assisting humans in everyday work and life due to its impressive function in creatively generating personalized content. However, current evaluations either are automated but misalign with humans or require human evaluations that are time-consuming and expensive. In this work, we present DreamBench++, a human-aligned benchmark automated by advan… ▽ More Personalized image generation holds great promise in assisting humans in everyday work and life due to its impressive function in creatively generating personalized content. However, current evaluations either are automated but misalign with humans or require human evaluations that are time-consuming and expensive. In this work, we present DreamBench++, a human-aligned benchmark automated by advanced multimodal GPT models. Specifically, we systematically design the prompts to let GPT be both human-aligned and self-aligned, empowered with task reinforcement. Further, we construct a comprehensive dataset comprising diverse images and prompts. By benchmarking 7 modern generative models, we demonstrate that DreamBench++ results in significantly more human-aligned evaluation, helping boost the community with innovative findings. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Project page: https://dreambenchplus.github.io/

arXiv:2406.16439 [pdf, other]

Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments

Authors: Shilei Cao, Yan Liu, Juepeng Zheng, Weijia Li, Runmin Dong, Haohuan Fu

Abstract: For real-world applications, neural network models are commonly deployed in dynamic environments, where the distribution of the target domain undergoes temporal changes. Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to test data drawn from a continually changing target domain. Despite recent advancements in addressing… ▽ More For real-world applications, neural network models are commonly deployed in dynamic environments, where the distribution of the target domain undergoes temporal changes. Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to test data drawn from a continually changing target domain. Despite recent advancements in addressing CTTA, two critical issues remain: 1) The use of a fixed threshold for pseudo-labeling in existing methodologies leads to the generation of low-quality pseudo-labels, as model confidence varies across categories and domains; 2) While current solutions utilize stochastic parameter restoration to mitigate catastrophic forgetting, their capacity to preserve critical information is undermined by its intrinsic randomness. To tackle these challenges, we present CTAOD, aiming to enhance the performance of detection models in CTTA scenarios. Inspired by prior CTTA works for effective adaptation, CTAOD is founded on the mean-teacher framework, characterized by three core components. Firstly, the object-level contrastive learning module tailored for object detection extracts object-level features using the teacher's region of interest features and optimizes them through contrastive learning. Secondly, the dynamic threshold strategy updates the category-specific threshold based on predicted confidence scores to improve the quality of pseudo-labels. Lastly, we design a data-driven stochastic restoration mechanism to selectively reset inactive parameters using the gradients as weights for a random mask matrix, thereby ensuring the retention of essential knowledge. We demonstrate the effectiveness of our approach on four CTTA tasks for object detection, where CTAOD outperforms existing methods, especially achieving a 3.0 mAP improvement on the Cityscapes-to-Cityscapes-C CTTA task. △ Less

Submitted 24 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.11657 [pdf, other]

Can LLM be a Personalized Judge?

Authors: Yijiang River Dong, Tiancheng Hu, Nigel Collier

Abstract: Ensuring that large language models (LLMs) reflect diverse user values and preferences is crucial as their user bases expand globally. It is therefore encouraging to see the growing interest in LLM personalization within the research community. However, current works often rely on the LLM-as-a-Judge approach for evaluation without thoroughly examining its validity. In this paper, we investigate th… ▽ More Ensuring that large language models (LLMs) reflect diverse user values and preferences is crucial as their user bases expand globally. It is therefore encouraging to see the growing interest in LLM personalization within the research community. However, current works often rely on the LLM-as-a-Judge approach for evaluation without thoroughly examining its validity. In this paper, we investigate the reliability of LLM-as-a-Personalized-Judge, asking LLMs to judge user preferences based on personas. Our findings suggest that directly applying LLM-as-a-Personalized-Judge is less reliable than previously assumed, showing low and inconsistent agreement with human ground truth. The personas typically used are often overly simplistic, resulting in low predictive power. To address these issues, we introduce verbal uncertainty estimation into the LLM-as-a-Personalized-Judge pipeline, allowing the model to express low confidence on uncertain judgments. This adjustment leads to much higher agreement (above 80%) on high-certainty samples for binary tasks. Through human evaluation, we find that the LLM-as-a-Personalized-Judge achieves comparable performance to third-party humans evaluation and even surpasses human performance on high-certainty samples. Our work indicates that certainty-enhanced LLM-as-a-Personalized-Judge offers a promising direction for developing more reliable and scalable methods for evaluating LLM personalization. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Our code is available at https://github.com/dong-river/Personalized-Judge

arXiv:2406.10869 [pdf, other]

Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution

Authors: Cuixin Yang, Rongkang Dong, Jun Xiao, Cong Zhang, Kin-Man Lam, Fei Zhou, Guoping Qiu

Abstract: As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI sup… ▽ More As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI super-resolution needs to take into account geometric distortion resulting from ERP. However, without considering such geometric distortion of ERP images, previous deep-learning-based methods only utilize a limited range of pixels and may easily miss self-similar textures for reconstruction. In this paper, we introduce a novel Geometric Distortion Guided Transformer for Omnidirectional image Super-Resolution (GDGT-OSR). Specifically, a distortion modulated rectangle-window self-attention mechanism, integrated with deformable self-attention, is proposed to better perceive the distortion and thus involve more self-similar textures. Distortion modulation is achieved through a newly devised distortion guidance generator that produces guidance by exploiting the variability of distortion across latitudes. Furthermore, we propose a dynamic feature aggregation scheme to adaptively fuse the features from different self-attention modules. We present extensive experimental results on public datasets and show that the new GDGT-OSR outperforms methods in existing literature. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 13 pages, 12 figures, journal

arXiv:2406.09501 [pdf, other]

Observational characteristics of circum-planetary-mass-object disks in the era of James Webb Space Telescope

Authors: Xilei Sun, Pinghui Huang, Ruobing Dong, Shang-Fei Liu

Abstract: Recent observations have confirmed circumplanetary disks (CPDs) embedded in parental protoplanetary disks (PPDs). On the other hand, planetary-mass companions (PMCs) and planetary-mass objects (PMOs) are likely to harbor their own accretion disks. Unlike PPDs, CPDs and other disks around planet analogues are generally too compact to be spatially resolved by current instrumentation. In this study,… ▽ More Recent observations have confirmed circumplanetary disks (CPDs) embedded in parental protoplanetary disks (PPDs). On the other hand, planetary-mass companions (PMCs) and planetary-mass objects (PMOs) are likely to harbor their own accretion disks. Unlike PPDs, CPDs and other disks around planet analogues are generally too compact to be spatially resolved by current instrumentation. In this study, we generate over 4,000 spectral energy distributions (SEDs) of circum-PMO-disks (CPMODs) with various host temperature and disk properties, which can be categorized into four prototypes, i.e., full, pre-transitional, transitional and evolved CPMODs. We propose a classification scheme based on their near-to-mid-infrared colors. Using those CPMOD models, we synthesize JWST (NIRCam and MIRI) photometry for F444W, F1000W and F2550W wide filters. We show F444W - F1000W and F444 - F2550W colors can be applied to distinguish different types of CPMODs, especially for those around hot hosts. Our results indicate that the ongoing and future JWST observations are promising to unveil structures and properties of CPMODs. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 18 pages, 7 figures, accepted for publication in ApJ

arXiv:2406.08480 [pdf, other]

Linear equations with monomial constraints and decision problems in abelian-by-cyclic groups

Authors: Ruiwen Dong

Abstract: We show that it is undecidable whether a system of linear equations over the Laurent polynomial ring $\mathbb{Z}[X^{\pm}]$ admit solutions where a specified subset of variables take value in the set of monomials $\{X^z \mid z \in \mathbb{Z}\}$. In particular, we construct a finitely presented $\mathbb{Z}[X^{\pm}]$-module, where it is undecidable whether a linear equation… ▽ More We show that it is undecidable whether a system of linear equations over the Laurent polynomial ring $\mathbb{Z}[X^{\pm}]$ admit solutions where a specified subset of variables take value in the set of monomials $\{X^z \mid z \in \mathbb{Z}\}$. In particular, we construct a finitely presented $\mathbb{Z}[X^{\pm}]$-module, where it is undecidable whether a linear equation $X^{z_1} \boldsymbol{f}_1 + \cdots + X^{z_n} \boldsymbol{f}_n = \boldsymbol{f}_0$ has solutions $z_1, \ldots, z_n \in \mathbb{Z}$. This contrasts the decidability of the case $n = 1$, which can be deduced from Noskov's Lemma. We apply this result to settle a number of problems in computational group theory. We show that it is undecidable whether a system of equations has solutions in the wreath product $\mathbb{Z} \wr \mathbb{Z}$, providing a negative answer to an open problem of Kharlampovich, López and Miasnikov (2020). We show that there exists a finitely generated abelian-by-cyclic group in which the problem of solving a single quadratic equation is undecidable. We also construct a finitely generated abelian-by-cyclic group, different to that of Mishchenko and Treier (2017), in which the Knapsack Problem is undecidable. In contrast, we show that the problem of Coset Intersection is decidable in all finitely generated abelian-by-cyclic groups. △ Less

Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: Added Theorem 3 compared to previous version. Supersedes arXiv:2309.08811 which contains Theorem 6

arXiv:2406.08079 [pdf, other]

A$^{2}$-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder

Authors: Lixian Zhang, Yi Zhao, Runmin Dong, Jinxiao Zhang, Shuai Yuan, Shilei Cao, Mengxuan Chen, Juepeng Zheng, Weijia Li, Wei Liu, Wayne Zhang, Litong Feng, Haohuan Fu

Abstract: Vast amounts of remote sensing (RS) data provide Earth observations across multiple dimensions, encompassing critical spatial, temporal, and spectral information which is essential for addressing global-scale challenges such as land use monitoring, disaster prevention, and environmental change mitigation. Despite various pre-training methods tailored to the characteristics of RS data, a key limita… ▽ More Vast amounts of remote sensing (RS) data provide Earth observations across multiple dimensions, encompassing critical spatial, temporal, and spectral information which is essential for addressing global-scale challenges such as land use monitoring, disaster prevention, and environmental change mitigation. Despite various pre-training methods tailored to the characteristics of RS data, a key limitation persists: the inability to effectively integrate spatial, temporal, and spectral information within a single unified model. To unlock the potential of RS data, we construct a Spatial-Temporal-Spectral Structured Dataset (STSSD) characterized by the incorporation of multiple RS sources, diverse coverage, unified locations within image sets, and heterogeneity within images. Building upon this structured dataset, we propose an Anchor-Aware Masked AutoEncoder method (A$^{2}$-MAE), leveraging intrinsic complementary information from the different kinds of images and geo-information to reconstruct the masked patches during the pre-training phase. A$^{2}$-MAE integrates an anchor-aware masking strategy and a geographic encoding module to comprehensively exploit the properties of RS images. Specifically, the proposed anchor-aware masking strategy dynamically adapts the masking process based on the meta-information of a pre-selected anchor image, thereby facilitating the training on images captured by diverse types of RS sources within one model. Furthermore, we propose a geographic encoding method to leverage accurate spatial patterns, enhancing the model generalization capabilities for downstream applications that are generally location-related. Extensive experiments demonstrate our method achieves comprehensive improvements across various downstream tasks compared with existing RS pre-training methods, including image classification, semantic segmentation, and change detection tasks. △ Less

Submitted 16 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.03001 [pdf, other]

EdgeSync: Faster Edge-model Updating via Adaptive Continuous Learning for Video Data Drift

Authors: Peng Zhao, Runchu Dong, Guiqin Wang, Cong Zhao

Abstract: Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency. The distribution of video content features may change over time for various reasons (i.e. light and weather change) , leading to accuracy degradation of existing models, to solve this problem, recent work proposes a framework that uses a remote server to continually train and adapt the li… ▽ More Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency. The distribution of video content features may change over time for various reasons (i.e. light and weather change) , leading to accuracy degradation of existing models, to solve this problem, recent work proposes a framework that uses a remote server to continually train and adapt the lightweight model at edge with the help of complex model. However, existing analytics approaches leave two challenges untouched: firstly, retraining task is compute-intensive, resulting in large model update delays; secondly, new model may not fit well enough with the data distribution of the current video stream. To address these challenges, in this paper, we present EdgeSync, EdgeSync filters the samples by considering both timeliness and inference results to make training samples more relevant to the current video content as well as reduce the update delay, to improve the quality of training, EdgeSync also designs a training management module that can efficiently adjusts the model training time and training order on the runtime. By evaluating real datasets with complex scenes, our method improves about 3.4% compared to existing methods and about 10% compared to traditional means. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.00891 [pdf, other]

Global High Categorical Resolution Land Cover Mapping via Weak Supervision

Authors: Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu

Abstract: Land cover information is indispensable for advancing the United Nations' sustainable development goals, and land cover mapping under a more detailed category system would significantly contribute to economic livelihood tracking and environmental degradation measurement. However, the substantial difficulty in acquiring fine-grained training data makes the implementation of this task particularly c… ▽ More Land cover information is indispensable for advancing the United Nations' sustainable development goals, and land cover mapping under a more detailed category system would significantly contribute to economic livelihood tracking and environmental degradation measurement. However, the substantial difficulty in acquiring fine-grained training data makes the implementation of this task particularly challenging. Here, we propose to combine fully labeled source domain and weakly labeled target domain for weakly supervised domain adaptation (WSDA). This is beneficial as the utilization of sparse and coarse weak labels can considerably alleviate the labor required for precise and detailed land cover annotation. Specifically, we introduce the Prototype-based pseudo-label Rectification and Expansion (PRE) approach, which leverages the prototypes (i.e., the class-wise feature centroids) as the bridge to connect sparse labels and global feature distributions. According to the feature distances to the prototypes, the confidence of pseudo-labels predicted in the unlabeled regions of the target domain is assessed. This confidence is then utilized to guide the dynamic expansion and rectification of pseudo-labels. Based on PRE, we carry out high categorical resolution land cover mapping for 10 cities in different regions around the world, severally using PlanetScope, Gaofen-1, and Sentinel-2 satellite images. In the study areas, we achieve cross-sensor, cross-category, and cross-continent WSDA, with the overall accuracy exceeding 80%. The promising results indicate that PRE is capable of reducing the dependency of land cover classification on high-quality annotations, thereby improving label efficiency. We expect our work to enable global fine-grained land cover mapping, which in turn promote Earth observation to provide more precise and thorough information for environmental monitoring. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.20739 [pdf]

Elucidating the Role of Stacking Faults in TlGaSe$_{2}$ on its Thermoelectric Properties

Authors: Tigran Simonian, Ahin Roy, Akash Bajaj, Rui Dong, Zheng Lei, Zdeněk Sofer, Stefano Sanvito, Valeria Nicolosi

Abstract: Thermoelectric materials are of great interest for heat energy harvesting applications. One such promising material is TlGaSe$_{2}$, a p-type semiconducting ternary chalcogenide. Recent reports show it can be processed as a thin film, opening the door for large-scale commercialization. However, TlGaSe$_{2}$ is prone to stacking faults along the [001] stacking direction and their role in its thermo… ▽ More Thermoelectric materials are of great interest for heat energy harvesting applications. One such promising material is TlGaSe$_{2}$, a p-type semiconducting ternary chalcogenide. Recent reports show it can be processed as a thin film, opening the door for large-scale commercialization. However, TlGaSe$_{2}$ is prone to stacking faults along the [001] stacking direction and their role in its thermoelectric properties has not been understood to date. Herein, TlGaSe$_{2}$ is investigated via (scanning) transmission electron microscopy and first-principles calculations. Stacking faults are found to be present throughout the material, as density functional theory calculations reveal a lack of preferential stacking order. Electron transport calculations show an enhancement of thermoelectric power factors when stacking faults are present. This implies the presence of stacking faults is key to the material's excellent thermoelectric properties along the [001] stacking direction, which can be further enhanced by doping the material to hole carrier concentrations to approx. 10$^{19}$ cm$^{-3}$. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.19055 [pdf, other]

FUSU: A Multi-temporal-source Land Use Change Segmentation Dataset for Fine-grained Urban Semantic Understanding

Authors: Shuai Yuan, Guancong Lin, Lixian Zhang, Runmin Dong, Jinxiao Zhang, Shuang Chen, Juepeng Zheng, Jie Wang, Haohuan Fu

Abstract: Fine urban change segmentation using multi-temporal remote sensing images is essential for understanding human-environment interactions in urban areas. Although there have been advances in high-quality land cover datasets that reveal the physical features of urban landscapes, the lack of fine-grained land use datasets hinders a deeper understanding of how human activities are distributed across th… ▽ More Fine urban change segmentation using multi-temporal remote sensing images is essential for understanding human-environment interactions in urban areas. Although there have been advances in high-quality land cover datasets that reveal the physical features of urban landscapes, the lack of fine-grained land use datasets hinders a deeper understanding of how human activities are distributed across the landscape and the impact of these activities on the environment, thus constraining proper technique development. To address this, we introduce FUSU, the first fine-grained land use change segmentation dataset for Fine-grained Urban Semantic Understanding. FUSU features the most detailed land use classification system to date, with 17 classes and 30 billion pixels of annotations. It includes bi-temporal high-resolution satellite images with 0.2-0.5 m ground sample distance and monthly optical and radar satellite time series, covering 847 km^2 across five urban areas in the southern and northern of China with different geographical features. The fine-grained land use pixel-wise annotations and high spatial-temporal resolution data provide a robust foundation for developing proper deep learning models to provide contextual insights on human activities and urbanization. To fully leverage FUSU, we propose a unified time-series architecture for both change detection and segmentation. We benchmark FUSU on various methods for several tasks. Dataset and code are available at: https://github.com/yuanshuai0914/FUSU. △ Less

Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.17899 [pdf]

doi 10.1002/anie.202300186

Near IR bandgap semiconductive 2D conjugated metal-organic framework with rhombic lattice and high mobility

Authors: Lukas Sporrer, Guojun Zhou, Mingchao Wang, Vasileios Balos, Sergio Revuelta, Kamil Jastrzembski, Markus Loeffler, Petko Petkov, Thomas Heine, Angieszka Kuc, Enrique Canovas, Zhehao Huang, Xinliang Feng, Renhao Dong

Abstract: Two-dimensional conjugated metal-organic frameworks (2D c-MOFs) are emerging as a unique class of 2D electronic materials. However, intrinsically semiconducting 2D c-MOFs with gaps in the Vis-NIR and high charge carrier mobility have been rare. Most of the reported semiconducting 2D c-MOFs are metallic (i.e. gapless), which limits their use in applications where larger band gaps are needed for log… ▽ More Two-dimensional conjugated metal-organic frameworks (2D c-MOFs) are emerging as a unique class of 2D electronic materials. However, intrinsically semiconducting 2D c-MOFs with gaps in the Vis-NIR and high charge carrier mobility have been rare. Most of the reported semiconducting 2D c-MOFs are metallic (i.e. gapless), which limits their use in applications where larger band gaps are needed for logic devices. Herein, we design a new D2h-geometric ligand, 2,3,6,7,11,12,15,16-octahydroxyphenanthro(9,10b)triphenylene (OHPTP), and synthesize the first example of a 2D c-MOF single crystal (OHPTP-Cu) with a rhombohedral pore geometry after coordination with copper. The continuous rotation electron diffraction (cRED) analysis unveils the orthorhombic crystal structure at the atomic level with a unique AB layer stacking. The resultant Cu2(OHPTP) is a p-type semiconductor with an indirect band gap of about 0.50 eV and exhibits high electrical conductivity of 0.10 S cm-1 and high charge carrier mobility of 10.0 cm2V-1s-1. Density-functional theory calculations underline the predominant role of the out-of-plane charge transport in this semiquinone-based 2D c-MOFs. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 11 pages 5 figures

Journal ref: Angew. Chem. Int. Ed. 2023, 62, e202300186

arXiv:2405.06590 [pdf, other]

Decomposing weather forecasting into advection and convection with neural networks

Authors: Mengxuan Chen, Ziqi Yuan, Jinxiao Zhang, Runmin Dong, Haohuan Fu

Abstract: Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics… ▽ More Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics of the atmospheric model. Contrarily, we propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately. By replacing the advection with a graph attention network and the convection with a multi-layer perceptron, our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models. We also assess the model's performance over a 5-day iterative forecasting. Under the same input variables and training methods, our model outperforms existing data-driven methods with a significantly-reduced number of parameters with a resolution of 5.625 deg. Overall, this work aims to contribute to the ongoing efforts that leverage machine learning techniques for improving both the accuracy and efficiency of global weather forecasting. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2404.19307 [pdf, other]

Enhancing GUI Exploration Coverage of Android Apps with Deep Link-Integrated Monkey

Authors: Han Hu, Han Wang, Ruiqi Dong, Xiao Chen, Chunyang Chen

Abstract: Mobile apps are ubiquitous in our daily lives for supporting different tasks such as reading and chatting. Despite the availability of many GUI testing tools, app testers still struggle with low testing code coverage due to tools frequently getting stuck in loops or overlooking activities with concealed entries. This results in a significant amount of testing time being spent on redundant and repe… ▽ More Mobile apps are ubiquitous in our daily lives for supporting different tasks such as reading and chatting. Despite the availability of many GUI testing tools, app testers still struggle with low testing code coverage due to tools frequently getting stuck in loops or overlooking activities with concealed entries. This results in a significant amount of testing time being spent on redundant and repetitive exploration of a few GUI pages. To address this, we utilize Android's deep links, which assist in triggering Android intents to lead users to specific pages and introduce a deep link-enhanced exploration method. This approach, integrated into the testing tool Monkey, gives rise to Delm (Deep Link-enhanced Monkey). Delm oversees the dynamic exploration process, guiding the tool out of meaningless testing loops to unexplored GUI pages. We provide a rigorous activity context mock-up approach for triggering existing Android intents to discover more activities with hidden entrances. We conduct experiments to evaluate Delm's effectiveness on activity context mock-up, activity coverage, method coverage, and crash detection. The findings reveal that Delm can mock up more complex activity contexts and significantly outperform state-of-the-art baselines with 27.2\% activity coverage, 21.13\% method coverage, and 23.81\% crash detection. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.18686 [pdf]

Dynamic temperature compensation for wavelength-stable entangled biphoton generation

Authors: Yuting Liu, Huibo Hong, Xiao Xiang, Runai Quan, Tao Liu, Mingtao Cao, Shougang Zhang, Ruifang Dong

Abstract: A dynamic temperature compensation method is presented to stabilize the wavelength of the entangled biphoton source, which is generated via the spontaneous parametric down-conversion based on a MgO: PPLN waveguide. Utilizing the dispersive Fourier transformation technique combined with a digital proportional-integral-differential algorithm, the small amount of wavelength variation can be instantly… ▽ More A dynamic temperature compensation method is presented to stabilize the wavelength of the entangled biphoton source, which is generated via the spontaneous parametric down-conversion based on a MgO: PPLN waveguide. Utilizing the dispersive Fourier transformation technique combined with a digital proportional-integral-differential algorithm, the small amount of wavelength variation can be instantly identified and then compensated with active temperature correction. The long-term wavelength stability, assessed though Allan deviation, shows nearly a hundredfold enhancement, reaching 2.00*10^(-7) at the averaging time of 10000 s. It offers a simple, ready-to-use solution for precise wavelength control in quantum information processing. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.15357 [pdf]

On-liquid-gallium surface synthesis of ultra-smooth conductive metal-organic framework thin films

Authors: Jinxin Liu, Yunxu Chen, Xing Huang, Yanhan Ren, Mike Hambsch, David Bodesheim, Darius Pohl, Xiaodong Li, Marielle Deconinck, Bowen Zhang, Markus Löffler, Zhongquan Liao, Fengxiang Zhao, Arezoo Dianat, Gianaurelio Cuniberti, Yana Vaynzof, Junfeng Gao, Jingcheng Hao, Stefan C. B. Mannsfeld, Xinliang Feng, Renhao Dong

Abstract: Conductive metal-organic frameworks (MOFs) are emerging electroactive materials for (opto-)electronics. However, it remains a great challenge to achieve reliable MOF-based devices via the existing synthesis methods that are compatible with the complementary metal-oxide-semiconductor technology, as the surface roughness of thus-far synthetic MOF films or pellets is rather high for efficient electro… ▽ More Conductive metal-organic frameworks (MOFs) are emerging electroactive materials for (opto-)electronics. However, it remains a great challenge to achieve reliable MOF-based devices via the existing synthesis methods that are compatible with the complementary metal-oxide-semiconductor technology, as the surface roughness of thus-far synthetic MOF films or pellets is rather high for efficient electrode contact. Here, we develop an on-liquid-gallium surface synthesis (OLGSS) strategy under chemical vapor deposition (CVD) conditions for the controlled growth of two-dimensional conjugated MOF (2D c-MOF) thin films with ten-fold improvement of surface flatness (surface roughness can reach as low as ~2 Å) compared with MOF films grown by the traditional methods. Supported by theoretical modeling, we unveil a layer-by-layer CVD growth mode for constructing flattening surfaces, that is triggered by the high adhesion energy between gallium (Ga) and planar aromatic ligands. We further demonstrate the generality of the as-proposed OLGSS strategy by reproducing such a flat surface over nine different 2D c-MOF films with variable thicknesses (~2 to 208 nm) and large lateral sizes (over 1 cm2). The resultant ultra-smooth 2D c-MOF films enable the formation of high-quality electrical contacts with gold (Au) electrodes, leading to a reduction of contact resistance by over ten orders of magnitude compared to the traditional uneven MOF films. Furthermore, due to the efficient interfacial interaction benifited from the high-quality contacts, the prepared van der Waals heterostructure (vdWH) of OLGSS c-MOF and MoS2 exhibits intriguing photoluminescence (PL) enhancement, PL peak shift and large work function modulation. The establishment of the reliable OLGSS method provides the chances to push the development of MOF electronics and the construction of multicomponent MOF-based heterostructure materials. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.15297 [pdf, ps, other]

Multi-stream Transmission for Directional Modulation Network via Distributed Multi-UAV-aided Multi-active-IRS

Authors: Ke Yang, Rongen Dong, Wei Gao, Feng Shu, Weiping Shi, Yan Wang, Xuehui Wang, Jiangzhou Wang

Abstract: Active intelligent reflecting surface (IRS) is a revolutionary technique for the future 6G networks. The conventional far-field single-IRS-aided directional modulation(DM) networks have only one (no direct path) or two (existing direct path) degrees of freedom (DoFs). This means that there are only one or two streams transmitted simultaneously from base station to user and will seriously limit its… ▽ More Active intelligent reflecting surface (IRS) is a revolutionary technique for the future 6G networks. The conventional far-field single-IRS-aided directional modulation(DM) networks have only one (no direct path) or two (existing direct path) degrees of freedom (DoFs). This means that there are only one or two streams transmitted simultaneously from base station to user and will seriously limit its rate gain achieved by IRS. How to create multiple DoFs more than two for DM? In this paper, single large-scale IRS is divided to multiple small IRSs and a novel multi-IRS-aided multi-stream DM network is proposed to achieve a point-to-point multi-stream transmission by creating $K$ ($\geq3$) DoFs, where multiple small IRSs are placed distributively via multiple unmanned aerial vehicles (UAVs). The null-space projection, zero-forcing (ZF) and phase alignment are adopted to design the transmit beamforming vector, receive beamforming vector and phase shift matrix (PSM), respectively, called NSP-ZF-PA. Here, $K$ PSMs and their corresponding beamforming vectors are independently optimized. The weighted minimum mean-square error (WMMSE) algorithm is involved in alternating iteration for the optimization variables by introducing the power constraint on IRS, named WMMSE-PC, where the majorization-minimization (MM) algorithm is used to solve the total PSM. To achieve a lower computational complexity, a maximum trace method, called Max-TR-SVD, is proposed by optimize the PSM of all IRSs. Numerical simulation results has shown that the proposed NSP-ZF-PA performs much better than Max-TR-SVD in terms of rate. In particular, the rate of NSP-ZF-PA with sixteen small IRSs is about five times that of NSP-ZF-PA with combining all small IRSs as a single large IRS. Thus, a dramatic rate enhancement may be achieved by multiple distributed IRSs. △ Less

Submitted 28 April, 2024; v1 submitted 26 March, 2024; originally announced April 2024.

arXiv:2404.13032 [pdf, other]

The James Webb Interferometer: Space-based interferometric detections of PDS 70 b and c at 4.8 $μ$m

Authors: Dori Blakely, Doug Johnstone, Gabriele Cugno, Anand Sivaramakrishnan, Peter Tuthill, Ruobing Dong, Benjamin J. S. Pope, Loïc Albert, Max Charles, Rachel A. Cooper, Matthew De Furio, Louis Desdoigts, René Doyon, Logan Francis, Alexandra Z. Greenbaum, David Lafrenière, James P. Lloyd, Michael R. Meyer, Laurent Pueyo, Shrishmoy Ray, Joel Sánchez-Bermúdez, Anthony Soulain, Deepashri Thatte, Thomas Vandal

Abstract: We observed the planet-hosting system PDS 70 with the James Webb Interferometer, JWST's Aperture Masking Interferometric (AMI) mode within NIRISS. Observing with the F480M filter centered at 4.8 $μ$m, we simultaneously fit a geometric model to the outer disk and the two known planetary companions. We re-detect the protoplanets PDS 70 b and c at an SNR of 21 and 11, respectively. Our photometry of… ▽ More We observed the planet-hosting system PDS 70 with the James Webb Interferometer, JWST's Aperture Masking Interferometric (AMI) mode within NIRISS. Observing with the F480M filter centered at 4.8 $μ$m, we simultaneously fit a geometric model to the outer disk and the two known planetary companions. We re-detect the protoplanets PDS 70 b and c at an SNR of 21 and 11, respectively. Our photometry of both PDS 70 b and c provide evidence for circumplanetary disk emission through fitting SED models to these new measurements and those found in the literature. We also newly detect emission within the disk gap at an SNR of $\sim$4, at a position angle of $207^{+11}_{-10}$ degrees, and an unconstrained separation within $\sim$200 mas. Follow-up observations will be needed to determine the nature of this emission. We place a 5$σ$ upper limit of $Δ$mag = 7.56 on the contrast of the candidate PDS 70 d at 4.8 $μ$m, which indicates that if the previously observed emission at shorter wavelengths is due to a planet, this putative planet has a different atmospheric composition than PDS 70 b or c. Finally, we place upper limits on emission from any additional planets in the disk gap. We find an azimuthally averaged 5$σ$ upper limit of $Δ$mag $\approx$ 7.5 at separations greater than 125 mas. These are the deepest limits to date within $\sim$250 mas at 4.8 $μ$m and the first space-based interferometric observations of this system. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Submitted to ApJ

arXiv:2404.04810 [pdf, other]

AlphaCrystal-II: Distance matrix based crystal structure prediction using deep learning

Authors: Yuqi Song, Rongzhi Dong, Lai Wei, Qin Li, Jianjun Hu

Abstract: Computational prediction of stable crystal structures has a profound impact on the large-scale discovery of novel functional materials. However, predicting the crystal structure solely from a material's composition or formula is a promising yet challenging task, as traditional ab initio crystal structure prediction (CSP) methods rely on time-consuming global searches and first-principles free ener… ▽ More Computational prediction of stable crystal structures has a profound impact on the large-scale discovery of novel functional materials. However, predicting the crystal structure solely from a material's composition or formula is a promising yet challenging task, as traditional ab initio crystal structure prediction (CSP) methods rely on time-consuming global searches and first-principles free energy calculations. Inspired by the recent success of deep learning approaches in protein structure prediction, which utilize pairwise amino acid interactions to describe 3D structures, we present AlphaCrystal-II, a novel knowledge-based solution that exploits the abundant inter-atomic interaction patterns found in existing known crystal structures. AlphaCrystal-II predicts the atomic distance matrix of a target crystal material and employs this matrix to reconstruct its 3D crystal structure. By leveraging the wealth of inter-atomic relationships of known crystal structures, our approach demonstrates remarkable effectiveness and reliability in structure prediction through comprehensive experiments. This work highlights the potential of data-driven methods in accelerating the discovery and design of new materials with tailored properties. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 16 pages

arXiv:2404.02264 [pdf, other]

The Identity Problem in virtually solvable matrix groups over algebraic numbers

Authors: Corentin Bodart, Ruiwen Dong

Abstract: The Tits alternative states that a finitely generated matrix group either contains a nonabelian free subgroup $F_2$, or it is virtually solvable. This paper considers two decision problems in virtually solvable matrix groups: the Identity Problem (does a given finitely generated subsemigroup contain the identity matrix?), and the Group Problem (is a given finitely generated subsemigroup a group?).… ▽ More The Tits alternative states that a finitely generated matrix group either contains a nonabelian free subgroup $F_2$, or it is virtually solvable. This paper considers two decision problems in virtually solvable matrix groups: the Identity Problem (does a given finitely generated subsemigroup contain the identity matrix?), and the Group Problem (is a given finitely generated subsemigroup a group?). We show that both problems are decidable in virtually solvable matrix groups over the field of algebraic numbers $\overline{\mathbb{Q}}$. Our proof also extends the decidability result for nilpotent groups by Bodart, Ciobanu, Metcalfe and Shaffrir, and the decidability result for metabelian groups by Dong (STOC'24). Since the Identity Problem and the Group Problem are known to be undecidable in matrix groups containing $F_2 \times F_2$, our result significantly reduces the decidability gap for both decision problems. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.17460 [pdf, other]

Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

Authors: Runmin Dong, Shuai Yuan, Bin Luo, Mengxuan Chen, Jinxiao Zhang, Lixian Zhang, Weijia Li, Juepeng Zheng, Haohuan Fu

Abstract: Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resoluti… ▽ More Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resolution images, but effectively utilizing reference images within these models remains an area for further exploration. Furthermore, content fidelity is difficult to guarantee in areas without relevant reference information. To solve these issues, we propose a change-aware diffusion model named Ref-Diff for RefSR, using the land cover change priors to guide the denoising process explicitly. Specifically, we inject the priors into the denoising model to improve the utilization of reference information in unchanged areas and regulate the reconstruction of semantically relevant content in changed areas. With this powerful guidance, we decouple the semantics-guided denoising and reference texture-guided denoising processes to improve the model performance. Extensive experiments demonstrate the superior effectiveness and robustness of the proposed method compared with state-of-the-art RefSR methods in both quantitative and qualitative evaluations. The code and data are available at https://github.com/dongrunmin/RefDiff. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR2024

arXiv:2403.15440 [pdf, other]

Linguistics from a topological viewpoint

Authors: Rui Dong

Abstract: Typological databases in linguistics are usually categorical-valued. As a result, it is difficult to have a clear visualization of the data. In this paper, we describe a workflow to analyze the topological shapes of South American languages by applying multiple correspondence analysis technique and topological data analysis methods. Typological databases in linguistics are usually categorical-valued. As a result, it is difficult to have a clear visualization of the data. In this paper, we describe a workflow to analyze the topological shapes of South American languages by applying multiple correspondence analysis technique and topological data analysis methods. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 14 pages, 17 figures

MSC Class: 91F20; 62R40; 55N31; 55U10

arXiv:2403.01072 [pdf, ps, other]

Distribution-Free Guarantees for Systems with Decision-Dependent Noise

Authors: Heling Zhang, Lillian J. Ratliff, Roy Dong

Abstract: In many real-world dynamical systems, obtaining precise models of system uncertainty remains a challenge. It may be difficult to estimate noise distributions or robustness bounds, especially when the distributions/robustness bounds vary with different control inputs in unknown ways. Addressing this challenge, this paper presents a novel iterative method tailored for systems with decision-dependent… ▽ More In many real-world dynamical systems, obtaining precise models of system uncertainty remains a challenge. It may be difficult to estimate noise distributions or robustness bounds, especially when the distributions/robustness bounds vary with different control inputs in unknown ways. Addressing this challenge, this paper presents a novel iterative method tailored for systems with decision-dependent noise without prior knowledge of the distributions. Our approach finds the open-loop control law that minimizes the worst-case loss, given that the noise induced by this control lies in its $(1 - p)$-confidence set for a predetermined $p$. At each iteration, we use a quantile method inspired by conformal prediction to empirically estimate the confidence set shaped by the preceding control law. These derived confidence sets offer distribution-free guarantees on the system's noise, guiding a robust control formulation that targets worst-case loss minimization. Under specific regularity conditions, our method is shown to converge to a near-optimal open-loop control. While our focus is on open-loop controls, the adaptive, data-driven nature of our approach suggests its potential applicability across diverse scenarios and extensions. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2403.00908 [pdf, other]

doi 10.17909/erkx-v276

JWST/NIRCam Imaging of Young Stellar Objects III: Detailed Imaging of the Nebular Environment Around the HL Tau Disk

Authors: Camryn Mullin, Ruobing Dong, Jarron Leisenring, Gabriele Cugno, Thomas Greene, Doug Johnstone, Michael R. Meyer, Kevin R. Wagner, Schuyler G. Wolff, Martha Boyer, Scott Horner, Klaus Hodapp, Don McCarthy, George Rieke, Marcia Rieke, Erick Young

Abstract: As part of the James Webb Space Telescope (JWST) Guaranteed Time Observation (GTO) program "Direct Imaging of YSOs" (program ID 1179), we use JWST NIRCam's direct imaging mode in F187N, F200W, F405N, and F410M to perform high contrast observations of the circumstellar structures surrounding the protostar HL Tau. The data reveal the known stellar envelope, outflow cavity, and streamers, but do not… ▽ More As part of the James Webb Space Telescope (JWST) Guaranteed Time Observation (GTO) program "Direct Imaging of YSOs" (program ID 1179), we use JWST NIRCam's direct imaging mode in F187N, F200W, F405N, and F410M to perform high contrast observations of the circumstellar structures surrounding the protostar HL Tau. The data reveal the known stellar envelope, outflow cavity, and streamers, but do not detect any companion candidates. We detect scattered light from an in-flowing spiral streamer previously detected in $\textrm{HCO}^+$ by ALMA, and part of the structure connected to the c-shaped outflow cavity. For detection limits in planet mass we use BEX evolutionary tracks when $M_\textrm{p}<2M_\textrm{J}$ and AMES-COND evolutionary tracks otherwise, assuming a planet age of 1 Myr (youngest available age). Inside the disk region, due to extended envelope emission, our point-source sensitivities are $\sim5$ mJy ($37~M_{\rm J}$) at 40 AU in F187N, and $\sim0.37$ mJy ($5.2~M_{\rm J}$) at 140 AU in F405N. Outside the disk region, the deepest limits we can reach are $\sim0.01$ mJy ($0.75~M_{\rm J}$) at a projected separation of $\sim525$ AU. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 13 pages, 6 figures, 2 tables, accepted to AAS Astronomical Journal

arXiv:2402.17766 [pdf, other]

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

Authors: Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma

Abstract: This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages. ShapeLLM is built upon an improved 3D encoder by extending ReCon to ReCon++ that benefits from multi-view image distillation for enhanced geometry understanding. By utilizing ReCon++ as the 3D point clo… ▽ More This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages. ShapeLLM is built upon an improved 3D encoder by extending ReCon to ReCon++ that benefits from multi-view image distillation for enhanced geometry understanding. By utilizing ReCon++ as the 3D point cloud input encoder for LLMs, ShapeLLM is trained on constructed instruction-following data and tested on our newly human-curated benchmark, 3D MM-Vet. ReCon++ and ShapeLLM achieve state-of-the-art performance in 3D geometry understanding and language-unified 3D interaction tasks, such as embodied visual grounding. Project page: https://qizekun.github.io/shapellm/ △ Less

Submitted 12 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: Accepted at ECCV 2024

arXiv:2402.15659 [pdf, other]

DeepLight: Reconstructing High-Resolution Observations of Nighttime Light With Multi-Modal Remote Sensing Data

Authors: Lixian Zhang, Runmin Dong, Shuai Yuan, Jinxiao Zhang, Mengxuan Chen, Juepeng Zheng, Haohuan Fu

Abstract: Nighttime light (NTL) remote sensing observation serves as a unique proxy for quantitatively assessing progress toward meeting a series of Sustainable Development Goals (SDGs), such as poverty estimation, urban sustainable development, and carbon emission. However, existing NTL observations often suffer from pervasive degradation and inconsistency, limiting their utility for computing the indicato… ▽ More Nighttime light (NTL) remote sensing observation serves as a unique proxy for quantitatively assessing progress toward meeting a series of Sustainable Development Goals (SDGs), such as poverty estimation, urban sustainable development, and carbon emission. However, existing NTL observations often suffer from pervasive degradation and inconsistency, limiting their utility for computing the indicators defined by the SDGs. In this study, we propose a novel approach to reconstruct high-resolution NTL images using multi-modal remote sensing data. To support this research endeavor, we introduce DeepLightMD, a comprehensive dataset comprising data from five heterogeneous sensors, offering fine spatial resolution and rich spectral information at a national scale. Additionally, we present DeepLightSR, a calibration-aware method for building bridges between spatially heterogeneous modality data in the multi-modality super-resolution. DeepLightSR integrates calibration-aware alignment, an auxiliary-to-main multi-modality fusion, and an auxiliary-embedded refinement to effectively address spatial heterogeneity, fuse diversely representative features, and enhance performance in $8\times$ super-resolution (SR) tasks. Extensive experiments demonstrate the superiority of DeepLightSR over 8 competing methods, as evidenced by improvements in PSNR (2.01 dB $ \sim $ 13.25 dB) and PIQE (0.49 $ \sim $ 9.32). Our findings underscore the practical significance of our proposed dataset and model in reconstructing high-resolution NTL data, supporting efficiently and quantitatively assessing the SDG progress. △ Less

Submitted 23 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: This paper has been accepted in IJCAI 2024

arXiv:2402.10052 [pdf, other]

Unmemorization in Large Language Models via Self-Distillation and Deliberate Imagination

Authors: Yijiang River Dong, Hongzhou Lin, Mikhail Belkin, Ramon Huerta, Ivan Vulić

Abstract: While displaying impressive generation capabilities across many tasks, Large Language Models (LLMs) still struggle with crucial issues of privacy violation and unwanted exposure of sensitive data. This raises an essential question: how should we prevent such undesired behavior of LLMs while maintaining their strong generation and natural language understanding (NLU) capabilities? In this work, we… ▽ More While displaying impressive generation capabilities across many tasks, Large Language Models (LLMs) still struggle with crucial issues of privacy violation and unwanted exposure of sensitive data. This raises an essential question: how should we prevent such undesired behavior of LLMs while maintaining their strong generation and natural language understanding (NLU) capabilities? In this work, we introduce a novel approach termed deliberate imagination in the context of LLM unlearning. Instead of trying to forget memorized data, we employ a self-distillation framework, guiding LLMs to deliberately imagine alternative scenarios. As demonstrated in a wide range of experiments, the proposed method not only effectively unlearns targeted text but also preserves the LLMs' capabilities in open-ended generation tasks as well as in NLU tasks. Our results demonstrate the usefulness of this approach across different models and sizes, and also with parameter-efficient fine-tuning, offering a novel pathway to addressing the challenges with private and sensitive data in LLM applications. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.09688 [pdf, ps, other]

A System-Level Dynamic Binary Translator using Automatically-Learned Translation Rules

Authors: Jinhu Jiang, Chaoyi Liang, Rongchao Dong, Zhaohui Yang, Zhongjun Zhou, Wenwen Wang, Pen-Chung Yew, Weihua Zhang

Abstract: System-level emulators have been used extensively for system design, debugging and evaluation. They work by providing a system-level virtual machine to support a guest operating system (OS) running on a platform with the same or different native OS that uses the same or different instruction-set architecture. For such system-level emulation, dynamic binary translation (DBT) is one of the core tech… ▽ More System-level emulators have been used extensively for system design, debugging and evaluation. They work by providing a system-level virtual machine to support a guest operating system (OS) running on a platform with the same or different native OS that uses the same or different instruction-set architecture. For such system-level emulation, dynamic binary translation (DBT) is one of the core technologies. A recently proposed learning-based DBT approach has shown a significantly improved performance with a higher quality of translated code using automatically learned translation rules. However, it has only been applied to user-level emulation, and not yet to system-level emulation. In this paper, we explore the feasibility of applying this approach to improve system-level emulation, and use QEMU to build a prototype. ... To achieve better performance, we leverage several optimizations that include coordination overhead reduction to reduce the overhead of each coordination, and coordination elimination and code scheduling to reduce the coordination frequency. Experimental results show that it can achieve an average of 1.36X speedup over QEMU 6.1 with negligible coordination overhead in the system emulation mode using SPEC CINT2006 as application benchmarks and 1.15X on real-world applications. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 10 pages, 19 figures, to be published in International Symposium on Code Generation and Optimization (CGO) 2024

arXiv:2402.02900 [pdf, other]

Forming localized dust concentrations in a dust ring: DM Tau case study

Authors: Hauyu Baobab Liu, Takayuki Muto, Mihoko Konishi, Chia-Ying Chung, Jun Hashimoto, Kiyoaki Doi, Ruobing Dong, Tomoyuki Kudo, Yasuhiro Hasegawa, Yuka Terada, Akimasa Kataoka

Abstract: The previous, high angular resolution 225 GHz ($\sim$1.3 mm) continuum observations on the transitional disk DM Tau have resolved an outer ring at 20-120 au radii that is weakly azimuthally asymmetric. We aimed to examine dust growth and filtration in the outer ring. We performed the $\sim$0$''$.06 ($\sim$8.7 au) resolution Karl G. Jansky Very Large Array (JVLA) 40-48 GHz ($\sim$7 mm; Q band) cont… ▽ More The previous, high angular resolution 225 GHz ($\sim$1.3 mm) continuum observations on the transitional disk DM Tau have resolved an outer ring at 20-120 au radii that is weakly azimuthally asymmetric. We aimed to examine dust growth and filtration in the outer ring. We performed the $\sim$0$''$.06 ($\sim$8.7 au) resolution Karl G. Jansky Very Large Array (JVLA) 40-48 GHz ($\sim$7 mm; Q band) continuum observations and the complementary observations at lower frequencies. In addition, we analyzed the archival JVLA observations that were taken since 2010. Intriguingly, the Q band image resolved the azimuthally highly asymmetric, knotty dust emission sources close to the inner edge of the outer ring. Fitting the 8-700 GHz spectral energy distribution (SED) with two dust components indicates that the maximum grain size in these knotty dust emission sources is likely $\gtrsim$300 $μ$m while it is $\lesssim$50 $μ$m in the rest of the ring. These results may be explained by trapping of inward migrating grown dust close to the ring inner edge. The exact mechanism for developing the azimuthal asymmetry has not yet been identified, which may be due to planet-disk interaction that might also be responsible for the creation of the dust cavity and pressure bump, or the fluid instabilities and vortex formation due to shear motions. Finally, we remark that the asymmetries in DM Tau are hard to diagnose from the $\gtrsim$225 GHz observations owing to a high optical depth at the ring. In other words, the apparent symmetric or asymmetric morphology of the transitional disks may be related to the optical depths of those disks at the observing frequency. △ Less

Submitted 13 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: 16 pages, 7 figures; accepted to A&A

arXiv:2401.16298 [pdf, other]

Breaking the Barrier: Selective Uncertainty-based Active Learning for Medical Image Segmentation

Authors: Siteng Ma, Haochang Wu, Aonghus Lawlor, Ruihai Dong

Abstract: Active learning (AL) has found wide applications in medical image segmentation, aiming to alleviate the annotation workload and enhance performance. Conventional uncertainty-based AL methods, such as entropy and Bayesian, often rely on an aggregate of all pixel-level metrics. However, in imbalanced settings, these methods tend to neglect the significance of target regions, eg., lesions, and tumors… ▽ More Active learning (AL) has found wide applications in medical image segmentation, aiming to alleviate the annotation workload and enhance performance. Conventional uncertainty-based AL methods, such as entropy and Bayesian, often rely on an aggregate of all pixel-level metrics. However, in imbalanced settings, these methods tend to neglect the significance of target regions, eg., lesions, and tumors. Moreover, uncertainty-based selection introduces redundancy. These factors lead to unsatisfactory performance, and in many cases, even underperform random sampling. To solve this problem, we introduce a novel approach called the Selective Uncertainty-based AL, avoiding the conventional practice of summing up the metrics of all pixels. Through a filtering process, our strategy prioritizes pixels within target areas and those near decision boundaries. This resolves the aforementioned disregard for target areas and redundancy. Our method showed substantial improvements across five different uncertainty-based methods and two distinct datasets, utilizing fewer labeled data to reach the supervised baseline and consistently achieving the highest overall performance. Our code is available at https://github.com/HelenMa9998/Selective\_Uncertainty\_AL. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.09841 [pdf, ps, other]

Interior Schauder estimates for Stokes systems in non-divergence form

Authors: Rong Dong, Dongsheng Li, Lihe Wang

Abstract: The global Schauder estimates for Stokes systems are established by Solonnikov [15] and [16] while the interior ones may fail generally from Serrin's counterexample (cf. [14]). Nevertheless, this paper obtains interior $C^{2,α}$ estimates for velocity and interior $C^{1,α}$ estimates for pressure in spatial direction. Furthermore, the $C^{α, \fracα2}$ estimate is attained for derivatives of curl o… ▽ More The global Schauder estimates for Stokes systems are established by Solonnikov [15] and [16] while the interior ones may fail generally from Serrin's counterexample (cf. [14]). Nevertheless, this paper obtains interior $C^{2,α}$ estimates for velocity and interior $C^{1,α}$ estimates for pressure in spatial direction. Furthermore, the $C^{α, \fracα2}$ estimate is attained for derivatives of curl of velocity. The estimates for velocity can be achieved pointwisely. The results are sharp and surprising since no continuity in time variable is assumed for the coefficients and the righthand side terms. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: arXiv admin note: text overlap with arXiv:2304.03529

arXiv:2401.09099 [pdf]

Structural Reinforcement in Mechanically Interlocked Two-Dimensional Polymers by Suppressing Interlayer Sliding

Authors: Ye Yang, André Knapp, David Bodesheim, Alexander Croy, Mike Hambsch, Chandrasekhar Naisa, Darius Pohl, Bernd Rellinghaus, Changsheng Zhao, Stefan C. B. Mannsfeld, Gianaurelio Cuniberti, Zhiyong Wang, Renhao Dong, Andreas Fery, Xinliang Feng

Abstract: Preserving the superior mechanical properties of monolayer two-dimensional (2D) materials when transitioning to bilayer and layer-stacked structures poses a great challenge, primarily arising from the weak van der Waals (vdW) forces that facilitate interlayer sliding and decoupling. Here, we discover that mechanically interlocked 2D polymers (2DPs) offer a means for structural reinforcement from m… ▽ More Preserving the superior mechanical properties of monolayer two-dimensional (2D) materials when transitioning to bilayer and layer-stacked structures poses a great challenge, primarily arising from the weak van der Waals (vdW) forces that facilitate interlayer sliding and decoupling. Here, we discover that mechanically interlocked 2D polymers (2DPs) offer a means for structural reinforcement from monolayer to bilayer. Incorporating macrocyclic molecules with one and two cavities into 2DPs backbones enables the precision synthesis of mechanically interlocked monolayer (MI-M2DP) and bilayer (MI-B2DP). Intriguingly, we have observed an exceptionally high effective Young's modulus of 222.4 GPa for MI-B2DP, surpassing those of MI-M2DP (130.1 GPa), vdW-stacked MI-M2DPs (2 MI-M2DP, 8.1 GPa) and other reported multilayer 2DPs. Modeling studies demonstrate the extraordinary effectiveness of mechanically interlocked structures in minimizing interlayer sliding (~0.1 Å) and energy penalty (320 kcal/mol) in MI-B2DP compared to 2 MI-M2DP (~1.2 Å, 550 kcal/mol), thereby suppressing mechanical relaxation and resulting in prominent structural reinforcement. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.08032 [pdf, other]

Structure-based out-of-distribution (OOD) materials property prediction: a benchmark study

Authors: Sadman Sadeed Omee, Nihang Fu, Rongzhi Dong, Ming Hu, Jianjun Hu

Abstract: In real-world material research, machine learning (ML) models are usually expected to predict and discover novel exceptional materials that deviate from the known materials. It is thus a pressing question to provide an objective evaluation of ML model performances in property prediction of out-of-distribution (OOD) materials that are different from the training set distribution. Traditional perfor… ▽ More In real-world material research, machine learning (ML) models are usually expected to predict and discover novel exceptional materials that deviate from the known materials. It is thus a pressing question to provide an objective evaluation of ML model performances in property prediction of out-of-distribution (OOD) materials that are different from the training set distribution. Traditional performance evaluation of materials property prediction models through random splitting of the dataset frequently results in artificially high performance assessments due to the inherent redundancy of typical material datasets. Here we present a comprehensive benchmark study of structure-based graph neural networks (GNNs) for extrapolative OOD materials property prediction. We formulate five different categories of OOD ML problems for three benchmark datasets from the MatBench study. Our extensive experiments show that current state-of-the-art GNN algorithms significantly underperform for the OOD property prediction tasks on average compared to their baselines in the MatBench study, demonstrating a crucial generalization gap in realistic material prediction tasks. We further examine the latent physical spaces of these GNN models and identify the sources of CGCNN, ALIGNN, and DeeperGATGNN's significantly more robust OOD performance than those of the current best models in the MatBench study (coGN and coNGN), and provide insights to improve their performance. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 21 pages

arXiv:2401.02834 [pdf, other]

JWST/NIRCam Imaging of Young Stellar Objects. II. Deep Constraints on Giant Planets and a Planet Candidate Outside of the Spiral Disk Around SAO 206462

Authors: Gabriele Cugno, Jarron Leisenring, Kevin R. Wagner, Camryn Mullin, Roubing Dong, Thomas Greene, Doug Johnstone, Michael R. Meyer, Schuyler G. Wolff, Charles Beichman, Martha Boyer, Scott Horner, Klaus Hodapp, Doug Kelly, Don McCarthy, Thomas Roellig, George Rieke, Marcia Rieke, John Stansberry, Erick Young

Abstract: We present JWST/NIRCam F187N, F200W, F405N and F410M direct imaging data of the disk surrounding SAO 206462. Previous images show a very structured disk, with a pair of spiral arms thought to be launched by one or more external perturbers. The spiral features are visible in three of the four filters, with the non-detection in F410M due to the large detector saturation radius. We detect with a sign… ▽ More We present JWST/NIRCam F187N, F200W, F405N and F410M direct imaging data of the disk surrounding SAO 206462. Previous images show a very structured disk, with a pair of spiral arms thought to be launched by one or more external perturbers. The spiral features are visible in three of the four filters, with the non-detection in F410M due to the large detector saturation radius. We detect with a signal-to-noise ratio of 4.4 a companion candidate (CC1) that, if on a coplanar circular orbit, would orbit SAO 206462 at a separation of $\sim300$ au, $2.25σ$ away from the predicted separation for the driver of the eastern spiral. According to the BEX models, CC1 has a mass of $M_\mathrm{CC1}=0.8\pm0.3~M_\mathrm{J}$. No other companion candidates were detected. At the location predicted by simulations of both spirals generated by a single massive companion, the NIRCam data exclude objects more massive than $\sim2.2~M_\mathrm{J}$ assuming the BEX evolutionary models. In terms of temperatures, the data are sensitive to objects with $T_{\text{eff}}\sim650-850$ K, when assuming planets emit like blackbodies ($R_\mathrm{p}$ between 1 and $3 R_\mathrm{J}$). From these results, we conclude that if the spirals are driven by gas giants, these must be either cold or embedded in circumplanetary material. In addition, the NIRCam data provide tight constraints on ongoing accretion processes. In the low extinction scenario we are sensitive to mass accretion rates of the order $\dot{M}\sim10^{-9} M_\mathrm{J}$ yr$^{-1}$. Thanks to the longer wavelengths used to search for emission lines, we reach unprecedented sensitivities to processes with $\dot{M}\sim10^{-7} M_\mathrm{J}$ yr$^{-1}$ even towards highly extincted environments ($A_\mathrm{V}\approx50$~mag). △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 18 pages, 8 figures, 3 tables

arXiv:2401.02830 [pdf, other]

JWST/NIRCam Imaging of Young Stellar Objects. I. Constraints on Planets Exterior to The Spiral Disk Around MWC 758

Authors: Kevin Wagner, Jarron Leisenring, Gabriele Cugno, Camryn Mullin, Ruobing Dong, Schuyler G. Wolff, Thomas Greene, Doug Johnstone, Michael R. Meyer, Charles Beichman, Martha Boyer, Scott Horner, Klaus Hodapp, Doug Kelly, Don McCarthy, Tom Roellig, George Rieke, Marcia Rieke, Michael Sitko, John Stansberry, Erick Young

Abstract: MWC 758 is a young star hosting a spiral protoplanetary disk. The spirals are likely companion-driven, and two previously-identified candidate companions have been identified -- one at the end the Southern spiral arm at ~0.6 arcsec, and one interior to the gap at ~0.1 arcsec. With JWST/NIRCam, we provide new images of the disk and constraints on planets exterior to ~1". We detect the two-armed spi… ▽ More MWC 758 is a young star hosting a spiral protoplanetary disk. The spirals are likely companion-driven, and two previously-identified candidate companions have been identified -- one at the end the Southern spiral arm at ~0.6 arcsec, and one interior to the gap at ~0.1 arcsec. With JWST/NIRCam, we provide new images of the disk and constraints on planets exterior to ~1". We detect the two-armed spiral disk, a known background star, and a spatially resolved background galaxy, but no clear companions. The candidates that have been reported are at separations that are not probed by our data with sensitivity sufficient to detect them -- nevertheless, these observations place new limits on companions down to ~2 Jupiter-masses at ~150 au and ~0.5 Jupiter masses at ~600 au. Owing to the unprecedented sensitivity of JWST and youth of the target, these are among the deepest mass-detection limits yet obtained through direct imaging observations, and provide new insights into the system's dynamical nature. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: Accepted for publication in AJ

arXiv:2401.02004 [pdf, other]

Shadowing in the protoplanetary disk of ZZ Tau IRS with HST

Authors: Jun Hashimoto, Ruobing Dong, Takayuki Muto, Hauyu Baobab Liu, Yuka Terada

Abstract: An inner component misaligned from an outer component in a protoplanetary disk can result in the former casting shadows on the latter. We present a new instance of shadowing on the outer disk around a very low mass star, ZZ~Tau~IRS. Through the analysis of near-infrared (NIR) archival data at $λ=1.6$~$μ$m acquired with the Wide Field Camera 3 on the Hubble Space Telescope, we identified brightness… ▽ More An inner component misaligned from an outer component in a protoplanetary disk can result in the former casting shadows on the latter. We present a new instance of shadowing on the outer disk around a very low mass star, ZZ~Tau~IRS. Through the analysis of near-infrared (NIR) archival data at $λ=1.6$~$μ$m acquired with the Wide Field Camera 3 on the Hubble Space Telescope, we identified brightness asymmetries in the top and bottom halves of the highly inclined outer disk, separated by a dark lane. The brighter sides in the top and bottom halves are on the opposite sides, which we attributed to shadows cast by a misaligned inner disk. Radiative transfer modeling of the system with a misaligned angle of 15~deg between the inner and outer disks well reproduced the observations. Additionally, we found an elevated brightness temperature of $^{12}$CO~(3-2) at $r\sim30$~au on the brighter side in NIR wavelengths in the top half disk, which can be explained by the shadowing effect too. While the origin of the misaligned inner disk remains unclear, future monitoring observations to search for temporal variations in brightness asymmetries will likely provide useful clues. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: 12 pages, 8 figures, accepted in AJ

arXiv:2312.15504 [pdf, ps, other]

Power Allocation and Beamforming Design for IRS-aided Secure Directional Modulation Network

Authors: Rongen Dong, Feng Shu, Fuhui Zhou, Yongpeng Wu, Jiangzhou Wang

Abstract: With the aim of boosting the security of the conventional directional modulation (DM) network, a secure DM network assisted by intelligent reflecting surface (IRS) is investigated in this paper. To maximize the secrecy rate (SR), we jointly optimize the power allocation (PA) factor, confidential message (CM) beamforming, artificial noise (AN) beamforming, and IRS reflected beamforming. To tackle t… ▽ More With the aim of boosting the security of the conventional directional modulation (DM) network, a secure DM network assisted by intelligent reflecting surface (IRS) is investigated in this paper. To maximize the secrecy rate (SR), we jointly optimize the power allocation (PA) factor, confidential message (CM) beamforming, artificial noise (AN) beamforming, and IRS reflected beamforming. To tackle the formulated problem, a maximizing SR with high-performance (Max-SR-HP) scheme is proposed, where the PA factor, CM beamforming, AN beamforming, and IRS phase shift matrix are derived by the derivative operation, generalized Rayleigh-Ritz, generalized power iteration, and semidefinite relaxation criteria, respectively. Given that the high complexity of the above scheme, a maximizing SR with low-complexity (Max-SR-LC) scheme is proposed, which employs the generalized leakage and successive convex approximation algorithms to derive the variables. Simulation results show that both the proposed schemes can significantly boost the SR performance, and are better than the equal PA, no IRS and random phase shift IRS schemes. △ Less

Submitted 4 March, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

arXiv:2312.10463 [pdf, other]

RecPrompt: A Prompt Tuning Framework for News Recommendation Using Large Language Models

Authors: Dairui Liu, Boming Yang, Honghui Du, Derek Greene, Aonghus Lawlor, Ruihai Dong, Irene Li

Abstract: In the evolving field of personalized news recommendation, understanding the semantics of the underlying data is crucial. Large Language Models (LLMs) like GPT-4 have shown promising performance in understanding natural language. However, the extent of their applicability in news recommendation systems remains to be validated. This paper introduces RecPrompt, the first framework for news recommend… ▽ More In the evolving field of personalized news recommendation, understanding the semantics of the underlying data is crucial. Large Language Models (LLMs) like GPT-4 have shown promising performance in understanding natural language. However, the extent of their applicability in news recommendation systems remains to be validated. This paper introduces RecPrompt, the first framework for news recommendation that leverages the capabilities of LLMs through prompt engineering. This system incorporates a prompt optimizer that applies an iterative bootstrapping process, enhancing the LLM-based recommender's ability to align news content with user preferences and interests more effectively. Moreover, this study offers insights into the effective use of LLMs in news recommendation, emphasizing both the advantages and the challenges of incorporating LLMs into recommendation systems. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: 8 pages, 3 figures, and 8 tables

arXiv:2312.08096 [pdf, other]

An Incentive Mechanism for Federated Learning Based on Multiple Resource Exchange

Authors: Ruonan Dong, Hui Xu, Han Zhang, GuoPeng Zhang

Abstract: Federated Learning (FL) is a distributed machine learning paradigm that addresses privacy concerns in machine learning and still guarantees high test accuracy. However, achieving the necessary accuracy by having all clients participate in FL is impractical, given the constraints of client local computing resource. In this paper, we introduce a multi-user collaborative computing framework, categori… ▽ More Federated Learning (FL) is a distributed machine learning paradigm that addresses privacy concerns in machine learning and still guarantees high test accuracy. However, achieving the necessary accuracy by having all clients participate in FL is impractical, given the constraints of client local computing resource. In this paper, we introduce a multi-user collaborative computing framework, categorizing users into two roles: model owners (MOs) and data owner (DOs). Without resorting to monetary incentives, an MO can encourage more DOs to join in FL by allowing the DOs to offload extra local computing tasks to the MO for execution. This exchange of "data" for "computing resources" streamlines the incentives for clients to engage more effectively in FL. We formulate the interaction between MO and DOs as an optimization problem, and the objective is to effectively utilize the communication and computing resource of the MO and DOs to minimize the time to complete an FL task. The proposed problem is a mixed integer nonlinear programming (MINLP) with high computational complexity. We first decompose it into two distinct subproblems, namely the client selection problem and the resource allocation problem to segregate the integer variables from the continuous variables. Then, an effective iterative algorithm is proposed to solve problem. Simulation results demonstrate that the proposed collaborative computing framework can achieve an accuracy of more than 95\% while minimizing the overall time to complete an FL task. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2311.14599 [pdf, other]

A Uniform Analysis of Debris Disks with the Gemini Planet Imager I: An Empirical Search for Perturbations from Planetary Companions in Polarized Light Images

Authors: Katie A. Crotts, Brenda C. Matthews, Gaspard Duchêne, Thomas M. Esposito, Ruobing Dong, Justin Hom, Rebecca Oppenheimer, Malena Rice, Schuyler G. Wolff, Christine H. Chen, Clarissa R. Do Ó, Paul Kalas, Briley L. Lewis, Alycia J. Weinberger, David J. Wilner, Mark Ammons, Pauline Arriaga, Robert J. De Rosa, John H. Debes, Michael P. Fitzgerald, Eileen C. Gonzales, Dean C. Hines, Sasha Hinkley, A. Meredith Hughes, Ludmilla Kolokolova , et al. (15 additional authors not shown)

Abstract: The Gemini Planet Imager (GPI) has excelled in imaging debris disks in the near-infrared. The GPI Exoplanet Survey (GPIES) imaged twenty-four debris disks in polarized $H$-band light, while other programs observed half of these disks in polarized $J$- and/or $K1$-bands. Using these data, we present a uniform analysis of the morphology of each disk to find asymmetries suggestive of perturbations, p… ▽ More The Gemini Planet Imager (GPI) has excelled in imaging debris disks in the near-infrared. The GPI Exoplanet Survey (GPIES) imaged twenty-four debris disks in polarized $H$-band light, while other programs observed half of these disks in polarized $J$- and/or $K1$-bands. Using these data, we present a uniform analysis of the morphology of each disk to find asymmetries suggestive of perturbations, particularly those due to planet-disk interactions. The multi-wavelength surface brightness, the disk color and geometry permit identification of any asymmetries such as warps or disk offsets from the central star. We find that nineteen of the disks in this sample exhibit asymmetries in surface brightness, disk color, disk geometry, or a combination of the three, suggesting that for this sample, perturbations, as seen in scattered light, are common. The relationship between these perturbations and potential planets in the system are discussed. We also explore correlations among stellar temperatures, ages, disk properties, and observed perturbations. We find significant trends between the vertical aspect ratio and the stellar temperature, disk radial extent, and the dust grain size distribution power-law, $q$. We also confirm a trend between the disk color and stellar effective temperature, where the disk becomes increasingly red/neutral with increasing temperature. Such results have important implications on the evolution of debris disk systems around stars of various spectral types. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: 46 pages, 20 figures, 6 tables, accepted for publication in ApJ

arXiv:2311.08164 [pdf, other]

Full characterization of biphotons with a generalized quantum interferometer

Authors: Baihong Li, Changhua Chen, Boxin Yuan, Xiaofei Zhang, Ruifang Dong, Shougang Zhang, Rui-Bo Jin

Abstract: Entangled photons (biphotons) in the time-frequency degree of freedom play a crucial role in both foundational physics and advanced quantum technologies. Fully characterizing them poses a key scientific challenge. Here, we propose a theoretical approach to achieving the complete tomography of biphotons by introducing a frequency shift in one arm of the combination interferometer. Our method, a gen… ▽ More Entangled photons (biphotons) in the time-frequency degree of freedom play a crucial role in both foundational physics and advanced quantum technologies. Fully characterizing them poses a key scientific challenge. Here, we propose a theoretical approach to achieving the complete tomography of biphotons by introducing a frequency shift in one arm of the combination interferometer. Our method, a generalized combination interferometer, enables the reconstruction of the full complex joint spectral amplitude associated with both frequency sum and difference in a single interferometer. In contrast, the generalized Hong-Ou-Mandel and N00N state interferometers only allow for the partial tomography of biphotons, either in frequency difference or frequency sum. This provides an alternative method for full characterization of an arbitrary two-photon state with exchange symmetry and holds potential for applications in high-dimensional quantum information processing. △ Less

Submitted 20 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: 14 pages, 3 figures

arXiv:2311.06515 [pdf]

Tunable interfacial chemisorption with atomic-level precision in a graphene WSe2 heterostructure

Authors: Mo-Han Zhang, Fei Gao, Aleksander Bach Lorentzen, Ya-Ning Ren, Ruo-Han Zhang, Xiao-Feng Zhou, Rui Dong, Shi-Wu Gao, Mads Brandbyge, Lin He

Abstract: It has long been an ultimate goal to introduce chemical doping at the atomic level to precisely tune properties of materials. Two-dimensional materials have natural advantage because of its highly-exposed surface atoms, however, it is still a grand challenge to achieve this goal experimentally. Here, we demonstrate the ability to introduce chemical doping in graphene with atomic-level precision by… ▽ More It has long been an ultimate goal to introduce chemical doping at the atomic level to precisely tune properties of materials. Two-dimensional materials have natural advantage because of its highly-exposed surface atoms, however, it is still a grand challenge to achieve this goal experimentally. Here, we demonstrate the ability to introduce chemical doping in graphene with atomic-level precision by controlling chemical adsorption of individual Se atoms, which are extracted from the underneath WSe2, at the interface of graphene-WSe2 heterostructures. Our scanning tunneling microscopy (STM) measurements, combined with first-principles calculations, reveal that individual Se atoms can chemisorbed on three possible positions in graphene, which generate distinct pseudospin-mediated atomic-scale vortices in graphene. We demonstrate that the chemisorbed positions of individual Se atoms can be manipulated by STM tip, which enables us to achieve atomic-scale controlling quantum interference of the pseudospin-mediated vortices in graphene. This result offers the promise of controlling properties of materials through chemical doping with atomic-level precision. △ Less

Submitted 11 November, 2023; originally announced November 2023.

arXiv:2311.03705 [pdf, other]

Efficient Bottom-Up Synthesis for Programs with Local Variables

Authors: Xiang Li, Xiangyu Zhou, Rui Dong, Yihong Zhang, Xinyu Wang

Abstract: We propose a new synthesis algorithm that can efficiently search programs with local variables (e.g., those introduced by lambdas). Prior bottom-up synthesis algorithms are not able to evaluate programs with free local variables, and therefore cannot effectively reduce the search space of such programs (e.g., using standard observational equivalence reduction techniques), making synthesis slow. Ou… ▽ More We propose a new synthesis algorithm that can efficiently search programs with local variables (e.g., those introduced by lambdas). Prior bottom-up synthesis algorithms are not able to evaluate programs with free local variables, and therefore cannot effectively reduce the search space of such programs (e.g., using standard observational equivalence reduction techniques), making synthesis slow. Our algorithm can reduce the space of programs with local variables. The key idea, dubbed lifted interpretation, is to lift up the program interpretation process, from evaluating one program at a time to simultaneously evaluating all programs from a grammar. Lifted interpretation provides a mechanism to systematically enumerate all binding contexts for local variables, thereby enabling us to evaluate and reduce the space of programs with local variables. Our ideas are instantiated in the domain of web automation. The resulting tool, Arborist, can automate a significantly broader range of challenging tasks more efficiently than state-of-the-art techniques including WebRobot and Helena. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: Accepted to POPL 2024

arXiv:2311.03682 [pdf, ps, other]

Incentive Design for Eco-driving in Urban Transportation Networks

Authors: M. Umar B. Niazi, Jung-Hoon Cho, Munther A. Dahleh, Roy Dong, Cathy Wu

Abstract: Eco-driving emerges as a cost-effective and efficient strategy to mitigate greenhouse gas emissions in urban transportation networks. Acknowledging the persuasive influence of incentives in shaping driver behavior, this paper presents the `eco-planner,' a digital platform devised to promote eco-driving practices in urban transportation. At the outset of their trips, users provide the platform with… ▽ More Eco-driving emerges as a cost-effective and efficient strategy to mitigate greenhouse gas emissions in urban transportation networks. Acknowledging the persuasive influence of incentives in shaping driver behavior, this paper presents the `eco-planner,' a digital platform devised to promote eco-driving practices in urban transportation. At the outset of their trips, users provide the platform with their trip details and travel time preferences, enabling the eco-planner to formulate personalized eco-driving recommendations and corresponding incentives, while adhering to its budgetary constraints. Upon trip completion, incentives are transferred to users who comply with the recommendations and effectively reduce their emissions. By comparing our proposed incentive mechanism with a baseline scheme that offers uniform incentives to all users, we demonstrate that our approach achieves superior emission reductions and increased user compliance with a smaller budget. △ Less

Submitted 16 May, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Showing 1–50 of 424 results for author: Dong, R