subscribe to arXiv mailings

Online Time-Informed Kinodynamic Motion Planning of Nonlinear Systems

Authors: Fei Meng, Jianbang Liu, Haojie Shi, Han Ma, Hongliang Ren, Max Q. -H. Meng

Abstract: Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and… ▽ More Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and limited system applicable scope, e.g., linear and polynomial nonlinear systems. To overcome these problems, we propose a method by leveraging deep learning technology, Koopman operator theory, and random set theory. Specifically, we propose a Deep Invertible Koopman operator with control U model named DIKU to predict states forward and backward over a long horizon by modifying the auxiliary network with an invertible neural network. A sampling-based approach, ASKU, performing reachability analysis for the DIKU is developed to approximate the TIS of nonlinear control systems online. Furthermore, we design an online time-informed SKMP using a direct sampling technique to draw uniform random samples in the TIS. Simulation experiment results demonstrate that our method outperforms other existing works, approximating TIS in near real-time and achieving superior planning performance in several time-optimal kinodynamic motion planning problems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02894 [pdf, other]

Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation

Authors: Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Min Zhang, Jinsong Su

Abstract: In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has be… ▽ More In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has become an option, which, however, faces two main challenges: 1) the huge modeling burden, as it is required to simultaneously learn alignment across languages and preserve the visual characteristics of the input image; 2) the difficulties of directly predicting excessively lengthy pixel sequences. In this paper, we propose \textit{Translatotron-V(ision)}, an end-to-end IIMT model consisting of four modules. In addition to an image encoder, and an image decoder, our model contains a target text decoder and an image tokenizer. Among them, the target text decoder is used to alleviate the language alignment burden, and the image tokenizer converts long sequences of pixels into shorter sequences of visual tokens, preventing the model from focusing on low-level visual features. Besides, we present a two-stage training framework for our model to assist the model in learning alignment across modalities and languages. Finally, we propose a location-aware evaluation metric called Structure-BLEU to assess the translation quality of the generated images. Experimental results demonstrate that our model achieves competitive performance compared to cascaded models with only 70.9\% of parameters, and significantly outperforms the pixel-level end-to-end IIMT model. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Accepted to ACL 2024 Findings

arXiv:2407.00102 [pdf, other]

Curriculum Learning with Quality-Driven Data Selection

Authors: Biao Wu, Fang Meng, Ling Chen

Abstract: The impressive multimodal capabilities demonstrated by OpenAI's GPT-4 have generated significant interest in the development of Multimodal Large Language Models (MLLMs). Visual instruction tuning of MLLMs with machine-generated instruction-following data has shown to enhance zero-shot capabilities across various tasks. However, there has been limited exploration into controlling the quality of the… ▽ More The impressive multimodal capabilities demonstrated by OpenAI's GPT-4 have generated significant interest in the development of Multimodal Large Language Models (MLLMs). Visual instruction tuning of MLLMs with machine-generated instruction-following data has shown to enhance zero-shot capabilities across various tasks. However, there has been limited exploration into controlling the quality of the instruction data.Current methodologies for data selection in MLLMs often rely on single, unreliable scores or use downstream tasks for selection, which is time-consuming and can lead to potential overfitting on the chosen evaluation datasets. To mitigate these limitations, we propose a novel data selection methodology that utilizes image-text correlation and model perplexity to evaluate and select data of varying quality. This approach leverages the distinct distribution of these two attributes, mapping data quality into a two-dimensional space that allows for the selection of data based on their location within this distribution. By utilizing this space, we can analyze the impact of task type settings, used as prompts, on data quality. Additionally, this space can be used to construct multi-stage subsets of varying quality to facilitate curriculum learning. Our research includes comprehensive experiments conducted on various datasets. The results emphasize substantial enhancements in five commonly assessed capabilities compared to using the complete dataset. Our codes, data, and models are publicly available at: \url{https://anonymous.4open.science/r/EHIT-31B4} △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2406.18769 [pdf]

Subharmonic oscillations in the Floquet circuit with the frequency-synthesis dimension

Authors: Bo Lv, Shiyun Xia, Ye Tian, Ting Liu, Hongyang Mu, Zhichao Shen, Sijie Wang, Zheng Zhu, Huibin Tao, Fanyi Meng, Jinhui Shi

Abstract: The period-doubling oscillation emerges with the coexistence between zero and π modes in Floquet topological insulator. Here, utilized the flexibility of the circuit, we construct the Floquet circuit with frequency-synthetic dimension and find the topological-protected deeply-subharmonic oscillations with the period extensively exceeding the doubling-driven period. In the construction framework, t… ▽ More The period-doubling oscillation emerges with the coexistence between zero and π modes in Floquet topological insulator. Here, utilized the flexibility of the circuit, we construct the Floquet circuit with frequency-synthetic dimension and find the topological-protected deeply-subharmonic oscillations with the period extensively exceeding the doubling-driven period. In the construction framework, the periodically-driven mechanism is attained by implementing the circuit-oscillator hierarchy with the stepping-variation resonances in frequency domain. The zero and π modes that arise at the Floquet band in the circuit indicate the anomalous boundary-bulk correspondence. The coexistence of zero and π modes, results in a subharmonic oscillation with the extremely-low frequency on the edge of the Floquet circuit. Furthermore, we explore the Floquet band with the enhanced periodically-driven strength tailored by the component flexibility of the circuit. Our method provides a flexible scheme to study Floquet topological phases, and open a new path for realizing the deeply subwavelength system. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 31 pages, 10 figures

arXiv:2406.17006 [pdf, other]

Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and the ratio of its partial width to that of the $χ_{c1}(3872)\rightarrow J/ψγ$ decay is measured to be $$ \frac{Γ_{χ_{c1}(3872)\rightarrow ψ(2S)γ}} {Γ_{χ_{c1}(3872)\rightarrow J/ψγ}} = 1.67 \pm 0.21 \pm 0.12 \pm0.04 , $$ where the first uncertainty is statistical, the second systematic and the third is due to the uncertainties on the branching fractions of the $ψ(2S)$ and $J/ψ$ mesons. The measured ratio makes the interpretation of the $χ_{c1}(3872)$ state as a~pure $D^0\bar{D}^{*0}+\bar{D}^0D^{*0}$ molecule questionable and strongly indicates a sizeable compact charmonium or tetraquark component within the $χ_{c1}(3872)$ state. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

arXiv:2406.16536 [pdf, other]

C-LLM: Learn to Check Chinese Spelling Errors Character by Character

Authors: Kunting Li, Yong Hu, Liang He, Fandong Meng, Jie Zhou

Abstract: Chinese Spell Checking (CSC) aims to detect and correct spelling errors in sentences. Despite Large Language Models (LLMs) exhibit robust capabilities and are widely applied in various tasks, their performance on CSC is often unsatisfactory. We find that LLMs fail to meet the Chinese character-level constraints of the CSC task, namely equal length and phonetic similarity, leading to a performance… ▽ More Chinese Spell Checking (CSC) aims to detect and correct spelling errors in sentences. Despite Large Language Models (LLMs) exhibit robust capabilities and are widely applied in various tasks, their performance on CSC is often unsatisfactory. We find that LLMs fail to meet the Chinese character-level constraints of the CSC task, namely equal length and phonetic similarity, leading to a performance bottleneck. Further analysis reveal that this issue stems from the granularity of tokenization, as current mixed character-word tokenization struggles to satisfy these character-level constraints. To address this issue, we propose C-LLM, a Large Language Model-based Chinese Spell Checking method that learns to check errors Character by Character. Character-level tokenization enables the model to learn character-level alignment, effectively mitigating issues related to character-level constraints. Furthermore, CSC is simplified to replication-dominated and substitution-supplemented tasks. Experiments on two CSC benchmarks demonstrate that C-LLM achieves an average improvement of 10% over existing methods. Specifically, it shows a 2.1% improvement in general scenarios and a significant 12% improvement in vertical domain scenarios, establishing state-of-the-art performance. The source code can be accessed at https://github.com/ktlKTL/C-LLM. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16416 [pdf, other]

Multilingual Knowledge Editing with Language-Agnostic Factual Neurons

Authors: Xue zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou

Abstract: Multilingual knowledge editing (MKE) aims to simultaneously revise factual knowledge across multilingual languages within large language models (LLMs). However, most existing MKE methods just adapt existing monolingual editing methods to multilingual scenarios, overlooking the deep semantic connections of the same factual knowledge between different languages, thereby limiting edit performance. To… ▽ More Multilingual knowledge editing (MKE) aims to simultaneously revise factual knowledge across multilingual languages within large language models (LLMs). However, most existing MKE methods just adapt existing monolingual editing methods to multilingual scenarios, overlooking the deep semantic connections of the same factual knowledge between different languages, thereby limiting edit performance. To address this issue, we first investigate how LLMs represent multilingual factual knowledge and discover that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons. These neurons represent the semantic connections between multilingual knowledge and are mainly located in certain layers. Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge. Specifically, we first generate a set of paraphrases for each multilingual knowledge to be edited to precisely locate the corresponding language-agnostic factual neurons. Then we optimize the update values for modifying these located neurons to achieve simultaneous modification of the same factual knowledge in multiple languages. Experimental results on Bi-ZsRE and MzsRE benchmarks demonstrate that our method outperforms existing MKE methods and achieves remarkable edit performance, indicating the importance of considering the semantic connections among multilingual knowledge. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 12 pages, 4 figures, 7 tables

arXiv:2406.13979 [pdf, other]

Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning

Authors: Yupei Zhang, Xiaofei Wang, Fangliangzi Meng, Jin Tang, Chao Li

Abstract: Multi-modal learning plays a crucial role in cancer diagnosis and prognosis. Current deep learning based multi-modal approaches are often limited by their abilities to model the complex correlations between genomics and histology data, addressing the intrinsic complexity of tumour ecosystem where both tumour and microenvironment contribute to malignancy. We propose a biologically interpretative an… ▽ More Multi-modal learning plays a crucial role in cancer diagnosis and prognosis. Current deep learning based multi-modal approaches are often limited by their abilities to model the complex correlations between genomics and histology data, addressing the intrinsic complexity of tumour ecosystem where both tumour and microenvironment contribute to malignancy. We propose a biologically interpretative and robust multi-modal learning framework to efficiently integrate histology images and genomics by decomposing the feature subspace of histology images and genomics, reflecting distinct tumour and microenvironment features. To enhance cross-modal interactions, we design a knowledge-driven subspace fusion scheme, consisting of a cross-modal deformable attention module and a gene-guided consistency strategy. Additionally, in pursuit of dynamically optimizing the subspace knowledge, we further propose a novel gradient coordination learning strategy. Extensive experiments demonstrate the effectiveness of the proposed method, outperforming state-of-the-art techniques in three downstream tasks of glioma diagnosis, tumour grading, and survival analysis. Our code is available at https://github.com/helenypzhang/Subspace-Multimodal-Learning. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.12324 [pdf, other]

AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints

Authors: Yu-Zhe Shi, Haofei Hou, Zhangqian Bi, Fanxu Meng, Xiang Wei, Lecheng Ruan, Qining Wang

Abstract: Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the A… ▽ More Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the AutoDSL framework to automate DSL-based constraint design across various domains. Utilizing domain specified experimental protocol corpora, AutoDSL optimizes syntactic constraints and abstracts semantic constraints. Quantitative and qualitative analyses of the DSLs designed by AutoDSL across five distinct domains highlight its potential as an auxiliary module for language models, aiming to improve procedural planning and execution. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL'24)

arXiv:2406.12111 [pdf, other]

Precision measurement of the $Ξ^-_b$ baryon lifetime

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second systematic. This value is averaged with the corresponding value from Run 1 to obtain ${r_τ^{\rm Run\,1,2} = 1.078\pm0.012\pm0.007}$. Multiplying by the world-average value of the $Λ^0_b$ lifetime yields $τ_{Ξ^-_b}^{\rm Run~1,2} = 1.578\pm0.018\pm0.010\pm0.011$ ps, where the uncertainties are statistical, systematic, and due to the limited knowledge of the $Λ^0_b$ lifetime. This measurement improves the precision of the current world average of the $Ξ^-_b$ lifetime by about a factor of two, and is in good agreement with the most recent theoretical predictions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

arXiv:2406.11802 [pdf, other]

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

Authors: Fanqing Meng, Wenqi Shao, Lixin Luo, Yahong Wang, Yiran Chen, Quanfeng Lu, Yue Yang, Tianshuo Yang, Kaipeng Zhang, Yu Qiao, Ping Luo

Abstract: Text-to-image (T2I) models have made substantial progress in generating images from textual prompts. However, they frequently fail to produce images consistent with physical commonsense, a vital capability for applications in world simulation and everyday tasks. Current T2I evaluation benchmarks focus on metrics such as accuracy, bias, and safety, neglecting the evaluation of models' internal know… ▽ More Text-to-image (T2I) models have made substantial progress in generating images from textual prompts. However, they frequently fail to produce images consistent with physical commonsense, a vital capability for applications in world simulation and everyday tasks. Current T2I evaluation benchmarks focus on metrics such as accuracy, bias, and safety, neglecting the evaluation of models' internal knowledge, particularly physical commonsense. To address this issue, we introduce PhyBench, a comprehensive T2I evaluation dataset comprising 700 prompts across 4 primary categories: mechanics, optics, thermodynamics, and material properties, encompassing 31 distinct physical scenarios. We assess 6 prominent T2I models, including proprietary models DALLE3 and Gemini, and demonstrate that incorporating physical principles into prompts enhances the models' ability to generate physically accurate images. Our findings reveal that: (1) even advanced models frequently err in various physical scenarios, except for optics; (2) GPT-4o, with item-specific scoring instructions, effectively evaluates the models' understanding of physical commonsense, closely aligning with human assessments; and (3) current T2I models are primarily focused on text-to-image translation, lacking profound reasoning regarding physical commonsense. We advocate for increased attention to the inherent knowledge within T2I models, beyond their utility as mere image generation tools. The code and data are available at https://github.com/OpenGVLab/PhyBench. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.08451 [pdf, other]

GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

Authors: Quanfeng Lu, Wenqi Shao, Zitao Liu, Fanqing Meng, Boxuan Li, Botong Chen, Siyuan Huang, Kaipeng Zhang, Yu Qiao, Ping Luo

Abstract: Smartphone users often navigate across multiple applications (apps) to complete tasks such as sharing content between social media platforms. Autonomous Graphical User Interface (GUI) navigation agents can enhance user experience in communication, entertainment, and productivity by streamlining workflows and reducing manual intervention. However, prior GUI agents often trained with datasets compri… ▽ More Smartphone users often navigate across multiple applications (apps) to complete tasks such as sharing content between social media platforms. Autonomous Graphical User Interface (GUI) navigation agents can enhance user experience in communication, entertainment, and productivity by streamlining workflows and reducing manual intervention. However, prior GUI agents often trained with datasets comprising simple tasks that can be completed within a single app, leading to poor performance in cross-app navigation. To address this problem, we introduce GUI Odyssey, a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos. Leveraging GUI Odyssey, we developed OdysseyAgent, a multimodal cross-app navigation agent by fine-tuning the Qwen-VL model with a history resampling module. Extensive experiments demonstrate OdysseyAgent's superior accuracy compared to existing models. For instance, OdysseyAgent surpasses fine-tuned Qwen-VL and zero-shot GPT-4V by 1.44\% and 55.49\% in-domain accuracy, and 2.29\% and 48.14\% out-of-domain accuracy on average. The dataset and code will be released in \url{https://github.com/OpenGVLab/GUI-Odyssey}. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 16 pages, 8 figures, a cross-app GUI navigation dataset

arXiv:2406.08434 [pdf, other]

TasTe: Teaching Large Language Models to Translate through Self-Reflection

Authors: Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie Zhou, Min Zhang

Abstract: Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks. Techniques like instruction tuning have effectively enhanced the proficiency of LLMs in the downstream task of machine translation. However, the existing approaches fail to yield satisfactory translation outputs that match the quality of supervised neural machine translation (NMT) syste… ▽ More Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks. Techniques like instruction tuning have effectively enhanced the proficiency of LLMs in the downstream task of machine translation. However, the existing approaches fail to yield satisfactory translation outputs that match the quality of supervised neural machine translation (NMT) systems. One plausible explanation for this discrepancy is that the straightforward prompts employed in these methodologies are unable to fully exploit the acquired instruction-following capabilities. To this end, we propose the TasTe framework, which stands for translating through self-reflection. The self-reflection process includes two stages of inference. In the first stage, LLMs are instructed to generate preliminary translations and conduct self-assessments on these translations simultaneously. In the second stage, LLMs are tasked to refine these preliminary translations according to the evaluation results. The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods. Our work presents a promising approach to unleash the potential of LLMs and enhance their capabilities in MT. The codes and datasets are open-sourced at https://github.com/YutongWang1216/ReflectionLLMMT. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: This paper has been accepted to the ACL 2024 main conference

arXiv:2406.08054 [pdf, other]

Quantum harvester enables energy transfer without randomness transfer or dissipation

Authors: Fei Meng, Junhao Xu, Xiangjing Liu, Oscar Dahlsten

Abstract: We consider a foundational question in energy harvesting: given a partly random energy source, is it possible to extract the energy without also transferring randomness or accepting another thermodynamical cost? We answer this in the positive, describing scenarios and protocols where in principle energy is extracted from a field with randomness but without any randomness being transferred, and wit… ▽ More We consider a foundational question in energy harvesting: given a partly random energy source, is it possible to extract the energy without also transferring randomness or accepting another thermodynamical cost? We answer this in the positive, describing scenarios and protocols where in principle energy is extracted from a field with randomness but without any randomness being transferred, and without energy dissipation. Such protocols fundamentally outperform existing methods of rectification which dissipate power, or feedback demon-like protocols which transfer randomness to the feedback system. The protocols exploit the possibility of the harvesting system taking several trajectories that lead to the same final state at a given time. We explain why these protocols do not violate basic physical principles. A key example involves the experimentally well-established phenomenon of Rabi oscillations between energy levels, exploiting the multitude of rotation axes in the state space that take the lower energy state to the excited state. The quantum system is deterministically excited to the highest energy level after interacting with the source for a fixed amount of time, irrespective of the random initial phase of the external potential. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 16 pages, 7 figures

arXiv:2406.06517 [pdf, other]

Genomics-guided Representation Learning for Pathologic Pan-cancer Tumor Microenvironment Subtype Prediction

Authors: Fangliangzi Meng, Hongrun Zhang, Ruodan Yan, Guohui Chuai, Chao Li, Qi Liu

Abstract: The characterization of Tumor MicroEnvironment (TME) is challenging due to its complexity and heterogeneity. Relatively consistent TME characteristics embedded within highly specific tissue features, render them difficult to predict. The capability to accurately classify TME subtypes is of critical significance for clinical tumor diagnosis and precision medicine. Based on the observation that tumo… ▽ More The characterization of Tumor MicroEnvironment (TME) is challenging due to its complexity and heterogeneity. Relatively consistent TME characteristics embedded within highly specific tissue features, render them difficult to predict. The capability to accurately classify TME subtypes is of critical significance for clinical tumor diagnosis and precision medicine. Based on the observation that tumors with different origins share similar microenvironment patterns, we propose PathoTME, a genomics-guided Siamese representation learning framework employing Whole Slide Image (WSI) for pan-cancer TME subtypes prediction. Specifically, we utilize Siamese network to leverage genomic information as a regularization factor to assist WSI embeddings learning during the training phase. Additionally, we employ Domain Adversarial Neural Network (DANN) to mitigate the impact of tissue type variations. To eliminate domain bias, a dynamic WSI prompt is designed to further unleash the model's capabilities. Our model achieves better performance than other state-of-the-art methods across 23 cancer types on TCGA dataset. Our code is available at https://github.com/Mengflz/PathoTME. △ Less

Submitted 8 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: MICCAI2024

arXiv:2406.06104 [pdf]

Correlated electrons of the flat band in charge density wave state of 4Hb-TaSexS2-x

Authors: Yanyan Geng, Jianfeng Guo, Fanyu Meng, Manyu Wang, Shuo Mi, Li Huang, Rui Xu, Fei Pang, Kai Liu, Shancai Wang, Hong-Jun Gao, Weichang Zhou, Wei Ji, Hechang Lei, Zhihai Cheng

Abstract: Many intriguing quantum states of matter, such as unconventional superconductivity, magnetic phases and fractional quantum Hall physics, emergent from the spatially-correlated localized electrons in the flat band of solid materials. By using scanning tunneling microscopy and spectroscopy (STM/STS), we report the real-space investigation of correlated electrons in the flat band of superlattice 4Hb-… ▽ More Many intriguing quantum states of matter, such as unconventional superconductivity, magnetic phases and fractional quantum Hall physics, emergent from the spatially-correlated localized electrons in the flat band of solid materials. By using scanning tunneling microscopy and spectroscopy (STM/STS), we report the real-space investigation of correlated electrons in the flat band of superlattice 4Hb-TaSexS2-x. In contrast with the pristine 4Hb-TaS2, the selenium (Se) substitutions significantly affect the interfacial transfer of correlated electrons between the CDW states of 1T- and 1H-TaS2 layers, and contribute a real-space fractional electron-filling configurations with the distributed electron-filled and -void SoD clusters of 1T-layer. The site-specific STS spectra directly reveal their respective prominent spectra weight above EF and symmetric Mott-like spectra. In addition, the spatial distributions of these electron-filled SoDs in the 1T-layer of 4Hb-TaSe0.7S1.3 demonstrate different local short-range patterning, clearly indicating the complex neighboring interactions among the localized electrons in the flat band of 1T-layer. Our results not only provide an in-depth insight of correlated electrons in the flat CDW band, and provide a simple platform to manipulate the electron-correlation-related quantum states. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 18 pages, 4 figures

arXiv:2406.05676 [pdf]

Chern insulator phase realized in dual-gate-tuned MnBi2Te4 thin films grown by molecular beam epitaxy

Authors: Yunhe Bai, Yuanzhao Li, Ruixuan Liu, Jianli Luan, Yang Chen, Wenyu Song, Peng-Fei Ji, Cui Ding, Zongwei Gao, Qinghua Zhang, Fanqi Meng, Bingbing Tong, Lin Li, Tianchen Zhu, Lin Gu, Lili Wang, Jinsong Zhang, Yayu Wang, Qi-Kun Xue, Ke He, Yang Feng, Xiao Feng

Abstract: The intrinsic magnetic order, large topological-magnetic gap and rich topological phases make MnBi2Te4 a wonderful platform to study exotic topological quantum states such as axion insulator and Chern insulator. To realize and manipulate these topological phases in a MnBi2Te4 thin film, precise manipulation of the electric field across the film is essential, which requires a dual-gate structure. I… ▽ More The intrinsic magnetic order, large topological-magnetic gap and rich topological phases make MnBi2Te4 a wonderful platform to study exotic topological quantum states such as axion insulator and Chern insulator. To realize and manipulate these topological phases in a MnBi2Te4 thin film, precise manipulation of the electric field across the film is essential, which requires a dual-gate structure. In this work, we achieve dual-gate tuning of MnBi2Te4 thin films grown with molecular beam epitaxy on SrTiO3(111) substrates by applying the substrate and an AlOx layer as the gate dielectrics of bottom and top gates, respectively. Under magnetic field of 9T and temperature of 20 mK, the Hall and longitudinal resistivities of the films show inversed gate-voltage dependence, for both top- and bottom-gates, signifying the existence of the dissipationless edge state contributed by Chern insulator phase in the ferromagnetic configuration. The maximum of the Hall resistivity only reaches 0.8 h/e2, even with dual-gate tuning, probably due to the high density of bulk carriers introduced by secondary phases. In the antiferromagnetic state under zero magnetic field, the films show normal insulator behavior. The dual-gated MnBi2Te4 thin films lay the foundation for developing devices based on electrically tunable topological quantum states. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 24 pages, 4 figures

arXiv:2406.03813 [pdf, other]

Touch100k: A Large-Scale Touch-Language-Vision Dataset for Touch-Centric Multimodal Representation

Authors: Ning Cheng, Changhao Guan, Jing Gao, Weihao Wang, You Li, Fandong Meng, Jie Zhou, Bin Fang, Jinan Xu, Wenjuan Han

Abstract: Touch holds a pivotal position in enhancing the perceptual and interactive capabilities of both humans and robots. Despite its significance, current tactile research mainly focuses on visual and tactile modalities, overlooking the language domain. Inspired by this, we construct Touch100k, a paired touch-language-vision dataset at the scale of 100k, featuring tactile sensation descriptions in multi… ▽ More Touch holds a pivotal position in enhancing the perceptual and interactive capabilities of both humans and robots. Despite its significance, current tactile research mainly focuses on visual and tactile modalities, overlooking the language domain. Inspired by this, we construct Touch100k, a paired touch-language-vision dataset at the scale of 100k, featuring tactile sensation descriptions in multiple granularities (i.e., sentence-level natural expressions with rich semantics, including contextual and dynamic relationships, and phrase-level descriptions capturing the key features of tactile sensations). Based on the dataset, we propose a pre-training method, Touch-Language-Vision Representation Learning through Curriculum Linking (TLV-Link, for short), inspired by the concept of curriculum learning. TLV-Link aims to learn a tactile representation for the GelSight sensor and capture the relationship between tactile, language, and visual modalities. We evaluate our representation's performance across two task categories (namely, material property identification and robot grasping prediction), focusing on tactile representation and zero-shot touch understanding. The experimental evaluation showcases the effectiveness of our representation. By enabling TLV-Link to achieve substantial improvements and establish a new state-of-the-art in touch-centric multimodal representation learning, Touch100k demonstrates its value as a valuable resource for research. Project page: https://cocacola-lab.github.io/Touch100k/. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03387 [pdf, other]

Measurement of the branching fraction ratios $R(D^{+})$ and $R(D^{*+})$ using muonic $τ$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1063 additional authors not shown)

Abstract: The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining… ▽ More The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining $D^+$ mesons with $τ^-\toμ^-\overlineν_μν_τ$ candidates, where the $D^+$ is reconstructed via the $D^+\to K^-π^+π^+$ decay. The results are \begin{align*} R(D^{+}) &= 0.249 \pm 0.043 \pm 0.047, R(D^{*+}) &= 0.402 \pm 0.081\pm 0.085, \end{align*} where the first uncertainties are statistical and the second systematic. The two measurements have a correlation coefficient of $-0.39$ and are compatible with the Standard Model. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2024-007.html (LHCb public pages)

Report number: LHCb-PAPER-2024-007, CERN-EP-2024-125

arXiv:2406.02882 [pdf, other]

Outdated Issue Aware Decoding for Reasoning Questions on Edited Knowledge

Authors: Zengkui Sun, Yijin Liu, Jiaan Wang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou

Abstract: Recently, Knowledge Editing has received increasing attention, since it could update the specific knowledge from outdated ones in pretrained models without re-training. However, as pointed out by recent studies, existing related methods tend to merely memorize the superficial word composition of the edited knowledge, rather than truly learning and absorbing it. Consequently, on the reasoning quest… ▽ More Recently, Knowledge Editing has received increasing attention, since it could update the specific knowledge from outdated ones in pretrained models without re-training. However, as pointed out by recent studies, existing related methods tend to merely memorize the superficial word composition of the edited knowledge, rather than truly learning and absorbing it. Consequently, on the reasoning questions, we discover that existing methods struggle to utilize the edited knowledge to reason the new answer, and tend to retain outdated responses, which are generated by the original models utilizing original knowledge. Nevertheless, the outdated responses are unexpected for the correct answers to reasoning questions, which we named as the outdated issue. To alleviate this issue, in this paper, we propose a simple yet effective decoding strategy, i.e., outDated ISsue aware deCOding (DISCO), to enhance the performance of edited models on reasoning questions. Specifically, we capture the difference in the probability distribution between the original and edited models. Further, we amplify the difference of the token prediction in the edited model to alleviate the outdated issue, and thus enhance the model performance w.r.t the edited knowledge. Experimental results suggest that applying DISCO could enhance edited models to reason, e.g., on reasoning questions, DISCO outperforms the prior SOTA method by 12.99 F1 scores, and reduces the ratio of the outdated issue to 5.78% on the zsRE dataset. △ Less

Submitted 16 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: ACL2024 Findings, Codes are at https://github.com/Acerkoo/DISCO

arXiv:2406.02876 [pdf, other]

LCS: A Language Converter Strategy for Zero-Shot Neural Machine Translation

Authors: Zengkui Sun, Yijin Liu, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou

Abstract: Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences. However, current LT strategies cannot indicate the desired target language as expected on zero-shot translation, i.e., the off-target issue. Our analysis reveals that the indication of the target language is sensitive to the placement of t… ▽ More Multilingual neural machine translation models generally distinguish translation directions by the language tag (LT) in front of the source or target sentences. However, current LT strategies cannot indicate the desired target language as expected on zero-shot translation, i.e., the off-target issue. Our analysis reveals that the indication of the target language is sensitive to the placement of the target LT. For example, when placing the target LT on the decoder side, the indication would rapidly degrade along with decoding steps, while placing the target LT on the encoder side would lead to copying or paraphrasing the source input. To address the above issues, we propose a simple yet effective strategy named Language Converter Strategy (LCS). By introducing the target language embedding into the top encoder layers, LCS mitigates confusion in the encoder and ensures stable language indication for the decoder. Experimental results on MultiUN, TED, and OPUS-100 datasets demonstrate that LCS could significantly mitigate the off-target issue, with language accuracy up to 95.28%, 96.21%, and 85.35% meanwhile outperforming the vanilla LT strategy by 3.07, 3,3, and 7.93 BLEU scores on zero-shot translation, respectively. △ Less

Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: ACL2024 Findings, Codes are at https://github.com/Acerkoo/LCS

arXiv:2406.01441 [pdf, other]

LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

Authors: Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang

Abstract: The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation. However, the area of data collection for instruction fine-tuning in machine translation remains relatively underexplored. In this paper, we present LexMatcher, a simple yet effect… ▽ More The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation. However, the area of data collection for instruction fine-tuning in machine translation remains relatively underexplored. In this paper, we present LexMatcher, a simple yet effective method for data curation, the design of which is driven by the coverage of senses found in bilingual dictionaries. The construction process comprises data retrieval from an existing corpus and data augmentation that supplements the infrequent senses of polysemous words. Utilizing LLaMA2 as our base model, our approach outperforms the established baselines on the WMT2022 test sets and also exhibits remarkable performance in tasks related to word sense disambiguation and specialized terminology translation. These results underscore the effectiveness of LexMatcher in enhancing LLM-based machine translation. The code, data, and models are available at https://github.com/ARIES-LM/Lexmatcher-MT.git. △ Less

Submitted 2 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00235 [pdf, other]

Amplitude analysis of the radiative decay $B^0_s\to K^+K^-γ$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1061 additional authors not shown)

Abstract: A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for alm… ▽ More A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for almost 70$\%$ of the decay rate. Considering the possible contributions of $f_2{(1270)}$, $f'_2{(1525)}$ and $f_2{(2010)}$ meson states, the overall tensor contribution to the amplitude is measured to be \begin{equation} {\cal F}_{\{f_2\}}=16.8\pm 0.5\mathrm{~(stat.)}\pm0.7\mathrm{~(syst.)}\%,\nonumber \end{equation} mostly dominated by the $f'_2(1525)$ state. Several statistically equivalent solutions are obtained for the detailed resonant structure depending on whether the smaller amplitudes interfere destructively or constructively with the dominant amplitude. The preferred solution that corresponds to the lowest values of the fit fractions along with constructive interference leads to the relative branching ratio measurement \begin{equation} \frac{{\cal B}(B^0_s\to f'_2γ)}{{\cal B}(B^0_s\toφγ)}= 19.4^{+0.9}_{-0.8}\mathrm{~(stat.)}{}^{+1.4}_{-0.5}\mathrm{~(syst.)}\pm0.5\mathrm{~(\cal{B})}\%\nonumber, \end{equation} where the last uncertainty is due to the ratio of measured branching fractions to the $K^+K^-$ final state. This result represents the first observation of the radiative $B^0_s\to f'_2(1525)γ$ decay, which is the second radiative transition observed in the $B^0_s$ sector. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-002.html (LHCb public pages)

Report number: LHCb-PAPER-2024-002, CERN-EP-2024-115

arXiv:2405.18922 [pdf, other]

Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective

Authors: Chenze Shao, Fandong Meng, Jiali Zeng, Jie Zhou

Abstract: Neural Machine Translation (NMT) has made remarkable progress over the past years. However, under-translation and over-translation remain two challenging problems in state-of-the-art NMT systems. In this work, we conduct an in-depth analysis on the underlying cause of under-translation in NMT, providing an explanation from the perspective of decoding objective. To optimize the beam search objectiv… ▽ More Neural Machine Translation (NMT) has made remarkable progress over the past years. However, under-translation and over-translation remain two challenging problems in state-of-the-art NMT systems. In this work, we conduct an in-depth analysis on the underlying cause of under-translation in NMT, providing an explanation from the perspective of decoding objective. To optimize the beam search objective, the model tends to overlook words it is less confident about, leading to the under-translation phenomenon. Correspondingly, the model's confidence in predicting the End Of Sentence (EOS) diminishes when under-translation occurs, serving as a mild penalty for under-translated candidates. Building upon this analysis, we propose employing the confidence of predicting EOS as a detector for under-translation, and strengthening the confidence-based penalty to penalize candidates with a high risk of under-translation. Experiments on both synthetic and real-world data show that our method can accurately detect and rectify under-translated outputs, with minor impact on other correct translations. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: ACL 2024 main conference

arXiv:2405.18906 [pdf, other]

Language Generation with Strictly Proper Scoring Rules

Authors: Chenze Shao, Fandong Meng, Yijin Liu, Jie Zhou

Abstract: Language generation based on maximum likelihood estimation (MLE) has become the fundamental approach for text generation. Maximum likelihood estimation is typically performed by minimizing the log-likelihood loss, also known as the logarithmic score in statistical decision theory. The logarithmic score is strictly proper in the sense that it encourages honest forecasts, where the expected score is… ▽ More Language generation based on maximum likelihood estimation (MLE) has become the fundamental approach for text generation. Maximum likelihood estimation is typically performed by minimizing the log-likelihood loss, also known as the logarithmic score in statistical decision theory. The logarithmic score is strictly proper in the sense that it encourages honest forecasts, where the expected score is maximized only when the model reports true probabilities. Although many strictly proper scoring rules exist, the logarithmic score is the only local scoring rule among them that depends exclusively on the probability of the observed sample, making it capable of handling the exponentially large sample space of natural text. In this work, we propose a straightforward strategy for adapting scoring rules to language generation, allowing for language modeling with any non-local scoring rules. Leveraging this strategy, we train language generation models using two classic strictly proper scoring rules, the Brier score and the Spherical score, as alternatives to the logarithmic score. Experimental results indicate that simply substituting the loss function, without adjusting other hyperparameters, can yield substantial improvements in model's generation capabilities. Moreover, these improvements can scale up to large language models (LLMs) such as LLaMA-7B and LLaMA-13B. Source code: \url{https://github.com/shaochenze/ScoringRulesLM}. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.17347 [pdf, other]

Comprehensive analysis of local and nonlocal amplitudes in the $B^0\rightarrow K^{*0}μ^+μ^-$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1070 additional authors not shown)

Abstract: A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a m… ▽ More A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a model of both one-particle and two-particle nonlocal amplitudes, and utilises the complete dimuon mass spectrum without any veto regions around the narrow charmonium resonances. In this way it is possible to explicitly isolate the local and nonlocal contributions and capture the interference between them. The results show that interference with nonlocal contributions, although larger than predicted, only has a minor impact on the Wilson Coefficients determined from the fit to the data. For the local contributions, the Wilson Coefficient $C_9$, responsible for vector dimuon currents, exhibits a $2.1σ$ deviation from the Standard Model expectation. The Wilson Coefficients $C_{10}$, $C_{9}'$ and $C_{10}'$ are all in better agreement than $C_{9}$ with the Standard Model and the global significance is at the level of $1.5σ$. The model used also accounts for nonlocal contributions from $B^{0}\to K^{*0}\left[τ^+τ^-\to μ^+μ^-\right]$ rescattering, resulting in the first direct measurement of the $b sττ$ vector effective-coupling $C_{9τ}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-011.html (LHCb public pages)

Report number: LHCb-PAPER-2024-011, CERN-EP-2024-122

arXiv:2405.13103 [pdf, other]

Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l… ▽ More A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper limit on the branching fraction is determined to be ${\cal B}( B^0_s\toφμ^\pmτ^\mp) < 1.0\times 10^{-5}$ at 90% confidence level. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-006.html (LHCb public pages)

Report number: LHCb-PAPER-2024-006, CERN-EP-2024-114

arXiv:2405.12688 [pdf, other]

Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1072 additional authors not shown)

Abstract: Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and… ▽ More Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and $13\,\mathrm{Te\kern -0.1em V}$. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the $B^- \to Λ_c^+ \overline{p} π^-$ mode, are measured, with $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$, $Ξ_{b}^- \toΛ_{c}^+ K^- K^-$ and $Ω_{b}^- \toΛ_{c}^+ K^- K^-$ decays being observed at over $5\,σ$ significance. The $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$ mode is also used to measure the $Ξ_{b}^-$ production asymmetry, which is found to be consistent with zero. In addition, the $B^- \to Λ_{c}^+ \overline{p} K^-$ decay is observed for the first time, and its branching fraction is measured relative to that of the $B^- \to Λ_{c}^+ \overline{p} π^-$ mode. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-013.html

Report number: CERN-EP-2024-116, LHCb-PAPER-2024-013

arXiv:2405.11324 [pdf, other]

Transverse polarization measurement of $Λ$ hyperons in $p$Ne collisions at $\sqrt{s_{NN}}$ = 68.4 GeV with the $\mbox{LHCb}$ detector

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1065 additional authors not shown)

Abstract: A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are… ▽ More A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are $$ P_Λ = 0.029 \pm 0.019 \, (\rm{stat}) \pm 0.012 \, (\rm{syst}) \, , $$ $$ P_{\barΛ} = 0.003 \pm 0.023 \, (\rm{stat}) \pm 0.014 \,(\rm{syst}) \,. $$ Furthermore, the results are shown as a function of the Feynman~$x$~variable, transverse momentum, pseudorapidity and rapidity of the hyperons, and are compared with previous measurements. △ Less

Submitted 24 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3120 (LHCb public pages)

Report number: CERN-EP-2024-121, LHCb-PAPER-2024-009

arXiv:2405.10482 [pdf]

Possible spin-polarized Cooper pairing in high temperature FeSe superconductor

Authors: Yi Hu, Fanyu Meng, Hechang Lei, Qi-Kun Xue, Ding Zhang

Abstract: Superconductivity and long-range ferromagnetism hardly coexist in a uniform manner. The counter-example has been observed, in uranium-based superconductors for instance, with a coexisting temperature limited to about 1 K. Here, we report the coexistence of high temperature superconductivity and itinerant ferromagnetism in lithium intercalated FeSe flakes. In superconducting samples with transition… ▽ More Superconductivity and long-range ferromagnetism hardly coexist in a uniform manner. The counter-example has been observed, in uranium-based superconductors for instance, with a coexisting temperature limited to about 1 K. Here, we report the coexistence of high temperature superconductivity and itinerant ferromagnetism in lithium intercalated FeSe flakes. In superconducting samples with transition temperature around 40 K, we observe the anomalous Hall effect with a hysteresis loop in transverse resistivity and a butterfly-like pattern of magneto-resistance. Intriguingly, such ferromagnetism persists down to a temperature at which the zero-field resistance fully vanishes. Furthermore, the superconductivity is enhanced under an in-plane magnetic field, suggestive of the participation of spin-polarized Cooper pairs. The surprising finding underscores a uniform coexistence of the two antagonistic phenomena on a record-high energy scale. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.07717 [pdf, other]

On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks

Authors: Chenhao Wu, Qingbo Wu, Haoran Wei, Shuai Chen, Lei Wang, King Ngi Ngan, Fanman Meng, Hongliang Li

Abstract: Despite demonstrating superior rate-distortion (RD) performance, learning-based image compression (LIC) algorithms have been found to be vulnerable to malicious perturbations in recent studies. However, the adversarial attacks considered in existing literature remain divergent from real-world scenarios, both in terms of the attack direction and bitrate. Additionally, existing methods focus solely… ▽ More Despite demonstrating superior rate-distortion (RD) performance, learning-based image compression (LIC) algorithms have been found to be vulnerable to malicious perturbations in recent studies. However, the adversarial attacks considered in existing literature remain divergent from real-world scenarios, both in terms of the attack direction and bitrate. Additionally, existing methods focus solely on empirical observations of the model vulnerability, neglecting to identify the origin of it. These limitations hinder the comprehensive investigation and in-depth understanding of the adversarial robustness of LIC algorithms. To address the aforementioned issues, this paper considers the arbitrary nature of the attack direction and the uncontrollable compression ratio faced by adversaries, and presents two practical rate-distortion attack paradigms, i.e., Specific-ratio Rate-Distortion Attack (SRDA) and Agnostic-ratio Rate-Distortion Attack (ARDA). Using the performance variations as indicators, we evaluate the adversarial robustness of eight predominant LIC algorithms against diverse attacks. Furthermore, we propose two novel analytical tools for in-depth analysis, i.e., Entropy Causal Intervention and Layer-wise Distance Magnify Ratio, and reveal that hyperprior significantly increases the bitrate and Inverse Generalized Divisive Normalization (IGDN) significantly amplifies input perturbations when under attack. Lastly, we examine the efficacy of adversarial training and introduce the use of online updating for defense. By comparing their advantages and disadvantages, we provide a reference for constructing more robust LIC algorithms against the rate-distortion attacks. △ Less

Submitted 4 July, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.06556 [pdf, other]

Search for time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the… ▽ More A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the $D^*(2010)^+ \rightarrow D^0 π^+$ decay. The decay $D^0 \rightarrow K^- π^+ π^0$ is used as a control channel to validate the measurement procedure. The gradient of the time-dependent $CP$ asymmetry, $ΔY$, in $D^0 \rightarrow π^+ π^- π^0$ decays is measured to be \begin{equation*} ΔY = (-1.3 \pm 6.3 \pm 2.4) \times 10^{-4}, \end{equation*} where the first uncertainty is statistical and the second is systematic, which is compatible with $CP$ conservation. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/p/LHCb-PAPER-2024-003.html (LHCb public pages)

Report number: LHCb-PAPER-2024-003, CERN-EP-2024-111

arXiv:2405.00098 [pdf, other]

Amplitude analysis and branching fraction measurement of $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1057 additional authors not shown)

Abstract: The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be… ▽ More The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be $0.173\pm 0.006\pm 0.010$, where the first uncertainty is statistical and the second is systematic. Using partially reconstructed $D^{*+}_{s}\to D^{+}_{s}γ$ and $D^{+}_{s}π^{0}$ decays, the ratio of branching fractions between the $B^{+}\to D^{*-}D^{*+}_{s}π^{+}$ and $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays is determined as $1.31\pm 0.07\pm 0.14$. An amplitude analysis of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decay is performed for the first time, revealing dominant contributions from known excited charm resonances decaying to the $D^{*-}π^{+}$ final state. No significant evidence of exotic contributions in the $D^{+}_{s}π^{+}$ or $D^{*-}D^{+}_{s}$ channels is found. The fit fraction of the scalar state $T_{c\bar{s} 0}^{\ast}(2900)^{++}$ observed in the $B^{+}\to D^{-}D^{+}_{s}π^{+}$ decay is determined to be less than 2.3% at a 90% confidence level. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-001.html (LHCb public pages)

Report number: LHCb-PAPER-2024-001, CERN-EP-2024-110

arXiv:2404.16006 [pdf, other]

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Authors: Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao

Abstract: Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to… ▽ More Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to assess LVLMs across massive multimodal tasks requiring expert knowledge and deliberate visual recognition, localization, reasoning, and planning. MMT-Bench comprises $31,325$ meticulously curated multi-choice visual questions from various multimodal scenarios such as vehicle driving and embodied navigation, covering $32$ core meta-tasks and $162$ subtasks in multimodal understanding. Due to its extensive task coverage, MMT-Bench enables the evaluation of LVLMs using a task map, facilitating the discovery of in- and out-of-domain tasks. Evaluation results involving $30$ LVLMs such as the proprietary GPT-4V, GeminiProVision, and open-sourced InternVL-Chat, underscore the significant challenges posed by MMT-Bench. We anticipate that MMT-Bench will inspire the community to develop next-generation multimodal foundation models aimed at achieving general-purpose multimodal intelligence. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: 77 pages, 41 figures

arXiv:2404.14085 [pdf]

High efficient sunlight-driven CO2 hydrogenation to methanol over NiZn intermetallic catalysts under atmospheric pressure

Authors: Linjia Han, Fanqi Meng, Xianhua Bai, Qixuan Wu, Yanhong Luo, Jiangjian Shi, Yaguang Li, Dongmei Li, Qingbo Meng

Abstract: The synthesis of solar methanol through direct CO2 hydrogenation using solar energy is of great importance in advancing a sustainable energy economy. In this study, non-precious NiZn intermetallic/ZnO catalyst is reported to catalyze the hydrogenation of CO2 to methanol using sunlight irradiation (1sun). The NiZn-ZnO interface is identified as the active site to stabilize the key intermediates of… ▽ More The synthesis of solar methanol through direct CO2 hydrogenation using solar energy is of great importance in advancing a sustainable energy economy. In this study, non-precious NiZn intermetallic/ZnO catalyst is reported to catalyze the hydrogenation of CO2 to methanol using sunlight irradiation (1sun). The NiZn-ZnO interface is identified as the active site to stabilize the key intermediates of HxCO*. At ambient pressure, the NiZn-ZnO catalyst demonstrates a methanol production rate of 127.5 umol g-1h-1 from solar driven CO2 hydrogenation, with a remarkable 100% selectivity towards methanol in the total organic products. Notably, this production rate stands as the highest record for photothermic CO2 hydrogenation to methanol in continuous-flow reactors with sunlight as the only requisite energy input. This discovery not only paves the way for the development of novel catalysts for CO2 hydrogenation to methanol but also marks a significant stride towards a full solar-driven chemical energy storage. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.11899 [pdf, other]

Investigating the Molecular Design Mechanism Behind the Hydrophobicity of Biological Surface Nanostructures: Insights from Butterfly and Mosquito Systems

Authors: Fan Meng, Noriyoshi Arai

Abstract: Wettability is a fundamental physicochemical property of solid surfaces, with unique wettability patterns playing pivotal roles across diverse domains. Inspired by nature's ingenious designs, bio-inspired materials have emerged as a frontier of scientific inquiry. They showcase remarkable hydrophobic properties observed in phenomena such as mosquitoes preventing fog condensation, and lotus leaves… ▽ More Wettability is a fundamental physicochemical property of solid surfaces, with unique wettability patterns playing pivotal roles across diverse domains. Inspired by nature's ingenious designs, bio-inspired materials have emerged as a frontier of scientific inquiry. They showcase remarkable hydrophobic properties observed in phenomena such as mosquitoes preventing fog condensation, and lotus leaves exhibiting self-cleaning attributes. This groundbreaking research delves into the hydrophobic characteristics of biomimetic surfaces using coarse-grained molecular simulation and the free energy barrier evaluation system. By analyzing the butterfly wings and mosquito eyes model, we aim to pioneer a comprehensive framework that factors in the influence of surface parameters on the free energy barrier. Through meticulous simulation and analysis, we strive to validate and enhance the reliability of the free energy barrier assessment method, deepening our understanding of hydrophobicity across diverse biomaterials and paving the way for optimizing their properties for a myriad of applications. During our investigation, we shed light on the elusive intermediate state, a departure from the typical Cassie or Wenzel state, enriching our theoretical framework for surfaces with distinctive properties. This research is a catalyst for developing biomimetic materials with superior hydrophobic characteristics and innovative fabrication processes, transcending academic boundaries and promising significant strides in environmental conservation, medicine, and beyond, offering hope for a greener, healthier, and more sustainable future. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 15 pages, 15 figures

arXiv:2404.10458 [pdf, other]

Advancing Long-Term Multi-Energy Load Forecasting with Patchformer: A Patch and Transformer-Based Approach

Authors: Qiuyi Hong, Fanlin Meng, Felipe Maldonado

Abstract: In the context of increasing demands for long-term multi-energy load forecasting in real-world applications, this paper introduces Patchformer, a novel model that integrates patch embedding with encoder-decoder Transformer-based architectures. To address the limitation in existing Transformer-based models, which struggle with intricate temporal patterns in long-term forecasting, Patchformer employ… ▽ More In the context of increasing demands for long-term multi-energy load forecasting in real-world applications, this paper introduces Patchformer, a novel model that integrates patch embedding with encoder-decoder Transformer-based architectures. To address the limitation in existing Transformer-based models, which struggle with intricate temporal patterns in long-term forecasting, Patchformer employs patch embedding, which predicts multivariate time-series data by separating it into multiple univariate data and segmenting each of them into multiple patches. This method effectively enhances the model's ability to capture local and global semantic dependencies. The numerical analysis shows that the Patchformer obtains overall better prediction accuracy in both multivariate and univariate long-term forecasting on the novel Multi-Energy dataset and other benchmark datasets. In addition, the positive effect of the interdependence among energy-related products on the performance of long-term time-series forecasting across Patchformer and other compared models is discovered, and the superiority of the Patchformer against other models is also demonstrated, which presents a significant advancement in handling the interdependence and complexities of long-term multi-energy forecasting. Lastly, Patchformer is illustrated as the only model that follows the positive correlation between model performance and the length of the past sequence, which states its ability to capture long-range past local semantic information. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.09686 [pdf, other]

AntBatchInfer: Elastic Batch Inference in the Kubernetes Cluster

Authors: Siyuan Li, Youshao Xiao, Fanzhuang Meng, Lin Ju, Lei Liang, Lin Wang, Jun Zhou

Abstract: Offline batch inference is a common task in the industry for deep learning applications, but it can be challenging to ensure stability and performance when dealing with large amounts of data and complicated inference pipelines. This paper demonstrated AntBatchInfer, an elastic batch inference framework, which is specially optimized for the non-dedicated cluster. AntBatchInfer addresses these chall… ▽ More Offline batch inference is a common task in the industry for deep learning applications, but it can be challenging to ensure stability and performance when dealing with large amounts of data and complicated inference pipelines. This paper demonstrated AntBatchInfer, an elastic batch inference framework, which is specially optimized for the non-dedicated cluster. AntBatchInfer addresses these challenges by providing multi-level fault-tolerant capabilities, enabling the stable execution of versatile and long-running inference tasks. It also improves inference efficiency by pipelining, intra-node, and inter-node scaling. It further optimizes the performance in complicated multiple-model batch inference scenarios. Through extensive experiments and real-world statistics, we demonstrate the superiority of our framework in terms of stability and efficiency. In the experiment, it outperforms the baseline by at least $2\times$ and $6\times$ in the single-model or multiple-model batch inference. Also, it is widely used at Ant Group, with thousands of daily jobs from various scenarios, including DLRM, CV, and NLP, which proves its practicability in the industry. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09443 [pdf, other]

Hybrid FedGraph: An efficient hybrid federated learning algorithm using graph convolutional neural network

Authors: Jaeyeon Jang, Diego Klabjan, Veena Mendiratta, Fanfei Meng

Abstract: Federated learning is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. Most existing works have focused on horizontal or vertical data distributions, where each client possesses different samples with shared features, or each client fully shares only sample indices, respectively. However, the hybrid… ▽ More Federated learning is an emerging paradigm for decentralized training of machine learning models on distributed clients, without revealing the data to the central server. Most existing works have focused on horizontal or vertical data distributions, where each client possesses different samples with shared features, or each client fully shares only sample indices, respectively. However, the hybrid scheme is much less studied, even though it is much more common in the real world. Therefore, in this paper, we propose a generalized algorithm, FedGraph, that introduces a graph convolutional neural network to capture feature-sharing information while learning features from a subset of clients. We also develop a simple but effective clustering algorithm that aggregates features produced by the deep neural networks of each client while preserving data privacy. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.07549 [pdf, other]

Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective

Authors: Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

Abstract: Code generation aims to understand the problem description and generate corresponding code snippets, where existing works generally decompose such complex tasks into intermediate steps by prompting strategies, such as Chain-of-Thought and its variants. While these studies have achieved some success, their effectiveness is highly dependent on the capabilities of advanced Large Language Models (LLMs… ▽ More Code generation aims to understand the problem description and generate corresponding code snippets, where existing works generally decompose such complex tasks into intermediate steps by prompting strategies, such as Chain-of-Thought and its variants. While these studies have achieved some success, their effectiveness is highly dependent on the capabilities of advanced Large Language Models (LLMs) such as GPT-4, particularly in terms of API calls, which significantly limits their practical applicability. Consequently, how to enhance the code generation capabilities of small and medium-scale code LLMs without significantly increasing training costs is an appealing challenge. In this paper, we suggest that code comments are the natural logic pivot between natural language and code language and propose using comments to boost the code generation ability of code LLMs. Concretely, we propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy. Experiments are performed on HumanEval and MBPP, utilizing StarCoder and WizardCoder as backbone models, and encompassing model parameter sizes between 3B and 7B. The results indicate that MANGO significantly improves the code pass rate based on the strong baselines. Meanwhile, the robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting. The code is publicly available at \url{https://github.com/pppa2019/Mango}. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: The code is publicly available at https://github.com/pppa2019/Mango

arXiv:2404.07394 [pdf, other]

High-power even- and odd mode emission from linear arrays of resonant-tunneling-diode (RTD) oscillators in the 0.4- to 0.8-THz frequency range

Authors: Fanqi Meng, Zhenling Tang, Petr Ourednik, Jahnabi Hazarika, Michael Feiginov, Safumi Suzuki, Hartmut G. Roskos

Abstract: Resonant tunneling diode (RTD) oscillators possess the highest oscillation frequency among all electronic THz emitters. However, the emitted power from RTDs remains limited. Here, we propose linear RTD-oscillator arrays capable of supporting coherent emission from both odd and even coupled modes. Both modes exhibit constructive interference in the far field, enabling high power emission. Experimen… ▽ More Resonant tunneling diode (RTD) oscillators possess the highest oscillation frequency among all electronic THz emitters. However, the emitted power from RTDs remains limited. Here, we propose linear RTD-oscillator arrays capable of supporting coherent emission from both odd and even coupled modes. Both modes exhibit constructive interference in the far field, enabling high power emission. Experimental demonstrations of coherent emission from 11-RTD-oscillator linear arrays are presented. The odd mode oscillates at approximately 450 GHz, emitting about 0.5 mW, while the even mode oscillates at around 750 GHz, emitting about 1 mW. Moreover, certain RTD-oscillator arrays demonstrate dual-band oscillation under different biases, allowing for controllable switching between two coupled modes. In addition, during bias sweeping in both directions, a notable hysteresis feature is observed in the switching bias for the odd and even modes. Our linear RTD-oscillator array represents a significant step forward in the realization of high-power large RTD-oscillator arrays and enables large-scale applications of RTD devices. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 7 pages, 6 figures

arXiv:2404.06954 [pdf, other]

Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy

Authors: Yijin Liu, Fandong Meng, Jie Zhou

Abstract: Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors. However, in the decoding process of existing approaches, different samples are assigned different computational budgets, which cannot guarantee a stable and precise acceleration effect. Furthermore,… ▽ More Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors. However, in the decoding process of existing approaches, different samples are assigned different computational budgets, which cannot guarantee a stable and precise acceleration effect. Furthermore, existing approaches generally skip multiple contiguous layers at the bottom or top of the layers, leading to a drastic change in the model's layer-wise representations, and thus a consequent performance degeneration. Therefore, we propose a Unified Layer Skipping strategy, which selects the number of layers to skip computation based solely on the target speedup ratio, and then skips the corresponding number of intermediate layer computations in a balanced manner. Since the Unified Layer Skipping strategy is independent of input samples, it naturally supports popular acceleration techniques such as batch decoding and KV caching, thus demonstrating more practicality for real-world applications. Experimental results on two common tasks, i.e., machine translation and text summarization, indicate that given a target speedup ratio, the Unified Layer Skipping strategy significantly enhances both the inference performance and the actual model throughput over existing dynamic approaches. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 12 pages, codes at https://github.com/Adaxry/Unified_Layer_Skipping

arXiv:2404.06108 [pdf, other]

Symmetry-guided gradient descent for quantum neural networks

Authors: Kaiming Bian, Shitao Zhang, Fei Meng, Wen Zhang, Oscar Dahlsten

Abstract: Many supervised learning tasks have intrinsic symmetries, such as translational and rotational symmetry in image classifications. These symmetries can be exploited to enhance performance. We formulate the symmetry constraints into a concise mathematical form. We design two ways to adopt the constraints into the cost function, thereby shaping the cost landscape in favour of parameter choices which… ▽ More Many supervised learning tasks have intrinsic symmetries, such as translational and rotational symmetry in image classifications. These symmetries can be exploited to enhance performance. We formulate the symmetry constraints into a concise mathematical form. We design two ways to adopt the constraints into the cost function, thereby shaping the cost landscape in favour of parameter choices which respect the given symmetry. Unlike methods that alter the neural network circuit ansatz to impose symmetry, our method only changes the classical post-processing of gradient descent, which is simpler to implement. We call the method symmetry-guided gradient descent (SGGD). We illustrate SGGD in entanglement classification of Werner states and in a binary classification task in a 2-D feature space. In both cases, the results show that SGGD can accelerate the training, improve the generalization ability, and remove vanishing gradients, especially when the training data is biased. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.05974 [pdf]

Vacancy enhanced cation ordering enables >15% efficiency in Kesterite solar cells

Authors: Jinlin Wang, Licheng Lou, Kang Yin, Fanqi Meng, Xiao Xu, Menghan Jiao, Bowen Zhang, Jiangjian Shi, Huijue Wu, Yanhong Luo, Dongmei Li, Qingbo Meng

Abstract: Atomic disorder, a widespread problem in compound crystalline materials, is a imperative affecting the performance of multi-chalcogenide Cu2ZnSn(S, Se)4 (CZTSSe) photovoltaic device known for its low cost and environmental friendliness. Cu-Zn disorder is particularly abundantly present in CZTSSe due to its extraordinarily low formation energy, having induced high-concentration deep defects and sev… ▽ More Atomic disorder, a widespread problem in compound crystalline materials, is a imperative affecting the performance of multi-chalcogenide Cu2ZnSn(S, Se)4 (CZTSSe) photovoltaic device known for its low cost and environmental friendliness. Cu-Zn disorder is particularly abundantly present in CZTSSe due to its extraordinarily low formation energy, having induced high-concentration deep defects and severe charge loss, while its regulation remains challenging due to the contradiction between disorder-order phase transition thermodynamics and atom-interchange kinetics. Herein, through introducing more vacancies in the CZTSSe surface, we explored a vacancy-assisted strategy to reduce the atom-interchange barrier limit to facilitate the Cu-Zn ordering kinetic process. The improvement in the Cu-Zn order degree has significantly reduced the charge loss in the device and helped us realize 15.4% (certified at 14.9%) and 13.5% efficiency (certified at 13.3%) in 0.27 cm2 and 1.1 cm2-area CZTSSe solar cells, respectively, thus bringing substantial advancement for emerging inorganic thin-film photovoltaics. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.02948 [pdf, other]

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

Authors: Fanxu Meng, Zhaohui Wang, Muhan Zhang

Abstract: To parameter-efficiently fine-tune (PEFT) large language models (LLMs), the low-rank adaptation (LoRA) method approximates the model changes $ΔW \in \mathbb{R}^{m \times n}$ through the product of two matrices $A \in \mathbb{R}^{m \times r}$ and $B \in \mathbb{R}^{r \times n}$, where $r \ll \min(m, n)$, $A$ is initialized with Gaussian noise, and $B$ with zeros. LoRA freezes the original model… ▽ More To parameter-efficiently fine-tune (PEFT) large language models (LLMs), the low-rank adaptation (LoRA) method approximates the model changes $ΔW \in \mathbb{R}^{m \times n}$ through the product of two matrices $A \in \mathbb{R}^{m \times r}$ and $B \in \mathbb{R}^{r \times n}$, where $r \ll \min(m, n)$, $A$ is initialized with Gaussian noise, and $B$ with zeros. LoRA freezes the original model $W$ and updates the "Noise & Zero" adapter, which may lead to slow convergence. To overcome this limitation, we introduce Principal Singular values and Singular vectors Adaptation (PiSSA). PiSSA shares the same architecture as LoRA, but initializes the adaptor matrices $A$ and $B$ with the principal components of the original matrix $W$, and put the remaining components into a residual matrix $W^{res} \in \mathbb{R}^{m \times n}$ which is frozen during fine-tuning. Compared to LoRA, PiSSA updates the principal components while freezing the "residual" parts, allowing faster convergence and enhanced performance. Comparative experiments of PiSSA and LoRA across 12 different models, ranging from 184M to 70B, encompassing 5 NLG and 8 NLU tasks, reveal that PiSSA consistently outperforms LoRA under identical experimental setups. On the GSM8K benchmark, Mistral-7B fine-tuned with PiSSA achieves an accuracy of 72.86%, surpassing LoRA's 67.7% by 5.16%. Due to the same architecture, PiSSA is also compatible with quantization to further reduce the memory requirement of fine-tuning. Compared to QLoRA, QPiSSA (PiSSA with 4-bit quantization) exhibits smaller quantization errors in the initial stages. Fine-tuning LLaMA-3-70B on GSM8K, QPiSSA attains an accuracy of 86.05%, exceeding the performances of QLoRA at 81.73%. Leveraging a fast SVD technique, PiSSA can be initialized in only a few seconds, presenting a negligible cost for transitioning from LoRA to PiSSA. △ Less

Submitted 28 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2404.01559 [pdf, ps, other]

Minimal model program for algebraically integrable foliations on klt varieties

Authors: Jihao Liu, Fanjun Meng, Lingyao Xie

Abstract: For lc algebraically integrable foliations on klt varieties, we prove the base-point-freeness theorem, the contraction theorem, and the existence of flips. The first result resolves a conjecture of Cascini and Spicer, while the latter two results strengthen a result of Cascini and Spicer by removing their assumption on the termination of flips. Moreover, we prove the existence of the minimal mod… ▽ More For lc algebraically integrable foliations on klt varieties, we prove the base-point-freeness theorem, the contraction theorem, and the existence of flips. The first result resolves a conjecture of Cascini and Spicer, while the latter two results strengthen a result of Cascini and Spicer by removing their assumption on the termination of flips. Moreover, we prove the existence of the minimal model program for lc algebraically integrable foliations on klt varieties and the existence of good minimal models or Mori fiber spaces for lc algebraically integrable foliations polarized with ample divisors on klt varieties. As a consequence, we show that $\mathbb{Q}$-factorial klt varieties with lc algebraically integrable Fano foliation structures are Mori dream spaces. We also show the existence of a Shokurov-type polytope for lc algebraically integrable foliations. △ Less

Submitted 8 July, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: 56 pages. New results added and expositions improved. We additionally prove the base-point-freeness theorem and the finite generation of polarized canonical rings. As a corollary, we also show that every $\mathbb{Q}$-factorial klt variety with an lc algebraically integrable Fano foliation structure is a Mori dream space

MSC Class: 14E30; 37F75

arXiv:2403.20009 [pdf, other]

On Large Language Models' Hallucination with Regard to Known Facts

Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou

Abstract: Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual question… ▽ More Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual questions that query the same triplet knowledge but result in different answers. The difference between the model behaviors on the correct and incorrect outputs hence suggests the patterns when hallucinations happen. Second, to measure the pattern, we utilize mappings from the residual streams to vocabulary space. We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases. In hallucinated cases, the output token's information rarely demonstrates abrupt increases and consistent superiority in the later stages of the model. Leveraging the dynamic curve as a feature, we build a classifier capable of accurately detecting hallucinatory predictions with an 88\% success rate. Our study shed light on understanding the reasons for LLMs' hallucinations on their known facts, and more importantly, on accurately predicting when they are hallucinating. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: Accepted by NAACL 2024 MainConference

arXiv:2403.19447 [pdf, other]

The ALMaQUEST Survey XV: The Dependence of the Molecular-to-Atomic Gas Ratios on Resolved Optical Diagnostics

Authors: Niankun Yu, Zheng Zheng, Chao-Wei Tsai, Pei Zuo, Sara L. Ellison, David V. Stark, Di Li, Jingwen Wu, Karen L. Masters, Ting Xiao, Yinghui Zheng, Zongnan Li, Kai Zhang, Hongying Chen, Shu Liu, Sihan Jiao, Fanyi Meng

Abstract: The atomic-to-molecular gas conversion is a critical step in the baryon cycle of galaxies, which sets the initial conditions for subsequent star formation and influences the multi-phase interstellar medium. We compiled a sample of 94 nearby galaxies with observations of multi-phase gas contents by utilizing public H I, CO, and optical IFU data from the MaNGA survey together with new FAST H I obser… ▽ More The atomic-to-molecular gas conversion is a critical step in the baryon cycle of galaxies, which sets the initial conditions for subsequent star formation and influences the multi-phase interstellar medium. We compiled a sample of 94 nearby galaxies with observations of multi-phase gas contents by utilizing public H I, CO, and optical IFU data from the MaNGA survey together with new FAST H I observations. In agreement with previous results, our sample shows that the global molecular-to-atomic gas ratio ($R_{\rm mol} \equiv$ log $M_{\rm H_2}/M_{\rm H\ I}$) is correlated with the global stellar mass surface density $μ_*$ with a Kendall's $τ$ coefficient of 0.25 and $p < 10^{-3}$, less tightly but still correlated with stellar mass and NUV$-$ r color, and not related to the specific star formation rate (sSFR). The cold gas distribution and kinematics inferred from the H I and CO global profile asymmetry and shape do not significantly rely on $R_{\rm mol}$. Thanks to the availability of kpc-scale observations of MaNGA, we decompose galaxies into H II, composite, and AGN-dominated regions by using the BPT diagrams. With increasing $R_{\rm mol}$, the fraction of H II regions within 1.5 effective radius decreases slightly; the density distribution in the spatially resolved BPT diagram also changes significantly, suggesting changes in metallicity and ionization states. Galaxies with high $R_{\rm mol}$ tend to have high oxygen abundance, both at one effective radius with a Kendall's $τ$ coefficient of 0.37 ($p < 10^{-3}$) and their central regions. Among all parameters investigated here, the oxygen abundance at one effective radius has the strongest relation with global $R_{\rm mol}$, but the dependence of gas conversion on gas distribution and galaxy ionization states is weak. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Accepted by SCPMA. We sincerely appreciate the constructive suggestions and comments from Lihwai Lin

arXiv:2403.17337 [pdf, other]

Destination-Constrained Linear Dynamical System Modeling in Set-Valued Frameworks

Authors: Xiaowei Yang, Haiqi Liu, Fanqin Meng, Xiaojing Shen

Abstract: Directional motion towards a specified destination is a common occurrence in physical processes and human societal activities. Utilizing this prior information can significantly improve the control and predictive performance of system models. This paper primarily focuses on reconstructing linear dynamic system models based on destination constraints in the set-valued framework. We treat destinatio… ▽ More Directional motion towards a specified destination is a common occurrence in physical processes and human societal activities. Utilizing this prior information can significantly improve the control and predictive performance of system models. This paper primarily focuses on reconstructing linear dynamic system models based on destination constraints in the set-valued framework. We treat destination constraints as inherent information in the state evolution process and employ convex optimization techniques to construct a coherent and robust state model. This refined model effectively captures the impact of destination constraints on the state evolution at each time step. Furthermore, we design an optimal weight matrix for the reconstructed model to ensure smoother and more natural trajectories of state evolution. We also analyze the theoretical guarantee of optimality for this weight matrix and the properties of the reconstructed model. Finally, simulation experiments verify that the reconstructed model has significant advantages over the unconstrained and unoptimized weighted models and constrains the evolution of state trajectories with different starting and ending points. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 15 pages, 11 figures

arXiv:2403.17012 [pdf]

Evolution and Efficiency in Neural Architecture Search: Bridging the Gap Between Expert Design and Automated Optimization

Authors: Fanfei Meng, Chen-Ao Wang, Lele Zhang

Abstract: The paper provides a comprehensive overview of Neural Architecture Search (NAS), emphasizing its evolution from manual design to automated, computationally-driven approaches. It covers the inception and growth of NAS, highlighting its application across various domains, including medical imaging and natural language processing. The document details the shift from expert-driven design to algorithm-… ▽ More The paper provides a comprehensive overview of Neural Architecture Search (NAS), emphasizing its evolution from manual design to automated, computationally-driven approaches. It covers the inception and growth of NAS, highlighting its application across various domains, including medical imaging and natural language processing. The document details the shift from expert-driven design to algorithm-driven processes, exploring initial methodologies like reinforcement learning and evolutionary algorithms. It also discusses the challenges of computational demands and the emergence of efficient NAS methodologies, such as Differentiable Architecture Search and hardware-aware NAS. The paper further elaborates on NAS's application in computer vision, NLP, and beyond, demonstrating its versatility and potential for optimizing neural network architectures across different tasks. Future directions and challenges, including computational efficiency and the integration with emerging AI domains, are addressed, showcasing NAS's dynamic nature and its continued evolution towards more sophisticated and efficient architecture search methods. △ Less

Submitted 2 April, 2024; v1 submitted 11 February, 2024; originally announced March 2024.

Comments: 7 Pages, Double Column

Journal ref: Journal of Mathematical Techniques and Computational Mathematics, 2024, Volume 3, Issue 3

Showing 1–50 of 485 results for author: Meng, F