-
UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
Authors:
Jingjing Ren,
Wenbo Li,
Haoyu Chen,
Renjing Pei,
Bin Shao,
Yong Guo,
Long Peng,
Fenglong Song,
Lei Zhu
Abstract:
Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining comp…
▽ More
Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining computational efficiency. UltraPixel leverages semantics-rich representations of lower-resolution images in the later denoising stage to guide the whole generation of highly detailed high-resolution images, significantly reducing complexity. Furthermore, we introduce implicit neural representations for continuous upsampling and scale-aware normalization layers adaptable to various resolutions. Notably, both low- and high-resolution processes are performed in the most compact space, sharing the majority of parameters with less than 3$\%$ additional parameters for high-resolution outputs, largely enhancing training and inference efficiency. Our model achieves fast training with reduced data requirements, producing photo-realistic high-resolution images and demonstrating state-of-the-art performance in extensive experiments.
△ Less
Submitted 4 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Towards Realistic Data Generation for Real-World Super-Resolution
Authors:
Long Peng,
Wenbo Li,
Renjing Pei,
Jingjing Ren,
Xueyang Fu,
Yang Wang,
Yang Cao,
Zheng-Jun Zha
Abstract:
Existing image super-resolution (SR) techniques often fail to generalize effectively in complex real-world settings due to the significant divergence between training data and practical scenarios. To address this challenge, previous efforts have either manually simulated intricate physical-based degradations or utilized learning-based techniques, yet these approaches remain inadequate for producin…
▽ More
Existing image super-resolution (SR) techniques often fail to generalize effectively in complex real-world settings due to the significant divergence between training data and practical scenarios. To address this challenge, previous efforts have either manually simulated intricate physical-based degradations or utilized learning-based techniques, yet these approaches remain inadequate for producing large-scale, realistic, and diverse data simultaneously. In this paper, we introduce a novel Realistic Decoupled Data Generator (RealDGen), an unsupervised learning data generation framework designed for real-world super-resolution. We meticulously develop content and degradation extraction strategies, which are integrated into a novel content-degradation decoupled diffusion model to create realistic low-resolution images from unpaired real LR and HR images. Extensive experiments demonstrate that RealDGen excels in generating large-scale, high-quality paired data that mirrors real-world degradations, significantly advancing the performance of popular SR models on various real-world benchmarks.
△ Less
Submitted 11 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Authors:
Xing Zhang,
Jiaxi Gu,
Haoyu Zhao,
Shicong Wang,
Hang Xu,
Renjing Pei,
Songcen Xu,
Zuxuan Wu,
Yu-Gang Jiang
Abstract:
Temporal Video Grounding (TVG) aims to localize a moment from an untrimmed video given the language description. Since the annotation of TVG is labor-intensive, TVG under limited supervision has accepted attention in recent years. The great success of vision-language pre-training guides TVG to follow the traditional "pre-training + fine-tuning" paradigm, however, the pre-training process would suf…
▽ More
Temporal Video Grounding (TVG) aims to localize a moment from an untrimmed video given the language description. Since the annotation of TVG is labor-intensive, TVG under limited supervision has accepted attention in recent years. The great success of vision-language pre-training guides TVG to follow the traditional "pre-training + fine-tuning" paradigm, however, the pre-training process would suffer from a lack of temporal modeling and fine-grained alignment due to the difference of data nature between pre-train and test. Besides, the large gap between pretext and downstream tasks makes zero-shot testing impossible for the pre-trained model. To avoid the drawbacks of the traditional paradigm, we propose AutoTVG, a new vision-language pre-training paradigm for TVG that enables the model to learn semantic alignment and boundary regression from automatically annotated untrimmed videos. To be specific, AutoTVG consists of a novel Captioned Moment Generation (CMG) module to generate captioned moments from untrimmed videos, and TVGNet with a regression head to predict localization results. Experimental results on Charades-STA and ActivityNet Captions show that, regarding zero-shot temporal video grounding, AutoTVG achieves highly competitive performance with in-distribution methods under out-of-distribution testing, and is superior to existing pre-training frameworks with much less training data.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
TENNs-PLEIADES: Building Temporal Kernels with Orthogonal Polynomials
Authors:
Yan Ru Pei,
Olivier Coenen
Abstract:
We introduce a neural network named PLEIADES (PoLynomial Expansion In Adaptive Distributed Event-based Systems), belonging to the TENNs (Temporal Neural Networks) architecture. We focus on interfacing these networks with event-based data to perform online spatiotemporal classification and detection with low latency. By virtue of using structured temporal kernels and event-based data, we have the f…
▽ More
We introduce a neural network named PLEIADES (PoLynomial Expansion In Adaptive Distributed Event-based Systems), belonging to the TENNs (Temporal Neural Networks) architecture. We focus on interfacing these networks with event-based data to perform online spatiotemporal classification and detection with low latency. By virtue of using structured temporal kernels and event-based data, we have the freedom to vary the sample rate of the data along with the discretization step-size of the network without additional finetuning. We experimented with three event-based benchmarks and obtained state-of-the-art results on all three by large margins with significantly smaller memory and compute costs. We achieved: 1) 99.59% accuracy with 192K parameters on the DVS128 hand gesture recognition dataset and 100% with a small additional output filter; 2) 99.58% test accuracy with 277K parameters on the AIS 2024 eye tracking challenge; and 3) 0.556 mAP with 576k parameters on the PROPHESEE 1 Megapixel Automotive Detection Dataset.
△ Less
Submitted 31 May, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution
Authors:
Long Peng,
Yang Cao,
Renjing Pei,
Wenbo Li,
Jiaming Guo,
Xueyang Fu,
Yang Wang,
Zheng-Jun Zha
Abstract:
Real-SR endeavors to produce high-resolution images with rich details while mitigating the impact of multiple degradation factors. Although existing methods have achieved impressive achievements in detail recovery, they still fall short when addressing regions with complex gradient arrangements due to the intensity-based linear weighting feature extraction manner. Moreover, the stochastic artifact…
▽ More
Real-SR endeavors to produce high-resolution images with rich details while mitigating the impact of multiple degradation factors. Although existing methods have achieved impressive achievements in detail recovery, they still fall short when addressing regions with complex gradient arrangements due to the intensity-based linear weighting feature extraction manner. Moreover, the stochastic artifacts introduced by degradation cues during the imaging process in real LR increase the disorder of the overall image details, further complicating the perception of intrinsic gradient arrangement. To address these challenges, we innovatively introduce kernel-wise differential operations within the convolutional kernel and develop several learnable directional gradient convolutions. These convolutions are integrated in parallel with a novel linear weighting mechanism to form an Adaptive Directional Gradient Convolution (DGConv), which adaptively weights and fuses the basic directional gradients to improve the gradient arrangement perception capability for both regular and irregular textures. Coupled with DGConv, we further devise a novel equivalent parameter fusion method for DGConv that maintains its rich representational capabilities while keeping computational costs consistent with a single Vanilla Convolution (VConv), enabling DGConv to improve the performance of existing super-resolution networks without incurring additional computational expenses. To better leverage the superiority of DGConv, we further develop an Adaptive Information Interaction Block (AIIBlock) to adeptly balance the enhancement of texture and contrast while meticulously investigating the interdependencies, culminating in the creation of a DGPNet for Real-SR through simple stacking. Comparative results with 15 SOTA methods across three public datasets underscore the effectiveness and efficiency of our proposed approach.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey
Authors:
Marcos V. Conde,
Zhijun Lei,
Wen Li,
Cosmin Stejerean,
Ioannis Katsavounidis,
Radu Timofte,
Kihwan Yoon,
Ganzorig Gankhuyag,
Jiangtao Lv,
Long Sun,
Jinshan Pan,
Jiangxin Dong,
Jinhui Tang,
Zhiyuan Li,
Hao Wei,
Chenyang Ge,
Dongyang Zhang,
Tianle Liu,
Huaian Chen,
Yi Jin,
Menghan Zhou,
Yiqiang Yan,
Si Gao,
Biao Wu,
Shaoli Liu
, et al. (50 additional authors not shown)
Abstract:
This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod…
▽ More
This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Event-Based Eye Tracking. AIS 2024 Challenge Survey
Authors:
Zuowen Wang,
Chang Gao,
Zongwei Wu,
Marcos V. Conde,
Radu Timofte,
Shih-Chii Liu,
Qinyu Chen,
Zheng-jun Zha,
Wei Zhai,
Han Han,
Bohao Liao,
Yuliang Wu,
Zengyu Wan,
Zhong Wang,
Yang Cao,
Ganchao Tan,
Jinze Chen,
Yan Ru Pei,
Sasskia Brüers,
Sébastien Crouzet,
Douglas McLelland,
Oliver Coenen,
Baoheng Zhang,
Yizhao Gao,
Jingyuan Li
, et al. (14 additional authors not shown)
Abstract:
This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggl…
▽ More
This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggle competition, and 8 teams submitted a challenge factsheet. The novel and diverse methods from the submitted factsheets are reviewed and analyzed in this survey to advance future event-based eye tracking research.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Authors:
Bin Ren,
Yawei Li,
Nancy Mehta,
Radu Timofte,
Hongyuan Yu,
Cheng Wan,
Yuxin Hong,
Bingnan Han,
Zhuoyuan Wu,
Yajun Zou,
Yuqing Liu,
Jizhe Li,
Keji He,
Chao Fan,
Heng Zhang,
Xiaolin Zhang,
Xuanwu Yin,
Kunlong Zuo,
Bohao Liao,
Peizhe Xia,
Long Peng,
Zhibo Du,
Xin Di,
Wangkai Li,
Yang Wang
, et al. (109 additional authors not shown)
Abstract:
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such…
▽ More
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/.
△ Less
Submitted 25 June, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera
Authors:
Yan Ru Pei,
Sasskia Brüers,
Sébastien Crouzet,
Douglas McLelland,
Olivier Coenen
Abstract:
Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal features, we propose a causal spatiotemporal convolutional network. This solution targets efficient implementation on edge-appropriate hardware with limited resources in three ways: 1) deliberately targets a simple arch…
▽ More
Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal features, we propose a causal spatiotemporal convolutional network. This solution targets efficient implementation on edge-appropriate hardware with limited resources in three ways: 1) deliberately targets a simple architecture and set of operations (convolutions, ReLU activations) 2) can be configured to perform online inference efficiently via buffering of layer outputs 3) can achieve more than 90% activation sparsity through regularization during training, enabling very significant efficiency gains on event-based processors. In addition, we propose a general affine augmentation strategy acting directly on the events, which alleviates the problem of dataset scarcity for event-based systems. We apply our model on the AIS 2024 event-based eye tracking challenge, reaching a score of 0.9916 p10 accuracy on the Kaggle private testset.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Authors:
Runhui Huang,
Kaixin Cai,
Jianhua Han,
Xiaodan Liang,
Renjing Pei,
Guansong Lu,
Songcen Xu,
Wei Zhang,
Hang Xu
Abstract:
Despite the success of generating high-quality images given any text prompts by diffusion-based generative models, prior works directly generate the entire images, but cannot provide object-wise manipulation capability. To support wider real applications like professional graphic design and digital artistry, images are frequently created and manipulated in multiple layers to offer greater flexibil…
▽ More
Despite the success of generating high-quality images given any text prompts by diffusion-based generative models, prior works directly generate the entire images, but cannot provide object-wise manipulation capability. To support wider real applications like professional graphic design and digital artistry, images are frequently created and manipulated in multiple layers to offer greater flexibility and control. Therefore in this paper, we propose a layer-collaborative diffusion model, named LayerDiff, specifically designed for text-guided, multi-layered, composable image synthesis. The composable image consists of a background layer, a set of foreground layers, and associated mask layers for each foreground element. To enable this, LayerDiff introduces a layer-based generation paradigm incorporating multiple layer-collaborative attention modules to capture inter-layer patterns. Specifically, an inter-layer attention module is designed to encourage information exchange and learning between layers, while a text-guided intra-layer attention module incorporates layer-specific prompts to direct the specific-content generation for each layer. A layer-specific prompt-enhanced module better captures detailed textual cues from the global prompt. Additionally, a self-mask guidance sampling strategy further unleashes the model's ability to generate multi-layered images. We also present a pipeline that integrates existing perceptual and generative models to produce a large dataset of high-quality, text-prompted, multi-layered images. Extensive experiments demonstrate that our LayerDiff model can generate high-quality multi-layered images with performance comparable to conventional whole-image generation methods. Moreover, LayerDiff enables a broader range of controllable generative applications, including layer-specific image editing and style transfer.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
CoSeR: Bridging Image and Language for Cognitive Super-Resolution
Authors:
Haoze Sun,
Wenbo Li,
Jianzhuang Liu,
Haoyu Chen,
Renjing Pei,
Xueyi Zou,
Youliang Yan,
Yujiu Yang
Abstract:
Existing super-resolution (SR) models primarily focus on restoring local texture details, often neglecting the global semantic information within the scene. This oversight can lead to the omission of crucial semantic details or the introduction of inaccurate textures during the recovery process. In our work, we introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with t…
▽ More
Existing super-resolution (SR) models primarily focus on restoring local texture details, often neglecting the global semantic information within the scene. This oversight can lead to the omission of crucial semantic details or the introduction of inaccurate textures during the recovery process. In our work, we introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images. We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process. To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention", consolidating all conditional information into a single module. Consequently, our method successfully restores semantically correct and photorealistic details, demonstrating state-of-the-art performance across multiple benchmarks. Code: https://github.com/VINHYU/CoSeR
△ Less
Submitted 20 December, 2023; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Reproducible image-based profiling with Pycytominer
Authors:
Erik Serrano,
Srinivas Niranj Chandrasekaran,
Dave Bunten,
Kenneth I. Brewer,
Jenna Tomkinson,
Roshan Kern,
Michael Bornholdt,
Stephen Fleming,
Ruifan Pei,
John Arevalo,
Hillary Tsang,
Vincent Rubinetti,
Callum Tromans-Coia,
Tim Becker,
Erin Weisbart,
Charlotte Bunne,
Alexandr A. Kalinin,
Rebecca Senft,
Stephen J. Taylor,
Nasim Jamali,
Adeniyi Adeboye,
Hamdah Shafqat Abbasi,
Allen Goodman,
Juan C. Caicedo,
Anne E. Carpenter
, et al. (3 additional authors not shown)
Abstract:
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics st…
▽ More
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics steps, known as image-based profiling. We demonstrate Pycytominers usefulness in a machine learning project to predict nuisance compounds that cause undesirable cell injuries.
△ Less
Submitted 2 July, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Authors:
Tianyi Lu,
Xing Zhang,
Jiaxi Gu,
Hang Xu,
Renjing Pei,
Songcen Xu,
Zuxuan Wu
Abstract:
Latent Diffusion Models (LDMs) are renowned for their powerful capabilities in image and video synthesis. Yet, video editing methods suffer from insufficient pre-training data or video-by-video re-training cost. In addressing this gap, we propose FLDM (Fused Latent Diffusion Model), a training-free framework to achieve text-guided video editing by applying off-the-shelf image editing methods in vi…
▽ More
Latent Diffusion Models (LDMs) are renowned for their powerful capabilities in image and video synthesis. Yet, video editing methods suffer from insufficient pre-training data or video-by-video re-training cost. In addressing this gap, we propose FLDM (Fused Latent Diffusion Model), a training-free framework to achieve text-guided video editing by applying off-the-shelf image editing methods in video LDMs. Specifically, FLDM fuses latents from an image LDM and an video LDM during the denoising process. In this way, temporal consistency can be kept with video LDM while high-fidelity from the image LDM can also be exploited. Meanwhile, FLDM possesses high flexibility since both image LDM and video LDM can be replaced so advanced image editing methods such as InstructPix2Pix and ControlNet can be exploited. To the best of our knowledge, FLDM is the first method to adapt off-the-shelf image editing methods into video LDMs for video editing. Extensive quantitative and qualitative experiments demonstrate that FLDM can improve the textual alignment and temporal consistency of edited videos.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Periodic linear complexions: Co-segregation of solutes at a low-angle grain boundary in a magnesium alloy
Authors:
Risheng Pei,
Zhuocheng Xie,
Achraf Atila,
Simon Anoldi,
Lei Xiao,
Xiaoqing Liu,
Hexin Wang,
Sandra Korte-Kerzel,
Julien Guénolé,
Talal Al-Samman
Abstract:
Solute segregation at low angle grain boundaries (LAGB) in Mg alloys significantly affects GB energy and mobility, therefore recrystallization kinetics and corresponding texture modification. In a system featuring multiple substitutional elements at high local concentration levels, solute-solute interaction needs to be considered to interpret and predict co-segregation behavior. In this work, atom…
▽ More
Solute segregation at low angle grain boundaries (LAGB) in Mg alloys significantly affects GB energy and mobility, therefore recrystallization kinetics and corresponding texture modification. In a system featuring multiple substitutional elements at high local concentration levels, solute-solute interaction needs to be considered to interpret and predict co-segregation behavior. In this work, atomic-scale experimental and modelling techniques were applied to investigate the co-segregation behavior of Ca, Zn and Al solutes at a LAGB in a Mg alloy. Three-dimensional atom probe tomography and corresponding clustering analysis revealed a strong clustering tendency of Ca solutes at the linear dislocation arrays. Atomistic simulations indicate that the co-segregation of Ca-Ca pairs in vicinity of the dislocation core region is more energetically favorable than other solute pairs, as well as the segregation of individual solutes.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs
Authors:
Yihong Liu,
Haotian Ye,
Leonie Weissweiler,
Renhao Pei,
Hinrich Schütze
Abstract:
In comparative linguistics, colexification refers to the phenomenon of a lexical form conveying two or more distinct meanings. Existing work on colexification patterns relies on annotated word lists, limiting scalability and usefulness in NLP. In contrast, we identify colexification patterns of more than 2,000 concepts across 1,335 languages directly from an unannotated parallel corpus. We then pr…
▽ More
In comparative linguistics, colexification refers to the phenomenon of a lexical form conveying two or more distinct meanings. Existing work on colexification patterns relies on annotated word lists, limiting scalability and usefulness in NLP. In contrast, we identify colexification patterns of more than 2,000 concepts across 1,335 languages directly from an unannotated parallel corpus. We then propose simple and effective methods to build multilingual graphs from the colexification patterns: ColexNet and ColexNet+. ColexNet's nodes are concepts and its edges are colexifications. In ColexNet+, concept nodes are additionally linked through intermediate nodes, each representing an ngram in one of 1,334 languages. We use ColexNet+ to train $\overrightarrow{\mbox{ColexNet+}}$, high-quality multilingual embeddings that are well-suited for transfer learning. In our experiments, we first show that ColexNet achieves high recall on CLICS, a dataset of crosslingual colexifications. We then evaluate $\overrightarrow{\mbox{ColexNet+}}$ on roundtrip translation, sentence retrieval and sentence classification and show that our embeddings surpass several transfer learning baselines. This demonstrates the benefits of using colexification as a source of information in multilingual NLP.
△ Less
Submitted 19 October, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages
Authors:
Chunlan Ma,
Ayyoob ImaniGooghari,
Haotian Ye,
Renhao Pei,
Ehsaneddin Asgari,
Hinrich Schütze
Abstract:
While natural language processing tools have been developed extensively for some of the world's languages, a significant portion of the world's over 7000 languages are still neglected. One reason for this is that evaluation datasets do not yet cover a wide range of languages, including low-resource and endangered ones. We aim to address this issue by creating a text classification dataset encompas…
▽ More
While natural language processing tools have been developed extensively for some of the world's languages, a significant portion of the world's over 7000 languages are still neglected. One reason for this is that evaluation datasets do not yet cover a wide range of languages, including low-resource and endangered ones. We aim to address this issue by creating a text classification dataset encompassing a large number of languages, many of which currently have little to no annotated data available. We leverage parallel translations of the Bible to construct such a dataset by first developing applicable topics and employing a crowdsourcing tool to collect annotated data. By annotating the English side of the data and projecting the labels onto other languages through aligned verses, we generate text classification datasets for more than 1500 languages. We extensively benchmark several existing multilingual language models using our dataset. To facilitate the advancement of research in this area, we will release our dataset and code.
△ Less
Submitted 4 June, 2024; v1 submitted 15 May, 2023;
originally announced May 2023.
-
A Crosslingual Investigation of Conceptualization in 1335 Languages
Authors:
Yihong Liu,
Haotian Ye,
Leonie Weissweiler,
Philipp Wicke,
Renhao Pei,
Robert Zangenfeind,
Hinrich Schütze
Abstract:
Languages differ in how they divide up the world into concepts and words; e.g., in contrast to English, Swahili has a single concept for `belly' and `womb'. We investigate these differences in conceptualization across 1,335 languages by aligning concepts in a parallel corpus. To this end, we propose Conceptualizer, a method that creates a bipartite directed alignment graph between source language…
▽ More
Languages differ in how they divide up the world into concepts and words; e.g., in contrast to English, Swahili has a single concept for `belly' and `womb'. We investigate these differences in conceptualization across 1,335 languages by aligning concepts in a parallel corpus. To this end, we propose Conceptualizer, a method that creates a bipartite directed alignment graph between source language concepts and sets of target language strings. In a detailed linguistic analysis across all languages for one concept (`bird') and an evaluation on gold standard data for 32 Swadesh concepts, we show that Conceptualizer has good alignment accuracy. We demonstrate the potential of research on conceptualization in NLP with two experiments. (1) We define crosslingual stability of a concept as the degree to which it has 1-1 correspondences across languages, and show that concreteness predicts stability. (2) We represent each language by its conceptualization pattern for 83 concepts, and define a similarity measure on these representations. The resulting measure for the conceptual similarity of two languages is complementary to standard genealogical, typological, and surface similarity measures. For four out of six language families, we can assign languages to their correct family based on conceptual similarity with accuracy between 54% and 87%.
△ Less
Submitted 26 May, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Orientation relationship of FeNiC and FeNiCSi from variant detection in EBSD data
Authors:
Mattis Seehaus,
Risheng Pei,
Sandra Korte-Kerzel,
Stefanie Sandlöbes-Haut
Abstract:
The determination of orientation relationships in dual or multi-phase materials is very important in the field of interface engineering for the design of materials with tailored properties. In this work, a code is developed for the automated and statistical analysis of the orientation relationship of electron backscatter diffraction data. On the example of Fe-Ni-(Si)-C alloys containing lenticular…
▽ More
The determination of orientation relationships in dual or multi-phase materials is very important in the field of interface engineering for the design of materials with tailored properties. In this work, a code is developed for the automated and statistical analysis of the orientation relationship of electron backscatter diffraction data. On the example of Fe-Ni-(Si)-C alloys containing lenticular martensite and retained austenite, the code is applied and it is shown that the orientation relationship (OR) corresponds to the Greninger-Troiano OR and that a statistically reliable investigation of the OR between the retained austenite and the related martensite variants is feasible using the code developed in this study.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Strengthening of Mg-Al-Ca alloys with C15 and C36 Laves phases
Authors:
Mhammad Zubair,
Stefanie Sandlöbes-Haut,
Risheng Pei,
Maximilian A. Wollenweber,
James S. K-L. Gibson,
S. Korte-Kerzel
Abstract:
The Laves phase skeleton in cast Mg-Al-Ca alloys is known to provide considerable strengthening. Laves phases such as CaMg$_2$ (C14), Ca(Al,Mg)$_2$ (C36), and CaAl2 (C15) have high melting points, high hardness at room and elevated temperatures, but unfortunately are inherently brittle. Mg-Al-Ca alloys thus have good creep properties but limited ductility. An understanding of the co-deformation be…
▽ More
The Laves phase skeleton in cast Mg-Al-Ca alloys is known to provide considerable strengthening. Laves phases such as CaMg$_2$ (C14), Ca(Al,Mg)$_2$ (C36), and CaAl2 (C15) have high melting points, high hardness at room and elevated temperatures, but unfortunately are inherently brittle. Mg-Al-Ca alloys thus have good creep properties but limited ductility. An understanding of the co-deformation behaviour of $α$-Mg and Laves phases is essential for optimising the strength-ductility balance of these alloys. Here, we study the mechanical behaviour of a Mg-4.65Al-2.82Ca alloy using micropillar compression in the $α$-Mg matrix, at $α$-Mg/C36 and $α$-Mg/C15 interfaces and in the C15 phase in combination with scanning electron microscopy (SE imaging), electron backscatter diffraction (EBSD), transmission Kikuchi diffraction (TKD), and low-kV scanning transmission electron microscopy (STEM). We show that both, C15 and C36, Laves phases provide considerable strengthening to the $α$-Mg matrix by delaying the onset of basal slip and extension twinning, while only the C36 phase appears to allow a certain extent of slip transfer/ plastic co-deformation, in spite of its greater anisotropy compared with the cubic C15 phase. We therefore conclude based on these results that strengthening of the $α$-Mg matrix by the C36 Laves phase is preferable given that it combines easy skeleton formation with some co-deformation and considerable stability at common application temperatures of magnesium alloys.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Atomistic insights into the inhomogeneous nature of solute segregation to grain boundaries in magnesium
Authors:
Risheng Pei,
Zhuocheng Xie,
Sangbong Yi,
Sandra Korte-Kerzel,
Julien Guénolé,
Talal Al-Samman
Abstract:
In magnesium alloys with multiple substitutional elements, solute segregation at grain boundaries (GBs) has a strong impact on many important material characteristics, such as GB energy and mobility, and therefore, texture. Although it is well established that GB segregation is inhomogeneous, the variation of GB solute composition for random boundaries is still not understood. In the current study…
▽ More
In magnesium alloys with multiple substitutional elements, solute segregation at grain boundaries (GBs) has a strong impact on many important material characteristics, such as GB energy and mobility, and therefore, texture. Although it is well established that GB segregation is inhomogeneous, the variation of GB solute composition for random boundaries is still not understood. In the current study, atomic-scale experimental and simulation techniques were used to investigate the compositional inhomogeneity of six different GBs. Three-dimensional atom probe tomography results revealed that GB solute concentration of Nd in Mg varies between 2 to 5 at.%. This variation was not only seen for different GB orientations but also within the GB plane. Correlated atomistic simulations suggest that the inhomogeneous segregation behavior observed experimentally stems from local atomic rearrangements within the GBs and introduce the notion of potential excess free volume in the context of improving the prediction of per-site segregation energies.
△ Less
Submitted 16 March, 2023; v1 submitted 8 January, 2022;
originally announced January 2022.
-
Periodic Fast Multipole Method
Authors:
Ruqi Pei,
Travis Askham,
Leslie Greengard,
Shidong Jiang
Abstract:
A new scheme is presented for imposing periodic boundary conditions on unit cells with arbitrary source distributions. We restrict our attention here to the Poisson, modified Helmholtz, Stokes and modified Stokes equations. The approach extends to the oscillatory equations of mathematical physics, including the Helmholtz and Maxwell equations, but we will address these in a companion paper, since…
▽ More
A new scheme is presented for imposing periodic boundary conditions on unit cells with arbitrary source distributions. We restrict our attention here to the Poisson, modified Helmholtz, Stokes and modified Stokes equations. The approach extends to the oscillatory equations of mathematical physics, including the Helmholtz and Maxwell equations, but we will address these in a companion paper, since the nature of the problem is somewhat different and includes the consideration of quasiperiodic boundary conditions and resonances. Unlike lattice sum-based methods, the scheme is insensitive to the unit cell's aspect ratio and is easily coupled to adaptive fast multipole methods (FMMs). Our analysis relies on classical "plane-wave" representations of the fundamental solution, and yields an explicit low-rank representation of the field due to all image sources beyond the first layer of neighboring unit cells. When the aspect ratio of the unit cell is large, our scheme can be coupled with the nonuniform fast Fourier transform (NUFFT) to accelerate the evaluation of the induced field. Its performance is illustrated with several numerial examples.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
A Finite-temperature Phase Transition for the Ising Spin-glass in $d\geq 2$
Authors:
Yan Ru Pei,
Massimiliano Di Ventra
Abstract:
It is believed that the $\pm J$ Ising spin-glass does not order at finite temperatures in dimension $d=2$. However, using a graphical representation and a contour argument, we prove rigorously the existence of a finite-temperature phase transition in $d\geq 2$ with $T_c \geq 0.4$. In the graphical representation, the low-temperature phase allows for the coexistence of multiple infinite clusters ea…
▽ More
It is believed that the $\pm J$ Ising spin-glass does not order at finite temperatures in dimension $d=2$. However, using a graphical representation and a contour argument, we prove rigorously the existence of a finite-temperature phase transition in $d\geq 2$ with $T_c \geq 0.4$. In the graphical representation, the low-temperature phase allows for the coexistence of multiple infinite clusters each with a rigidly aligned spin-overlap state. These clusters correlate negatively with each other, and are entropically stable without breaking any global symmetry. They can emerge in most graph structures and disorder measures.
△ Less
Submitted 19 September, 2022; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Non-equilibrium criticality and efficient exploration of glassy landscapes with memory dynamics
Authors:
Yan Ru Pei,
Massimiliano Di Ventra
Abstract:
Spin glasses are notoriously difficult to study both analytically and numerically due to the presence of frustration and metastability. Their highly non-convex landscapes require collective updates to explore efficiently. Currently, most state-of-the-art algorithms rely on stochastic spin clusters to perform non-local updates, but such "cluster algorithms" lack general efficiency. Here, we introdu…
▽ More
Spin glasses are notoriously difficult to study both analytically and numerically due to the presence of frustration and metastability. Their highly non-convex landscapes require collective updates to explore efficiently. Currently, most state-of-the-art algorithms rely on stochastic spin clusters to perform non-local updates, but such "cluster algorithms" lack general efficiency. Here, we introduce a non-equilibrium approach for simulating spin glasses based on classical dynamics with memory. By simulating various classes of 3d spin glasses (Edwards-Anderson, partially-frustrated, and fully-frustrated models), we find that memory dynamically promotes critical spin clusters during time evolution, in a self-organizing manner. This facilitates an efficient exploration of the low-temperature phases of spin glasses.
△ Less
Submitted 27 March, 2021; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Efficient Solution of Boolean Satisfiability Problems with Digital MemComputing
Authors:
S. R. B. Bearden,
Y. R. Pei,
M. Di Ventra
Abstract:
Boolean satisfiability is a propositional logic problem of interest in multiple fields, e.g., physics, mathematics, and computer science. Beyond a field of research, instances of the SAT problem, as it is known, require efficient solution methods in a variety of applications. It is the decision problem of determining whether a Boolean formula has a satisfying assignment, believed to require expone…
▽ More
Boolean satisfiability is a propositional logic problem of interest in multiple fields, e.g., physics, mathematics, and computer science. Beyond a field of research, instances of the SAT problem, as it is known, require efficient solution methods in a variety of applications. It is the decision problem of determining whether a Boolean formula has a satisfying assignment, believed to require exponentially growing time for an algorithm to solve for the worst-case instances. Yet, the efficient solution of many classes of Boolean formulae eludes even the most successful algorithms, not only for the worst-case scenarios, but also for typical-case instances. Here, we introduce a memory-assisted physical system (a digital memcomputing machine) that, when its non-linear ordinary differential equations are integrated numerically, shows evidence for polynomially-bounded scalability while solving "hard" planted-solution instances of SAT, known to require exponential time to solve in the typical case for both complete and incomplete algorithms. Furthermore, we analytically demonstrate that the physical system can efficiently solve the SAT problem in continuous time, without the need to introduce chaos or an exponentially growing energy. The efficiency of the simulations is related to the collective dynamical properties of the original physical system that persist in the numerical integration to robustly guide the solution search even in the presence of numerical errors. We anticipate our results to broaden research directions in physics-inspired computing paradigms ranging from theory to application, from simulation to hardware implementation.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.
-
Deformation of micrometer and mm-sized Fe2.4wt.%Si single- and bi-crystals with a high angle grain boundary at room temperature
Authors:
Martin Heller,
James S. K. -L. Gibson,
Risheng Pei,
Sandra Korte-Kerzel
Abstract:
Plasticity in body-centred cubic (BCC) metals, including dislocation interactions at grain boundaries, is much less understood than in face-centred cubic (FCC) metals. At low temperatures additional resistance to dislocation motion due to the Peierls barrier becomes important, which increases the complexity of plasticity. Iron-silicon steel is an interesting, model BCC material since the evolution…
▽ More
Plasticity in body-centred cubic (BCC) metals, including dislocation interactions at grain boundaries, is much less understood than in face-centred cubic (FCC) metals. At low temperatures additional resistance to dislocation motion due to the Peierls barrier becomes important, which increases the complexity of plasticity. Iron-silicon steel is an interesting, model BCC material since the evolution of the dislocation structure in specifically-oriented grains and at particular grain boundaries have far-reaching effects not only on the deformation behaviour but also on the magnetic properties, which are important in its final application as electrical steel. In this study, two different orientations of micropillars (1, 2, 4 microns in diameter) and macropillars (2500 microns) and their corresponding bi crystals are analysed after compression experiments with respect to the effect of size on strength and dislocation structures. Using different experimental methods, such as slip trace analysis, plane tilt analysis and cross-sectional EBSD, we show that direct slip transmission occurs, and different slip systems are active in the bi-crystals compared to their single-crystal counterparts. However, in spite of direct transmission and a very high transmission factor, dislocation pile-up at the grain boundary is also observed at early stages of deformation. Moreover, an effect of size scaling with the pillar size in single crystals and the grain size in bi-crystals is found, which is consistent with investigations elsewhere in FCC metals.
△ Less
Submitted 10 June, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines
Authors:
Haik Manukian,
Yan Ru Pei,
Sean R. B. Bearden,
Massimiliano Di Ventra
Abstract:
Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), impro…
▽ More
Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over traditional gradient methods. This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). Along with the proofs of stability and convergence of this method, we also demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard, MNIST. The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.
△ Less
Submitted 19 January, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
The Optimal Deterrence of Crime: A Focus on the Time Preference of DWI Offenders
Authors:
Yuqing Wang,
Yan Ru Pei
Abstract:
We develop a general model for finding the optimal penal strategy based on the behavioral traits of the offenders. We focus on how the discount rate (level of time discounting) affects criminal propensity on the individual level, and how the aggregation of these effects influences criminal activities on the population level. The effects are aggregated based on the distribution of discount rate amo…
▽ More
We develop a general model for finding the optimal penal strategy based on the behavioral traits of the offenders. We focus on how the discount rate (level of time discounting) affects criminal propensity on the individual level, and how the aggregation of these effects influences criminal activities on the population level. The effects are aggregated based on the distribution of discount rate among the population. We study this distribution empirically through a survey with 207 participants, and we show that it follows zero-inflated exponential distribution. We quantify the effectiveness of the penal strategy as its net utility for the population, and show how this quantity can be maximized. When we apply the maximization procedure on the offense of impaired driving (DWI), we discover that the effectiveness of DWI deterrence depends critically on the amount of fine and prison condition.
△ Less
Submitted 20 September, 2019; v1 submitted 13 September, 2019;
originally announced September 2019.
-
Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops
Authors:
Yan Ru Pei,
Haik Manukian,
Massimiliano Di Ventra
Abstract:
Many optimization problems can be cast into the maximum satisfiability (MAX-SAT) form, and many solvers have been developed for tackling such problems. To evaluate a MAX-SAT solver, it is convenient to generate hard MAX-SAT instances with known solutions. Here, we propose a method of generating weighted MAX-2-SAT instances inspired by the frustrated-loop algorithm used by the quantum annealing com…
▽ More
Many optimization problems can be cast into the maximum satisfiability (MAX-SAT) form, and many solvers have been developed for tackling such problems. To evaluate a MAX-SAT solver, it is convenient to generate hard MAX-SAT instances with known solutions. Here, we propose a method of generating weighted MAX-2-SAT instances inspired by the frustrated-loop algorithm used by the quantum annealing community. We extend the algorithm for instances of general bipartite couplings, with the associated optimization problem being the minimization of the restricted Boltzmann machine (RBM) energy over the nodal values, which is useful for effectively pre-training the RBM. The hardness of the generated instances can be tuned through a central parameter known as the frustration index. Two versions of the algorithm are presented: the random- and structured-loop algorithms. For the random-loop algorithm, we provide a thorough theoretical and empirical analysis on its mathematical properties from the perspective of frustration, and observe empirically a double phase transition behavior in the hardness scaling behavior driven by the frustration index. For the structured-loop algorithm, we show that it offers an improvement in hardness over the random-loop algorithm in the regime of high loop density, with the variation of hardness tunable through the concentration of frustrated weights.
△ Less
Submitted 11 March, 2020; v1 submitted 13 May, 2019;
originally announced May 2019.
-
On the Universality of Memcomputing Machines
Authors:
Yan Ru Pei,
Fabio L. Traversa,
Massimiliano Di Ventra
Abstract:
Universal memcomputing machines (UMMs) [IEEE Trans. Neural Netw. Learn. Syst. 26, 2702 (2015)] represent a novel computational model in which memory (time non-locality) accomplishes both tasks of storing and processing of information. UMMs have been shown to be Turing-complete, namely they can simulate any Turing machine. In this paper, using set theory and cardinality arguments, we compare them w…
▽ More
Universal memcomputing machines (UMMs) [IEEE Trans. Neural Netw. Learn. Syst. 26, 2702 (2015)] represent a novel computational model in which memory (time non-locality) accomplishes both tasks of storing and processing of information. UMMs have been shown to be Turing-complete, namely they can simulate any Turing machine. In this paper, using set theory and cardinality arguments, we compare them with liquid-state machines (or "reservoir computing") and quantum machines ("quantum computing"). We show that UMMs can simulate both types of machines, hence they are both "liquid-" or "reservoir-complete" and "quantum-complete". Of course, these statements pertain only to the type of problems these machines can solve, and not to the amount of resources required for such simulations. Nonetheless, the method presented here provides a general framework in which to describe the relation between UMMs and any other type of computational model.
△ Less
Submitted 10 May, 2019; v1 submitted 22 December, 2017;
originally announced December 2017.
-
Edge-colorings of $K_{m,n}$ which Forbid Multicolored Cycles
Authors:
Hung-Lin Fu,
Yuan-Hsun Lo,
Ryo-Yu Pei
Abstract:
A subgraph in an edge-colored graph is multicolored if all its edges receive distinct colors. In this paper, we study the proper edge-colorings of the complete bipartite graph $K_{m,n}$ which forbid multicolored cycles. Mainly, we prove that (1) for any integer $k\geq 2$, if $n\geq 5k-6$, then any properly $n$-edge-colored $K_{k,n}$ contains a multicolored $C_{2k}$, and (2) determine the order of…
▽ More
A subgraph in an edge-colored graph is multicolored if all its edges receive distinct colors. In this paper, we study the proper edge-colorings of the complete bipartite graph $K_{m,n}$ which forbid multicolored cycles. Mainly, we prove that (1) for any integer $k\geq 2$, if $n\geq 5k-6$, then any properly $n$-edge-colored $K_{k,n}$ contains a multicolored $C_{2k}$, and (2) determine the order of the properly edge-colored complete bipartite graphs which forbid multicolored $C_6$.
△ Less
Submitted 30 June, 2014;
originally announced July 2014.