subscribe to arXiv mailings

A Comprehensive Survey on the Security of Smart Grid: Challenges, Mitigations, and Future Research Opportunities

Authors: Arastoo Zibaeirad, Farnoosh Koleini, Shengping Bi, Tao Hou, Tao Wang

Abstract: In this study, we conduct a comprehensive review of smart grid security, exploring system architectures, attack methodologies, defense strategies, and future research opportunities. We provide an in-depth analysis of various attack vectors, focusing on new attack surfaces introduced by advanced components in smart grids. The review particularly includes an extensive analysis of coordinated attacks… ▽ More In this study, we conduct a comprehensive review of smart grid security, exploring system architectures, attack methodologies, defense strategies, and future research opportunities. We provide an in-depth analysis of various attack vectors, focusing on new attack surfaces introduced by advanced components in smart grids. The review particularly includes an extensive analysis of coordinated attacks that incorporate multiple attack strategies and exploit vulnerabilities across various smart grid components to increase their adverse impact, demonstrating the complexity and potential severity of these threats. Following this, we examine innovative detection and mitigation strategies, including game theory, graph theory, blockchain, and machine learning, discussing their advancements in counteracting evolving threats and associated research challenges. In particular, our review covers a thorough examination of widely used machine learning-based mitigation strategies, analyzing their applications and research challenges spanning across supervised, unsupervised, semi-supervised, ensemble, and reinforcement learning. Further, we outline future research directions and explore new techniques and concerns. We first discuss the research opportunities for existing and emerging strategies, and then explore the potential role of new techniques, such as large language models (LLMs), and the emerging threat of adversarial machine learning in the future of smart grid security. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2406.09371 [pdf, other]

LRM-Zero: Training Large Reconstruction Models with Synthesized Data

Authors: Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie Kaufman, Xin Sun, Hao Tan

Abstract: We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is automatically synthesized from simple primitive shapes with random texturing and augmentations (e.g., height fields, boolean differences, and wireframes). Unlike previous 3D data… ▽ More We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is automatically synthesized from simple primitive shapes with random texturing and augmentations (e.g., height fields, boolean differences, and wireframes). Unlike previous 3D datasets (e.g., Objaverse) which are often captured or crafted by humans to approximate real 3D data, Zeroverse completely ignores realistic global semantics but is rich in complex geometric and texture details that are locally similar to or even more intricate than real objects. We demonstrate that our LRM-Zero, trained with our fully synthesized Zeroverse, can achieve high visual quality in the reconstruction of real-world objects, competitive with models trained on Objaverse. We also analyze several critical design choices of Zeroverse that contribute to LRM-Zero's capability and training stability. Our work demonstrates that 3D reconstruction, one of the core tasks in 3D vision, can potentially be addressed without the semantics of real-world objects. The Zeroverse's procedural synthesis code and interactive visualization are available at: https://desaixie.github.io/lrm-zero/. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 23 pages, 8 figures. Our code and interactive visualization are available at: https://desaixie.github.io/lrm-zero/

arXiv:2406.07520 [pdf, other]

Neural Gaffer: Relighting Any Object via Diffusion

Authors: Haian Jin, Yuan Li, Fujun Luan, Yuanbo Xiangli, Sai Bi, Kai Zhang, Zexiang Xu, Jin Sun, Noah Snavely

Abstract: Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BR… ▽ More Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BRDFs, which can be inaccurate or under-expressive. In this work, we propose a novel end-to-end 2D relighting diffusion model, called Neural Gaffer, that takes a single image of any object and can synthesize an accurate, high-quality relit image under any novel environmental lighting condition, simply by conditioning an image generator on a target environment map, without an explicit scene decomposition. Our method builds on a pre-trained diffusion model, and fine-tunes it on a synthetic relighting dataset, revealing and harnessing the inherent understanding of lighting present in the diffusion model. We evaluate our model on both synthetic and in-the-wild Internet imagery and demonstrate its advantages in terms of generalization and accuracy. Moreover, by combining with other generative methods, our model enables many downstream 2D tasks, such as text-based relighting and object insertion. Our model can also operate as a strong relighting prior for 3D tasks, such as relighting a radiance field. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Project Website: https://neural-gaffer.github.io

arXiv:2405.17129 [pdf, other]

TEII: Think, Explain, Interact and Iterate with Large Language Models to Solve Cross-lingual Emotion Detection

Authors: Long Cheng, Qihao Shao, Christine Zhao, Sheng Bi, Gina-Anne Levow

Abstract: Cross-lingual emotion detection allows us to analyze global trends, public opinion, and social phenomena at scale. We participated in the Explainability of Cross-lingual Emotion Detection (EXALT) shared task, achieving an F1-score of 0.6046 on the evaluation set for the emotion detection sub-task. Our system outperformed the baseline by more than 0.16 F1-score absolute, and ranked second amongst c… ▽ More Cross-lingual emotion detection allows us to analyze global trends, public opinion, and social phenomena at scale. We participated in the Explainability of Cross-lingual Emotion Detection (EXALT) shared task, achieving an F1-score of 0.6046 on the evaluation set for the emotion detection sub-task. Our system outperformed the baseline by more than 0.16 F1-score absolute, and ranked second amongst competing systems. We conducted experiments using fine-tuning, zero-shot learning, and few-shot learning for Large Language Model (LLM)-based models as well as embedding-based BiLSTM and KNN for non-LLM-based techniques. Additionally, we introduced two novel methods: the Multi-Iteration Agentic Workflow and the Multi-Binary-Classifier Agentic Workflow. We found that LLM-based approaches provided good performance on multilingual emotion detection. Furthermore, ensembles combining all our experimented models yielded higher F1-scores than any single approach alone. △ Less

Submitted 2 July, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis (ACL 2024)

arXiv:2405.16363 [pdf, other]

LLMs for User Interest Exploration in Large-scale Recommendation Systems

Authors: Jianling Wang, Haokai Lu, Yifan Liu, He Ma, Yueqi Wang, Yang Gu, Shuzhou Zhang, Ningren Han, Shuchao Bi, Lexi Baugher, Ed Chi, Minmin Chen

Abstract: Traditional recommendation systems are subject to a strong feedback loop by learning from and reinforcing past user-item interactions, which in turn limits the discovery of novel user interests. To address this, we introduce a hybrid hierarchical framework combining Large Language Models (LLMs) and classic recommendation models for user interest exploration. The framework controls the interfacing… ▽ More Traditional recommendation systems are subject to a strong feedback loop by learning from and reinforcing past user-item interactions, which in turn limits the discovery of novel user interests. To address this, we introduce a hybrid hierarchical framework combining Large Language Models (LLMs) and classic recommendation models for user interest exploration. The framework controls the interfacing between the LLMs and the classic recommendation models through "interest clusters", the granularity of which can be explicitly determined by algorithm designers. It recommends the next novel interests by first representing "interest clusters" using language, and employs a fine-tuned LLM to generate novel interest descriptions that are strictly within these predefined clusters. At the low level, it grounds these generated interests to an item-level policy by restricting classic recommendation models, in this case a transformer-based sequence recommender to return items that fall within the novel clusters generated at the high level. We showcase the efficacy of this approach on an industrial-scale commercial platform serving billions of users. Live experiments show a significant increase in both exploration of novel interests and overall user enjoyment of the platform. △ Less

Submitted 7 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.14847 [pdf, other]

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Authors: Liwen Wu, Sai Bi, Zexiang Xu, Fujun Luan, Kai Zhang, Iliyan Georgiev, Kalyan Sunkavalli, Ravi Ramamoorthi

Abstract: Novel-view synthesis of specular objects like shiny metals or glossy paints remains a significant challenge. Not only the glossy appearance but also global illumination effects, including reflections of other objects in the environment, are critical components to faithfully reproduce a scene. In this paper, we present Neural Directional Encoding (NDE), a view-dependent appearance encoding of neura… ▽ More Novel-view synthesis of specular objects like shiny metals or glossy paints remains a significant challenge. Not only the glossy appearance but also global illumination effects, including reflections of other objects in the environment, are critical components to faithfully reproduce a scene. In this paper, we present Neural Directional Encoding (NDE), a view-dependent appearance encoding of neural radiance fields (NeRF) for rendering specular objects. NDE transfers the concept of feature-grid-based spatial encoding to the angular domain, significantly improving the ability to model high-frequency angular signals. In contrast to previous methods that use encoding functions with only angular input, we additionally cone-trace spatial features to obtain a spatially varying directional encoding, which addresses the challenging interreflection effects. Extensive experiments on both synthetic and real datasets show that a NeRF model with NDE (1) outperforms the state of the art on view synthesis of specular objects, and (2) works with small networks to allow fast (real-time) inference. The project webpage and source code are available at: \url{https://lwwu2.github.io/nde/}. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted to CVPR 2024

arXiv:2405.12523 [pdf, other]

Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models

Authors: Jiaqi Li, Qianshan Wei, Chuanyi Zhang, Guilin Qi, Miaozeng Du, Yongrui Chen, Sheng Bi

Abstract: Machine unlearning empowers individuals with the `right to be forgotten' by removing their private or sensitive information encoded in machine learning models. However, it remains uncertain whether MU can be effectively applied to Multimodal Large Language Models (MLLMs), particularly in scenarios of forgetting the leaked visual data of concepts. To overcome the challenge, we propose an efficient… ▽ More Machine unlearning empowers individuals with the `right to be forgotten' by removing their private or sensitive information encoded in machine learning models. However, it remains uncertain whether MU can be effectively applied to Multimodal Large Language Models (MLLMs), particularly in scenarios of forgetting the leaked visual data of concepts. To overcome the challenge, we propose an efficient method, Single Image Unlearning (SIU), to unlearn the visual recognition of a concept by fine-tuning a single associated image for few steps. SIU consists of two key aspects: (i) Constructing Multifaceted fine-tuning data. We introduce four targets, based on which we construct fine-tuning data for the concepts to be forgotten; (ii) Jointly training loss. To synchronously forget the visual recognition of concepts and preserve the utility of MLLMs, we fine-tune MLLMs through a novel Dual Masked KL-divergence Loss combined with Cross Entropy loss. Alongside our method, we establish MMUBench, a new benchmark for MU in MLLMs and introduce a collection of metrics for its evaluation. Experimental results on MMUBench show that SIU completely surpasses the performance of existing methods. Furthermore, we surprisingly find that SIU can avoid invasive membership inference attacks and jailbreak attacks. To the best of our knowledge, we are the first to explore MU in MLLMs. We will release the code and benchmark in the near future. △ Less

Submitted 29 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2404.19702 [pdf, other]

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

Authors: Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu

Abstract: We propose GS-LRM, a scalable large reconstruction model that can predict high-quality 3D Gaussian primitives from 2-4 posed sparse images in 0.23 seconds on single A100 GPU. Our model features a very simple transformer-based architecture; we patchify input posed images, pass the concatenated multi-view image tokens through a sequence of transformer blocks, and decode final per-pixel Gaussian para… ▽ More We propose GS-LRM, a scalable large reconstruction model that can predict high-quality 3D Gaussian primitives from 2-4 posed sparse images in 0.23 seconds on single A100 GPU. Our model features a very simple transformer-based architecture; we patchify input posed images, pass the concatenated multi-view image tokens through a sequence of transformer blocks, and decode final per-pixel Gaussian parameters directly from these tokens for differentiable rendering. In contrast to previous LRMs that can only reconstruct objects, by predicting per-pixel Gaussians, GS-LRM naturally handles scenes with large variations in scale and complexity. We show that our model can work on both object and scene captures by training it on Objaverse and RealEstate10K respectively. In both scenarios, the models outperform state-of-the-art baselines by a wide margin. We also demonstrate applications of our model in downstream 3D generation tasks. Our project webpage is available at: https://sai-bi.github.io/project/gs-lrm/ . △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: Project webpage: https://sai-bi.github.io/project/gs-lrm/

arXiv:2404.18184 [pdf]

doi 10.23977/infse.2024.050217

Application and practice of AI technology in quantitative investment

Authors: Shuochen Bi, Wenqing Bao, Jue Xiao, Jiangshan Wang, Tingting Deng

Abstract: With the continuous development of artificial intelligence technology, using machine learning technology to predict market trends may no longer be out of reach. In recent years, artificial intelligence has become a research hotspot in the academic circle,and it has been widely used in image recognition, natural language processing and other fields, and also has a huge impact on the field of quanti… ▽ More With the continuous development of artificial intelligence technology, using machine learning technology to predict market trends may no longer be out of reach. In recent years, artificial intelligence has become a research hotspot in the academic circle,and it has been widely used in image recognition, natural language processing and other fields, and also has a huge impact on the field of quantitative investment. As an investment method to obtain stable returns through data analysis, model construction and program trading, quantitative investment is deeply loved by financial institutions and investors. At the same time, as an important application field of quantitative investment, the quantitative investment strategy based on artificial intelligence technology arises at the historic moment.How to apply artificial intelligence to quantitative investment, so as to better achieve profit and risk control, has also become the focus and difficulty of the research. From a global perspective, inflation in the US and the Federal Reserve are the concerns of investors, which to some extent affects the direction of global assets, including the Chinese stock market. This paper studies the application of AI technology, quantitative investment, and AI technology in quantitative investment, aiming to provide investors with auxiliary decision-making, reduce the difficulty of investment analysis, and help them to obtain higher returns. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 9 pages,2 figures

Journal ref: Information Systems and Economics (2024) Clausius Scientific Press, Canada , ISSN 2523-6407 Vol. 5 Num. 2

arXiv:2404.18183 [pdf]

doi 10.62051/IJGEM.v2n3.08

Innovative Application of Artificial Intelligence Technology in Bank Credit Risk Management

Authors: Shuochen Bi, Wenqing Bao

Abstract: With the rapid growth of technology, especially the widespread application of artificial intelligence (AI) technology, the risk management level of commercial banks is constantly reaching new heights. In the current wave of digitalization, AI has become a key driving force for the strategic transformation of financial institutions, especially the banking industry. For commercial banks, the stabili… ▽ More With the rapid growth of technology, especially the widespread application of artificial intelligence (AI) technology, the risk management level of commercial banks is constantly reaching new heights. In the current wave of digitalization, AI has become a key driving force for the strategic transformation of financial institutions, especially the banking industry. For commercial banks, the stability and safety of asset quality are crucial, which directly relates to the long-term stable growth of the bank. Among them, credit risk management is particularly core because it involves the flow of a large amount of funds and the accuracy of credit decisions. Therefore, establishing a scientific and effective credit risk decision-making mechanism is of great strategic significance for commercial banks. In this context, the innovative application of AI technology has brought revolutionary changes to bank credit risk management. Through deep learning and big data analysis, AI can accurately evaluate the credit status of borrowers, timely identify potential risks, and provide banks with more accurate and comprehensive credit decision support. At the same time, AI can also achieve realtime monitoring and early warning, helping banks intervene before risks occur and reduce losses. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 6 pages, 1 figure, 2 tables

Journal ref: International Journal of Global Economics and Management ISSN: 3005-9690 (Print), ISSN: 3005-8090 (Online) | Volume 2, Number 3, Year 2024

arXiv:2404.13850 [pdf, other]

Reconstructing Intrinsic Stellar Noise with Stellar Atmospheric Parameters and Chromospheric Activity

Authors: Jinghua Zhang, Maosheng Xiang, Jie Yu, Jian Ge, Ji-Wei Xie, Hui Zhang, Yaguang Li, You Wu, Chun-Qian Li, Shaolan Bi, Hong-Liang Yan, Jian-Rong Shi

Abstract: Accurately characterizing intrinsic stellar photometric noise induced by stellar astrophysics, such as stellar activity, granulation, and oscillations, is of crucial importance for detecting transiting exoplanets. In this study, we investigate the relation between the intrinsic stellar photometric noise, as quantified by the Kepler rrmsCDPP measurement, and the level of stellar chromospheric activ… ▽ More Accurately characterizing intrinsic stellar photometric noise induced by stellar astrophysics, such as stellar activity, granulation, and oscillations, is of crucial importance for detecting transiting exoplanets. In this study, we investigate the relation between the intrinsic stellar photometric noise, as quantified by the Kepler rrmsCDPP measurement, and the level of stellar chromospheric activity, as indicated by the S-index of Ca II HK lines derived from the LAMOST spectra. Our results reveal a clear positive correlation between S-index and rrmsCDPP, and the correlation becomes more significant at higher activity levels and on longer timescales. We have therefore built an empirical relation between rrmsCDPP and S-index as well as Teff, logg, [Fe/H], and apparent magnitude with the XGBoost regression algorithm, using the LAMOST-Kepler common star sample as the training set. This method achieves a precision of ~20 ppm for inferring the intrinsic noise from the S-index and other stellar labels on a 6-hour integration duration. We have applied this empirical relation to the full LAMOST DR7 spectra database, and obtained the intrinsic noise predictions for 1,358,275 stars. The resultant catalog is publicly available and expected to be valuable for optimizing target selection for future exoplanet-hunting space missions, such as the Earth 2.0 mission. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: Accepted for publication in ApJS

arXiv:2404.12385 [pdf, other]

MeshLRM: Large Reconstruction Model for High-Quality Mesh

Authors: Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, Zexiang Xu

Abstract: We propose MeshLRM, a novel LRM-based approach that can reconstruct a high-quality mesh from merely four input images in less than one second. Different from previous large reconstruction models (LRMs) that focus on NeRF-based reconstruction, MeshLRM incorporates differentiable mesh extraction and rendering within the LRM framework. This allows for end-to-end mesh reconstruction by fine-tuning a p… ▽ More We propose MeshLRM, a novel LRM-based approach that can reconstruct a high-quality mesh from merely four input images in less than one second. Different from previous large reconstruction models (LRMs) that focus on NeRF-based reconstruction, MeshLRM incorporates differentiable mesh extraction and rendering within the LRM framework. This allows for end-to-end mesh reconstruction by fine-tuning a pre-trained NeRF LRM with mesh rendering. Moreover, we improve the LRM architecture by simplifying several complex designs in previous LRMs. MeshLRM's NeRF initialization is sequentially trained with low- and high-resolution images; this new LRM training strategy enables significantly faster convergence and thereby leads to better quality with less compute. Our approach achieves state-of-the-art mesh reconstruction from sparse-view inputs and also allows for many downstream applications, including text-to-3D and single-image-to-3D generation. Project page: https://sarahweiii.github.io/meshlrm/ △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.04526 [pdf, other]

DATENeRF: Depth-Aware Text-based Editing of NeRFs

Authors: Sara Rojas, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavall

Abstract: Recent advancements in diffusion models have shown remarkable proficiency in editing 2D images based on text prompts. However, extending these techniques to edit scenes in Neural Radiance Fields (NeRF) is complex, as editing individual 2D frames can result in inconsistencies across multiple views. Our crucial insight is that a NeRF scene's geometry can serve as a bridge to integrate these 2D edits… ▽ More Recent advancements in diffusion models have shown remarkable proficiency in editing 2D images based on text prompts. However, extending these techniques to edit scenes in Neural Radiance Fields (NeRF) is complex, as editing individual 2D frames can result in inconsistencies across multiple views. Our crucial insight is that a NeRF scene's geometry can serve as a bridge to integrate these 2D edits. Utilizing this geometry, we employ a depth-conditioned ControlNet to enhance the coherence of each 2D image modification. Moreover, we introduce an inpainting approach that leverages the depth information of NeRF scenes to distribute 2D edits across different images, ensuring robustness against errors and resampling challenges. Our results reveal that this methodology achieves more consistent, lifelike, and detailed edits than existing leading methods for text-driven NeRF scene editing. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: 14 pages, Conference paper, 3D Scene Editing, Neural Rendering, Diffusion Models

arXiv:2404.00940 [pdf, ps, other]

Sequential Decision-Making under Uncertainty: A Robust MDPs review

Authors: Wenfan Ou, Sheng Bi

Abstract: Fueled by both advances in robust optimization theory and applications of reinforcement learning, robust Markov Decision Processes (RMDPs) have gained increasing attention, due to their powerful capability for sequential decision-making under uncertainty. This review provides an in-depth overview of the evolution and advances in RMDPs formulations, particularly in ambiguity modeling, and classifie… ▽ More Fueled by both advances in robust optimization theory and applications of reinforcement learning, robust Markov Decision Processes (RMDPs) have gained increasing attention, due to their powerful capability for sequential decision-making under uncertainty. This review provides an in-depth overview of the evolution and advances in RMDPs formulations, particularly in ambiguity modeling, and classifies these methods for representing uncertainty into three principal approaches: parametric, moment-based, and discrepancy-based, elaborating the trade-offs among the alternative representations. Meanwhile, the review delves into the rectangular assumptions, which guarantee the tractability of RMDPs yet are noted for their conservatism. The review summarizes three popular rectangular conditions and develops a new proof to attest to the NP-hardness of non-rectangular RMDPs. Out of the traditional RMDPs scope, recent efforts without conventional rectangular assumptions and new fashions within the RMDPs community are also reviewed. These studies foster the development of more flexible and practical modeling frameworks and enhance the adaptability and performance of RMDPs. △ Less

Submitted 3 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

arXiv:2403.11210 [pdf, ps, other]

Error bounds for rank-one DNN reformulation of QAP and DC exact penalty approach

Authors: Yitian Qian, Shaohua Pan, Shujun Bi, Houduo Qi

Abstract: This paper concerns the quadratic assignment problem (QAP), a class of challenging combinatorial optimization problems. We provide an equivalent rank-one doubly nonnegative (DNN) reformulation with fewer equality constraints, and derive the local error bounds for its feasible set. By leveraging these error bounds, we prove that the penalty problem induced by the difference of convexity (DC) reform… ▽ More This paper concerns the quadratic assignment problem (QAP), a class of challenging combinatorial optimization problems. We provide an equivalent rank-one doubly nonnegative (DNN) reformulation with fewer equality constraints, and derive the local error bounds for its feasible set. By leveraging these error bounds, we prove that the penalty problem induced by the difference of convexity (DC) reformulation of the rank-one constraint is a global exact penalty, and so is the penalty problem for its Burer-Monteiro (BM) factorization. As a byproduct, we verify that the penalty problem for the rank-one DNN reformulation proposed in \cite{Jiang21} is a global exact penalty without the calmness assumption. Then, we develop a continuous relaxation approach by seeking approximate stationary points of a finite number of penalty problems for the BM factorization with an augmented Lagrangian method, whose asymptotic convergence certificate is also provided under a mild condition. Numerical comparison with Gurobi for \textbf{131} benchmark instances validates the efficiency of the proposed DC exact penalty approach. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.09632 [pdf, other]

Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image

Authors: Yiqun Mei, Yu Zeng, He Zhang, Zhixin Shu, Xuaner Zhang, Sai Bi, Jianming Zhang, HyunJoon Jung, Vishal M. Patel

Abstract: At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to rec… ▽ More At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to reconstruct geometry and appearance from an input portrait as a set of 3D-aware features. We design a relighting module conditioned on a given lighting to process these features, and predict a relit 3D representation in the form of a tri-plane, which can render to an arbitrary viewpoint through volume rendering. Besides viewpoint and lighting control, Holo-Relighting also takes the head pose as a condition to enable head-pose-dependent lighting effects. With these novel designs, Holo-Relighting can generate complex non-Lambertian lighting effects (e.g., specular highlights and cast shadows) without using any explicit physical lighting priors. We train Holo-Relighting with data captured with a light stage, and propose two data-rendering techniques to improve the data quality for training the volumetric relighting system. Through quantitative and qualitative experiments, we demonstrate Holo-Relighting can achieve state-of-the-arts relighting quality with better photorealism, 3D consistency and controllability. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: CVPR2024

arXiv:2403.07571 [pdf, other]

doi 10.1145/3589335.3651548

Proactive Recommendation with Iterative Preference Guidance

Authors: Shuxian Bi, Wenjie Wang, Hang Pan, Fuli Feng, Xiangnan He

Abstract: Recommender systems mainly tailor personalized recommendations according to user interests learned from user feedback. However, such recommender systems passively cater to user interests and even reinforce existing interests in the feedback loop, leading to problems like filter bubbles and opinion polarization. To counteract this, proactive recommendation actively steers users towards developing n… ▽ More Recommender systems mainly tailor personalized recommendations according to user interests learned from user feedback. However, such recommender systems passively cater to user interests and even reinforce existing interests in the feedback loop, leading to problems like filter bubbles and opinion polarization. To counteract this, proactive recommendation actively steers users towards developing new interests in a target item or topic by strategically modulating recommendation sequences. Existing work for proactive recommendation faces significant hurdles: 1) overlooking the user feedback in the guidance process; 2) lacking explicit modeling of the guiding objective; and 3) insufficient flexibility for integration into existing industrial recommender systems. To address these issues, we introduce an Iterative Preference Guidance (IPG) framework. IPG performs proactive recommendation in a flexible post-processing manner by ranking items according to their IPG scores that consider both interaction probability and guiding value. These scores are explicitly estimated with iteratively updated user representation that considers the most recent user interactions. Extensive experiments validate that IPG can effectively guide user interests toward target interests with a reasonable trade-off in recommender accuracy. The code is available at https://github.com/GabyUSTC/IPG-Rec. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Accepted by WWW 2024 (Short)

arXiv:2402.14035 [pdf, other]

Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model

Authors: Zichang Liu, Qingyun Liu, Yuening Li, Liang Liu, Anshumali Shrivastava, Shuchao Bi, Lichan Hong, Ed H. Chi, Zhe Zhao

Abstract: Recent advancements in foundation models have yielded impressive performance across a wide range of tasks. Meanwhile, for specific applications, practitioners have been developing specialized application models. To enjoy the benefits of both kinds of models, one natural path is to transfer the knowledge in foundation models into specialized application models, which are generally more efficient fo… ▽ More Recent advancements in foundation models have yielded impressive performance across a wide range of tasks. Meanwhile, for specific applications, practitioners have been developing specialized application models. To enjoy the benefits of both kinds of models, one natural path is to transfer the knowledge in foundation models into specialized application models, which are generally more efficient for serving. Techniques from knowledge distillation may be applied here, where the application model learns to mimic the foundation model. However, specialized application models and foundation models have substantial gaps in capacity, employing distinct architectures, using different input features from different modalities, and being optimized on different distributions. These differences in model characteristics lead to significant challenges for distillation methods. In this work, we propose creating a teaching committee comprising both foundation model teachers and complementary teachers. Complementary teachers possess model characteristics akin to the student's, aiming to bridge the gap between the foundation model and specialized application models for a smoother knowledge transfer. Further, to accommodate the dissimilarity among the teachers in the committee, we introduce DiverseDistill, which allows the student to understand the expertise of each teacher and extract task knowledge. Our evaluations demonstrate that adding complementary teachers enhances student performance. Finally, DiverseDistill consistently outperforms baseline distillation methods, regardless of the teacher choices, resulting in significantly improved student performance. △ Less

Submitted 15 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.04644 [pdf, other]

LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views

Authors: Yuji Roh, Qingyun Liu, Huan Gui, Zhe Yuan, Yujin Tang, Steven Euijong Whang, Liang Liu, Shuchao Bi, Lichan Hong, Ed H. Chi, Zhe Zhao

Abstract: Fine-tuning is becoming widely used for leveraging the power of pre-trained foundation models in new downstream tasks. While there are many successes of fine-tuning on various tasks, recent studies have observed challenges in the generalization of fine-tuned models to unseen distributions (i.e., out-of-distribution; OOD). To improve OOD generalization, some previous studies identify the limitation… ▽ More Fine-tuning is becoming widely used for leveraging the power of pre-trained foundation models in new downstream tasks. While there are many successes of fine-tuning on various tasks, recent studies have observed challenges in the generalization of fine-tuned models to unseen distributions (i.e., out-of-distribution; OOD). To improve OOD generalization, some previous studies identify the limitations of fine-tuning data and regulate fine-tuning to preserve the general representation learned from pre-training data. However, potential limitations in the pre-training data and models are often ignored. In this paper, we contend that overly relying on the pre-trained representation may hinder fine-tuning from learning essential representations for downstream tasks and thus hurt its OOD generalization. It can be especially catastrophic when new tasks are from different (sub)domains compared to pre-training data. To address the issues in both pre-training and fine-tuning data, we propose a novel generalizable fine-tuning method LEVI (Layer-wise Ensemble of different VIews), where the pre-trained model is adaptively ensembled layer-wise with a small task-specific model, while preserving its efficiencies. By combining two complementing models, LEVI effectively suppresses problematic features in both the fine-tuning data and pre-trained model and preserves useful features for new tasks. Broad experiments with large language and vision models show that LEVI greatly improves fine-tuning generalization via emphasizing different views from fine-tuning data and pre-trained features. △ Less

Submitted 18 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

arXiv:2402.04448 [pdf, other]

Failure Analysis in Next-Generation Critical Cellular Communication Infrastructures

Authors: Siguo Bi, Xin Yuan, Shuyan Hu, Kai Li, Wei Ni, Ekram Hossain, Xin Wang

Abstract: The advent of communication technologies marks a transformative phase in critical infrastructure construction, where the meticulous analysis of failures becomes paramount in achieving the fundamental objectives of continuity, security, and availability. This survey enriches the discourse on failures, failure analysis, and countermeasures in the context of the next-generation critical communication… ▽ More The advent of communication technologies marks a transformative phase in critical infrastructure construction, where the meticulous analysis of failures becomes paramount in achieving the fundamental objectives of continuity, security, and availability. This survey enriches the discourse on failures, failure analysis, and countermeasures in the context of the next-generation critical communication infrastructures. Through an exhaustive examination of existing literature, we discern and categorize prominent research orientations with focuses on, namely resource depletion, security vulnerabilities, and system availability concerns. We also analyze constructive countermeasures tailored to address identified failure scenarios and their prevention. Furthermore, the survey emphasizes the imperative for standardization in addressing failures related to Artificial Intelligence (AI) within the ambit of the sixth-generation (6G) networks, accounting for the forward-looking perspective for the envisioned intelligence of 6G network architecture. By identifying new challenges and delineating future research directions, this survey can help guide stakeholders toward unexplored territories, fostering innovation and resilience in critical communication infrastructure development and failure prevention. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.17669 [pdf, other]

Compression before Fusion: Broadcast Semantic Communication System for Heterogeneous Tasks

Authors: Mingze Gong, Shuoyao Wang, Fangwei Ye, Suzhi Bi

Abstract: Semantic communication has emerged as new paradigm shifts in 6G from the conventional syntax-oriented communications. Recently, the wireless broadcast technology has been introduced to support semantic communication system toward higher communication efficiency. Nevertheless, existing broadcast semantic communication systems target on general representation within one stage and fail to balance the… ▽ More Semantic communication has emerged as new paradigm shifts in 6G from the conventional syntax-oriented communications. Recently, the wireless broadcast technology has been introduced to support semantic communication system toward higher communication efficiency. Nevertheless, existing broadcast semantic communication systems target on general representation within one stage and fail to balance the inference accuracy among users. In this paper, the broadcast encoding process is decomposed into compression and fusion to improves communication efficiency with adaptation to tasks and channels.Particularly, we propose multiple task-channel-aware sub-encoders (TCE) and a channel-aware feature fusion sub-encoder (CFE) towards compression and fusion, respectively. In TCEs, multiple local-channel-aware attention blocks are employed to extract and compress task-relevant information for each user. In GFE, we introduce a global-channel-aware fine-tuning block to merge these compressed task-relevant signals into a compact broadcast signal. Notably, we retrieve the bottleneck in DeepBroadcast and leverage information bottleneck theory to further optimize the parameter tuning of TCEs and CFE.We substantiate our approach through experiments on a range of heterogeneous tasks across various channels with additive white Gaussian noise (AWGN) channel, Rayleigh fading channel, and Rician fading channel. Simulation results evidence that the proposed DeepBroadcast outperforms the state-of-the-art methods. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.15438 [pdf, other]

Relations of rotation and chromospheric activity to stellar age for FGK dwarfs from Kepler and LAMOST

Authors: Lifei Ye, Shaolan Bi, Jinghua Zhang, Tiancheng Sun, Liu Long, Zhishuai Ge, Tanda Li, Xianfei Zhang, Xunzhou Chen, Yaguang Li, Jianzhao Zhou, Maosheng Xiang

Abstract: The empirical relations between rotation period, chromospheric activity, and age can be used to estimate stellar age. To calibrate these relations, we present a catalog, including the masses and ages of 52,321 FGK dwarfs, 47,489 chromospheric activity index $logR^{+}_{HK}$, 6,077 rotation period $P_{rot}$ and variability amplitude $S_{ph}$, based on data from LAMOST DR7, Kepler and Gaia DR3. We fi… ▽ More The empirical relations between rotation period, chromospheric activity, and age can be used to estimate stellar age. To calibrate these relations, we present a catalog, including the masses and ages of 52,321 FGK dwarfs, 47,489 chromospheric activity index $logR^{+}_{HK}$, 6,077 rotation period $P_{rot}$ and variability amplitude $S_{ph}$, based on data from LAMOST DR7, Kepler and Gaia DR3. We find a pronounced correlation among $P_{rot}$, age, and [Fe/H] throughout the main-sequence phase for F dwarfs. However, the decrease of $logR^{+}_{HK}$ over time is not significant except for those with [Fe/H] $<$ $-$0.1. For G dwarfs, both $P_{rot}$ and $logR^{+}_{HK}$ are reliable age probes in the ranges $\sim$ 2-11 Gyr and $\sim$ 2-13 Gyr, respectively. K dwarfs exhibit a prominent decrease in $logR^{+}_{HK}$ within the age range of $\sim$ 3-13 Gyr when the relation of $P_{rot}-τ$ is invalid. These relations are very important for promptly estimating the age of a vast number of stars, thus serving as a powerful tool in advancing the fields of exoplanet properties, stellar evolution, and Galactic-archaeology. △ Less

Submitted 27 January, 2024; originally announced January 2024.

arXiv:2401.14640 [pdf, other]

Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs

Authors: Nan Hu, Jiaoyan Chen, Yike Wu, Guilin Qi, Sheng Bi, Tongtong Wu, Jeff Z. Pan

Abstract: The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations an… ▽ More The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations and statements. To compare these attribution evaluation methods and develop new ones, we introduce a set of fine-grained categories (i.e., supportive, insufficient, contradictory and irrelevant) for measuring the attribution, and develop a Complex Attributed Question Answering (CAQA) benchmark by leveraging knowledge graphs (KGs) for automatically generating attributions of different categories to question-answer pairs. Our analysis reveals that existing evaluators perform poorly under fine-grained attribution settings and exhibit weaknesses in complex citation-statement reasoning. Our CAQA benchmark, validated with human annotations, emerges as a promising tool for selecting and developing LLM attribution evaluators. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 13 pages, 5 figures

arXiv:2401.11696 [pdf, ps, other]

doi 10.1063/5.0178075

Self-interaction corrected SCAN functional for molecules and solids in the numeric atom-center orbital framework

Authors: Sheng Bi, Christian Carbogno, Igor Ying Zhang, Matthias Scheffler

Abstract: Semilocal density-functional approximations (DFAs), including the state-of-the-art SCAN functional, are plagued by the self-interaction error (SIE). While this error is explicitly defined only for one-electron systems, it has inspired the self-interaction correction method proposed by Perdew and Zunger (PZ-SIC), which has shown promise in mitigating the many-electron SIE. However, the PZ-SIC metho… ▽ More Semilocal density-functional approximations (DFAs), including the state-of-the-art SCAN functional, are plagued by the self-interaction error (SIE). While this error is explicitly defined only for one-electron systems, it has inspired the self-interaction correction method proposed by Perdew and Zunger (PZ-SIC), which has shown promise in mitigating the many-electron SIE. However, the PZ-SIC method is known for its significant numerical instability. In this study, we introduce a novel constraint that facilitates self-consistent localization of the SIC orbitals in the spirit of Edmiston-Ruedenberg orbitals [Rev. Mod. Phys. 35, 457 (1963)]. Our practical implementation within the all-electron numeric atom-centered orbitals code FHI-aims guarantees efficient and stable convergence of the self-consistent PZ-SIC equations for both molecules and solids. We further demonstrate that our PZ-SIC approach effectively mitigates the SIE in the meta-GGA SCAN functional, significantly improving the accuracy for ionization potentials, charge-transfer energies, and band gaps for a diverse selection of molecules and solids. However, our PZ-SIC method does have its limitations. It can not improve the already accurate SCAN results for properties such as cohesive energies, lattice constants, and bulk modulus in our test sets. This highlights the need for new-generation DFAs with more comprehensive applicability. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Journal ref: J. Chem. Phys. 160, 034106 (2024)

arXiv:2401.11134 [pdf, other]

Detection of Solar-like Oscillations in Sub-giant and Red Giant Stars Using 2-minute Cadence TESS Data

Authors: Jianzhao Zhou, Shaolan Bi, Jie Yu, Yaguang Li, Xianfei Zhang, Tanda Li, Liu Long, Mengjie Li, Tiancheng Sun, Lifei Ye

Abstract: Based on all 2-minute cadence $TESS$ light curves from Sector 1 to 60, we provide a catalog of 8,651 solar-like oscillators, including frequency at maximum power ($ν_{\rm max}$, with its median precision, $σ$=5.39\%), large frequency separation ($Δν$, $σ$=6.22\%), seismically derived masses, radii, and surface gravity. In this sample, we have detected 2,173 new oscillators and added 4,373 new… ▽ More Based on all 2-minute cadence $TESS$ light curves from Sector 1 to 60, we provide a catalog of 8,651 solar-like oscillators, including frequency at maximum power ($ν_{\rm max}$, with its median precision, $σ$=5.39\%), large frequency separation ($Δν$, $σ$=6.22\%), seismically derived masses, radii, and surface gravity. In this sample, we have detected 2,173 new oscillators and added 4,373 new $Δν$ measurements. Our seismic parameters are consistent with those from $Kepler$, $K2$, and previous $TESS$ data. The median fractional residual in $ν_{\rm max}$ is $1.63\%$ with a scatter of $14.75\%$, and in $Δν$ it is $0.11\%$ with a scatter of $10.76\%$. We have detected 476 solar-like oscillators with $ν_{\rm max}$ exceeding the $Nyquist$ frequency of $Kepler$ long-cadence data during the evolutionary phases of sub-giant and the base of the red-giant branch, which provide a valuable resource for understanding angular momentum transport. △ Less

Submitted 20 January, 2024; originally announced January 2024.

arXiv:2401.04379 [pdf, ps, other]

doi 10.1063/5.0174040

Improving XYG3-type Doubly Hybrid Approximation using Self-Interaction Corrected SCAN Density and Orbitals via the PZ-SIC Framework: the xDH@SCAN(SIC) Approach

Authors: Sheng Bi, Shirong Wang, Igor Ying Zhang, Xin Xu

Abstract: XYG3-type doubly hybrid approximations (xDH) have gained a widespread recognition for their accuracy in describing a diverse range of chemical and physical interactions. However, a recent study (J. Phys. Chem. 2021, 12, 800-807) has highlighted the limitation of xDH methods in calculating the dissociation of the NaCl molecule. This issue has been related to the density and orbitals used for evalua… ▽ More XYG3-type doubly hybrid approximations (xDH) have gained a widespread recognition for their accuracy in describing a diverse range of chemical and physical interactions. However, a recent study (J. Phys. Chem. 2021, 12, 800-807) has highlighted the limitation of xDH methods in calculating the dissociation of the NaCl molecule. This issue has been related to the density and orbitals used for evaluating the energy in xDH methods, which are obtained from lower-rung hybrid density functional approximations (DFAs) and display substantial density errors in the dissociation limit. In this work, we systematically investigate the influence of density on several challenging datasets and find that the xDH methods are less sensitive to the density errors compared to semi-local and hybrid DFAs. Furthermore, we demonstrate that the self-interaction corrected SCAN density offers superior accuracy compared to the self-consistent SCAN density and Hartree-Fock (HF) density, as evidenced by the charge analysis on the dissociation of heterodimers, such as NaCl and LiF. Building on these insights, we propose a 5-parameter xDH method using the SCAN density and orbitals corrected by the PZ-SIC scheme. This new xDH@SCAN(SIC) method provides a balanced and accurate description across a wide range of challenging systems. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Journal ref: J. Chem. Phys. 159, 234103 (2023)

arXiv:2312.13980 [pdf, other]

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Authors: Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman

Abstract: Multi-view diffusion models, obtained by applying Supervised Finetuning (SFT) to text-to-image diffusion models, have driven recent breakthroughs in text-to-3D research. However, due to the limited size and quality of existing 3D datasets, they still suffer from multi-view inconsistencies and Neural Radiance Field (NeRF) reconstruction artifacts. We argue that multi-view diffusion models can benef… ▽ More Multi-view diffusion models, obtained by applying Supervised Finetuning (SFT) to text-to-image diffusion models, have driven recent breakthroughs in text-to-3D research. However, due to the limited size and quality of existing 3D datasets, they still suffer from multi-view inconsistencies and Neural Radiance Field (NeRF) reconstruction artifacts. We argue that multi-view diffusion models can benefit from further Reinforcement Learning Finetuning (RLFT), which allows models to learn from the data generated by themselves and improve beyond their dataset limitations during SFT. To this end, we introduce Carve3D, an improved RLFT algorithm coupled with a novel Multi-view Reconstruction Consistency (MRC) metric, to enhance the consistency of multi-view diffusion models. To measure the MRC metric on a set of multi-view images, we compare them with their corresponding NeRF renderings at the same camera viewpoints. The resulting model, which we denote as Carve3DM, demonstrates superior multi-view consistency and NeRF reconstruction quality than existing models. Our results suggest that pairing SFT with Carve3D's RLFT is essential for developing multi-view-consistent diffusion models, mirroring the standard Large Language Model (LLM) alignment pipeline. Our code, training and testing data, and video results are available at: https://desaixie.github.io/carve-3d. △ Less

Submitted 9 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 22 pages, 16 figures. Our code, training and testing data, and video results are available at: https://desaixie.github.io/carve-3d. This paper has been accepted to CVPR 2024. v2: incorporated changes from the CVPR 2024 camera-ready version

arXiv:2312.03787 [pdf, other]

Detection and Mitigation of Position Spoofing Attacks on Cooperative UAV Swarm Formations

Authors: Siguo Bi, Kai Li, Shuyan Hu, Wei Ni, Cong Wang, Xin Wang

Abstract: Detecting spoofing attacks on the positions of unmanned aerial vehicles (UAVs) within a swarm is challenging. Traditional methods relying solely on individually reported positions and pairwise distance measurements are ineffective in identifying the misbehavior of malicious UAVs. This paper presents a novel systematic structure designed to detect and mitigate spoofing attacks in UAV swarms. We for… ▽ More Detecting spoofing attacks on the positions of unmanned aerial vehicles (UAVs) within a swarm is challenging. Traditional methods relying solely on individually reported positions and pairwise distance measurements are ineffective in identifying the misbehavior of malicious UAVs. This paper presents a novel systematic structure designed to detect and mitigate spoofing attacks in UAV swarms. We formulate the problem of detecting malicious UAVs as a localization feasibility problem, leveraging the reported positions and distance measurements. To address this problem, we develop a semidefinite relaxation (SDR) approach, which reformulates the non-convex localization problem into a convex and tractable semidefinite program (SDP). Additionally, we propose two innovative algorithms that leverage the proximity of neighboring UAVs to identify malicious UAVs effectively. Simulations demonstrate the superior performance of our proposed approaches compared to existing benchmarks. Our methods exhibit robustness across various swarm networks, showcasing their effectiveness in detecting and mitigating spoofing attacks. {\blue Specifically, the detection success rate is improved by up to 65\%, 55\%, and 51\% against distributed, collusion, and mixed attacks, respectively, compared to the benchmarks. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: accepted by IEEE TIFS in Dec. 2023

arXiv:2312.01388 [pdf, other]

Two long-period giant planets around two giant stars: HD 112570 and HD 154391

Authors: Guang-Yao Xiao, Huan-Yu Teng, Jianzhao Zhou, Bun'ei Sato, Yu-Juan Liu, Shaolan Bi, Takuya Takarada, Masayuki Kuzuhara, Marc Hon, Liang Wang, Masashi Omiya, Hiroki Harakawa, Fei Zhao, Gang Zhao, Eiji Kambe, Hideyuki Izumiura, Hiroyasu Ando, Kunio Noguchi, Wei Wang, Meng Zhai, Nan Song, Chengqun Yang, Tanda Li, Timothy D. Brandt, Michitoshi Yoshida , et al. (2 additional authors not shown)

Abstract: We present the discoveries of two giant planets orbiting the red giant branch (RGB) star HD 112570 and the red clump (RC) star HD 154391, based on the radial velocity (RV) measurements from Xinglong station and Okayama Astrophysical Observatory (OAO). Spectroscopic and asteroseismic analyses suggest that HD 112570 has a mass of $1.15\pm0.12\,M_{\odot}$, a radius of $9.85\pm0.23\,R_{\odot}$, a meta… ▽ More We present the discoveries of two giant planets orbiting the red giant branch (RGB) star HD 112570 and the red clump (RC) star HD 154391, based on the radial velocity (RV) measurements from Xinglong station and Okayama Astrophysical Observatory (OAO). Spectroscopic and asteroseismic analyses suggest that HD 112570 has a mass of $1.15\pm0.12\,M_{\odot}$, a radius of $9.85\pm0.23\,R_{\odot}$, a metallicity [Fe/H] of $-0.46\pm0.1$ and a ${\rm log}\,g$ of $2.47\pm0.1$. With the joint analysis of RV and Hipparcos-Gaia astrometry, we obtain a dynamical mass of $M_{\rm p}={3.42}_{-0.84}^{+1.4}\ M_{\rm Jup}$, a period of $P={2615}_{-77}^{+85}$ days and a moderate eccentricity of $e={0.20}_{-0.14}^{+0.16}$ for the Jovian planet HD 112570 b. For HD 154391, it has a mass of $2.07\pm0.03\,M_{\odot}$, a radius of $8.56\pm0.05\,R_{\odot}$, a metallicity [Fe/H] of $0.07\pm0.1$ and a ${\rm log}\,g$ of $2.86\pm0.1$. The super-Jupiter HD 154391 b has a mass of $M_{\rm p}={9.1}_{-1.9}^{+2.8}\ M_{\rm Jup}$, a period of $P={5163}_{-57}^{+60}$ days and an eccentricity of $e={0.20}_{-0.04}^{+0.04}$. We found HD 154391 b has one of the longest orbital period among those ever discovered orbiting evolved stars, which may provide a valuable case in our understanding of planetary formation at wider orbits. Moreover, while a mass gap at $4\,M_{\rm Jup}$ seems to be present in the population of giant stars, there appears to be no significant differences in the distribution of metallicity among giant planets with masses above or below this threshold. Finally, The origin of the abnormal accumulation near 2 au for planets around large evolved stars ($R_{\star}>21\,R_{\odot}$), remains unclear. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 26 pages, 18 figures, 8 tables, Accepted for publication in The Astronomical Journal

arXiv:2311.12024 [pdf, other]

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

Authors: Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

Abstract: We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1.3 seconds on a single A100 GPU. PF-LRM is a highly scalable method utilizing the self-attention blocks to exchange information between 3D object tokens and 2D image tokens; we predict a c… ▽ More We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1.3 seconds on a single A100 GPU. PF-LRM is a highly scalable method utilizing the self-attention blocks to exchange information between 3D object tokens and 2D image tokens; we predict a coarse point cloud for each view, and then use a differentiable Perspective-n-Point (PnP) solver to obtain camera poses. When trained on a huge amount of multi-view posed data of ~1M objects, PF-LRM shows strong cross-dataset generalization ability, and outperforms baseline methods by a large margin in terms of pose prediction accuracy and 3D reconstruction quality on various unseen evaluation datasets. We also demonstrate our model's applicability in downstream text/image-to-3D task with fast feed-forward inference. Our project website is at: https://totoro97.github.io/pf-lrm . △ Less

Submitted 23 November, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: Project website: https://totoro97.github.io/pf-lrm ; add more experiments

arXiv:2311.09639 [pdf, other]

On the Quantification of Image Reconstruction Uncertainty without Training Data

Authors: Sirui Bi, Victor Fung, Jiaxin Zhang

Abstract: Computational imaging plays a pivotal role in determining hidden information from sparse measurements. A robust inverse solver is crucial to fully characterize the uncertainty induced by these measurements, as it allows for the estimation of the complete posterior of unrecoverable targets. This, in turn, facilitates a probabilistic interpretation of observational data for decision-making. In this… ▽ More Computational imaging plays a pivotal role in determining hidden information from sparse measurements. A robust inverse solver is crucial to fully characterize the uncertainty induced by these measurements, as it allows for the estimation of the complete posterior of unrecoverable targets. This, in turn, facilitates a probabilistic interpretation of observational data for decision-making. In this study, we propose a deep variational framework that leverages a deep generative model to learn an approximate posterior distribution to effectively quantify image reconstruction uncertainty without the need for training data. We parameterize the target posterior using a flow-based model and minimize their Kullback-Leibler (KL) divergence to achieve accurate uncertainty estimation. To bolster stability, we introduce a robust flow-based model with bi-directional regularization and enhance expressivity through gradient boosting. Additionally, we incorporate a space-filling design to achieve substantial variance reduction on both latent prior space and target posterior space. We validate our method on several benchmark tasks and two real-world applications, namely fastMRI and black hole image reconstruction. Our results indicate that our method provides reliable and high-quality image reconstruction with robust uncertainty estimation. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

arXiv:2311.09217 [pdf, other]

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Authors: Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

Abstract: We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model incorporates a triplane NeRF representation and can denoise noisy multi-view images via NeRF reconstruction and rendering, achieving single-stage 3D generation in $\sim$30s on single A100 GPU. We train \textbf{DMV3D} on larg… ▽ More We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model incorporates a triplane NeRF representation and can denoise noisy multi-view images via NeRF reconstruction and rendering, achieving single-stage 3D generation in $\sim$30s on single A100 GPU. We train \textbf{DMV3D} on large-scale multi-view image datasets of highly diverse objects using only image reconstruction losses, without accessing 3D assets. We demonstrate state-of-the-art results for the single-image reconstruction problem where probabilistic modeling of unseen object parts is required for generating diverse reconstructions with sharp textures. We also show high-quality text-to-3D generation results outperforming previous 3D diffusion models. Our project website is at: https://justimyhxu.github.io/projects/dmv3d/ . △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Project Page: https://justimyhxu.github.io/projects/dmv3d/

arXiv:2311.07812 [pdf, other]

The formation of blue large-amplitude pulsators from white-dwarf main-sequence star mergers

Authors: Xianfei Zhang, C. Simon Jeffery, Jie Su, Shaolan Bi

Abstract: Blue large-amplitude pulsators (BLAPs) are hot low-mass stars which show large-amplitude light variations likely due to radial oscillations driven by iron-group opacities. Period changes provide evidence of both secular contraction and expansion amongst the class. Various formation histories have been proposed, but none are completely satisfactory. \citet{Zhang2017} proposed that the merger of a h… ▽ More Blue large-amplitude pulsators (BLAPs) are hot low-mass stars which show large-amplitude light variations likely due to radial oscillations driven by iron-group opacities. Period changes provide evidence of both secular contraction and expansion amongst the class. Various formation histories have been proposed, but none are completely satisfactory. \citet{Zhang2017} proposed that the merger of a helium core white dwarf with a low-mass main-sequence star (HeWD+MS) can lead to the formation of some classes of hot subdwarf. We have analyzed these HeWD+MS merger models in more detail. Between helium-shell ignition and full helium-core burning, the models pass through the volume of luminosity -- gravity-- temperature space occupied by BLAPs. Periods of expansion and contraction associated with helium-shell flashes can account for the observed rates of period change. We argue that the HeWD+MS merger model provides at least one BLAP formation channel. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 13 pages, 8 figures, accepted by ApJ

arXiv:2311.06214 [pdf, other]

Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model

Authors: Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi

Abstract: Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward methods that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method that generates… ▽ More Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward methods that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. We adopt a two-stage paradigm, which first generates a sparse set of four structured and consistent views from text in one shot with a fine-tuned 2D text-to-image diffusion model, and then directly regresses the NeRF from the generated images with a novel transformer-based sparse-view reconstructor. Through extensive experiments, we demonstrate that our method can generate diverse 3D assets of high visual quality within 20 seconds, which is two orders of magnitude faster than previous optimization-based methods that can take 1 to 10 hours. Our project webpage: https://jiahao.ai/instant3d/. △ Less

Submitted 23 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: Project webpage: https://jiahao.ai/instant3d/

arXiv:2311.05815 [pdf, other]

Imprints of Sagittarius accretion event: Young O-rich stars and discontinuous chemical evolution in Milky Way disc

Authors: Tiancheng Sun, Shaolan Bi, Xunzhou Chen, Yuqin Chen, Chao Liu, Xianfei Zhang, Tanda Li, Yaguang Li, Yaqian Wu, Zhishuai Ge, Lifei Ye

Abstract: The Milky Way has undergone significant transformations in its early history, characterised by violent mergers and the accretion of satellite galaxies. Among these events, the infall of the satellite galaxy Gaia-Enceladus/Sausage is recognised as the last major merger event, fundamentally altering the evolution of the Milky Way and shaping its chemo-dynamical structure. However, recent observation… ▽ More The Milky Way has undergone significant transformations in its early history, characterised by violent mergers and the accretion of satellite galaxies. Among these events, the infall of the satellite galaxy Gaia-Enceladus/Sausage is recognised as the last major merger event, fundamentally altering the evolution of the Milky Way and shaping its chemo-dynamical structure. However, recent observational evidence suggests that the Milky Way remains undergone notable events of star formation in the past 4 Gyr, which is thought to be triggered by the perturbations from Sagittarius dwarf galaxy (Sgr). Here we report chemical signatures of the Sgr accretion event in the past 4 Gyr, using the [Fe/H] and [O/Fe] ratios in the thin disc, which is reported for the first time. It reveals that the previously discovered V-shape structure of age-[Fe/H] relation varies across different Galactic locations and has rich substructures. Interestingly, we discover a discontinuous structure at z$_{\rm max}$ $<$ 0.3 kpc, interrupted by a recent burst of star formation from 4 Gyr to 2 Gyr ago. In this episode, we find a significant rise in oxygen abundance leading to a distinct [O/Fe] gradient, contributing to the formation of young O-rich stars. Combined with the simulated star formation history and chemical abundance of Sgr, we suggest that the Sgr is an important actor in the discontinuous chemical evolution of the Milky Way disc. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 17 pages, 15 figures. Under review at Nature Communications

arXiv:2311.04400 [pdf, other]

LRM: Large Reconstruction Model for Single Image to 3D

Authors: Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan

Abstract: We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specific fashion, LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural rad… ▽ More We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specific fashion, LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural radiance field (NeRF) from the input image. We train our model in an end-to-end manner on massive multi-view data containing around 1 million objects, including both synthetic renderings from Objaverse and real captures from MVImgNet. This combination of a high-capacity model and large-scale training data empowers our model to be highly generalizable and produce high-quality 3D reconstructions from various testing inputs, including real-world in-the-wild captures and images created by generative models. Video demos and interactable 3D meshes can be found on our LRM project webpage: https://yiconghong.me/LRM. △ Less

Submitted 9 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: ICLR 2024

arXiv:2309.11206 [pdf, other]

Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering

Authors: Yike Wu, Nan Hu, Sheng Bi, Guilin Qi, Jie Ren, Anhuan Xie, Wei Song

Abstract: Despite their competitive performance on knowledge-intensive tasks, large language models (LLMs) still have limitations in memorizing all world knowledge especially long tail knowledge. In this paper, we study the KG-augmented language model approach for solving the knowledge graph question answering (KGQA) task that requires rich world knowledge. Existing work has shown that retrieving KG knowled… ▽ More Despite their competitive performance on knowledge-intensive tasks, large language models (LLMs) still have limitations in memorizing all world knowledge especially long tail knowledge. In this paper, we study the KG-augmented language model approach for solving the knowledge graph question answering (KGQA) task that requires rich world knowledge. Existing work has shown that retrieving KG knowledge to enhance LLMs prompting can significantly improve LLMs performance in KGQA. However, their approaches lack a well-formed verbalization of KG knowledge, i.e., they ignore the gap between KG representations and textual representations. To this end, we propose an answer-sensitive KG-to-Text approach that can transform KG knowledge into well-textualized statements most informative for KGQA. Based on this approach, we propose a KG-to-Text enhanced LLMs framework for solving the KGQA task. Experiments on several KGQA benchmarks show that the proposed KG-to-Text augmented LLMs approach outperforms previous KG-augmented LLMs approaches regarding answer accuracy and usefulness of knowledge statements. △ Less

Submitted 21 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.11009 [pdf, other]

Controllable Dynamic Appearance for Neural 3D Portraits

Authors: ShahRukh Athar, Zhixin Shu, Zexiang Xu, Fujun Luan, Sai Bi, Kalyan Sunkavalli, Dimitris Samaras

Abstract: Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consis… ▽ More Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consistency across frames of a video is hard to maintain, even in studio environments, thus making the created reanimatable neural portraits prone to artifacts during reanimation. In this work, we propose CoDyNeRF, a system that enables the creation of fully controllable 3D portraits in real-world capture conditions. CoDyNeRF learns to approximate illumination dependent effects via a dynamic appearance model in the canonical space that is conditioned on predicted surface normals and the facial expressions and head-pose deformations. The surface normals prediction is guided using 3DMM normals that act as a coarse prior for the normals of the human head, where direct prediction of normals is hard due to rigid and non-rigid deformations induced by head-pose and facial expression changes. Using only a smartphone-captured short video of a subject for training, we demonstrate the effectiveness of our method on free view synthesis of a portrait scene with explicit head pose and expression controls, and realistic lighting effects. The project page can be found here: http://shahrukhathar.github.io/2023/08/22/CoDyNeRF.html △ Less

Submitted 21 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

arXiv:2308.12662 [pdf, ps, other]

Capacity Analysis and Throughput Maximization of NOMA with Nonlinear Power Amplifier Distortion

Authors: Xiaojia Wang, Suzhi Bi, Xian Li, Xiaohui Lin, Zhi Quan, Ying-Jun Angela Zhang

Abstract: In future B5G/6G broadband communication systems, non-linear signal distortion caused by the impairment of transmit power amplifier (PA) can severely degrade the communication performance, especially when uplink users share the wireless medium using non-orthogonal multiple access (NOMA) schemes. This is because the successive interference cancellation (SIC) decoding technique, used in NOMA, is inc… ▽ More In future B5G/6G broadband communication systems, non-linear signal distortion caused by the impairment of transmit power amplifier (PA) can severely degrade the communication performance, especially when uplink users share the wireless medium using non-orthogonal multiple access (NOMA) schemes. This is because the successive interference cancellation (SIC) decoding technique, used in NOMA, is incapable of eliminating the interference caused by PA distortion. Consequently, each user's decoding process suffers from the cumulative distortion noise of all uplink users. In this paper, we establish a new and tractable PA distortion signal model based on real-world measurements, where the distortion noise power is a polynomial function of PA transmit power diverging from the oversimplified linear function commonly employed in existing studies. Applying the proposed signal model, we characterize the capacity rate region of multi-user uplink NOMA by optimizing the user transmit power. Our findings reveal a significant contraction in the capacity region of NOMA, attributable to polynomial distortion noise power. For practical engineering applications, we formulate a general weighted sum rate maximization (WSRMax) problem under individual user rate constraints. We further propose an efficient power control algorithm to attain the optimal performance. Numerical results show that the optimal power control policy under the proposed non-linear PA model achieves on average 13\% higher throughput compared to the policies assuming an ideal linear PA model. Overall, our findings demonstrate the importance of accurate PA distortion modeling to the performance of NOMA and provide efficient optimal power control method accordingly. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: The paper has been submitted for potential journal publications

arXiv:2308.12256 [pdf, other]

doi 10.1145/3604915.3610244

Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders

Authors: Yueqi Wang, Yoni Halpern, Shuo Chang, Jingchen Feng, Elaine Ya Le, Longfei Li, Xujian Liang, Min-Cheng Huang, Shane Li, Alex Beutel, Yaping Zhang, Shuchao Bi

Abstract: Sequential recommenders have been widely used in industry due to their strength in modeling user preferences. While these models excel at learning a user's positive interests, less attention has been paid to learning from negative user feedback. Negative user feedback is an important lever of user control, and comes with an expectation that recommenders should respond quickly and reduce similar re… ▽ More Sequential recommenders have been widely used in industry due to their strength in modeling user preferences. While these models excel at learning a user's positive interests, less attention has been paid to learning from negative user feedback. Negative user feedback is an important lever of user control, and comes with an expectation that recommenders should respond quickly and reduce similar recommendations to the user. However, negative feedback signals are often ignored in the training objective of sequential retrieval models, which primarily aim at predicting positive user interactions. In this work, we incorporate explicit and implicit negative user feedback into the training objective of sequential recommenders in the retrieval stage using a "not-to-recommend" loss function that optimizes for the log-likelihood of not recommending items with negative feedback. We demonstrate the effectiveness of this approach using live experiments on a large-scale industrial recommender system. Furthermore, we address a challenge in measuring recommender responsiveness to negative feedback by developing a counterfactual simulation framework to compare recommender responses between different user actions, showing improved responsiveness from the modeling change. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: RecSys 2023 Industry Track

arXiv:2307.06596 [pdf, other]

Investigating 16 Open Clusters in the Kepler/K2-Gaia DR3 field. I. Membership, Binary, and Rotation

Authors: Liu Long, Shanlao Bi, Jinhua Zhang, Xianfei Zhang, Liyun Zhang, Zhishuai Ge, Tanda Li, Xunzhou Chen, Yaguang Li, Lifei Ye, TianCheng Sun, Jianzhao Zhou

Abstract: Using data from the Gaia Data Release 3 (Gaia DR3) and Kepler/K2, we present a catalog of 16 open clusters with ages ranging from 4 to 4000 Myr, which provides detailed information on membership, binary systems, and rotation. We assess the memberships in 5D phase space, and estimate the basic parameters of each cluster. Among the 20,160 members, there are 4,381 stars identified as binary candidate… ▽ More Using data from the Gaia Data Release 3 (Gaia DR3) and Kepler/K2, we present a catalog of 16 open clusters with ages ranging from 4 to 4000 Myr, which provides detailed information on membership, binary systems, and rotation. We assess the memberships in 5D phase space, and estimate the basic parameters of each cluster. Among the 20,160 members, there are 4,381 stars identified as binary candidates and 49 stars as blue straggler stars. The fraction of binaries vary in each cluster, and the range between 9% to 44%. We obtain the rotation periods of 5,467 members, of which 4,304 are determined in this work. To establish a benchmark for the rotation-age-color relation, we construct color-period diagrams. We find that the rotational features of binaries are similar to that of single stars, while features for binaries are more scattered in the rotation period. Moreover, the morphology of the color-period relationship is already established for Upper Scorpius at the age of 19 Myr, and some stars of varying spectral types (i.e. FG-, K-, and M-type) show different spin-down rates after the age of ~110 Myr. By incorporating the effects of stalled spin-down into our analysis, we develop an empirical rotation-age-color relation, which is valid with ages between 700 - 4000 Myr and colors corresponding to a range of 0.5 < (G_BP-G_RP)0 < 2.5 mag. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: 21 pages, 17 figures, accepted for publication in ApJS

arXiv:2307.06335 [pdf, other]

Neural Free-Viewpoint Relighting for Glossy Indirect Illumination

Authors: Nithin Raghavan, Yan Xiao, Kai-En Lin, Tiancheng Sun, Sai Bi, Zexiang Xu, Tzu-Mao Li, Ravi Ramamoorthi

Abstract: Precomputed Radiance Transfer (PRT) remains an attractive solution for real-time rendering of complex light transport effects such as glossy global illumination. After precomputation, we can relight the scene with new environment maps while changing viewpoint in real-time. However, practical PRT methods are usually limited to low-frequency spherical harmonic lighting. All-frequency techniques usin… ▽ More Precomputed Radiance Transfer (PRT) remains an attractive solution for real-time rendering of complex light transport effects such as glossy global illumination. After precomputation, we can relight the scene with new environment maps while changing viewpoint in real-time. However, practical PRT methods are usually limited to low-frequency spherical harmonic lighting. All-frequency techniques using wavelets are promising but have so far had little practical impact. The curse of dimensionality and much higher data requirements have typically limited them to relighting with fixed view or only direct lighting with triple product integrals. In this paper, we demonstrate a hybrid neural-wavelet PRT solution to high-frequency indirect illumination, including glossy reflection, for relighting with changing view. Specifically, we seek to represent the light transport function in the Haar wavelet basis. For global illumination, we learn the wavelet transport using a small multi-layer perceptron (MLP) applied to a feature field as a function of spatial location and wavelet index, with reflected direction and material parameters being other MLP inputs. We optimize/learn the feature field (compactly represented by a tensor decomposition) and MLP parameters from multiple images of the scene under different lighting and viewing conditions. We demonstrate real-time (512 x 512 at 24 FPS, 800 x 600 at 13 FPS) precomputed rendering of challenging scenes involving view-dependent reflections and even caustics. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 13 pages, 9 figures, to appear in cgf proceedings of egsr 2023

arXiv:2307.04086 [pdf, other]

Age of FGK Dwarfs Observed with LAMOST and GALAH: Considering the Oxygen Enhancement

Authors: Tiancheng Sun, Zhishuai Ge, Xunzhou Chen, Shaolan Bi, Tanda Li, Xianfei Zhang, Yaguang Li, Yaqian Wu, Sarah A. Bird, Ferguson J. W., Jianzhao Zhou, Lifei Ye, Liu Long, Jinghua Zhang

Abstract: Varying oxygen abundance could impact the modeling-inferred ages. This work aims to estimate the ages of dwarfs considering observed oxygen abundance. To characterize 67,503 LAMOST and 4,006 GALAH FGK-type dwarf stars, we construct a grid of stellar models which take into account oxygen abundance as an independent model input. Compared with ages determined with commonly-used $α$-enhanced models, w… ▽ More Varying oxygen abundance could impact the modeling-inferred ages. This work aims to estimate the ages of dwarfs considering observed oxygen abundance. To characterize 67,503 LAMOST and 4,006 GALAH FGK-type dwarf stars, we construct a grid of stellar models which take into account oxygen abundance as an independent model input. Compared with ages determined with commonly-used $α$-enhanced models, we find a difference of $\sim$9% on average when the observed oxygen abundance is considered. The age differences between the two types of models are correlated to [Fe/H] and [O/$α$], and they are relatively significant on stars with [Fe/H] $\lesssim$ -0.6 dex. Generally, varying 0.2 dex in [O/$α$] will alter the age estimates of metal-rich (-0.2 $<$ [Fe/H] $<$ 0.2) stars by $\sim$10%, and relatively metal-poor (-1 $<$ [Fe/H] $<$ -0.2) stars by $\sim$15%. Of the low-O stars with [Fe/H] $<$ 0.1 dex and [O/$α$] $\sim$ -0.2 dex, many have fractional age differences of $\geq$ 10%, and even reach up to 27%. The fractional age difference of high-O stars with [O/$α$] $\sim$ 0.4 dex reaches up to -33% to -42% at [Fe/H] $\lesssim$ -0.6 dex. We also analyze the chemical properties of these stars. We find a decreasing trend of [Fe/H] with age from 7.5-9 Gyr to 5-6.5 Gyr for the stars from the LAMOST and GALAH. The [O/Fe] of these stars increases with decreasing age from 7.5-9 Gyr to 3-4 Gyr, indicating that the younger population is more O-rich. △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: 18 pages, 12 figures, accepted for publication in ApJS

arXiv:2307.03397 [pdf, other]

Asteroseismic Modeling of 1,153 Kepler Red Giant Branch Stars: Improved Stellar Parameters with Gravity-Mode Period Spacings and Luminosity Constraints

Authors: Yingxiang Wang, Tanda Li, Shaolan Bi, Timothy R. Bedding, Yaguang Li

Abstract: This paper reports estimated stellar parameters of 1,153 Kepler red giant branch stars determined with asteroseismic modeling. We use radial-mode oscillation frequencies, gravity-mode period spacings, Gaia luminosities, and spectroscopic data to characterize these stars. Compared with previous studies, we find that the two additional observed constraints, i.e., the gravity-mode period spacing and… ▽ More This paper reports estimated stellar parameters of 1,153 Kepler red giant branch stars determined with asteroseismic modeling. We use radial-mode oscillation frequencies, gravity-mode period spacings, Gaia luminosities, and spectroscopic data to characterize these stars. Compared with previous studies, we find that the two additional observed constraints, i.e., the gravity-mode period spacing and luminosity, significantly improve the precision of fundamental stellar parameters. The typical uncertainties are 2.9% for the mass, 11% for the age, 1.0% for the radius, 0.0039 dex for the surface gravity, and 0.5\% for the helium core mass, making this the best-characterized large sample of red-giant stars available to date. With better characterizations for these red giants, we recalibrate the seismic scaling relations and study the surface term on the red-giant branch. We confirm that the surface term depends on the surface gravity and effective temperature, but there is no significant correlation with metallicity. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: Accepted by ApJ

arXiv:2306.05171 [pdf]

Robot Task Planning Based on Large Language Model Representing Knowledge with Directed Graph Structures

Authors: Yue Zhen, Sheng Bi, Lu Xing-tong, Pan Wei-qin, Shi Hai-peng, Chen Zi-rui, Fang Yi-shu

Abstract: Traditional robot task planning methods face challenges when dealing with highly unstructured environments and complex tasks. We propose a task planning method that combines human expertise with an LLM and have designed an LLM prompt template, Think_Net_Prompt, with stronger expressive power to represent structured professional knowledge. We further propose a method to progressively decompose task… ▽ More Traditional robot task planning methods face challenges when dealing with highly unstructured environments and complex tasks. We propose a task planning method that combines human expertise with an LLM and have designed an LLM prompt template, Think_Net_Prompt, with stronger expressive power to represent structured professional knowledge. We further propose a method to progressively decompose tasks and generate a task tree to reduce the planning volume for each task, and we have designed a strategy to decouple robot task planning. By dividing different planning entities and separating the task from the actual machine binding process, the task planning process becomes more flexible. Research results show that our method performs well in handling specified code formats, understanding the relationship between tasks and subtasks, and extracting parameters from text descriptions. However, there are also problems such as limited complexity of task logic handling, ambiguity in the quantity of parts and the precise location of assembly. Improving the precision of task description and cognitive structure can bring certain improvements. https://github.com/NOMIzy/Think_Net_Prompt △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2306.03669 [pdf, ps, other]

Joint 3D Deployment and Resource Allocation for UAV-assisted Integrated Communication and Localization

Authors: Suzhi Bi, Jiaxing Yu, Zheyuan Yang, Xiaohui Lin, Yuan Wu

Abstract: In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted integrated communication and localization network in emergency scenarios where a single UAV is deployed as both an airborne base station (BS) and anchor node to assist ground BSs in communication and localization services. We formulate an optimization problem to maximize the sum communication rate of all users under localizati… ▽ More In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted integrated communication and localization network in emergency scenarios where a single UAV is deployed as both an airborne base station (BS) and anchor node to assist ground BSs in communication and localization services. We formulate an optimization problem to maximize the sum communication rate of all users under localization accuracy constraints by jointly optimizing the 3D position of the UAV, and communication bandwidth and power allocation of the UAV and ground BSs. To address the intractable localization accuracy constraints, we introduce a new performance metric and geometrically characterize the UAV feasible deployment region in which the localization accuracy constraints are satisfied. Accordingly, we combine Gibbs sampling (GS) and block coordinate descent (BCD) techniques to tackle the non-convex joint optimization problem. Numerical results show that the proposed method attains almost identical rate performance as the meta-heuristic benchmark method while reducing the CPU time by 89.3%. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: The paper has been accepted for publication by IEEE Wireless Communications Letters

arXiv:2305.17134 [pdf, other]

NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support

Authors: Xinyue Wei, Fanbo Xiang, Sai Bi, Anpei Chen, Kalyan Sunkavalli, Zexiang Xu, Hao Su

Abstract: We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy meshes with poor topology. Differentiable rasterization-based methods can generate high-quality meshes but are sensitive to initialization. Our method combines the benefits of both worlds; we take the ge… ▽ More We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy meshes with poor topology. Differentiable rasterization-based methods can generate high-quality meshes but are sensitive to initialization. Our method combines the benefits of both worlds; we take the geometry initialization obtained from neural volumetric fields, and further optimize the geometry as well as a compact neural texture representation with differentiable rasterizers. Through extensive experiments, we demonstrate that our method can generate accurate mesh reconstructions with faithful appearance that are comparable to previous volume rendering methods while being an order of magnitude faster in rendering. We also show that our generated mesh and neural texture reconstruction is compatible with existing graphics pipelines and enables downstream 3D applications such as simulation. Project page: https://sarahweiii.github.io/neumanifold/ △ Less

Submitted 6 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: Project page: https://sarahweiii.github.io/neumanifold/

arXiv:2305.12986

Sparsity and Coefficient Permutation Based Two-Domain AMP for Image Block Compressed Sensing

Authors: Junhui Li, Xingsong Hou, Huake Wang, Shuhao Bi

Abstract: The learned denoising-based approximate message passing (LDAMP) algorithm has attracted great attention for image compressed sensing (CS) tasks. However, it has two issues: first, its global measurement model severely restricts its applicability to high-dimensional images, and its block-based measurement method exhibits obvious block artifacts; second, the denoiser in the LDAMP is too simple, and… ▽ More The learned denoising-based approximate message passing (LDAMP) algorithm has attracted great attention for image compressed sensing (CS) tasks. However, it has two issues: first, its global measurement model severely restricts its applicability to high-dimensional images, and its block-based measurement method exhibits obvious block artifacts; second, the denoiser in the LDAMP is too simple, and existing denoisers have limited ability in detail recovery. In this paper, to overcome the issues and develop a high-performance LDAMP method for image block compressed sensing (BCS), we propose a novel sparsity and coefficient permutation-based AMP (SCP-AMP) method consisting of the block-based sampling and the two-domain reconstruction modules. In the sampling module, SCP-AMP adopts a discrete cosine transform (DCT) based sparsity strategy to reduce the impact of the high-frequency coefficient on the reconstruction, followed by a coefficient permutation strategy to avoid block artifacts. In the reconstruction module, a two-domain AMP method with DCT domain noise correction and pixel domain denoising is proposed for iterative reconstruction. Regarding the denoiser, we proposed a multi-level deep attention network (MDANet) to enhance the texture details by employing multi-level features and multiple attention mechanisms. Extensive experiments demonstrated that the proposed SCP-AMP method achieved better reconstruction accuracy than other state-of-the-art BCS algorithms in terms of both visual perception and objective metrics. △ Less

Submitted 17 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: The content modification has been upgraded and corrected on a large scale, and request to withdraw this version

arXiv:2305.09138 [pdf, other]

doi 10.1093/mnras/stad1499

Characterising abundance-age relations of GALAH stars using oxygen-enhanced stellar models

Authors: Tiancheng Sun, Xunzhou Chen, Shaolan Bi, Zhishuai Ge, Maosheng Xiang, Yaqian Wu

Abstract: Main Sequence Turn-off stars (MSTO) and subgiant stars are good tracers of galactic populations. We present a study of 41,034 MSTO and subgiant stars from the GALAH survey. Using a grid of stellar models that accounts for the variation of O abundances, we determine their ages with a median age uncertainty of $\sim$9.4 per cent. Our analysis reveals that the ages of high-O stars based on O-enhanced… ▽ More Main Sequence Turn-off stars (MSTO) and subgiant stars are good tracers of galactic populations. We present a study of 41,034 MSTO and subgiant stars from the GALAH survey. Using a grid of stellar models that accounts for the variation of O abundances, we determine their ages with a median age uncertainty of $\sim$9.4 per cent. Our analysis reveals that the ages of high-O stars based on O-enhanced models (OEM models) are smaller than those determined with $α$-enhanced models, resulting in a mean fractional age difference of -5.3 per cent at [O/$α$] = 0.2 and -11.0 per cent at [O/$α$] = 0.4. This age difference significantly impacts the age distribution of thick disc and halo stars, leading to a steeper downward trend in the [Fe/H]-age plane from 8 Gyr to 14 Gyr, indicating a shorter formation time-scale and a faster chemical-enhanced history for these populations. We confirm the V-shape of the normalized age-metallicity distribution $p$($τ$$\mid$[Fe/H]) of thin disc stars, which is presumably a consequence of the second gas infall. Additionally, we find that the halo stars in our sample can be divided into two sequences, a metal-rich sequence (Splash stars) and a metal-poor sequence (accreted stars), with the Splash stars predominantly older than 9 Gyr and the accreted halo stars older than 10 Gyr. Finally, we observe two distinct sequences in the relations between various chemical abundances and age for disc stars, namely a young sequence with ages $<$ $\sim$8 Gyr and an old sequence with ages $>$ $\sim$8 Gyr. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: 11 pages, 12 figures

arXiv:2305.07764 [pdf, other]

Long-Term Value of Exploration: Measurements, Findings and Algorithms

Authors: Yi Su, Xiangyu Wang, Elaine Ya Le, Liang Liu, Yuening Li, Haokai Lu, Benjamin Lipshitz, Sriraj Badam, Lukasz Heldt, Shuchao Bi, Ed Chi, Cristos Goodrow, Su-Lin Wu, Lexi Baugher, Minmin Chen

Abstract: Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term valu… ▽ More Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term value of exploration by examining its effects on content corpus, and connecting content corpus growth to the long-term user experience from real-world experiments. Once established the values of exploration, we investigate the Neural Linear Bandit algorithm as a general framework to introduce exploration into any deep learning based ranking systems. We conduct live experiments on one of the largest short-form video recommendation platforms that serves billions of users to validate the new experiment designs, quantify the long-term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration. △ Less

Submitted 25 February, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

Comments: 11 pages, WSDM 2024

Showing 1–50 of 226 results for author: Bi, S