-
Improving the trainability of VQE on NISQ computers for solving portfolio optimization using convex interpolation
Authors:
Shengbin Wang,
Guihui Li,
Zhaoyun Chen,
Peng Wang,
Menghan Dou,
Haiyong Zheng,
Zhimin Wang,
Yongjian Gu,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
Solving combinatorial optimization problems using variational quantum algorithms (VQAs) represents one of the most promising applications in the NISQ era. However, the limited trainability of VQAs could hinder their scalability to large problem sizes. In this paper, we improve the trainability of variational quantum eigensolver (VQE) by utilizing convex interpolation to solve portfolio optimizatio…
▽ More
Solving combinatorial optimization problems using variational quantum algorithms (VQAs) represents one of the most promising applications in the NISQ era. However, the limited trainability of VQAs could hinder their scalability to large problem sizes. In this paper, we improve the trainability of variational quantum eigensolver (VQE) by utilizing convex interpolation to solve portfolio optimization. The idea is inspired by the observation that the Dicke state possesses an inherent clustering property. Consequently, the energy of a state with a larger Hamming distance from the ground state intuitively results in a greater energy gap away from the ground state energy in the overall distribution trend. Based on convex interpolation, the location of the ground state can be evaluated by learning the property of a small subset of basis states in the Hilbert space. This enlightens naturally the proposals of the strategies of close-to-solution initialization, regular cost function landscape, and recursive ansatz equilibrium partition. The successfully implementation of a $40$-qubit experiment using only $10$ superconducting qubits demonstrates the effectiveness of our proposals. Furthermore, the quantum inspiration has also spurred the development of a prototype greedy algorithm. Extensive numerical simulations indicate that the hybridization of VQE and greedy algorithms achieves a mutual complementarity, combining the advantages of both global and local optimization methods. Our proposals can be extended to improve the trainability for solving other large-scale combinatorial optimization problems that are widely used in real applications, paving the way to unleash quantum advantages of NISQ computers in the near future.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
A hybrid quantum-classical framework for computational fluid dynamics
Authors:
Chuang-Chao Ye,
Ning-Bo An,
Teng-Yang Ma,
Meng-Han Dou,
Wen Bai,
Zhao-Yun Chen,
Guo-Ping Guo
Abstract:
Great progress has been made in quantum computing in recent years, providing opportunities to overcome computation resource poverty in many scientific computations like computational fluid dynamics (CFD). In this work, efforts are made to exploit quantum potentialities in CFD, and a hybrid classical and quantum computing CFD framework is proposed to release the power of current quantum computing.…
▽ More
Great progress has been made in quantum computing in recent years, providing opportunities to overcome computation resource poverty in many scientific computations like computational fluid dynamics (CFD). In this work, efforts are made to exploit quantum potentialities in CFD, and a hybrid classical and quantum computing CFD framework is proposed to release the power of current quantum computing. In this framework, the traditional CFD solvers are coupled with quantum linear algebra libraries in weak form to achieve collaborative computation between classical and quantum computing. The quantum linear solver provides high-precision solutions and scalable problem sizes for linear systems and is designed to be easily callable for solving linear algebra systems similar to classical linear libraries, thus enabling seamless integration into existing CFD solvers. Some typical cases are performed to validate the feasibility of the proposed framework and the correctness of quantum linear algorithms in CFD.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Authors:
Renqiu Xia,
Song Mao,
Xiangchao Yan,
Hongbin Zhou,
Bo Zhang,
Haoyang Peng,
Jiahao Pi,
Daocheng Fu,
Wenjie Wu,
Hancheng Ye,
Shiyang Feng,
Bin Wang,
Chao Xu,
Conghui He,
Pinlong Cai,
Min Dou,
Botian Shi,
Sheng Zhou,
Yongwei Wang,
Bin Wang,
Junchi Yan,
Fei Wu,
Yu Qiao
Abstract:
Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract…
▽ More
Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extraction and understanding tasks, and their capacity to process within-document data formats such as charts and equations remains under-explored. To address these issues, we present DocGenome, a structured document benchmark constructed by annotating 500K scientific documents from 153 disciplines in the arXiv open-access community, using our custom auto-labeling pipeline. DocGenome features four key characteristics: 1) Completeness: It is the first dataset to structure data from all modalities including 13 layout attributes along with their LaTeX source codes. 2) Logicality: It provides 6 logical relationships between different entities within each scientific document. 3) Diversity: It covers various document-oriented tasks, including document classification, visual grounding, document layout detection, document transformation, open-ended single-page QA and multi-page QA. 4) Correctness: It undergoes rigorous quality control checks conducted by a specialized team. We conduct extensive experiments to demonstrate the advantages of DocGenome and objectively evaluate the performance of large models on our benchmark.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Authors:
Qingyun Li,
Zhe Chen,
Weiyun Wang,
Wenhai Wang,
Shenglong Ye,
Zhenjiang Jin,
Guanzhou Chen,
Yinan He,
Zhangwei Gao,
Erfei Cui,
Jiashuo Yu,
Hao Tian,
Jiasheng Zhou,
Chao Xu,
Bin Wang,
Xingjian Wei,
Wei Li,
Wenjian Zhang,
Bo Zhang,
Pinlong Cai,
Licheng Wen,
Xiangchao Yan,
Zhenxiang Li,
Pei Chu,
Yi Wang
, et al. (15 additional authors not shown)
Abstract:
Image-text interleaved data, consisting of multiple images and texts arranged in a natural document format, aligns with the presentation paradigm of internet data and closely resembles human reading habits. Recent studies have shown that such data aids multimodal in-context learning and maintains the capabilities of large language models during multimodal fine-tuning. However, the limited scale an…
▽ More
Image-text interleaved data, consisting of multiple images and texts arranged in a natural document format, aligns with the presentation paradigm of internet data and closely resembles human reading habits. Recent studies have shown that such data aids multimodal in-context learning and maintains the capabilities of large language models during multimodal fine-tuning. However, the limited scale and diversity of current image-text interleaved data restrict the development of multimodal large language models. In this paper, we introduce OmniCorpus, a 10 billion-scale image-text interleaved dataset. Using an efficient data engine, we filter and extract large-scale high-quality documents, which contain 8.6 billion images and 1,696 billion text tokens. Compared to counterparts (e.g., MMC4, OBELICS), our dataset 1) has 15 times larger scales while maintaining good data quality; 2) features more diverse sources, including both English and non-English websites as well as video-centric websites; 3) is more flexible, easily degradable from an image-text interleaved format to pure text corpus and image-text pairs. Through comprehensive analysis and experiments, we validate the quality, usability, and effectiveness of the proposed dataset. We hope this could provide a solid data foundation for future multimodal model research. Code and data are released at https://github.com/OpenGVLab/OmniCorpus.
△ Less
Submitted 12 July, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Enabling Large-Scale and High-Precision Fluid Simulations on Near-Term Quantum Computers
Authors:
Zhao-Yun Chen,
Teng-Yang Ma,
Chuang-Chao Ye,
Liang Xu,
Ming-Yang Tan,
Xi-Ning Zhuang,
Xiao-Fan Xu,
Yun-Jie Wang,
Tai-Ping Sun,
Yong Chen,
Lei Du,
Liang-Liang Guo,
Hai-Feng Zhang,
Hao-Ran Tao,
Tian-Le Wang,
Xiao-Yan Yang,
Ze-An Zhao,
Peng Wang,
Sheng Zhang,
Chi Zhang,
Ren-Ze Zhao,
Zhi-Long Jia,
Wei-Cheng Kong,
Meng-Han Dou,
Jun-Chao Wang
, et al. (7 additional authors not shown)
Abstract:
Quantum computational fluid dynamics (QCFD) offers a promising alternative to classical computational fluid dynamics (CFD) by leveraging quantum algorithms for higher efficiency. This paper introduces a comprehensive QCFD method, including an iterative method "Iterative-QLS" that suppresses error in quantum linear solver, and a subspace method to scale the solution to a larger size. We implement o…
▽ More
Quantum computational fluid dynamics (QCFD) offers a promising alternative to classical computational fluid dynamics (CFD) by leveraging quantum algorithms for higher efficiency. This paper introduces a comprehensive QCFD method, including an iterative method "Iterative-QLS" that suppresses error in quantum linear solver, and a subspace method to scale the solution to a larger size. We implement our method on a superconducting quantum computer, demonstrating successful simulations of steady Poiseuille flow and unsteady acoustic wave propagation. The Poiseuille flow simulation achieved a relative error of less than $0.2\%$, and the unsteady acoustic wave simulation solved a 5043-dimensional matrix. We emphasize the utilization of the quantum-classical hybrid approach in applications of near-term quantum computers. By adapting to quantum hardware constraints and offering scalable solutions for large-scale CFD problems, our method paves the way for practical applications of near-term quantum computers in computational science.
△ Less
Submitted 19 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
Authors:
Jianbiao Mei,
Yukai Ma,
Xuemeng Yang,
Licheng Wen,
Xinyu Cai,
Xin Li,
Daocheng Fu,
Bo Zhang,
Pinlong Cai,
Min Dou,
Botian Shi,
Liang He,
Yong Liu,
Yu Qiao
Abstract:
Autonomous driving has advanced significantly due to sensors, machine learning, and artificial intelligence improvements. However, prevailing methods struggle with intricate scenarios and causal relationships, hindering adaptability and interpretability in varied environments. To address the above problems, we introduce LeapAD, a novel paradigm for autonomous driving inspired by the human cognitiv…
▽ More
Autonomous driving has advanced significantly due to sensors, machine learning, and artificial intelligence improvements. However, prevailing methods struggle with intricate scenarios and causal relationships, hindering adaptability and interpretability in varied environments. To address the above problems, we introduce LeapAD, a novel paradigm for autonomous driving inspired by the human cognitive process. Specifically, LeapAD emulates human attention by selecting critical objects relevant to driving decisions, simplifying environmental interpretation, and mitigating decision-making complexities. Additionally, LeapAD incorporates an innovative dual-process decision-making module, which consists of an Analytic Process (System-II) for thorough analysis and reasoning, along with a Heuristic Process (System-I) for swift and empirical processing. The Analytic Process leverages its logical reasoning to accumulate linguistic driving experience, which is then transferred to the Heuristic Process by supervised fine-tuning. Through reflection mechanisms and a growing memory bank, LeapAD continuously improves itself from past mistakes in a closed-loop environment. Closed-loop testing in CARLA shows that LeapAD outperforms all methods relying solely on camera input, requiring 1-2 orders of magnitude less labeled data. Experiments also demonstrate that as the memory bank expands, the Heuristic Process with only 1.8B parameters can inherit the knowledge from a GPT-4 powered Analytic Process and achieve continuous performance improvement. Code will be released at https://github.com/PJLab-ADG/LeapAD.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Authors:
Zheng Zhu,
Xiaofeng Wang,
Wangbo Zhao,
Chen Min,
Nianchen Deng,
Min Dou,
Yuqi Wang,
Botian Shi,
Kai Wang,
Chi Zhang,
Yang You,
Zhaoxiang Zhang,
Dawei Zhao,
Liang Xiao,
Jian Zhao,
Jiwen Lu,
Guan Huang
Abstract:
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical law…
▽ More
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical laws. In this survey, we embark on a comprehensive exploration of the latest advancements in world models. Our analysis navigates through the forefront of generative methodologies in video generation, where world models stand as pivotal constructs facilitating the synthesis of highly realistic visual content. Additionally, we scrutinize the burgeoning field of autonomous-driving world models, meticulously delineating their indispensable role in reshaping transportation and urban mobility. Furthermore, we delve into the intricacies inherent in world models deployed within autonomous agents, shedding light on their profound significance in enabling intelligent interactions within dynamic environmental contexts. At last, we examine challenges and limitations of world models, and discuss their potential future directions. We hope this survey can serve as a foundational reference for the research community and inspire continued innovation. This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Authors:
Zhe Chen,
Weiyun Wang,
Hao Tian,
Shenglong Ye,
Zhangwei Gao,
Erfei Cui,
Wenwen Tong,
Kongzhi Hu,
Jiapeng Luo,
Zheng Ma,
Ji Ma,
Jiaqi Wang,
Xiaoyi Dong,
Hang Yan,
Hewei Guo,
Conghui He,
Botian Shi,
Zhenjiang Jin,
Chao Xu,
Bin Wang,
Xingjian Wei,
Wei Li,
Wenjian Zhang,
Bo Zhang,
Pinlong Cai
, et al. (10 additional authors not shown)
Abstract:
In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual…
▽ More
In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs. (2) Dynamic High-Resolution: we divide images into tiles ranging from 1 to 40 of 448$\times$448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input. (3) High-Quality Bilingual Dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images, and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in OCR- and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks. Code has been released at https://github.com/OpenGVLab/InternVL.
△ Less
Submitted 29 April, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes
Authors:
Ziqian Bai,
Feitong Tan,
Sean Fanello,
Rohit Pandey,
Mingsong Dou,
Shichen Liu,
Ping Tan,
Yinda Zhang
Abstract:
3D head avatars built with neural implicit volumetric representations have achieved unprecedented levels of photorealism. However, the computational cost of these methods remains a significant barrier to their widespread adoption, particularly in real-time applications such as virtual reality and teleconferencing. While attempts have been made to develop fast neural rendering approaches for static…
▽ More
3D head avatars built with neural implicit volumetric representations have achieved unprecedented levels of photorealism. However, the computational cost of these methods remains a significant barrier to their widespread adoption, particularly in real-time applications such as virtual reality and teleconferencing. While attempts have been made to develop fast neural rendering approaches for static scenes, these methods cannot be simply employed to support realistic facial expressions, such as in the case of a dynamic facial performance. To address these challenges, we propose a novel fast 3D neural implicit head avatar model that achieves real-time rendering while maintaining fine-grained controllability and high rendering quality. Our key idea lies in the introduction of local hash table blendshapes, which are learned and attached to the vertices of an underlying face parametric model. These per-vertex hash-tables are linearly merged with weights predicted via a CNN, resulting in expression dependent embeddings. Our novel representation enables efficient density and color predictions using a lightweight MLP, which is further accelerated by a hierarchical nearest neighbor search method. Extensive experiments show that our approach runs in real-time while achieving comparable rendering quality to state-of-the-arts and decent results on challenging expressions.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Variational quantum eigensolver with linear depth problem-inspired ansatz for solving portfolio optimization in finance
Authors:
Shengbin Wang,
Peng Wang,
Guihui Li,
Shubin Zhao,
Dongyi Zhao,
Jing Wang,
Yuan Fang,
Menghan Dou,
Yongjian Gu,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
Great efforts have been dedicated in recent years to explore practical applications for noisy intermediate-scale quantum (NISQ) computers, which is a fundamental and challenging problem in quantum computing. As one of the most promising methods, the variational quantum eigensolver (VQE) has been extensively studied. In this paper, VQE is applied to solve portfolio optimization problems in finance…
▽ More
Great efforts have been dedicated in recent years to explore practical applications for noisy intermediate-scale quantum (NISQ) computers, which is a fundamental and challenging problem in quantum computing. As one of the most promising methods, the variational quantum eigensolver (VQE) has been extensively studied. In this paper, VQE is applied to solve portfolio optimization problems in finance by designing two hardware-efficient Dicke state ansatze that reach a maximum of 2n two-qubit gate depth and n^2/4 parameters, with n being the number of qubits used. Both ansatze are partitioning-friendly, allowing for the proposal of a highly scalable quantum/classical hybrid distributed computing (HDC) scheme. Combining simultaneous sampling, problem-specific measurement error mitigation, and fragment reuse techniques, we successfully implement the HDC experiments on the superconducting quantum computer Wu Kong with up to 55 qubits. The simulation and experimental results illustrate that the restricted expressibility of the ansatze, induced by the small number of parameters and limited entanglement, is advantageous for solving classical optimization problems with the cost function of the conditional value-at-risk (CVaR) for the NISQ era and beyond. Furthermore, the HDC scheme shows great potential for achieving quantum advantage in the NISQ era. We hope that the heuristic idea presented in this paper can motivate fruitful investigations in current and future quantum computing paradigms.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
Authors:
Renqiu Xia,
Bo Zhang,
Hancheng Ye,
Xiangchao Yan,
Qi Liu,
Hongbin Zhou,
Zijun Chen,
Min Dou,
Botian Shi,
Junchi Yan,
Yu Qiao
Abstract:
Recently, many versatile Multi-modal Large Language Models (MLLMs) have emerged continuously. However, their capacity to query information depicted in visual charts and engage in reasoning based on the queried contents remains under-explored. In this paper, to comprehensively and rigorously benchmark the ability of the off-the-shelf MLLMs in the chart domain, we construct ChartX, a multi-modal eva…
▽ More
Recently, many versatile Multi-modal Large Language Models (MLLMs) have emerged continuously. However, their capacity to query information depicted in visual charts and engage in reasoning based on the queried contents remains under-explored. In this paper, to comprehensively and rigorously benchmark the ability of the off-the-shelf MLLMs in the chart domain, we construct ChartX, a multi-modal evaluation set covering 18 chart types, 7 chart tasks, 22 disciplinary topics, and high-quality chart data. Besides, we develop ChartVLM to offer a new perspective on handling multi-modal tasks that strongly depend on interpretable patterns, such as reasoning tasks in the field of charts or geometric images. We evaluate the chart-related ability of mainstream MLLMs and our ChartVLM on the proposed ChartX evaluation set. Extensive experiments demonstrate that ChartVLM surpasses both versatile and chart-related large models, achieving results comparable to GPT-4V. We believe that our study can pave the way for further exploration in creating a more comprehensive chart evaluation set and developing more interpretable multi-modal models. Both ChartX and ChartVLM are available at: https://github.com/UniModal4Reasoning/ChartVLM
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving
Authors:
Guohang Yan,
Jiahao Pi,
Jianfei Guo,
Zhaotong Luo,
Min Dou,
Nianchen Deng,
Qiusheng Huang,
Daocheng Fu,
Licheng Wen,
Pinlong Cai,
Xing Gao,
Xinyu Cai,
Bo Zhang,
Xuemeng Yang,
Yeqi Bai,
Hongbin Zhou,
Botian Shi
Abstract:
With deep learning and computer vision technology development, autonomous driving provides new solutions to improve traffic safety and efficiency. The importance of building high-quality datasets is self-evident, especially with the rise of end-to-end autonomous driving algorithms in recent years. Data plays a core role in the algorithm closed-loop system. However, collecting real-world data is ex…
▽ More
With deep learning and computer vision technology development, autonomous driving provides new solutions to improve traffic safety and efficiency. The importance of building high-quality datasets is self-evident, especially with the rise of end-to-end autonomous driving algorithms in recent years. Data plays a core role in the algorithm closed-loop system. However, collecting real-world data is expensive, time-consuming, and unsafe. With the development of implicit rendering technology and in-depth research on using generative models to produce data at scale, we propose OASim, an open and adaptive simulator and autonomous driving data generator based on implicit neural rendering. It has the following characteristics: (1) High-quality scene reconstruction through neural implicit surface reconstruction technology. (2) Trajectory editing of the ego vehicle and participating vehicles. (3) Rich vehicle model library that can be freely selected and inserted into the scene. (4) Rich sensors model library where you can select specified sensors to generate data. (5) A highly customizable data generation system can generate data according to user needs. We demonstrate the high quality and fidelity of the generated data through perception performance evaluation on the Carla simulator and real-world data acquisition. Code is available at https://github.com/PJLab-ADG/OASim.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving
Authors:
Daocheng Fu,
Wenjie Lei,
Licheng Wen,
Pinlong Cai,
Song Mao,
Min Dou,
Botian Shi,
Yu Qiao
Abstract:
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platform…
▽ More
The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platforms, LimSim++ addresses the need for a long-term closed-loop infrastructure supporting continuous learning and improved generalization in autonomous driving. The platform offers extended-duration, multi-scenario simulations, providing crucial information for (M)LLM-driven vehicles. Users can engage in prompt engineering, model evaluation, and framework enhancement, making LimSim++ a versatile tool for research and practice. This paper additionally introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The open-source resources of LimSim++ are available at: https://pjlab-adg.github.io/limsim-plus/.
△ Less
Submitted 12 April, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Underwater motions analysis and control of a coupling-tiltable unmanned aerial-aquatic quadrotor
Authors:
Dongyue Huang,
Chenggang Wang,
Minghao Dou,
Xuchen Liu,
Zixuan Liu,
Biao Wang,
Ben M. Chen
Abstract:
This paper proposes a method for analyzing a series of potential motions in a coupling-tiltable aerial-aquatic quadrotor based on its nonlinear dynamics. Some characteristics and constraints derived by this method are specified as Singular Thrust Tilt Angles (STTAs), utilizing to generate motions including planar motions. A switch-based control scheme addresses issues of control direction uncertai…
▽ More
This paper proposes a method for analyzing a series of potential motions in a coupling-tiltable aerial-aquatic quadrotor based on its nonlinear dynamics. Some characteristics and constraints derived by this method are specified as Singular Thrust Tilt Angles (STTAs), utilizing to generate motions including planar motions. A switch-based control scheme addresses issues of control direction uncertainty inherent to the mechanical structure by incorporating a saturated Nussbaum function. A high-fidelity simulation environment incorporating a comprehensive hydrodynamic model is built based on a Hardware-In-The-Loop (HITL) setup with Gazebo and a flight control board. The experiments validate the effectiveness of the absolute and quasi planar motions, which cannot be achieved by conventional quadrotors, and demonstrate stable performance when the pitch or roll angle is activated in the auxiliary control channel.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Towards Knowledge-driven Autonomous Driving
Authors:
Xin Li,
Yeqi Bai,
Pinlong Cai,
Licheng Wen,
Daocheng Fu,
Bo Zhang,
Xuemeng Yang,
Xinyu Cai,
Tao Ma,
Jianfei Guo,
Xing Gao,
Min Dou,
Yikang Li,
Botian Shi,
Yong Liu,
Liang He,
Yu Qiao
Abstract:
This paper explores the emerging knowledge-driven autonomous driving technologies. Our investigation highlights the limitations of current autonomous driving systems, in particular their sensitivity to data bias, difficulty in handling long-tail scenarios, and lack of interpretability. Conversely, knowledge-driven methods with the abilities of cognition, generalization and life-long learning emerg…
▽ More
This paper explores the emerging knowledge-driven autonomous driving technologies. Our investigation highlights the limitations of current autonomous driving systems, in particular their sensitivity to data bias, difficulty in handling long-tail scenarios, and lack of interpretability. Conversely, knowledge-driven methods with the abilities of cognition, generalization and life-long learning emerge as a promising way to overcome these challenges. This paper delves into the essence of knowledge-driven autonomous driving and examines its core components: dataset \& benchmark, environment, and driver agent. By leveraging large language models, world models, neural rendering, and other advanced artificial intelligence techniques, these components collectively contribute to a more holistic, adaptive, and intelligent autonomous driving system. The paper systematically organizes and reviews previous research efforts in this area, and provides insights and guidance for future research and practical applications of autonomous driving. We will continually share the latest updates on cutting-edge developments in knowledge-driven autonomous driving along with the relevant valuable open-source resources at: \url{https://github.com/PJLab-ADG/awesome-knowledge-driven-AD}.
△ Less
Submitted 27 December, 2023; v1 submitted 7 December, 2023;
originally announced December 2023.
-
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
Authors:
Licheng Wen,
Xuemeng Yang,
Daocheng Fu,
Xiaofeng Wang,
Pinlong Cai,
Xin Li,
Tao Ma,
Yingxuan Li,
Linran Xu,
Dengke Shang,
Zheng Zhu,
Shaoyan Sun,
Yeqi Bai,
Xinyu Cai,
Min Dou,
Shuanglu Hu,
Botian Shi,
Yu Qiao
Abstract:
The pursuit of autonomous driving technology hinges on the sophisticated integration of perception, decision-making, and control systems. Traditional approaches, both data-driven and rule-based, have been hindered by their inability to grasp the nuance of complex driving environments and the intentions of other road users. This has been a significant bottleneck, particularly in the development of…
▽ More
The pursuit of autonomous driving technology hinges on the sophisticated integration of perception, decision-making, and control systems. Traditional approaches, both data-driven and rule-based, have been hindered by their inability to grasp the nuance of complex driving environments and the intentions of other road users. This has been a significant bottleneck, particularly in the development of common sense reasoning and nuanced scene understanding necessary for safe and reliable autonomous driving. The advent of Visual Language Models (VLM) represents a novel frontier in realizing fully autonomous vehicle driving. This report provides an exhaustive evaluation of the latest state-of-the-art VLM, GPT-4V(ision), and its application in autonomous driving scenarios. We explore the model's abilities to understand and reason about driving scenes, make decisions, and ultimately act in the capacity of a driver. Our comprehensive tests span from basic scene recognition to complex causal reasoning and real-time decision-making under varying conditions. Our findings reveal that GPT-4V demonstrates superior performance in scene understanding and causal reasoning compared to existing autonomous systems. It showcases the potential to handle out-of-distribution scenarios, recognize intentions, and make informed decisions in real driving contexts. However, challenges remain, particularly in direction discernment, traffic light recognition, vision grounding, and spatial reasoning tasks. These limitations underscore the need for further research and development. Project is now available on GitHub for interested parties to access and utilize: \url{https://github.com/PJLab-ADG/GPT4V-AD-Exploration}
△ Less
Submitted 28 November, 2023; v1 submitted 9 November, 2023;
originally announced November 2023.
-
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models
Authors:
Licheng Wen,
Daocheng Fu,
Xin Li,
Xinyu Cai,
Tao Ma,
Pinlong Cai,
Min Dou,
Botian Shi,
Liang He,
Yu Qiao
Abstract:
Recent advancements in autonomous driving have relied on data-driven approaches, which are widely adopted but face challenges including dataset bias, overfitting, and uninterpretability. Drawing inspiration from the knowledge-driven nature of human driving, we explore the question of how to instill similar capabilities into autonomous driving systems and summarize a paradigm that integrates an int…
▽ More
Recent advancements in autonomous driving have relied on data-driven approaches, which are widely adopted but face challenges including dataset bias, overfitting, and uninterpretability. Drawing inspiration from the knowledge-driven nature of human driving, we explore the question of how to instill similar capabilities into autonomous driving systems and summarize a paradigm that integrates an interactive environment, a driver agent, as well as a memory component to address this question. Leveraging large language models (LLMs) with emergent abilities, we propose the DiLu framework, which combines a Reasoning and a Reflection module to enable the system to perform decision-making based on common-sense knowledge and evolve continuously. Extensive experiments prove DiLu's capability to accumulate experience and demonstrate a significant advantage in generalization ability over reinforcement learning-based methods. Moreover, DiLu is able to directly acquire experiences from real-world datasets which highlights its potential to be deployed on practical autonomous driving systems. To the best of our knowledge, we are the first to leverage knowledge-driven capability in decision-making for autonomous vehicles. Through the proposed DiLu framework, LLM is strengthened to apply knowledge and to reason causally in the autonomous driving domain. Project page: https://pjlab-adg.github.io/DiLu/
△ Less
Submitted 21 February, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Authors:
Bo Zhang,
Xinyu Cai,
Jiakang Yuan,
Donglin Yang,
Jianfei Guo,
Xiangchao Yan,
Renqiu Xia,
Botian Shi,
Min Dou,
Tao Chen,
Si Liu,
Junchi Yan,
Yu Qiao
Abstract:
Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous domain knowledge can be hardly directly deployed to a new domain without additional costs. In this paper, we provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Sim…
▽ More
Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous domain knowledge can be hardly directly deployed to a new domain without additional costs. In this paper, we provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception (ReSimAD) scheme. Specifically, the implicit reconstruction process is based on the knowledge from the previous old domain, aiming to convert the domain-related knowledge into domain-invariant representations, e.g., 3D scene-level meshes. Besides, the point clouds simulation process of multiple new domains is conditioned on the above reconstructed 3D meshes, where the target-domain-like simulation samples can be obtained, thus reducing the cost of collecting and annotating new-domain data for the subsequent perception process. For experiments, we consider different cross-domain situations such as Waymo-to-KITTI, Waymo-to-nuScenes, Waymo-to-ONCE, etc, to verify the zero-shot target-domain perception using ReSimAD. Results demonstrate that our method is beneficial to boost the domain generalization ability, even promising for 3D pre-training.
△ Less
Submitted 25 January, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images
Authors:
Tze Ho Elden Tse,
Franziska Mueller,
Zhengyang Shen,
Danhang Tang,
Thabo Beeler,
Mingsong Dou,
Yinda Zhang,
Sasa Petrovic,
Hyung Jin Chang,
Jonathan Taylor,
Bardia Doosti
Abstract:
We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images. Unlike existing hand pose estimation methods, where one typically trains a deep network to regress hand model parameters from single RGB image, we consider a more challenging problem setting where we directly regress the absolute root poses of two-hands with extended forearm at high…
▽ More
We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images. Unlike existing hand pose estimation methods, where one typically trains a deep network to regress hand model parameters from single RGB image, we consider a more challenging problem setting where we directly regress the absolute root poses of two-hands with extended forearm at high resolution from egocentric view. As existing datasets are either infeasible for egocentric viewpoints or lack background variations, we create a large-scale synthetic dataset with diverse scenarios and collect a real dataset from multi-calibrated camera setup to verify our proposed multi-view image feature fusion strategy. To make the reconstruction physically plausible, we propose two strategies: (i) a coarse-to-fine spectral graph convolution decoder to smoothen the meshes during upsampling and (ii) an optimisation-based refinement stage at inference to prevent self-penetrations. Through extensive quantitative and qualitative evaluations, we show that our framework is able to produce realistic two-hand reconstructions and demonstrate the generalisation of synthetic-trained models to real data, as well as real-time AR/VR applications.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Drive Like a Human: Rethinking Autonomous Driving with Large Language Models
Authors:
Daocheng Fu,
Xin Li,
Licheng Wen,
Min Dou,
Pinlong Cai,
Botian Shi,
Yu Qiao
Abstract:
In this paper, we explore the potential of using a large language model (LLM) to understand the driving environment in a human-like manner and analyze its ability to reason, interpret, and memorize when facing complex scenarios. We argue that traditional optimization-based and modular autonomous driving (AD) systems face inherent performance limitations when dealing with long-tail corner cases. To…
▽ More
In this paper, we explore the potential of using a large language model (LLM) to understand the driving environment in a human-like manner and analyze its ability to reason, interpret, and memorize when facing complex scenarios. We argue that traditional optimization-based and modular autonomous driving (AD) systems face inherent performance limitations when dealing with long-tail corner cases. To address this problem, we propose that an ideal AD system should drive like a human, accumulating experience through continuous driving and using common sense to solve problems. To achieve this goal, we identify three key abilities necessary for an AD system: reasoning, interpretation, and memorization. We demonstrate the feasibility of employing an LLM in driving scenarios by building a closed-loop system to showcase its comprehension and environment-interaction abilities. Our extensive experiments show that the LLM exhibits the impressive ability to reason and solve long-tailed cases, providing valuable insights for the development of human-like autonomous driving. The related code are available at https://github.com/PJLab-ADG/DriveLikeAHuman .
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
LimSim: A Long-term Interactive Multi-scenario Traffic Simulator
Authors:
Licheng Wen,
Daocheng Fu,
Song Mao,
Pinlong Cai,
Min Dou,
Yikang Li,
Yu Qiao
Abstract:
With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styl…
▽ More
With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styles and multi-vehicle interactions, and struggle to handle corner cases in the dataset. In this paper, we propose LimSim, the Long-term Interactive Multi-scenario traffic Simulator, which aims to provide a long-term continuous simulation capability under the urban road network. LimSim can simulate fine-grained dynamic scenarios and focus on the diverse interactions between multiple vehicles in the traffic flow. This paper provides a detailed introduction to the framework and features of the LimSim, and demonstrates its performance through case studies and experiments. LimSim is now open source on GitHub: https://www.github.com/PJLab-ADG/LimSim .
△ Less
Submitted 26 July, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Predicting RNA Secondary Structure on Universal Quantum Computer
Authors:
Ji Jiang,
Qipeng Yan,
Ye Li,
Min Lu,
Ziwei Cui,
Menghan Dou,
Qingchun Wang,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
It is the first step for understanding how RNA structure folds from base sequences that to know how its secondary structure is formed. Traditional energy-based algorithms are short of precision, particularly for non-nested sequences, while learning-based algorithms face challenges in obtaining high-quality training data. Recently, quantum annealer has rapidly predicted the folding of the secondary…
▽ More
It is the first step for understanding how RNA structure folds from base sequences that to know how its secondary structure is formed. Traditional energy-based algorithms are short of precision, particularly for non-nested sequences, while learning-based algorithms face challenges in obtaining high-quality training data. Recently, quantum annealer has rapidly predicted the folding of the secondary structure, highlighting that quantum computing is a promising solution to this problem. However, gate model algorithms for universal quantum computing are not available. In this paper, gate-based quantum algorithms will be presented, which are highly flexible and can be applied to various physical devices. Mapped all possible secondary structure to the state of a quadratic Hamiltonian, the whole folding process is described as a quadratic unconstrained binary optimization model. Then the model can be solved through quantum approximation optimization algorithm. We demonstrate the performance with both numerical simulation and experimental realization. Throughout our benchmark dataset, simulation results suggest that our quantum approach is comparable in accuracy to classical methods. For non-nested sequences, our quantum approach outperforms classical energy-based methods. Experimental results also indicate our method is robust in current noisy devices. It is the first instance of universal quantum algorithms being employed to tackle RNA folding problems, and our work provides a valuable model for utilizing universal quantum computers in solving RNA folding problems.
△ Less
Submitted 17 May, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
An Improved QFT-Based Quantum Comparator and Extended Modular Arithmetic Using One Ancilla Qubit
Authors:
Yewei Yuan,
Chao Wang,
Bei Wang,
Zhao-Yun Chen,
Meng-Han Dou,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
Quantum comparators and modular arithmetic are fundamental in many quantum algorithms. Current research mainly focuses on operations between two quantum states. However, various applications, such as integer factorization, optimization, option pricing, and risk analysis, commonly require one of the inputs to be classical. It requires many ancillary qubits, especially when subsequent computations a…
▽ More
Quantum comparators and modular arithmetic are fundamental in many quantum algorithms. Current research mainly focuses on operations between two quantum states. However, various applications, such as integer factorization, optimization, option pricing, and risk analysis, commonly require one of the inputs to be classical. It requires many ancillary qubits, especially when subsequent computations are involved. In this paper, we propose a quantum-classical comparator based on the quantum Fourier transform (QFT). Then we extend it to compare two quantum integers and modular arithmetic. Proposed operators only require one ancilla qubit, which is optimal for qubit resources. We analyze limitations in the current modular addition circuit and develop it to process arbitrary quantum states in the entire $n$-qubit space. The proposed algorithms reduce computing resources and make them valuable for Noisy Intermediate-Scale Quantum (NISQ) computers.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
Authors:
Ziqian Bai,
Feitong Tan,
Zeng Huang,
Kripasindhu Sarkar,
Danhang Tang,
Di Qiu,
Abhimitra Meka,
Ruofei Du,
Mingsong Dou,
Sergio Orts-Escolano,
Rohit Pandey,
Ping Tan,
Thabo Beeler,
Sean Fanello,
Yinda Zhang
Abstract:
We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild. The learnt avatar is driven by a parametric face model to achieve user-controlled facial expressions and head poses. Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism. To reduc…
▽ More
We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild. The learnt avatar is driven by a parametric face model to achieve user-controlled facial expressions and head poses. Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism. To reduce over-smoothing and improve out-of-model expressions synthesis, we propose to predict local features anchored on the 3DMM geometry. These learnt features are driven by 3DMM deformation and interpolated in 3D space to yield the volumetric radiance at a designated query point. We further show that using a Convolutional Neural Network in the UV space is critical in incorporating spatial context and producing representative local features. Extensive experiments show that we are able to reconstruct high-quality avatars, with more accurate expression-dependent details, good generalization to out-of-training expressions, and quantitatively superior renderings compared to other state-of-the-art approaches.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Simulation of chemical reaction dynamics based on quantum computing
Authors:
Qiankun Gong,
Qingmin Man,
Ye Li,
Menghan Dou,
Qingchun Wang,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
The molecular energies of chemical systems have been successfully calculated on quantum computers, however, more attention has been paid to the dynamic process of chemical reactions in practical application, especially in catalyst design, material synthesis. Due to the limited the capabilities of the noisy intermediate scale quantum (NISQ) devices, directly simulating the reaction dynamics and det…
▽ More
The molecular energies of chemical systems have been successfully calculated on quantum computers, however, more attention has been paid to the dynamic process of chemical reactions in practical application, especially in catalyst design, material synthesis. Due to the limited the capabilities of the noisy intermediate scale quantum (NISQ) devices, directly simulating the reaction dynamics and determining reaction pathway still remain a challenge. Here we develop the ab initio molecular dynamics based on quantum computing to simulate reaction dynamics by extending correlated sampling approach. And, we use this approach to calculate Hessian matrix and evaluate computation resources. We test the performance of our approach by simulating hydrogen exchange reaction and bimolecular nucleophilic substitution SN2 reaction. Our results suggest that it is reliable to characterize the molecular structure, property, and reactivity, which is another important expansion of the application of quantum computing
△ Less
Submitted 27 March, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Efficient and Error-Resilient Data Access Protocols for a Limited-Sized Quantum Random Access Memory
Authors:
Zhao-Yun Chen,
Cheng Xue,
Yun-Jie Wang,
Tai-Ping Sun,
Huan-Yu Liu,
Xi-Ning Zhuang,
Meng-Han Dou,
Tian-Rui Zou,
Yuan Fang,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
Quantum Random Access Memory (QRAM) is a critical component for loading classical data into quantum computers. While constructing a practical QRAM presents several challenges, including the impracticality of an infinitely large QRAM size and a fully error-correction implementation, it is essential to consider a practical case where the QRAM has a limited size. In this work, we focus on the access…
▽ More
Quantum Random Access Memory (QRAM) is a critical component for loading classical data into quantum computers. While constructing a practical QRAM presents several challenges, including the impracticality of an infinitely large QRAM size and a fully error-correction implementation, it is essential to consider a practical case where the QRAM has a limited size. In this work, we focus on the access of larger data sizes without keeping on increasing the size of the QRAM. Firstly, we address the challenge of word length, as real-world datasets typically have larger word lengths than the single-bit data that most previous studies have focused on. We propose a novel protocol for loading data with larger word lengths $k$ without increasing the number of QRAM levels $n$. By exploiting the parallelism in the data query process, our protocol achieves a time complexity of $O(n+k)$ and improves error scaling performance compared to existing approaches. Secondly, we provide a data-loading method for general-sized data access tasks when the number of data items exceeds $2^n$, which outperforms the existing hybrid QRAM+QROM architecture. Our method contributes to the development of time and error-optimized data access protocols for QRAM devices, reducing the qubit count and error requirements for QRAM implementation, and making it easier to construct practical QRAM devices with a limited number of physical qubits.
△ Less
Submitted 21 June, 2023; v1 submitted 9 March, 2023;
originally announced March 2023.
-
Data-driven prognostics based on time-frequency analysis and symbolic recurrent neural network for fuel cells under dynamic load
Authors:
Chu Wang,
Manfeng Dou,
Zhongliang Li,
Rachid Outbib,
Dongdong Zhao,
Jian Zuo,
Yuanlin Wang,
Bin Liang,
Peng Wang
Abstract:
Data-centric prognostics is beneficial to improve the reliability and safety of proton exchange membrane fuel cell (PEMFC). For the prognostics of PEMFC operating under dynamic load, the challenges come from extracting degradation features, improving prediction accuracy, expanding the prognostics horizon, and reducing computational cost. To address these issues, this work proposes a data-driven PE…
▽ More
Data-centric prognostics is beneficial to improve the reliability and safety of proton exchange membrane fuel cell (PEMFC). For the prognostics of PEMFC operating under dynamic load, the challenges come from extracting degradation features, improving prediction accuracy, expanding the prognostics horizon, and reducing computational cost. To address these issues, this work proposes a data-driven PEMFC prognostics approach, in which Hilbert-Huang transform is used to extract health indicator in dynamic operating conditions and symbolic-based gated recurrent unit model is used to enhance the accuracy of life prediction. Comparing with other state-of-the-art methods, the proposed data-driven prognostics approach provides a competitive prognostics horizon with lower computational cost. The prognostics performance shows consistency and generalizability under different failure threshold settings.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
TJ-FlyingFish: Design and Implementation of an Aerial-Aquatic Quadrotor with Tiltable Propulsion Units
Authors:
Xuchen Liu,
Minghao Dou,
Dongyue Huang,
Biao Wang,
Jinqiang Cui,
Qinyuan Ren,
Lihua Dou,
Zhi Gao,
Jie Chen,
Ben M. Chen
Abstract:
Aerial-aquatic vehicles are capable to move in the two most dominant fluids, making them more promising for a wide range of applications. We propose a prototype with special designs for propulsion and thruster configuration to cope with the vast differences in the fluid properties of water and air. For propulsion, the operating range is switched for the different mediums by the dual-speed propulsi…
▽ More
Aerial-aquatic vehicles are capable to move in the two most dominant fluids, making them more promising for a wide range of applications. We propose a prototype with special designs for propulsion and thruster configuration to cope with the vast differences in the fluid properties of water and air. For propulsion, the operating range is switched for the different mediums by the dual-speed propulsion unit, providing sufficient thrust and also ensuring output efficiency. For thruster configuration, thrust vectoring is realized by the rotation of the propulsion unit around the mount arm, thus enhancing the underwater maneuverability. This paper presents a quadrotor prototype of this concept and the design details and realization in practice.
△ Less
Submitted 6 February, 2023; v1 submitted 28 January, 2023;
originally announced January 2023.
-
VQNet 2.0: A New Generation Machine Learning Framework that Unifies Classical and Quantum
Authors:
Huanyu Bian,
Zhilong Jia,
Menghan Dou,
Yuan Fang,
Lei Li,
Yiming Zhao,
Hanchao Wang,
Zhaohui Zhou,
Wei Wang,
Wenyu Zhu,
Ye Li,
Yang Yang,
Weiming Zhang,
Nenghai Yu,
Zhaoyun Chen,
Guoping Guo
Abstract:
With the rapid development of classical and quantum machine learning, a large number of machine learning frameworks have been proposed. However, existing machine learning frameworks usually only focus on classical or quantum, rather than both. Therefore, based on VQNet 1.0, we further propose VQNet 2.0, a new generation of unified classical and quantum machine learning framework that supports hybr…
▽ More
With the rapid development of classical and quantum machine learning, a large number of machine learning frameworks have been proposed. However, existing machine learning frameworks usually only focus on classical or quantum, rather than both. Therefore, based on VQNet 1.0, we further propose VQNet 2.0, a new generation of unified classical and quantum machine learning framework that supports hybrid optimization. The core library of the framework is implemented in C++, and the user level is implemented in Python, and it supports deployment on quantum and classical hardware. In this article, we analyze the development trend of the new generation machine learning framework and introduce the design principles of VQNet 2.0 in detail: unity, practicality, efficiency, and compatibility, as well as full particulars of implementation. We illustrate the functions of VQNet 2.0 through several basic applications, including classical convolutional neural networks, quantum autoencoders, hybrid classical-quantum networks, etc. After that, through extensive experiments, we demonstrate that the operation speed of VQNet 2.0 is higher than the comparison method. Finally, through extensive experiments, we demonstrate that VQNet 2.0 can deploy on different hardware platforms, the overall calculation speed is faster than the comparison method. It also can be mixed and optimized with quantum circuits composed of multiple quantum computing libraries.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
QPanda: high-performance quantum computing framework for multiple application scenarios
Authors:
Menghan Dou,
Tianrui Zou,
Yuan Fang,
Jing Wang,
Dongyi Zhao,
Lei Yu,
Boying Chen,
Wenbo Guo,
Ye Li,
Zhaoyun Chen,
Guoping Guo
Abstract:
With the birth of Noisy Intermediate Scale Quantum (NISQ) devices and the verification of "quantum supremacy" in random number sampling and boson sampling, more and more fields hope to use quantum computers to solve specific problems, such as aerodynamic design, route allocation, financial option prediction, quantum chemical simulation to find new materials, and the challenge of quantum cryptograp…
▽ More
With the birth of Noisy Intermediate Scale Quantum (NISQ) devices and the verification of "quantum supremacy" in random number sampling and boson sampling, more and more fields hope to use quantum computers to solve specific problems, such as aerodynamic design, route allocation, financial option prediction, quantum chemical simulation to find new materials, and the challenge of quantum cryptography to automotive industry security. However, these fields still need to constantly explore quantum algorithms that adapt to the current NISQ machine, so a quantum programming framework that can face multi-scenarios and application needs is required. Therefore, this paper proposes QPanda, an application scenario-oriented quantum programming framework with high-performance simulation. Such as designing quantum chemical simulation algorithms based on it to explore new materials, building a quantum machine learning framework to serve finance, etc. This framework implements high-performance simulation of quantum circuits, a configuration of the fusion processing backend of quantum computers and supercomputers, and compilation and optimization methods of quantum programs for NISQ machines. Finally, the experiment shows that quantum jobs can be executed with high fidelity on the quantum processor using quantum circuit compile and optimized interface and have better simulation performance.
△ Less
Submitted 29 December, 2022;
originally announced December 2022.
-
UniDA3D: Unified Domain Adaptive 3D Semantic Segmentation Pipeline
Authors:
Ben Fei,
Siyuan Huang,
Jiakang Yuan,
Botian Shi,
Bo Zhang,
Weidong Yang,
Min Dou,
Yikang Li
Abstract:
State-of-the-art 3D semantic segmentation models are trained on off-the-shelf public benchmarks, but they will inevitably face the challenge of recognition accuracy drop when these well-trained models are deployed to a new domain. In this paper, we introduce a Unified Domain Adaptive 3D semantic segmentation pipeline (UniDA3D) to enhance the weak generalization ability, and bridge the point distri…
▽ More
State-of-the-art 3D semantic segmentation models are trained on off-the-shelf public benchmarks, but they will inevitably face the challenge of recognition accuracy drop when these well-trained models are deployed to a new domain. In this paper, we introduce a Unified Domain Adaptive 3D semantic segmentation pipeline (UniDA3D) to enhance the weak generalization ability, and bridge the point distribution gap between domains. Different from previous studies that only focus on a single adaptation task, UniDA3D can tackle several adaptation tasks in 3D segmentation field, by designing a unified source-and-target active sampling strategy, which selects a maximally-informative subset from both source and target domains for effective model adaptation. Besides, benefiting from the rise of multi-modal 2D-3D datasets, UniDA3D investigates the possibility of achieving a multi-modal sampling strategy, by developing a cross-modality feature interaction module that can extract a representative pair of image and point features to achieve a bi-directional image-point feature interaction for safe model adaptation. Experimentally, UniDA3D is verified to be effective in many adaptation tasks including: 1) unsupervised domain adaptation, 2) unsupervised few-shot domain adaptation; 3) active domain adaptation. Their results demonstrate that, by easily coupling UniDA3D with off-the-shelf 3D segmentation baselines, domain generalization ability of these baselines can be enhanced.
△ Less
Submitted 12 March, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling
Authors:
Boyan Jiang,
Xinlin Ren,
Mingsong Dou,
Xiangyang Xue,
Yanwei Fu,
Yinda Zhang
Abstract:
Recent progress in 4D implicit representation focuses on globally controlling the shape and motion with low dimensional latent vectors, which is prone to missing surface details and accumulating tracking error. While many deep local representations have shown promising results for 3D shape modeling, their 4D counterpart does not exist yet. In this paper, we fill this blank by proposing a novel Loc…
▽ More
Recent progress in 4D implicit representation focuses on globally controlling the shape and motion with low dimensional latent vectors, which is prone to missing surface details and accumulating tracking error. While many deep local representations have shown promising results for 3D shape modeling, their 4D counterpart does not exist yet. In this paper, we fill this blank by proposing a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD, which has the merits of both 4D human modeling and local representation, and enables high-fidelity reconstruction with detailed surface deformations, such as clothing wrinkles. Particularly, our key insight is to encourage the network to learn the latent codes of local part-level representation, capable of explaining the local geometry and temporal deformations. To make the inference at test-time, we first estimate the inner body skeleton motion to track local parts at each time step, and then optimize the latent codes for each part via auto-decoding based on different types of observed data. Extensive experiments demonstrate that the proposed method has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications, including 4D reconstruction from sparse points, non-rigid depth fusion, both qualitatively and quantitatively.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Response of the Fe K_alpha line emission to the X-ray continuum variability in the changing-look active galactic nucleus NGC 1566
Authors:
W. C. Liang,
X. W. Shu,
J. X. Wang,
Y. Tan,
W. J. Zhang,
L. M. Sun,
N. Jiang,
L. M. Dou
Abstract:
NGC 1566 is a changing look AGN known to exhibit recurrent X-ray outbursts with each lasting for several years. The most recent X-ray outburst is observed on 2018, with a substantial increase of 2--10 keV flux by a factor of ~24 than the historical minimum. We re-analyze the XMM-Newton and NuSTAR observations covering the pre-outburst, outburst and post-outburst epochs, and confirm the discovery o…
▽ More
NGC 1566 is a changing look AGN known to exhibit recurrent X-ray outbursts with each lasting for several years. The most recent X-ray outburst is observed on 2018, with a substantial increase of 2--10 keV flux by a factor of ~24 than the historical minimum. We re-analyze the XMM-Newton and NuSTAR observations covering the pre-outburst, outburst and post-outburst epochs, and confirm the discovery of the broad feature in the ~5--7 keV band during the period of outburst that could be interpreted as a relativistic Fe K_alpha emission line. Our analysis suggests that its flux has increased in tandem with the 2--10 keV continuum, making it the second changing look AGN in which the broad Fe K_alpha line responds to the X-ray continuum variability. This behavior strongly supports the idea that X-rays originates in a corona above the accretion disk, and disk reflection produces the relativistic Fe K_alpha line. In addition, we find the response of narrow Fe K_alpha emission line to the changes in the X-ray continuum on a time-scale as short as four months, allowing to put the location of line-emitting region at <0.1 pc, comparable to the size of optical BLR. By comparing to the changing look AGN NGC 2992, the Fe K_alpha variation rate (the ratio of Fe K_alpha variation to luminosity variation) in NGC 1566 appears greater, which could be possibly explained by larger amount of gas or Fe abundance responsible for producing the Fe K_alpha line for the latter. The strength of variable broad Fe K_alpha line as well as the soft X-ray excess emission appears to be correlated with the accretion rate, which could be explained as due to the state transition associated with the changing-look phenomenon.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Discovery of late-time X-ray flare and anomalous emission line enhancement after the nuclear optical outburst in a narrow-line Seyfert 1 Galaxy
Authors:
W. J. Zhang,
X. W. Shu,
Z. F. Sheng,
L. M. Sun,
L. M. Dou,
N. Jiang,
J. G. Wang,
X. Y. Hu,
Y. B. Wang,
T. G. Wang
Abstract:
CSS J102913+404220 is a peculiar narrow line Seyfert 1 galaxy with an energetic nuclear optical outburst. We present a detailed analysis of its multi-wavelength photometric and spectroscopic observations covering a period of decade since outburst. We detect mid-infrared (MIR) flares delayed by about two months relative to the optical outburst, with an extremely high peak luminosity of log(L_4.6um)…
▽ More
CSS J102913+404220 is a peculiar narrow line Seyfert 1 galaxy with an energetic nuclear optical outburst. We present a detailed analysis of its multi-wavelength photometric and spectroscopic observations covering a period of decade since outburst. We detect mid-infrared (MIR) flares delayed by about two months relative to the optical outburst, with an extremely high peak luminosity of log(L_4.6um)>44 erg/s. The MIR peak luminosity is at least an order of magnitude higher than any known supernovae explosions, suggesting the optical outburst might be due to a stellar tidal disruption event (TDE). We find late-time X-ray brightening by a factor of >30 with respect to what is observed about 100 days after the optical outburst peak, followed by a flux fading by a factor of ~4 within two weeks, making it one of Active Galactic Nuclei (AGNs) with extreme variability. Despite the dramatic X-ray variability, there are no coincident strong flux variations in optical, UV and MIR bands. This unusual variability behavior has been seen in other highly accreting AGNs and could be attributed to absorption variability. In this scenario, the decrease in the covering factor of absorber with accretion rate could cause the X-ray brightening, possibly induced by the TDE. Most strikingly, while the UV/optical continuum remains little changes with time, an evident enhancement in the flux of H_alpha broad emission line is observed, about a decade after the nuclear optical outburst, which is an anomalous behavior never seen in any other AGNs. Such an H_alpha anomaly could be explained by the replenishment of gas clouds and excitation within Broad Line Region (BLR) that originates, perhaps from the interaction of outflowing stellar debris with BLR. The results highlight the importance of late-time evolution of TDE that could affect the accreting properties of AGN, as suggested by recent simulations.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Shortcuts to Quantum Approximate Optimization Algorithm
Authors:
Yahui Chai,
Yong-Jian Han,
Yu-Chun Wu,
Ye Li,
Menghan Dou,
Guo-Ping Guo
Abstract:
The Quantum Approximate Optimization Algorithm (QAOA) is a quantum-classical hybrid algorithm intending to find the ground state of a target Hamiltonian. Theoretically, QAOA can obtain the approximate solution if the quantum circuit is deep enough. Actually, the performance of QAOA decreases practically if the quantum circuit is deep since near-term devices are not noise-free and the errors caused…
▽ More
The Quantum Approximate Optimization Algorithm (QAOA) is a quantum-classical hybrid algorithm intending to find the ground state of a target Hamiltonian. Theoretically, QAOA can obtain the approximate solution if the quantum circuit is deep enough. Actually, the performance of QAOA decreases practically if the quantum circuit is deep since near-term devices are not noise-free and the errors caused by noise accumulate as the quantum circuit increases. In order to reduce the depth of quantum circuits, we propose a new ansatz dubbed as "Shortcuts to QAOA" (S-QAOA), S-QAOA provides shortcuts to the ground state of target Hamiltonian by including more two-body interactions and releasing the parameter freedoms. To be specific, besides the existing ZZ interaction in the QAOA ansatz, other two-body interactions are introduced in the S-QAOA ansatz such that the approximate solutions could be obtained with smaller circuit depth. Considering the MaxCut problem and Sherrington-Kirkpatrick (SK) model, numerically computation shows the YY interaction has the best performance. The reason for this might arise from the counterdiabatic effect generated by YY interaction. On top of this, we release the freedom of parameters of two-body interactions, which a priori do not necessarily have to be fully identical, and numerical results show that it is worth paying the extra cost of having more parameter freedom since one has a greater improvement on success rate.
△ Less
Submitted 24 April, 2022; v1 submitted 20 December, 2021;
originally announced December 2021.
-
Long-term X-ray evolution of SDSS J134244.4+053056.1: A more than 18 year-old, long-lived IMBH-TDE candidate
Authors:
J. S. He,
L. M. Dou,
Y. L. Ai,
X. W. Shu,
N. Jiang,
T. G. Wang,
F. B. Zhang,
R. F. Shen
Abstract:
SDSS J134244.4+053056 is a tidal disruption event candidate with strong temporal coronal line emitters and a long fading, mid-infrared dust echo. We present detailed analyses of X-ray emission from a Swift/XRT observation in 2009 and the most recent XMM-Newton/pn observation in 2020. The two spectra can be modeled with hard and soft components. While no significant variability is detected in the h…
▽ More
SDSS J134244.4+053056 is a tidal disruption event candidate with strong temporal coronal line emitters and a long fading, mid-infrared dust echo. We present detailed analyses of X-ray emission from a Swift/XRT observation in 2009 and the most recent XMM-Newton/pn observation in 2020. The two spectra can be modeled with hard and soft components. While no significant variability is detected in the hard component above 2 keV between these two observations, the soft X-ray emission in 0.3-2 keV varies by a factor of $\sim5$. The luminosity of this soft component fades from $\sim1.8\times10^{41}$ to $\sim3.7\times10^{40}$ erg s$^{-1}$ from the observation in Swift to that of XMM-Newton, which are 8 and 19 years after the outburst occurred, respectively. The evolution of luminosity matches with the $t^{-5/3}$ decline law well; there is a soft X-ray peak luminosity of 10$^{44}$ erg s$^{-1}$ at the time of the optical flare. Furthermore, the spectra of the soft component harden slightly in the decay phase, in which the photon index $Γ$ varies from $4.8^{+1.2}_{-0.9}$ to $3.7\pm0.5$, although they are consistent with each other if we consider the uncertainties. Additionally, by comparing the BH mass estimate between the $M-σ$ correlation, the broad H$α$ emission, and the fundamental plane relation of BH accretion, we find that a value of $\sim10^{5}$Msun is favored. If so, taking its X-ray spectral variation, luminosity evolution, and further support from theory into account, we suggest that SDSS J134244.4+053056 is a long-lived tidal disruption event candidate lasting more than 18 years with an intermediate-mass black hole.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Origin Pilot: a Quantum Operating System for Effecient Usage of Quantum Resources
Authors:
Weicheng Kong,
Junchao Wang,
Yongjian Han,
Yuchun Wu,
Yu Zhang,
Menghan Dou,
Yuan Fang,
Guoping Guo
Abstract:
The operating system is designed to manage the hardware and software resources of a computer. With the development of quantum computing, the management of quantum resources and cooperation between quantum systems and other computing resources (e.g. CPU, GPU and FPGA etc.) become the key challenge for the application of quantum computing to solve real world problems. In this paper we propose a quan…
▽ More
The operating system is designed to manage the hardware and software resources of a computer. With the development of quantum computing, the management of quantum resources and cooperation between quantum systems and other computing resources (e.g. CPU, GPU and FPGA etc.) become the key challenge for the application of quantum computing to solve real world problems. In this paper we propose a quantum operating system, Origin Pilot. Origin Pilot includes the module of quantum task scheduling, quantum resource management, quantum program compilation and qubits' automatic calibration. With these modules, Origin Pilot can manage the quantum computing resources and solve the multi-quantum processor scheduling problem. It can also allow the parallel execution of multiple quantum programs and calibrate the quantum resource effectively. Thus, the performance of resources is guaranteed and the resource utilization is improved. By comparing the results with and without Origin Pilot, we evaluate the impact on a quantum circuit's fidelity of qubits mapping algorithm. We also evaluate the effectiveness of automatic calibration and parallel execution of multi-quantum processors. Finally, Origin Pilot can be easily customized for hybrid computing resources.
△ Less
Submitted 22 May, 2021;
originally announced May 2021.
-
HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
Authors:
Feitong Tan,
Danhang Tang,
Mingsong Dou,
Kaiwen Guo,
Rohit Pandey,
Cem Keskin,
Ruofei Du,
Deqing Sun,
Sofien Bouaziz,
Sean Fanello,
Ping Tan,
Yinda Zhang
Abstract:
In this paper, we address the problem of building dense correspondences between human images under arbitrary camera viewpoints and body poses. Prior art either assumes small motion between frames or relies on local descriptors, which cannot handle large motion or visually ambiguous body parts, e.g., left vs. right hand. In contrast, we propose a deep learning framework that maps each pixel to a fe…
▽ More
In this paper, we address the problem of building dense correspondences between human images under arbitrary camera viewpoints and body poses. Prior art either assumes small motion between frames or relies on local descriptors, which cannot handle large motion or visually ambiguous body parts, e.g., left vs. right hand. In contrast, we propose a deep learning framework that maps each pixel to a feature space, where the feature distances reflect the geodesic distances among pixels as if they were projected onto the surface of a 3D human scan. To this end, we introduce novel loss functions to push features apart according to their geodesic distances on the surface. Without any semantic annotation, the proposed embeddings automatically learn to differentiate visually similar parts and align different subjects into an unified feature space. Extensive experiments show that the learned embeddings can produce accurate correspondences between images with remarkable generalization capabilities on both intra and inter subjects.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Effect of block medium parameters on energy dissipation
Authors:
K. X. Wang,
N. I. Aleksandrova,
Y. S. Pan,
V. N. Oparin,
L. M. Dou,
A. I. Chanyshev
Abstract:
This paper describes energy distribution in a block medium simulated by a one-dimensional chain of masses joined by springs and dampers. Equations describing the motion of masses are solved by the methods of the theory of ordinary differential equations. The effect of the block medium parameters on energy dissipation is investigated. An approximate analytical solution is obtained that describes th…
▽ More
This paper describes energy distribution in a block medium simulated by a one-dimensional chain of masses joined by springs and dampers. Equations describing the motion of masses are solved by the methods of the theory of ordinary differential equations. The effect of the block medium parameters on energy dissipation is investigated. An approximate analytical solution is obtained that describes the total energy of a block medium at large values of time.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Possible ~0.4 hour X-ray quasi-periodicity from an ultrasoft active galactic nucleus
Authors:
J. R. Song,
X. W. Shu,
L. M. Sun,
Y. Q. Xue,
C. Jin,
W. J. Zhang,
N. Jiang,
L. M. Dou,
T. G. Wang
Abstract:
RX J1301.9+2747 is an ultrasoft active galactic nucleus (AGN) with unusual X-ray variability that is characterized by a long quiescent state and a short-lived flare state. The X-ray flares are found to recur quasi-periodically on a timescale of 13-20 ks. Here, we report the analysis of the light curve in the quiescent state from two XMM observations spanning 18.5 years, along with the discovery of…
▽ More
RX J1301.9+2747 is an ultrasoft active galactic nucleus (AGN) with unusual X-ray variability that is characterized by a long quiescent state and a short-lived flare state. The X-ray flares are found to recur quasi-periodically on a timescale of 13-20 ks. Here, we report the analysis of the light curve in the quiescent state from two XMM observations spanning 18.5 years, along with the discovery of a possible quasi-periodic X-ray oscillation (QPO) with a period of ~1500s. The QPO is detected at the same frequency in the two independent observations, with a combined significance of >99.89%. The QPO is in agreement with the relation between frequency and black hole mass (M_BH) that has been reported in previous works for AGNs and Galactic black hole X-ray binaries (XRBs). The QPO frequency is stable over almost two decades, suggesting that it may correspond to the high-frequency type found in XRBs and originates, perhaps, from a certain disk resonance mode. In the 3:2 twin-frequency resonance model, our best estimate on the M_BH range implies that a maximal black hole spin can be ruled out. We find that all ultrasoft AGNs reported so far display quasi-periodicities in the X-ray emission, suggesting a possible link on the part of the extreme variability phenomenon to the ultrasoft X-ray component. This indicates that ultrasoft AGNs could be the most promising candidates in future searches for X-ray periodicities.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
Compact Radio Emission from Nearby Galaxies with Mid-infrared Nuclear Outbursts
Authors:
B. B. Dai,
X. W. Shu,
N. Jiang,
L. M. Dou,
D. Z. Liu,
C. W. Yang,
F. B. Zhang,
T. G. Wang
Abstract:
We present 5.5 GHz observations with the VLA of a sample of nearby galaxies with energetic nuclear outbursts at mid-infrared (MIR) bands. These observations reach a uniform depth down to a median rms of ~10 uJy, representing one of most sensitive searches for radio emission associated with nuclear transients. We detect radio emission in 12 out of 16 galaxies at a level of >5sigma, corresponding to…
▽ More
We present 5.5 GHz observations with the VLA of a sample of nearby galaxies with energetic nuclear outbursts at mid-infrared (MIR) bands. These observations reach a uniform depth down to a median rms of ~10 uJy, representing one of most sensitive searches for radio emission associated with nuclear transients. We detect radio emission in 12 out of 16 galaxies at a level of >5sigma, corresponding to a detection rate of 75%. Such a high detection is remarkably different from previous similar searches in stellar tidal disruption events. The radio emission is compact and not resolved for the majority of sources on scales of ~<0.5" (<0.9 kpc at z<0.1). We find the possibility of the star-formation contributing to the radio emission is low, but an AGN origin remains a plausible scenario, especially for sources that show evidence of AGN activity in their optical spectra. If the detections could represent radio emission associated with nuclear transient phenomenon such as jet or outflow, we use the blast wave model by analogy with the GRB afterglows to describe the evolution of radio light curves. In this context, the observations are consistent with a decelerating jet with an energy of ~10^{51-52} erg viewed at 30\degree-60\degree off-axis at later times, suggesting that powerful jets may be ubiquitous among MIR-burst galaxies. Future continuous monitoring observations will be crucial to decipher the origin of radio emission through detections of potential flux and spectral evolution. Our results highlight the importance of radio observations to constrain the nature of nuclear MIR outbursts in galaxies.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.
-
Deep Implicit Volume Compression
Authors:
Danhang Tang,
Saurabh Singh,
Philip A. Chou,
Christian Haene,
Mingsong Dou,
Sean Fanello,
Jonathan Taylor,
Philip Davidson,
Onur G. Guleryuz,
Yinda Zhang,
Shahram Izadi,
Andrea Tagliasacchi,
Sofien Bouaziz,
Cem Keskin
Abstract:
We describe a novel approach for compressing truncated signed distance fields (TSDF) stored in 3D voxel grids, and their corresponding textures. To compress the TSDF, our method relies on a block-based neural network architecture trained end-to-end, achieving state-of-the-art rate-distortion trade-off. To prevent topological errors, we losslessly compress the signs of the TSDF, which also upper bo…
▽ More
We describe a novel approach for compressing truncated signed distance fields (TSDF) stored in 3D voxel grids, and their corresponding textures. To compress the TSDF, our method relies on a block-based neural network architecture trained end-to-end, achieving state-of-the-art rate-distortion trade-off. To prevent topological errors, we losslessly compress the signs of the TSDF, which also upper bounds the reconstruction error by the voxel size. To compress the corresponding texture, we designed a fast block-based UV parameterization, generating coherent texture maps that can be effectively compressed using existing video compression algorithms. We demonstrate the performance of our algorithms on two 4D performance capture datasets, reducing bitrate by 66% for the same distortion, or alternatively reducing the distortion by 50% for the same bitrate, compared to the state-of-the-art.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
Multi-wavelength observations of the BL Lac object Fermi J1544-0649: one year after its awakening
Authors:
P. H. T. Tam,
P. S. Pal,
Y. D. Cui,
N. Jiang,
Y. Sotnikova,
C. W. Yang,
L. Z. Wang,
B. T. Tang,
Y. B. Li,
J. Mao,
A. K. H. Kong,
Z. H. Zhong,
J. Ding,
T. Mufakharov,
J. F. Fan,
L. M. Dou,
R. F. Shen,
Y. L. Ai
Abstract:
We report observations of a transient source \fermi\ from radio to \grs. \fermi\ was discovered by the {\it Fermi-LAT} in May 2017. Follow-up {\it Swift-XRT} observations revealed three flaring episodes through March 2018, and the peak X-ray flux is about $10^3$ higher than the {\it ROSAT all-sky survey (RASS)} flux upper limit. Optical spectral measurements taken by the {\it Magellan 6.5-m telesc…
▽ More
We report observations of a transient source \fermi\ from radio to \grs. \fermi\ was discovered by the {\it Fermi-LAT} in May 2017. Follow-up {\it Swift-XRT} observations revealed three flaring episodes through March 2018, and the peak X-ray flux is about $10^3$ higher than the {\it ROSAT all-sky survey (RASS)} flux upper limit. Optical spectral measurements taken by the {\it Magellan 6.5-m telescope} and the {\it Lick-Shane telescope} both show a largely featureless spectrum, strengthening the BL Lac interpretation first proposed by \citet{Bruni18}. The optical and mid-infrared (MIR) emission goes to a higher state in 2018, when the flux in high energies goes down to a lower level. Our {\it RATAN-600m} measurements at 4.8~GHz and 8.2~GHz do not indicate any significant radio flux variation over the monitoring seasons in 2017 and 2018, nor deviate from the archival {\it NVSS} flux level. During GeV flaring times, the spectrum is very hard ($Γ_γ\sim$1.7) in the GeV band and at times also very hard (($Γ_{\rm X}\lesssim2$) in the X-rays, similar to a high-synchrotron-peak (or even an extreme) BL Lac object, making \fermi\ a good target for ground-based {\it Cherenkov telescopes}.
△ Less
Submitted 31 January, 2020;
originally announced January 2020.
-
An Improved multi-objective genetic algorithm based on orthogonal design and adaptive clustering pruning strategy
Authors:
Xinwu Yang,
Guizeng You,
Chong Zhao,
Mengfei Dou,
Xinian Guo
Abstract:
Two important characteristics of multi-objective evolutionary algorithms are distribution and convergency. As a classic multi-objective genetic algorithm, NSGA-II is widely used in multi-objective optimization fields. However, in NSGA-II, the random population initialization and the strategy of population maintenance based on distance cannot maintain the distribution or convergency of the populati…
▽ More
Two important characteristics of multi-objective evolutionary algorithms are distribution and convergency. As a classic multi-objective genetic algorithm, NSGA-II is widely used in multi-objective optimization fields. However, in NSGA-II, the random population initialization and the strategy of population maintenance based on distance cannot maintain the distribution or convergency of the population well. To dispose these two deficiencies, this paper proposes an improved algorithm, OTNSGA-II II, which has a better performance on distribution and convergency. The new algorithm adopts orthogonal experiment, which selects individuals in manner of a new discontinuing non-dominated sorting and crowding distance, to produce the initial population. And a new pruning strategy based on clustering is proposed to self-adaptively prunes individuals with similar features and poor performance in non-dominated sorting and crowding distance, or to individuals are far away from the Pareto Front according to the degree of intra-class aggregation of clustering results. The new pruning strategy makes population to converge to the Pareto Front more easily and maintain the distribution of population. OTNSGA-II and NSGA-II are compared on various types of test functions to verify the improvement of OTNSGA-II in terms of distribution and convergency.
△ Less
Submitted 2 January, 2019;
originally announced January 2019.
-
A long decay of X-ray flux and spectral evolution in the supersoft active galactic nucleus GSN 069
Authors:
X. W. Shu,
S. S. Wang,
L. M. Dou,
N. Jiang,
J. X. Wang,
T. G. Wang
Abstract:
GSN 069 is an optically identified very low-mass AGN which shows supersoft X-ray emission. The source is known to exhibit huge X-ray outburst, with flux increased by more than a factor of ~240 compared to the quiescence state. We report its long-term evolution in the X-ray flux and spectral variations over a time-scale of ~decade, using both new and archival X-ray observations from the XMM and Swi…
▽ More
GSN 069 is an optically identified very low-mass AGN which shows supersoft X-ray emission. The source is known to exhibit huge X-ray outburst, with flux increased by more than a factor of ~240 compared to the quiescence state. We report its long-term evolution in the X-ray flux and spectral variations over a time-scale of ~decade, using both new and archival X-ray observations from the XMM and Swift. The new Swift observations detected the source in its lowest level of X-ray activity since outburst, a factor of ~4 lower in the 0.2-2 keV flux than that obtained with the XMM observations nearly 8 years ago. Combining with the historical X-ray measurements, we find that the X-ray flux is decreasing slowly. There seemed to be spectral softening associated with the drop of X-ray flux. In addition, we find evidence for the presence of a weak, variable hard X-ray component, in addition to the dominant thermal blackbody emission reported before. The long decay of X-ray flux and spectral evolution, as well as the supersoft X-ray spectra, suggest that the source could be a tidal disruption event, though a highly variable AGN cannot be fully ruled out. Further continued X-ray monitoring would be required to test the TDE interpretation, through better determining the flux evolution in the decay phase.
△ Less
Submitted 2 September, 2018;
originally announced September 2018.
-
Fluid sensitive nanoscale switching with quantum levitation controlled by $α$-Sn/$β$-Sn phase transition
Authors:
Mathias Boström,
Maofeng Dou,
Oleksandr I. Malyi,
Prachi Parashar,
Drew F. Parsons,
Iver Brevik,
Clas Persson
Abstract:
We analyse the Lifshitz pressure between silica and tin separated by a liquid mixture of bromobenzene and chlorobenzene. We show that the phase transition from semimetallic $α$-Sn to metallic $β$-Sn can switch Lifshitz forces from repulsive to attractive. This effect is caused by the difference in dielectric functions of $α$-Sn and $β$-Sn, giving both attractive and repulsive contributions to the…
▽ More
We analyse the Lifshitz pressure between silica and tin separated by a liquid mixture of bromobenzene and chlorobenzene. We show that the phase transition from semimetallic $α$-Sn to metallic $β$-Sn can switch Lifshitz forces from repulsive to attractive. This effect is caused by the difference in dielectric functions of $α$-Sn and $β$-Sn, giving both attractive and repulsive contributions to the total Lifshitz pressure at different frequency regions controlled by the composition of the intervening liquid mixture. In this way, one may be able to produce phase transition-controlled quantum levitation in liquid medium.
△ Less
Submitted 5 March, 2018;
originally announced March 2018.
-
Second-order correlation function from asymmetric to symmetric transitions due to spectrally indistinguishable biexciton cascade emission
Authors:
X. F. Wu,
X. M. Dou,
K. Ding,
P. Y. Zhou,
H. Q. Ni,
Z. C. Niu,
H. J. Zhu,
D. S. Jiang,
C. L. Zhao,
B. Q. Sun
Abstract:
We report the observed photon bunching statistics of biexciton cascade emission at zero time delay in single quantum dots by second-order correlation function measurements under continuous wave excitation. It is found that the bunching phenomenon is independent of the biexciton binding energy when it varies from 0.59 meV to nearly zero. The photon bunching takes place when the exciton photon is no…
▽ More
We report the observed photon bunching statistics of biexciton cascade emission at zero time delay in single quantum dots by second-order correlation function measurements under continuous wave excitation. It is found that the bunching phenomenon is independent of the biexciton binding energy when it varies from 0.59 meV to nearly zero. The photon bunching takes place when the exciton photon is not spectrally distinguishable from biexciton photon, and either of them can trigger the start in a Hanbury-Brown and Twiss setup. However, if the exciton energy is spectrally distinguishable from the biexciton the photon statistics becomes asymmetric and a cross-bunching lineshape is obtained. The theoretical calculations based on a model of three-level rate-equation analysis are consistent with the result of second-order correlation function measurements.
△ Less
Submitted 22 September, 2015;
originally announced September 2015.
-
Casimir attractive-repulsive transition in MEMS
Authors:
M. Boström,
S. Å. Ellingsen,
I. Brevik,
M. Dou,
C. Persson,
Bo E. Sernelius
Abstract:
Unwanted stiction in micro- and nanomechanical (NEMS/MEMS) systems due to dispersion (van der Waals, or Casimir) forces is a significant hurdle in the fabrication of systems with moving parts on these length scales. Introducing a suitably dielectric liquid in the interspace between bodies has previously been demonstrated to render dispersion forces repulsive, or even to switch sign as a function o…
▽ More
Unwanted stiction in micro- and nanomechanical (NEMS/MEMS) systems due to dispersion (van der Waals, or Casimir) forces is a significant hurdle in the fabrication of systems with moving parts on these length scales. Introducing a suitably dielectric liquid in the interspace between bodies has previously been demonstrated to render dispersion forces repulsive, or even to switch sign as a function of separation. Making use of recently available permittivity data calculated by us we show that such a remarkable non-monotonic Casimir force, changing from attractive to repulsive as separation increases, can in fact be observed in systems where constituent materials are in standard NEMS/MEMS use requiring no special or exotic materials. No such nonmonotonic behaviour has been measured to date. We calculate the force between a silica sphere and a flat surface of either zinc oxide or hafnia, two materials which are among the most prominent for practical microelectrical and microoptical devices. Our results explicate the need for highly accurate permittivity functions of the materials involved for frequencies from optical to far-infrared frequencies. A careful analysis of the Casimir interaction is presented, and we show how the change in the sign of the interaction can be understood as a result of multiple crossings of the dielectric functions of the three media involved in a given set-up.
△ Less
Submitted 26 September, 2012;
originally announced September 2012.
-
Enlarged Molecules from Excited Atoms in Nanochannels
Authors:
Mathias Boström,
Iver Brevik,
Bo E. Sernelius,
Maofeng Dou,
Clas Persson,
Barry W. Ninham
Abstract:
The resonance interaction that takes place in planar nanochannels between pairs of excited state atoms is explored. We consider interactions in channels of silica, zinc oxide and gold. The nanosized channels induce a dramatically different interaction from that in free space. Illustrative calculations for two lithium and cesium atoms, demonstrate that there is a short range repulsion followed by l…
▽ More
The resonance interaction that takes place in planar nanochannels between pairs of excited state atoms is explored. We consider interactions in channels of silica, zinc oxide and gold. The nanosized channels induce a dramatically different interaction from that in free space. Illustrative calculations for two lithium and cesium atoms, demonstrate that there is a short range repulsion followed by long range attraction. The binding energy is strongest near the surfaces. The size of the enlarged molecule is biggest at the center of the cavity and increases with channel width. Since the interaction is generic, we predict that enlarged molecules are formed in porous structures, and that the molecule size depends on the size of the nanochannels
△ Less
Submitted 21 June, 2012;
originally announced June 2012.
-
Temperature dependence of electron-spin relaxation in a single InAs quantum dot at zero applied magnetic field
Authors:
X. M. Dou,
B. Q. Sun,
D. S. Jiang,
H. Q. Ni,
Z. C. Niu
Abstract:
The temperature-dependent electron spin relaxation of positively charged excitons in a single InAs quantum dot (QD) was measured by time-resolved photoluminescence spectroscopy at zero applied magnetic fields. The experimental results show that the electron-spin relaxation is clearly divided into two different temperature regimes: (i) T < 50 K, spin relaxation depends on the dynamical nuclear spin…
▽ More
The temperature-dependent electron spin relaxation of positively charged excitons in a single InAs quantum dot (QD) was measured by time-resolved photoluminescence spectroscopy at zero applied magnetic fields. The experimental results show that the electron-spin relaxation is clearly divided into two different temperature regimes: (i) T < 50 K, spin relaxation depends on the dynamical nuclear spin polarization (DNSP) and is approximately temperature-independent, as predicted by Merkulov et al. (ii) T > about 50 K, spin relaxation speeds up with increasing temperature. A model of two LO phonon scattering process coupled with hyperfine interaction is proposed to account for the accelerated electron spin relaxation at higher temperatures.
△ Less
Submitted 5 January, 2012;
originally announced January 2012.