subscribe to arXiv mailings

Improving Distant 3D Object Detection Using 2D Box Supervision

Authors: Zetong Yang, Zhiding Yu, Chris Choy, Renhao Wang, Anima Anandkumar, Jose M. Alvarez

Abstract: Improving the detection of distant 3d objects is an important yet challenging task. For camera-based 3D perception, the annotation of 3d bounding relies heavily on LiDAR for accurate depth information. As such, the distance of annotation is often limited due to the sparsity of LiDAR points on distant objects, which hampers the capability of existing detectors for long-range scenarios. We address t… ▽ More Improving the detection of distant 3d objects is an important yet challenging task. For camera-based 3D perception, the annotation of 3d bounding relies heavily on LiDAR for accurate depth information. As such, the distance of annotation is often limited due to the sparsity of LiDAR points on distant objects, which hampers the capability of existing detectors for long-range scenarios. We address this challenge by considering only 2D box supervision for distant objects since they are easy to annotate. We propose LR3D, a framework that learns to recover the missing depth of distant objects. LR3D adopts an implicit projection head to learn the generation of mapping between 2D boxes and depth using the 3D supervision on close objects. This mapping allows the depth estimation of distant objects conditioned on their 2D boxes, making long-range 3D detection with 2D supervision feasible. Experiments show that without distant 3D annotations, LR3D allows camera-based methods to detect distant objects (over 200m) with comparable accuracy to full 3D supervision. Our framework is general, and could widely benefit 3D detection methods to a large extent. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2310.15083 [pdf]

Mass uptake during oxidation of metallic alloys: literature data collection, analysis, and FAIR sharing

Authors: Saswat Mishra, Sharmila Karumuri, Vincent Mika, Collin Scott, Chadwick Choy, Kenneth H. Sandhage, Ilias Bilionis, Michael S. Titus, Alejandro Strachan

Abstract: The area-normalized change of mass ($Δ$m/A) with time during the oxidation of metallic alloys is commonly used to assess oxidation resistance. Analyses of such data can also aid in evaluating underlying oxidation mechanisms. We performed an exhaustive literature search and digitized normalized mass change vs. time data for 407 alloys. To maximize the impact of these and future mass uptake data, we… ▽ More The area-normalized change of mass ($Δ$m/A) with time during the oxidation of metallic alloys is commonly used to assess oxidation resistance. Analyses of such data can also aid in evaluating underlying oxidation mechanisms. We performed an exhaustive literature search and digitized normalized mass change vs. time data for 407 alloys. To maximize the impact of these and future mass uptake data, we developed and published an open, online, computational workflow that fits the data to various models of oxidation kinetics, uses Bayesian statistics for model selection, and makes the raw data and model parameters available via a queryable database. The tool, Refractory Oxidation Database (https://nanohub.org/tools/refoxdb/), uses nanoHUB's Sim2Ls to make the workflow and data (including metadata) findable, accessible, interoperable, and reusable (FAIR). We find that the models selected by the original authors do not match the most likely one according to the Bayesian information criterion (BIC) in 71% of the cases. Further, in 56% of the cases, the published model was not even in the top 3 models according to the BIC. These numbers were obtained assuming an experimental noise of 2.5% of the mass gain range, a smaller noise leads to more discrepancies. The RefOxDB tool is open access and researchers can add their own raw data (those to be included in future publications, as well as negative results) for analysis and to share their work with the community. Such consistent and systematic analysis of open, community generated data can significantly accelerate the development of machine-learning models for oxidation behavior and assist in the understanding and improvement of oxidation resistance. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2309.00583 [pdf, other]

Geometry-Informed Neural Operator for Large-Scale 3D PDEs

Authors: Zongyi Li, Nikola Borislavov Kovachki, Chris Choy, Boyi Li, Jean Kossaifi, Shourya Prakash Otta, Mohammad Amin Nabian, Maximilian Stadler, Christian Hundt, Kamyar Azizzadenesheli, Anima Anandkumar

Abstract: We propose the geometry-informed neural operator (GINO), a highly efficient approach to learning the solution operator of large-scale partial differential equations with varying geometries. GINO uses a signed distance function and point-cloud representations of the input shape and neural operators based on graph and Fourier architectures to learn the solution operator. The graph neural operator ha… ▽ More We propose the geometry-informed neural operator (GINO), a highly efficient approach to learning the solution operator of large-scale partial differential equations with varying geometries. GINO uses a signed distance function and point-cloud representations of the input shape and neural operators based on graph and Fourier architectures to learn the solution operator. The graph neural operator handles irregular grids and transforms them into and from regular latent grids on which Fourier neural operator can be efficiently applied. GINO is discretization-convergent, meaning the trained model can be applied to arbitrary discretization of the continuous domain and it converges to the continuum operator as the discretization is refined. To empirically validate the performance of our method on large-scale simulation, we generate the industry-standard aerodynamics dataset of 3D vehicle geometries with Reynolds numbers as high as five million. For this large-scale 3D fluid simulation, numerical methods are expensive to compute surface pressure. We successfully trained GINO to predict the pressure on car surfaces using only five hundred data points. The cost-accuracy experiments show a $26,000 \times$ speed-up compared to optimized GPU-based computational fluid dynamics (CFD) simulators on computing the drag coefficient. When tested on new combinations of geometries and boundary conditions (inlet velocities), GINO obtains a one-fourth reduction in error rate compared to deep neural network approaches. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.04199 [pdf, ps, other]

A centennial reappraisal of Heisenberg's Quantum Mechanics with a perspective on Einstein's Quantum Riddle

Authors: Tuck C. Choy

Abstract: Heisenberg's breakthrough in his July 1925 paper that set in motion the development of Quantum Mechanics through subsequent papers by Born, Jordan, Heisenberg and also Dirac (from 1925 to 1927) is reexamined through a modern lens. In this paper, we shall discuss some new perspectives on (i) what could be the guiding intuitions for his discoveries and (ii) the origin of the Born-Jordan-Heisenberg c… ▽ More Heisenberg's breakthrough in his July 1925 paper that set in motion the development of Quantum Mechanics through subsequent papers by Born, Jordan, Heisenberg and also Dirac (from 1925 to 1927) is reexamined through a modern lens. In this paper, we shall discuss some new perspectives on (i) what could be the guiding intuitions for his discoveries and (ii) the origin of the Born-Jordan-Heisenberg canonical quantization rule. From this vantage point we may get an insight into Einstein's Quantum Riddle (Lande1974,Sommerfeld1918,Born1926) and a possible glimpse of what might come next after the last 100 years of Heisenberg's quantum mechanics. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: (This is the preprint of a paper dedicated to the celebration of 100 years of quantum mechanics, on the anniversary of Heisenberg's founding paper on the subject in July 1925, to be published in a celebratory volume in July 2025 by World Scientific Publications, Singapore)

arXiv:2306.13852 [pdf]

Gd-Based Solvated Shells for Defect Passivation of CsPbBr$_3$ Nanoplatelets Enabling Efficient Color-Saturated Blue Electroluminescence

Authors: Haoran Wang, Jingyu Qian, Jiayun Sun, Tong Su, Shiming Lei, Xiaoyu Zhang, Wallace C. H. Choy, Xiao Wei Sun, Kai Wang, Weiwei Zhao

Abstract: Reduced-dimensional CsPbBr$_3$ nanoplatelets (NPLs) are promising candidates for color-saturated blue emitters, yet their electroluminescence performance is hampered by non-radiative recombination, which is associated with bromine vacancies. Here, we show that a post-synthetic treatment of CsPbBr$_3$ NPLs with GdBr$_3$-dimethylformamide (DMF) can effectively eliminate defects while preserving the… ▽ More Reduced-dimensional CsPbBr$_3$ nanoplatelets (NPLs) are promising candidates for color-saturated blue emitters, yet their electroluminescence performance is hampered by non-radiative recombination, which is associated with bromine vacancies. Here, we show that a post-synthetic treatment of CsPbBr$_3$ NPLs with GdBr$_3$-dimethylformamide (DMF) can effectively eliminate defects while preserving the color. According to a combined experimental and theoretical study, Gd$^{3+}$ ions are less reactive with NPLs as a result of compact interaction between them and DMF, and this stable Gd$^{3+}$-DMF solvation structure makes Brions more available and allows them to move more freely. Consequently, defects are rapidly passivated and photoluminescence quantum yield increases dramatically (from 35 to ~100%), while the surface ligand density and emission color remain unchanged. The result is a remarkable electroluminescence efficiency of 2.4% (at 464 nm), one of the highest in pure blue perovskite NPL light-emitting diodes. It is noteworthy that the conductive NPL film shows a high photoluminescence quantum yield of 80%, demonstrating NPLs' significant electroluminescence potential with further device structure design. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2305.13220 [pdf, other]

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

Authors: Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima Anandkumar

Abstract: Indoor scene reconstruction from monocular images has long been sought after by augmented reality and robotics developers. Recent advances in neural field representations and monocular priors have led to remarkable results in scene-level surface reconstructions. The reliance on Multilayer Perceptrons (MLP), however, significantly limits speed in training and rendering. In this work, we propose to… ▽ More Indoor scene reconstruction from monocular images has long been sought after by augmented reality and robotics developers. Recent advances in neural field representations and monocular priors have led to remarkable results in scene-level surface reconstructions. The reliance on Multilayer Perceptrons (MLP), however, significantly limits speed in training and rendering. In this work, we propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without MLPs. Our globally sparse and locally dense data structure exploits surfaces' spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data such as color and semantic labels. To apply this representation to monocular scene reconstruction, we develop a scale calibration algorithm for fast geometric initialization from monocular depth priors. We apply differentiable volume rendering from this initialization to refine details with fast convergence. We also introduce efficient high-dimensional Continuous Random Fields (CRFs) to further exploit the semantic-geometry consistency between scene objects. Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: CVPR 2023

arXiv:2302.12251 [pdf, other]

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Authors: Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M. Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

Abstract: Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This appealing ability is vital for recognition and understanding. To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images. Our framework adopts a two-stage design where we start from a… ▽ More Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This appealing ability is vital for recognition and understanding. To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images. Our framework adopts a two-stage design where we start from a sparse set of visible and occupied voxel queries from depth estimation, followed by a densification stage that generates dense 3D voxels from the sparse ones. A key idea of this design is that the visual features on 2D images correspond only to the visible scene structures rather than the occluded or empty spaces. Therefore, starting with the featurization and prediction of the visible structures is more reliable. Once we obtain the set of sparse queries, we apply a masked autoencoder design to propagate the information to all the voxels by self-attention. Experiments on SemanticKITTI show that VoxFormer outperforms the state of the art with a relative improvement of 20.0% in geometry and 18.1% in semantics and reduces GPU memory during training to less than 16GB. Our code is available on https://github.com/NVlabs/VoxFormer. △ Less

Submitted 25 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: CVPR 2023 Highlight (10% of accepted papers, 2.5% of submissions)

arXiv:2208.11537 [pdf, other]

PeRFception: Perception using Radiance Fields

Authors: Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

Abstract: The recent progress in implicit 3D representation, i.e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner. This new representation can effectively convey the information of hundreds of high-resolution images in one compact format and allows photorealistic synthesis of novel views. In this work, using the variant of NeRF call… ▽ More The recent progress in implicit 3D representation, i.e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner. This new representation can effectively convey the information of hundreds of high-resolution images in one compact format and allows photorealistic synthesis of novel views. In this work, using the variant of NeRF called Plenoxels, we create the first large-scale implicit representation datasets for perception tasks, called the PeRFception, which consists of two parts that incorporate both object-centric and scene-centric scans for classification and segmentation. It shows a significant memory compression rate (96.4\%) from the original dataset, while containing both 2D and 3D information in a unified form. We construct the classification and segmentation models that directly take as input this implicit format and also propose a novel augmentation technique to avoid overfitting on backgrounds of images. The code and data are publicly available in https://postech-cvlab.github.io/PeRFception . △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: Project Page: https://postech-cvlab.github.io/PeRFception/

arXiv:2206.08077 [pdf, other]

Neural Scene Representation for Locomotion on Structured Terrain

Authors: David Hoeller, Nikita Rudin, Christopher Choy, Animashree Anandkumar, Marco Hutter

Abstract: We propose a learning-based method to reconstruct the local terrain for locomotion with a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the algorithm estimates the topography in the robot's vicinity. The raw measurements from these cameras are noisy and only provide partial and occluded observations that in man… ▽ More We propose a learning-based method to reconstruct the local terrain for locomotion with a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the algorithm estimates the topography in the robot's vicinity. The raw measurements from these cameras are noisy and only provide partial and occluded observations that in many cases do not show the terrain the robot stands on. Therefore, we propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement. The model consists of a 4D fully convolutional network on point clouds that learns the geometric priors to complete the scene from the context and an auto-regressive feedback to leverage spatio-temporal consistency and use evidence from the past. The network can be solely trained with synthetic data, and due to extensive augmentation, it is robust in the real world, as shown in the validation on a quadrupedal robot, ANYmal, traversing challenging settings. We run the pipeline on the robot's onboard low-power computer using an efficient sparse tensor implementation and show that the proposed method outperforms classical map representations. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2203.06856 [pdf, other]

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Authors: Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

Abstract: Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit r… ▽ More Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed to goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. Furthermore, we apply the simulation-trained ACID model directly to real-world objects and show success in manipulating them into target configurations. For more results and information, please visit https://b0ku1.github.io/acid/ . △ Less

Submitted 5 August, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: RSS 2022 Best Student Paper Award Finalist. Please check out more details at https://b0ku1.github.io/acid/

Journal ref: Robotics: Science and Systems (RSS), 2022

arXiv:2112.01316 [pdf, other]

Putting 3D Spatially Sparse Networks on a Diet

Authors: Junha Lee, Christopher Choy, Jaesik Park

Abstract: 3D neural networks have become prevalent for many 3D vision tasks including object detection, segmentation, registration, and various perception tasks for 3D inputs. However, due to the sparsity and irregularity of 3D data, custom 3D operators or network designs have been the primary focus of research, while the size of networks or efficacy of parameters has been overlooked. In this work, we perfo… ▽ More 3D neural networks have become prevalent for many 3D vision tasks including object detection, segmentation, registration, and various perception tasks for 3D inputs. However, due to the sparsity and irregularity of 3D data, custom 3D operators or network designs have been the primary focus of research, while the size of networks or efficacy of parameters has been overlooked. In this work, we perform the first comprehensive study on the weight sparsity of spatially sparse 3D convolutional networks and propose a compact weight-sparse and spatially sparse 3D convnet (WS^3-Convnet) for semantic and instance segmentation on the real-world indoor and outdoor datasets. We employ various network pruning strategies to find compact networks and show our WS^3-Convnet achieves minimal loss in performance (2.15\% drop) with orders-of-magnitude smaller number of parameters (99\% compression rate) and computational cost (95\% reduction). Finally, we systematically analyze the compression patterns of WS^3-Convnet and show interesting emerging sparsity patterns common in our compressed networks to further speed up inference (45\% faster). \keywords{Efficient network architecture, Network pruning, 3D scene segmentation, Spatially sparse convolution} △ Less

Submitted 8 April, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

arXiv:2108.13826 [pdf, other]

Self-Calibrating Neural Radiance Fields

Authors: Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

Abstract: In this work, we propose a camera self-calibration algorithm for generic cameras with arbitrary non-linear distortions. We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions. While t… ▽ More In this work, we propose a camera self-calibration algorithm for generic cameras with arbitrary non-linear distortions. We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions. While traditional self-calibration algorithms mostly rely on geometric constraints, we additionally incorporate photometric consistency. This requires learning the geometry of the scene, and we use Neural Radiance Fields (NeRF). We also propose a new geometric loss function, viz., projected ray distance loss, to incorporate geometric consistency for complex non-linear camera models. We validate our approach on standard real image datasets and demonstrate that our model can learn the camera intrinsics and extrinsics (pose) from scratch without COLMAP initialization. Also, we show that learning accurate camera models in a differentiable manner allows us to improve PSNR over baselines. Our module is an easy-to-use plugin that can be applied to NeRF variants to improve performance. The code and data are currently available at https://github.com/POSTECH-CVLab/SCNeRF. △ Less

Submitted 2 September, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

Comments: Accepted in ICCV21, Project Page: https://postech-cvlab.github.io/SCNeRF/

arXiv:2105.06464 [pdf, other]

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Authors: Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar

Abstract: We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pai… ▽ More We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense constrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference. △ Less

Submitted 5 June, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

Comments: Tech Report

arXiv:2007.12025 [pdf, ps, other]

doi 10.13140/RG.2.2.16190.31041

On Dirac Quantisation rules and the trace anomaly

Authors: Tuck C Choy

Abstract: In this article I shall clarify various aspects of the Dirac quantisation rules of 1930\cite{Dirac}, namely (i) the choice of antisymmetric Poisson brackets, (ii) the first quantisation Rule 1 (iii) the second quantisation Rule 2, and their relations to the trace anomaly. In fact in 1925 Dirac already had a preliminarily formulation of these rules \cite{Dirac3}. Using them, he had independently re… ▽ More In this article I shall clarify various aspects of the Dirac quantisation rules of 1930\cite{Dirac}, namely (i) the choice of antisymmetric Poisson brackets, (ii) the first quantisation Rule 1 (iii) the second quantisation Rule 2, and their relations to the trace anomaly. In fact in 1925 Dirac already had a preliminarily formulation of these rules \cite{Dirac3}. Using them, he had independently rediscovered the Born-Jordan quantisation rule \cite{BornJordan1925} and called it the quantum condition. This is the best known and undoubtedly most significant of the canonical quantisation rules of quantum mechanics. We shall discuss several violations of the Poisson-Lie algebra (assumed by Dirac), starting from antisymmetry, which is the first criterion for defining a Lie algebra. Similar violations also occur for the Leibniz's rule and the Jacobi identity, the latter we shall also prove for all our quantum Poisson brackets. That none of these violations jeopardised Dirac's ingenious original derivation \cite{Dirac} of his first quantisation Rule 1, is quite remarkable. This is because the violations are all of higher orders in $\hbar$. We shall further show that (ii) does not automatically lead to a trace anomaly for certain bounded integrable operators. Several issues that are both pedagogical and foundational arising from this study show that quantum mechanics is still not a finished product. I shall briefly mention some attempts and options to complete its development. △ Less

Submitted 17 January, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: 8 pages no figures

arXiv:2006.12356 [pdf, other]

Generative Sparse Detection Networks for 3D Single-shot Object Detection

Authors: JunYoung Gwak, Christopher Choy, Silvio Savarese

Abstract: 3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality. Yet, the sparse nature of the 3D data poses unique challenges to this task. Most notably, the observable surface of the 3D point clouds is disjoint from the center of the instance to ground the bounding box prediction on. To this end, we propose Generative… ▽ More 3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality. Yet, the sparse nature of the 3D data poses unique challenges to this task. Most notably, the observable surface of the 3D point clouds is disjoint from the center of the instance to ground the bounding box prediction on. To this end, we propose Generative Sparse Detection Network (GSDN), a fully-convolutional single-shot sparse detection network that efficiently generates the support for object proposals. The key component of our model is a generative sparse tensor decoder, which uses a series of transposed convolutions and pruning layers to expand the support of sparse tensors while discarding unlikely object centers to maintain minimal runtime and memory footprint. GSDN can process unprecedentedly large-scale inputs with a single fully-convolutional feed-forward pass, thus does not require the heuristic post-processing stage that stitches results from sliding windows as other previous methods have. We validate our approach on three 3D indoor datasets including the large-scale 3D indoor reconstruction dataset where our method outperforms the state-of-the-art methods by a relative improvement of 7.14% while being 3.78 times faster than the best prior work. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:2005.08144 [pdf, other]

High-dimensional Convolutional Networks for Geometric Pattern Recognition

Authors: Christopher Choy, Junha Lee, Rene Ranftl, Jaesik Park, Vladlen Koltun

Abstract: Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recognition problems that arise in the context of geometric registration. We first study the effectiveness of convolutional networks in detecting linear subspaces in high-dimensional spaces with up to 32 dime… ▽ More Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recognition problems that arise in the context of geometric registration. We first study the effectiveness of convolutional networks in detecting linear subspaces in high-dimensional spaces with up to 32 dimensions: much higher dimensionality than prior applications of ConvNets. We then apply high-dimensional ConvNets to 3D registration under rigid motions and image correspondence estimation. Experiments indicate that our high-dimensional ConvNets outperform prior approaches that relied on deep networks based on global pooling operators. △ Less

Submitted 16 May, 2020; originally announced May 2020.

Comments: Accepted for CVPR 2020 oral presentation

arXiv:2004.11540 [pdf, other]

Deep Global Registration

Authors: Christopher Choy, Wei Dong, Vladlen Koltun

Abstract: We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional network for correspondence confidence prediction, a differentiable Weighted Procrustes algorithm for closed-form pose estimation, and a robust gradient-based SE(3) optimizer for pose refinement. Experimen… ▽ More We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional network for correspondence confidence prediction, a differentiable Weighted Procrustes algorithm for closed-form pose estimation, and a robust gradient-based SE(3) optimizer for pose refinement. Experiments demonstrate that our approach outperforms state-of-the-art methods, both learning-based and classical, on real-world data. △ Less

Submitted 8 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: Accepted for CVPR'20 oral presentation

arXiv:2003.12622 [pdf, other]

SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans

Authors: Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner

Abstract: We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intr… ▽ More We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intrinsically coupled, we show that treating the problem jointly significantly helps to produce globally-consistent representations of a scene. Object CAD models are aligned to the scene by establishing dense correspondences between geometry, and we introduce a hierarchical layout prediction approach to estimate layout planes from corners and edges of the scene.To this end, we propose a message-passing graph neural network to model the inter-relationships between objects and layout, guiding generation of a globally object alignment in a scene. By considering the global scene layout, we achieve significantly improved CAD alignments compared to state-of-the-art methods, improving from 41.83% to 58.41% alignment accuracy on SUNCG and from 50.05% to 61.24% on ScanNet, respectively. The resulting CAD-based representations makes our method well-suited for applications in content creation such as augmented- or virtual reality. △ Less

Submitted 27 March, 2020; originally announced March 2020.

Comments: Video here https://youtu.be/F0DpggYByh0

arXiv:1904.08755 [pdf, other]

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Authors: Christopher Choy, JunYoung Gwak, Silvio Savarese

Abstract: In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos us… ▽ More In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos using high-dimensional convolutions. For this, we adopt sparse tensors and propose the generalized sparse convolution that encompasses all discrete convolutions. To implement the generalized sparse convolution, we create an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks. We create 4D spatio-temporal convolutional neural networks using the library and validate them on various 3D semantic segmentation benchmarks and proposed 4D datasets for 3D-video perception. To overcome challenges in the 4D space, we propose the hybrid kernel, a special case of the generalized sparse convolution, and the trilateral-stationary conditional random field that enforces spatio-temporal consistency in the 7D space-time-chroma space. Experimentally, we show that convolutional neural networks with only generalized 3D sparse convolutions can outperform 2D or 2D-3D hybrid methods by a large margin. Also, we show that on 3D-videos, 4D spatio-temporal convolutional neural networks are robust to noise, outperform 3D convolutional neural networks and are faster than the 3D counterpart in some cases. △ Less

Submitted 13 June, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

Comments: CVPR'19

arXiv:1901.01141 [pdf]

BOTDA Fiber Sensor System Based on FPGA Accelerated Support Vector Regression

Authors: Huan Wu, Hongda Wang, Chiu-Sing Choy, Chester Shu, Chao Lu

Abstract: Brillouin optical time domain analyzer (BOTDA) fiber sensors have shown strong capability in static long haul distributed temperature/strain sensing. However, in applications such as structural health monitoring and leakage detection, real-time measurement is quite necessary. The measurement time of temperature/strain in a BOTDA system includes data acquisition time and post-processing time. In th… ▽ More Brillouin optical time domain analyzer (BOTDA) fiber sensors have shown strong capability in static long haul distributed temperature/strain sensing. However, in applications such as structural health monitoring and leakage detection, real-time measurement is quite necessary. The measurement time of temperature/strain in a BOTDA system includes data acquisition time and post-processing time. In this work, we propose to use hardware accelerated support vector regression (SVR) for the post-processing of the collected BOTDA data. Ideal Lorentzian curves under different temperatures with different linewidths are used to train the SVR model to determine the linear SVR decision function. The performance of SVR is evaluated under different signal-to-noise ratios (SNRs) experimentally. After the model coefficients are determined, algorithm-specific hardware accelerators based on field programmable gate arrays (FPGAs) are used to realize SVR decision function. During the implementation, hardware optimization techniques based on loop dependence analysis and batch processing are proposed to reduce the execution latency. Our FPGA implementations can achieve up to 42x speedup compared with software implementation on an i7-5960x computer. The post-processing time for 96,100 BGSs along 38.44-km FUT is only 0.46 seconds with FPGA board ZCU104, making the post-processing time no longer a limiting factor for dynamic sensing. Moreover, the energy efficiency of our FPGA implementation can reach up to 226.1x higher than software implementation based on CPU. △ Less

Submitted 28 December, 2018; originally announced January 2019.

Comments: 8 pgaes

arXiv:1803.08495 [pdf, other]

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

Authors: Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese

Abstract: We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model combines and extends learning by association and metric learning approaches to learn implicit cross-modal connections, and produces a joint representation that captures the many-to-many relations between language and… ▽ More We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model combines and extends learning by association and metric learning approaches to learn implicit cross-modal connections, and produces a joint representation that captures the many-to-many relations between language and physical properties of 3D shapes such as color and shape. To evaluate our approach, we collect a large dataset of natural language descriptions for physical 3D objects in the ShapeNet dataset. With this learned joint embedding we demonstrate text-to-shape retrieval that outperforms baseline approaches. Using our embeddings with a novel conditional Wasserstein GAN framework, we generate colored 3D shapes from text. Our method is the first to connect natural language text with realistic 3D objects exhibiting rich variations in color, texture, and shape detail. See video at https://youtu.be/zraPvRdl13Q △ Less

Submitted 22 March, 2018; originally announced March 2018.

arXiv:1801.02941 [pdf]

doi 10.1002/aenm.201701586

Quantifying Efficiency Loss of Perovskite Solar Cells by a Modified Detailed Balance Model

Authors: Wei E. I. Sha, Hong Zhang, Zi Shuai Wang, Hugh L. Zhu, Xingang Ren, Francis Lin, Alex K. -Y. Jen, Wallace C. H. Choy

Abstract: A modified detailed balance model is built to understand and quantify efficiency loss of perovskite solar cells. The modified model captures the light-absorption dependent short-circuit current, contact and transport-layer modified carrier transport, as well as recombination and photon-recycling influenced open-circuit voltage. Our theoretical and experimental results show that for experimentally… ▽ More A modified detailed balance model is built to understand and quantify efficiency loss of perovskite solar cells. The modified model captures the light-absorption dependent short-circuit current, contact and transport-layer modified carrier transport, as well as recombination and photon-recycling influenced open-circuit voltage. Our theoretical and experimental results show that for experimentally optimized perovskite solar cells with the power conversion efficiency of 19%, optical loss of 25%, non-radiative recombination loss of 35%, and ohmic loss of 35% are the three dominant loss factors for approaching the 31% efficiency limit of perovskite solar cells. We also find that the optical loss will climb up to 40% for a thin-active-layer design. Moreover, a misconfigured transport layer will introduce above 15% of energy loss. Finally, the perovskite-interface induced surface recombination, ohmic loss, and current leakage should be further reduced to upgrade device efficiency and eliminate hysteresis effect. The work contributes to fundamental understanding of device physics of perovskite solar cells. The developed model offers a systematic design and analysis tool to photovoltaic science and technology. △ Less

Submitted 9 January, 2018; originally announced January 2018.

Comments: 21 pages, 9 figures, 3 tables

Journal ref: Advanced Energy Materials, 2018

arXiv:1710.07563 [pdf, other]

SEGCloud: Semantic Segmentation of 3D Point Clouds

Authors: Lyne P. Tchapmi, Christopher B. Choy, Iro Armeni, JunYoung Gwak, Silvio Savarese

Abstract: 3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited to coarse voxel predictions and do not explicitly enforce global consistency. We present SEGCloud, an end-to-end framework to obtain 3D point-level se… ▽ More 3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited to coarse voxel predictions and do not explicitly enforce global consistency. We present SEGCloud, an end-to-end framework to obtain 3D point-level segmentation that combines the advantages of NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields (FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are transferred back to the raw 3D points via trilinear interpolation. Then the FC-CRF enforces global consistency and provides fine-grained semantics on the points. We implement the latter as a differentiable Recurrent NN to allow joint optimization. We evaluate the framework on two indoor and two outdoor 3D datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance comparable or superior to the state-of-the-art on all datasets. △ Less

Submitted 20 October, 2017; originally announced October 2017.

Comments: Accepted as a spotlight at the International Conference of 3D Vision (3DV 2017)

arXiv:1708.04672 [pdf, other]

DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image

Authors: Andrey Kurenkov, Jingwei Ji, Animesh Garg, Viraj Mehta, JunYoung Gwak, Christopher Choy, Silvio Savarese

Abstract: 3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or point clouds. However, these methods can be computationally expensive and miss fine details. We introduce a new differentiable layer for 3D data deforma… ▽ More 3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or point clouds. However, these methods can be computationally expensive and miss fine details. We introduce a new differentiable layer for 3D data deformation and use it in DeformNet to learn a model for 3D reconstruction-through-deformation. DeformNet takes an image input, searches the nearest shape template from a database, and deforms the template to match the query image. We evaluate our approach on the ShapeNet dataset and show that - (a) the Free-Form Deformation layer is a powerful new building block for Deep Learning models that manipulate 3D data (b) DeformNet uses this FFD layer combined with shape retrieval for smooth and detail-preserving 3D reconstruction of qualitatively plausible point clouds with respect to a single query image (c) compared to other state-of-the-art 3D reconstruction methods, DeformNet quantitatively matches or outperforms their benchmarks by significant margins. For more information, visit: https://deformnet-site.github.io/DeformNet-website/ . △ Less

Submitted 10 August, 2017; originally announced August 2017.

Comments: 11 pages, 9 figures, NIPS

arXiv:1705.10904 [pdf, other]

Weakly supervised 3D Reconstruction with Adversarial Constraint

Authors: JunYoung Gwak, Christopher B. Choy, Animesh Garg, Manmohan Chandraker, Silvio Savarese

Abstract: Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables… ▽ More Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables perspective projection and backpropagation. Additionally, since the 3D reconstruction from masks is an ill posed problem, we propose to constrain the 3D reconstruction to the manifold of unlabeled realistic 3D shapes that match mask observations. We demonstrate that learning a log-barrier solution to this constrained optimization problem resembles the GAN objective, enabling the use of existing tools for training GANs. We evaluate and analyze the manifold constrained reconstruction on various datasets for single and multi-view reconstruction of both synthetic and real images. △ Less

Submitted 4 October, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

arXiv:1704.04394 [pdf, other]

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

Authors: Namhoon Lee, Wongun Choi, Paul Vernaza, Christopher B. Choy, Philip H. S. Torr, Manmohan Chandraker

Abstract: We introduce a Deep Stochastic IOC RNN Encoderdecoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes. DESIRE effectively predicts future locations of objects in multiple scenes by 1) accounting for the multi-modal nature of the future prediction (i.e., given the same context, future may vary), 2) foreseeing the potential future outcomes and m… ▽ More We introduce a Deep Stochastic IOC RNN Encoderdecoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes. DESIRE effectively predicts future locations of objects in multiple scenes by 1) accounting for the multi-modal nature of the future prediction (i.e., given the same context, future may vary), 2) foreseeing the potential future outcomes and make a strategic prediction based on that, and 3) reasoning not only from the past motion history, but also from the scene context as well as the interactions among the agents. DESIRE achieves these in a single end-to-end trainable neural network model, while being computationally efficient. The model first obtains a diverse set of hypothetical future prediction samples employing a conditional variational autoencoder, which are ranked and refined by the following RNN scoring-regression module. Samples are scored by accounting for accumulated future rewards, which enables better long-term strategic decisions similar to IOC frameworks. An RNN scene context fusion module jointly captures past motion histories, the semantic scene context and interactions among multiple agents. A feedback mechanism iterates over the ranking and refinement to further boost the prediction accuracy. We evaluate our model on two publicly available datasets: KITTI and Stanford Drone Dataset. Our experiments show that the proposed model significantly improves the prediction accuracy compared to other baseline methods. △ Less

Submitted 14 April, 2017; originally announced April 2017.

Comments: Accepted at CVPR 2017

arXiv:1703.07576 [pdf]

doi 10.1021/acsphotonics.6b01043

Exploring the Way to Approach the Efficiency Limit of Perovskite Solar Cells by Drift-Diffusion Model

Authors: Xingang Ren, Zishuai Wang, Wei E. I. Sha, Wallace C. H. Choy

Abstract: Drift-diffusion model is an indispensable modeling tool to understand the carrier dynamics (transport, recombination, and collection) and simulate practical-efficiency of solar cells (SCs) through taking into account various carrier recombination losses existing in multilayered device structures. Exploring the way to predict and approach the SC efficiency limit by using the drift-diffusion model w… ▽ More Drift-diffusion model is an indispensable modeling tool to understand the carrier dynamics (transport, recombination, and collection) and simulate practical-efficiency of solar cells (SCs) through taking into account various carrier recombination losses existing in multilayered device structures. Exploring the way to predict and approach the SC efficiency limit by using the drift-diffusion model will enable us to gain more physical insights and design guidelines for emerging photovoltaics, particularly perovskite solar cells. Our work finds out that two procedures are the prerequisites for predicting and approaching the SC efficiency limit. Firstly, the intrinsic radiative recombination needs to be corrected after adopting optical designs which will significantly affect the open-circuit voltage at its Shockley-Queisser limit. Through considering a detailed balance between emission and absorption of semiconductor materials at the thermal equilibrium, and the Boltzmann statistics at the non-equilibrium, we offer a different approach to derive the accurate expression of intrinsic radiative recombination with the optical corrections for semiconductor materials. The new expression captures light trapping of the absorbed photons and angular restriction of the emitted photons simultaneously, which are ignored in the traditional Roosbroeck-Shockley expression. Secondly, the contact characteristics of the electrodes need to be carefully engineered to eliminate the charge accumulation and surface recombination at the electrodes. The selective contact or blocking layer incorporated nonselective contact that inhibits the surface recombination at the electrode is another important prerequisite. With the two procedures, the accurate prediction of efficiency limit and precise evaluation of efficiency degradation for perovskite solar cells are attainable by the drift-diffusion model. △ Less

Submitted 19 April, 2017; v1 submitted 22 March, 2017; originally announced March 2017.

Comments: 32 pages, 11 figures

Journal ref: ACS Photonics, 2017, 4(4), 934-942

arXiv:1701.02426 [pdf, other]

Scene Graph Generation by Iterative Message Passing

Authors: Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei

Abstract: Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image. We propose a novel end-to-end model that generates such structured scene represent… ▽ More Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image. We propose a novel end-to-end model that generates such structured scene representation from an input image. The model solves the scene graph inference problem using standard RNNs and learns to iteratively improves its predictions via message passing. Our joint inference model can take advantage of contextual cues to make better predictions on objects and their relationships. The experiments show that our model significantly outperforms previous methods for generating scene graphs using Visual Genome dataset and inferring support relations with NYU Depth v2 dataset. △ Less

Submitted 12 April, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

Comments: CVPR 2017

arXiv:1612.01083 [pdf, ps, other]

doi 10.1063/1.4970958

Exciton delocalization incorporated drift-diffusion model for bulk-heterojunction organic solar cells

Authors: Zi Shuai Wang, Wei E. I. Sha, Wallace C. H. Choy

Abstract: Modeling the charge-generation process is highly important to understand device physics and optimize power conversion efficiency of bulk-heterojunction (BHJ) organic solar cells (OSCs). Free carriers are generated by both ultrafast exciton delocalization and slow exciton diffusion and dissociation at the heterojunction interface. In this work, we developed a systematic numerical simulation to desc… ▽ More Modeling the charge-generation process is highly important to understand device physics and optimize power conversion efficiency of bulk-heterojunction (BHJ) organic solar cells (OSCs). Free carriers are generated by both ultrafast exciton delocalization and slow exciton diffusion and dissociation at the heterojunction interface. In this work, we developed a systematic numerical simulation to describe the charge-generation process by a modified drift-diffusion model. The transport, recombination, and collection of free carriers are incorporated to fully capture the device response. The theoretical results match well with the state-of-the-art high-performance organic solar cells. It is demonstrated that the increase of exciton delocalization ratio reduces the energy loss in the exciton diffusion-dissociation process, and thus, significantly improves the device efficiency especially for the short-circuit current. By changing the exciton delocalization ratio, OSC performances are comprehensively investigated under the conditions of short-circuit and open-circuit. Particularly, bulk recombination dependent fill factor saturation is unveiled and understood. As a fundamental electrical analysis of the delocalization mechanism, our work is important to understand and optimize the high-performance OSCs. △ Less

Submitted 4 December, 2016; originally announced December 2016.

Comments: 10 pages (7 pages for main paper and 3 pages for supporting information), 8 figures (7 figures in main paper, 1 figure in supporting information)

Journal ref: Journal of Applied Physics, 120, 213101 (2016)

arXiv:1608.08208 [pdf, ps, other]

doi 10.1109/TAP.2016.2600758

Polarization Control by Using Anisotropic 3D Chiral Structures

Authors: Menglin L. N. Chen, Li Jun Jiang, Wei E. I. Sha, Wallace C. H. Choy, Tatsuo Itoh

Abstract: Due to the mirror symmetry breaking, chiral structures show fantastic electromagnetic (EM) properties involving negative refraction, giant optical activity, and asymmetric transmission. Aligned electric and magnetic dipoles excited in chiral structures contribute to extraordinary properties. However, the chiral structures that exhibit n-fold rotational symmetry show limited tuning capability. In t… ▽ More Due to the mirror symmetry breaking, chiral structures show fantastic electromagnetic (EM) properties involving negative refraction, giant optical activity, and asymmetric transmission. Aligned electric and magnetic dipoles excited in chiral structures contribute to extraordinary properties. However, the chiral structures that exhibit n-fold rotational symmetry show limited tuning capability. In this paper, we proposed a compact, light, and highly tunable anisotropic chiral structure to overcome this limitation and realize a linear-to-circular polarization conversion. The anisotropy is due to simultaneous excitations of two different pairs of aligned electric and magnetic dipoles. The 3D omega-like structure, etched on two sides of one PCB board and connected by metallic vias, achieves 60% of linearto- circular conversion (transmission) efficiency at the operating frequency of 9.2 GHz. The desired 90-degree phase shift between the two orthogonal linear polarization components is not only from the finite-thickness dielectric substrate but also from the anisotropic chiral response slightly off the resonance. The work enables elegant and practical polarization control of EM waves. △ Less

Submitted 29 August, 2016; originally announced August 2016.

Comments: 7 pages, 10 figures

Journal ref: IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2016

arXiv:1606.03558 [pdf, other]

Universal Correspondence Network

Authors: Christopher B. Choy, JunYoung Gwak, Silvio Savarese, Manmohan Chandraker

Abstract: We present a deep learning framework for accurate visual correspondences and demonstrate its effectiveness for both geometric and semantic matching, spanning across rigid motions to intra-class shape or appearance variations. In contrast to previous CNN-based approaches that optimize a surrogate patch similarity objective, we use deep metric learning to directly learn a feature space that preserve… ▽ More We present a deep learning framework for accurate visual correspondences and demonstrate its effectiveness for both geometric and semantic matching, spanning across rigid motions to intra-class shape or appearance variations. In contrast to previous CNN-based approaches that optimize a surrogate patch similarity objective, we use deep metric learning to directly learn a feature space that preserves either geometric or semantic similarity. Our fully convolutional architecture, along with a novel correspondence contrastive loss allows faster training by effective reuse of computations, accurate gradient computation through the use of thousands of examples per image pair and faster testing with $O(n)$ feed forward passes for $n$ keypoints, instead of $O(n^2)$ for typical patch similarity methods. We propose a convolutional spatial transformer to mimic patch normalization in traditional features like SIFT, which is shown to dramatically boost accuracy for semantic correspondences across intra-class shape variations. Extensive experiments on KITTI, PASCAL, and CUB-2011 datasets demonstrate the significant advantages of our features over prior works that use either hand-constructed or learned features. △ Less

Submitted 31 October, 2016; v1 submitted 11 June, 2016; originally announced June 2016.

Comments: To appear at NIPS 2016 as full oral presentation

arXiv:1605.03268 [pdf, ps, other]

doi 10.1103/PhysRevA.94.053825

Strongly Enhanced and Directionally Tunable Second-Harmonic Radiation by a Plasmonic Particle-in-Cavity Nanoantenna

Authors: Xiaoyan Y. Z. Xiong, Li Jun Jiang, Wei E. I. Sha, Yat Hei Lo, Ming Fang, Weng Cho Chew, Wallace C. H. Choy

Abstract: Second-harmonic (SH) generation is tremendously important for nonlinear sensing, microscopy and communication system. One of the great challenges of current designs is to enhance the SH signal and simultaneously tune its radiation direction with a high directivity. In contrast to the linear plasmonic scattering dominated by a bulk dipolar mode, a complex surface-induced multipolar source at the do… ▽ More Second-harmonic (SH) generation is tremendously important for nonlinear sensing, microscopy and communication system. One of the great challenges of current designs is to enhance the SH signal and simultaneously tune its radiation direction with a high directivity. In contrast to the linear plasmonic scattering dominated by a bulk dipolar mode, a complex surface-induced multipolar source at the doubled frequency sets a fundamental limit to control the SH radiation from metallic nanostructures. In this work, we harness plasmonic hybridization mechanism together with a special selection rule governing the SH radiation to achieve the high-intensity and tunable-direction emission by a metallic particle-in-cavity nanoantenna (PIC-NA). The nanoantenna is modelled with a first-principle, self-consistent boundary element method, which considers the depletion of pump waves. The giant SH enhancement arises from a hybridized gap plasmon resonance between the small particle and the large cavity that functions as a concentrator and reflector. Centrosymmetry breaking of the PIC-NA not only modifies the gap plasmon mode boosting the SH signal, but also redirects the SH wave with a unidirectional emission. The PIC-NA has a significantly larger SH conversion efficiency compared to existing literature. The main beam of the radiation pattern can be steered over a wide angle by tuning the particle's position. △ Less

Submitted 14 November, 2016; v1 submitted 10 May, 2016; originally announced May 2016.

Comments: 8 pages, 11 figures

Journal ref: Phys. Rev. A 94, 053825, 2016

arXiv:1604.00449 [pdf, other]

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Authors: Christopher B. Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, Silvio Savarese

Abstract: Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data. Our network takes in one or more images of… ▽ More Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data. Our network takes in one or more images of an object instance from arbitrary viewpoints and outputs a reconstruction of the object in the form of a 3D occupancy grid. Unlike most of the previous works, our network does not require any image annotations or object class labels for training or testing. Our extensive experimental analysis shows that our reconstruction framework i) outperforms the state-of-the-art methods for single view reconstruction, and ii) enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline). △ Less

Submitted 1 April, 2016; originally announced April 2016.

Comments: Appendix can be found at http://cvgl.stanford.edu/papers/choy_16_appendix.pdf

arXiv:1507.00086 [pdf, ps, other]

doi 10.1364/OL.37.002112

Unidirectional and Wavelength Selective Photonic Sphere-Array Nanoantennas

Authors: Yang G. Liu, Wallace C. H. Choy, Wei E. I. Sha, Weng Cho Chew

Abstract: We design a photonic sphere-array nanoantenna (NA) exhibiting both strong directionality and wavelength selectivity. Although the geometric configuration of the photonic NA resembles a plasmonic Yagi-Uda NA, it has different working principles, and most importantly, reduces the inherent metallic loss from plasmonic elements. For any selected optical wavelength, a sharp Fano-resonance by the reflec… ▽ More We design a photonic sphere-array nanoantenna (NA) exhibiting both strong directionality and wavelength selectivity. Although the geometric configuration of the photonic NA resembles a plasmonic Yagi-Uda NA, it has different working principles, and most importantly, reduces the inherent metallic loss from plasmonic elements. For any selected optical wavelength, a sharp Fano-resonance by the reflector is tunable to overlap spectrally with a wider dipole resonance by the sphere-chain director leading to the high directionality. The work provides design principles for directional and selective photonic NAs, which is particularly useful for photon detection and spontaneous emission manipulation. △ Less

Submitted 30 June, 2015; originally announced July 2015.

Comments: 3 pages, 4 figures

Journal ref: Optics Letters, 37(11): 2112-2114, 2012

arXiv:1507.00084 [pdf, ps, other]

doi 10.1364/OL.39.000158

Observing Abnormally Large Group Velocity at the Plasmonic Band Edge via a Universal Eigenvalue Analysis

Authors: Wei E. I. Sha, Ling Ling Meng, Wallace C. H. Choy, Weng Cho Chew

Abstract: We developed a novel universal eigenvalue analysis for 2D arbitrary nanostructures comprising dispersive and lossy materials. The complex dispersion relation (or complex Bloch band structure) of a metallic grating is rigorously calculated by the proposed algorithm with the finite-difference implementation. The abnormally large group velocity is observed at a plasmonic band edge with a large attenu… ▽ More We developed a novel universal eigenvalue analysis for 2D arbitrary nanostructures comprising dispersive and lossy materials. The complex dispersion relation (or complex Bloch band structure) of a metallic grating is rigorously calculated by the proposed algorithm with the finite-difference implementation. The abnormally large group velocity is observed at a plasmonic band edge with a large attenuation constant. Interestingly, we found the abnormal group velocity is caused by the leaky (radiation) loss not by metallic absorption (Ohmic) loss. The periodically modulated surface of the grating significantly modifies the original dispersion relation of the semi-infinite dielectric-metal structure and induces the extraordinarily large group velocity, which is different from the near-zero group velocity at photonic band edge. The work is fundamentally important to the design of plasmonic nanostructures. △ Less

Submitted 30 June, 2015; originally announced July 2015.

Comments: 4 pages, 6 figures

Journal ref: Optics Letters, 39(1): 158-161, 2014

arXiv:1506.09003 [pdf]

doi 10.1063/1.4922150

The Efficiency Limit of CH3NH3PbI3 Perovskite Solar Cells

Authors: Wei E. I. Sha, Xingang Ren, Luzhou Chen, Wallace C. H. Choy

Abstract: With the consideration of photon recycling effect, the efficiency limit of methylammonium lead iodide (CH3NH3PbI3) perovskite solar cells is predicted by a detailed balance model. To obtain convincing predictions, both AM 1.5 spectrum of Sun and experimentally measured complex refractive index of perovskite material are employed in the detailed balance model. The roles of light trapping and angula… ▽ More With the consideration of photon recycling effect, the efficiency limit of methylammonium lead iodide (CH3NH3PbI3) perovskite solar cells is predicted by a detailed balance model. To obtain convincing predictions, both AM 1.5 spectrum of Sun and experimentally measured complex refractive index of perovskite material are employed in the detailed balance model. The roles of light trapping and angular restriction in improving the maximal output power of thin-film perovskite solar cells are also clarified. The efficiency limit of perovskite cells (without the angular restriction) is about 31%, which approaches to Shockley-Queisser limit (33%) achievable by gallium arsenide (GaAs) cells. Moreover, the Shockley-Queisser limit could be reached with a 200 nm-thick perovskite solar cell, through integrating a wavelength-dependent angular-restriction design with a textured light-trapping structure. Additionally, the influence of the trap-assisted nonradiative recombination on the device efficiency is investigated. The work is fundamentally important to high-performance perovskite photovoltaics. △ Less

Submitted 30 June, 2015; originally announced June 2015.

Comments: 14 pages, 6 figures

Journal ref: Appl. Phys. Lett. 106, 221104 (2015)

arXiv:1302.3284 [pdf, ps, other]

On the macroscopic verifications of Klein's theorem and the proof of $E_0=mc^2$

Authors: T. C. Choy

Abstract: Alternative verifications of Klein's theorem and the proof of $E_0=mc^2$, for a relativistic macroscopic body are presented, using models with boundary conditions of varying complexity, together with some refinements for the case containing electromagnetic radiation for the simplest model. The robustness of these models to the final result of $E_0=mc^2$, attests to the minor role played by the Poi… ▽ More Alternative verifications of Klein's theorem and the proof of $E_0=mc^2$, for a relativistic macroscopic body are presented, using models with boundary conditions of varying complexity, together with some refinements for the case containing electromagnetic radiation for the simplest model. The robustness of these models to the final result of $E_0=mc^2$, attests to the minor role played by the Poincaré type stresses introduced in some of these models for mechanical stability. Finally we caution the reader that while internal consistency of the $E_0=mc^2$ relation for a macroscopic body in special relativity is proved, it does not in any way furnish a proof of the relation for a single point particle, for this would imply that one is able to prove the postulates of special relativity from the premises of the theory itself. △ Less

Submitted 19 September, 2013; v1 submitted 13 February, 2013; originally announced February 2013.

Comments: 7 pages, no figures

arXiv:1108.0167 [pdf, ps, other]

On the $c$ equivalence principle and its relation to the weak equivalence principle of general relativity

Authors: T. C. Choy

Abstract: We clarify the status of the $c$ equivalence principle ($c_u=c$) recently proposed by Heras et al \cite{JoseAJP2010,JoseEJP2010} and show that its proposal leads to an extension of the current framework of classical relativistic electrodynamics (CRE). This is because in the MLT (mass, length and time) system of units, CRE theory can contain only one fundamental constant of nature and special relat… ▽ More We clarify the status of the $c$ equivalence principle ($c_u=c$) recently proposed by Heras et al \cite{JoseAJP2010,JoseEJP2010} and show that its proposal leads to an extension of the current framework of classical relativistic electrodynamics (CRE). This is because in the MLT (mass, length and time) system of units, CRE theory can contain only one fundamental constant of nature and special relativity dictates that this must be $c$, the standard speed of light in vacuum, a point not sufficiently emphasized in most textbooks with the exception of a few such as Panofsky and Phillips \cite{PanofskyPhillips}. The $c$ equivalence principle Heras \cite{JoseAJP2010,JoseEJP2010} can be shown to be linked to the second postulate of special relativity which extends the constancy of the unique velocity of light to all of physics (especially to mechanics) other than electromagnetism. An interesting corollary is that both the weak equivalence principle of general relativity and the $c$ equivalence principle are in fact one and the same, which we demonstrate within the context of Newtonian gravity. △ Less

Submitted 3 August, 2011; v1 submitted 31 July, 2011; originally announced August 2011.

Comments: 9 pages, no figures. Some typos corrected

arXiv:physics/0305062 [pdf, ps, other]

doi 10.1119/1.1643371

Capacitors can radiate - some consequences of the two-capacitor problem with radiation

Authors: T. C. Choy

Abstract: We fill a gap in the arguments of Boykin et al [American Journal of Physics, Vol 70 No. 4, pp 415-420 (2002)] by not invoking an electric current loop (i.e. magnetic dipole model) to account for the radiation energy loss, since an obvious corollary of their results is that the capacitors should radiate directly even if the connecting wires are shrunk to zero length. That this is so is shown here… ▽ More We fill a gap in the arguments of Boykin et al [American Journal of Physics, Vol 70 No. 4, pp 415-420 (2002)] by not invoking an electric current loop (i.e. magnetic dipole model) to account for the radiation energy loss, since an obvious corollary of their results is that the capacitors should radiate directly even if the connecting wires are shrunk to zero length. That this is so is shown here by a direct derivation of capacitor radiation using an oscillating electric dipole radiator model for the capacitors as well as the alternative less widely known magnetic 'charge' current loop representation for an electric dipole [see for example "Electromagnetic Waves" by S.A.Schlekunoff, van Nostrand (1948)]. Implications for Electromagnetic Compliance (EMC) issues as well as novel antenna designs further motivate the purpose of this paper. △ Less

Submitted 14 May, 2003; originally announced May 2003.

Comments: 5 Pages with No figures

Journal ref: Am. J. Phys. Vol 72 no 5, page 663 (2004)

arXiv:cond-mat/0305286 [pdf, ps, other]

doi 10.1088/0953-8984/17/10/007

Field emission theory beyond WKB - the full image problem

Authors: T. C. Choy, A. H. Harker, A. M. Stoneham

Abstract: The classic theory of electron field emission from a cold metal surface due to Fowler and Nordheim (FN) is re-examined and found to violate the validity criteria for the WKB approximation, for electric fields greater than about 1V per micron. In this study we shall examine the complete solution without invoking the WKB approximation in order to assess the reliability of the FN theory as widely u… ▽ More The classic theory of electron field emission from a cold metal surface due to Fowler and Nordheim (FN) is re-examined and found to violate the validity criteria for the WKB approximation, for electric fields greater than about 1V per micron. In this study we shall examine the complete solution without invoking the WKB approximation in order to assess the reliability of the FN theory as widely used for the interpretation of experimental data. Particular problems occur when the barrier height (and therefore indirectly also the width) is significantly reduced by the image or some external effects. Further refinement of the theory will be discussed by considering the effects of screening, which can be one mechanism for the barrier height reduction, in addition to the widely known negative affinity of diamond like carbon systems. A comparison with experimental data from carbon field emitters shows that the enhanced current found in this paper may provide an explanation for strong field emission observed recently in carbon based samples. △ Less

Submitted 13 May, 2003; originally announced May 2003.

Comments: 18 pages and 6 figures

Journal ref: J. Phys. Cond. Matt. Vol 17 pg 1505 (2005)

arXiv:cond-mat/0011092 [pdf, ps, other]

Image charges revisited: a closed form solution

Authors: T. C. Choy

Abstract: We demonstrate that the corrections to the classical Kelvin image theory due to finite electron screening length $λ$, recently discussed by Roulet and Saint Jean, Am. J. Phys. 68(4) 319, is amenable to an exact closed form solution in terms of an integral involving Bessel functions. An improper choice of boundary conditions is rectified as well, enabling also a complete solution for all potentia… ▽ More We demonstrate that the corrections to the classical Kelvin image theory due to finite electron screening length $λ$, recently discussed by Roulet and Saint Jean, Am. J. Phys. 68(4) 319, is amenable to an exact closed form solution in terms of an integral involving Bessel functions. An improper choice of boundary conditions is rectified as well, enabling also a complete solution for all potentials - both inside and outside the metal surface. △ Less

Submitted 6 November, 2000; originally announced November 2000.

Comments: 17 pages, 1 Figure

arXiv:quant-ph/9911096 [pdf, ps, other]

doi 10.1103/PhysRevA.62.012506

The Van der Waals interaction of the hydrogen molecule - an exact local energy density functional

Authors: T. C. Choy

Abstract: We verify that the van der Waals interaction and hence all dispersion interactions for the hydrogen molecule given by: W"= -{A/R^6}-{B/R^8}-{C/R^10}- ..., in which R is the internuclear separation, are exactly soluble. The constants A=6.4990267..., B=124.3990835 ... and C=1135.2140398... (in Hartree units) first obtained approximately by Pauling and Beach (PB) [1] using a linear variational meth… ▽ More We verify that the van der Waals interaction and hence all dispersion interactions for the hydrogen molecule given by: W"= -{A/R^6}-{B/R^8}-{C/R^10}- ..., in which R is the internuclear separation, are exactly soluble. The constants A=6.4990267..., B=124.3990835 ... and C=1135.2140398... (in Hartree units) first obtained approximately by Pauling and Beach (PB) [1] using a linear variational method, can be shown to be obtainable to any desired accuracy via our exact solution. In addition we shall show that a local energy density functional can be obtained, whose variational solution rederives the exact solution for this problem. This demonstrates explicitly that a static local density functional theory exists for this system. We conclude with remarks about generalising the method to other hydrogenic systems and also to helium. △ Less

Submitted 25 November, 1999; v1 submitted 22 November, 1999; originally announced November 1999.

Comments: 11 pages, 13 figures and 28 references

Journal ref: Physical Review A vol 62 12506 (2000)

arXiv:quant-ph/9907027

The meaning of 'counterfactual' statements and non-locality in quantum mechanics

Authors: T. C. Choy, Debra Ziegeler

Abstract: Recent discussions by Mermin [1] and Stapp [2] in this journal on non-locality and counterfactuality are shown to contain linguistic problems that require verification. As such they can at most provide us with two subjective choices for the meaning of 'counterfactual statements' in quantum mechanics. We shall show that the word 'counterfactual' is in fact inappropriate here and should be replace… ▽ More Recent discussions by Mermin [1] and Stapp [2] in this journal on non-locality and counterfactuality are shown to contain linguistic problems that require verification. As such they can at most provide us with two subjective choices for the meaning of 'counterfactual statements' in quantum mechanics. We shall show that the word 'counterfactual' is in fact inappropriate here and should be replaced by the word 'hypothetical'. Mermin's choice imposes a strictly contextual meaning based upon an interpretation of counterfactuality which he used to refute, without proof as we shall see, Stapp's logical proof [3] of non-locality in quantum theory. In linguistic theory both authors' choices of meaning: counterfactual versus hypothetical are equally acceptable and therefore some of the issues they discussed lie outside the domain of physics. The issues they discussed are further confused by the fact that in his reply Stapp [2] seems to have adopted Mermin's counterfactual interpretation against his own original [3] hypothetical interpretation. In the rest of this paper we shall adopt the hypothetical sense of Stapp's original statements but we modify his crucial statement LOC2 appropriately, then following his argumentations, we shall show that there is no conflict between relativity and quantum mechanics. We suggest that this should be the natural (pragmatic) choice of meaning in defining the predictions of events in the Hardy experiment. △ Less

Submitted 8 July, 1999; originally announced July 1999.

Comments: 10 pages, No figures, submitted to American Journal of Physics Apr 1999

arXiv:cond-mat/9511010 [pdf, ps, other]

doi 10.1063/1.472625

An improved perturbation approach to the 2D Edwards polymer -- corrections to scaling

Authors: S. R. Shannon, T. C. Choy, R. J. Fleming

Abstract: We present the results of a new perturbation calculation in polymer statistics which starts from a ground state that already correctly predicts the long chain length behaviour of the mean square end--to--end distance $\langle R_N^2 \rangle\ $, namely the solution to the 2~dimensional~(2D) Edwards model. The $\langle R_N^2 \rangle$ thus calculated is shown to be convergent in $N$, the number of s… ▽ More We present the results of a new perturbation calculation in polymer statistics which starts from a ground state that already correctly predicts the long chain length behaviour of the mean square end--to--end distance $\langle R_N^2 \rangle\ $, namely the solution to the 2~dimensional~(2D) Edwards model. The $\langle R_N^2 \rangle$ thus calculated is shown to be convergent in $N$, the number of steps in the chain, in contrast to previous methods which start from the free random walk solution. This allows us to calculate a new value for the leading correction--to--scaling exponent~$Δ$. Writing $\langle R_N^2 \rangle = AN^{2ν}(1+BN^{-Δ} + CN^{-1}+...)$, where $ν= 3/4$ in 2D, our result shows that $Δ= 1/2$. This value is also supported by an analysis of 2D self--avoiding walks on the {\em continuum}. △ Less

Submitted 2 November, 1995; originally announced November 1995.

Comments: 17 Pages of Revtex. No figures. Submitted to J. Phys. A

arXiv:cond-mat/9510162 [pdf, ps, other]

doi 10.1103/PhysRevB.53.2175

Corrections to scaling in 2--dimensional polymer statistics

Authors: S. R. Shannon, T. C. Choy, R. J. Fleming

Abstract: Writing $<R^2_N > = AN^{2ν}(1+BN^{-Δ_1}+CN^{-1}+ ...)$ for the mean square end--to--end length $<R^2_N>$ of a self--avoiding polymer chain of $N$ links, we have calculated $Δ_1$ for the two--dimensional {\em continuum} case from a new {\em finite} perturbation method based on the ground state of Edwards self consistent solution which predicts the (exact) $ν=3/4$ exponent. This calculation yields… ▽ More Writing $<R^2_N > = AN^{2ν}(1+BN^{-Δ_1}+CN^{-1}+ ...)$ for the mean square end--to--end length $<R^2_N>$ of a self--avoiding polymer chain of $N$ links, we have calculated $Δ_1$ for the two--dimensional {\em continuum} case from a new {\em finite} perturbation method based on the ground state of Edwards self consistent solution which predicts the (exact) $ν=3/4$ exponent. This calculation yields $Δ_1=1/2$. A finite size scaling analysis of data generated for the continuum using a biased sampling Monte Carlo algorithm supports this value, as does a re--analysis of exact data for two--dimensional lattices. △ Less

Submitted 30 October, 1995; originally announced October 1995.

Comments: 10 pages of RevTex, 5 Postscript figures. Accepted for publication in Phys. Rev. B. Brief Reports. Also submitted to J. Phys. A

Showing 1–45 of 45 results for author: Choy, C