Skip to main content

Showing 1–43 of 43 results for author: Tu, R

  1. arXiv:2407.07111  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Diffusion Model-Based Video Editing: A Survey

    Authors: Wenhao Sun, Rong-Cheng Tu, Jingyi Liao, Dacheng Tao

    Abstract: The rapid development of diffusion models (DMs) has significantly advanced image and video applications, making "what you want is what you see" a reality. Among these, video editing has gained substantial attention and seen a swift rise in research activity, necessitating a comprehensive and systematic review of the existing literature. This paper reviews diffusion model-based video editing techni… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: 23 pages, 12 figures, a project related to this paper can be found at https://github.com/wenhao728/awesome-diffusion-v2v

  2. arXiv:2406.14555  [pdf, other

    cs.CV

    A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

    Authors: Xincheng Shuai, Henghui Ding, Xingjun Ma, Rongcheng Tu, Yu-Gang Jiang, Dacheng Tao

    Abstract: Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC). Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models, which generate images according to text prompts. Th… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Project Page: https://github.com/xinchengshuai/Awesome-Image-Editing

  3. arXiv:2406.08311  [pdf, other

    cs.LG cs.AI

    Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework

    Authors: Ruibo Tu, Zineb Senane, Lele Cao, Cheng Zhang, Hedvig Kjellström, Gustav Eje Henter

    Abstract: Tabular synthesis models remain ineffective at capturing complex dependencies, and the quality of synthetic data is still insufficient for comprehensive downstream tasks, such as prediction under distribution shifts, automated decision-making, and cross-table understanding. A major challenge is the lack of prior knowledge about underlying structures and high-order relationships in tabular data. We… ▽ More

    Submitted 5 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  4. Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask

    Authors: Zineb Senane, Lele Cao, Valentin Leonhard Buchner, Yusuke Tashiro, Lei You, Pawel Herman, Mats Nordahl, Ruibo Tu, Vilhelm von Ehrenheim

    Abstract: Time Series Representation Learning (TSRL) focuses on generating informative representations for various Time Series (TS) modeling tasks. Traditional Self-Supervised Learning (SSL) methods in TSRL fall into four main categories: reconstructive, adversarial, contrastive, and predictive, each with a common challenge of sensitivity to noise and intricate data nuances. Recently, diffusion-based method… ▽ More

    Submitted 17 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: Published as a full paper by KDD 2024 Research Track (12 pages as main paper and 11 pages as appendix). Source code available at https://github.com/llcresearch/TSDE

    ACM Class: G.3; I.6.5; I.2.4

  5. arXiv:2404.12512  [pdf, other

    cs.CR cs.LG

    Proteus: Preserving Model Confidentiality during Graph Optimizations

    Authors: Yubo Gao, Maryam Haghifam, Christina Giannoula, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar

    Abstract: Deep learning (DL) models have revolutionized numerous domains, yet optimizing them for computational efficiency remains a challenging endeavor. Development of new DL models typically involves two parties: the model developers and performance optimizers. The collaboration between the parties often necessitates the model developers exposing the model architecture and computational graph to the opti… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  6. arXiv:2312.06583  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    3D Hand Pose Estimation in Egocentric Images in the Wild

    Authors: Aditya Prakash, Ruisen Tu, Matthew Chang, Saurabh Gupta

    Abstract: We present WildHands, a method for 3D hand pose estimation in egocentric images in the wild. This is challenging due to (a) lack of 3D hand pose annotations for images in the wild, and (b) a form of perspective distortion-induced shape ambiguity that arises in the analysis of crops around hands. For the former, we use auxiliary supervision on in-the-wild data in the form of segmentation masks & gr… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Project page: https://ap229997.github.io/projects/hands/

  7. arXiv:2310.05181  [pdf, other

    eess.AS cs.GR cs.HC cs.LG cs.SD

    Unified speech and gesture synthesis using flow matching

    Authors: Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter

    Abstract: As text-to-speech technologies achieve remarkable naturalness in read-aloud tasks, there is growing interest in multimodal synthesis of verbal and non-verbal communicative behaviour, such as spontaneous speech and associated body gestures. This paper presents a novel, unified architecture for jointly synthesising speech acoustics and skeleton-based 3D gesture motion from text, trained using optima… ▽ More

    Submitted 9 January, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: 5 pages, 1 figure. Final version, accepted to IEEE ICASSP 2024

    MSC Class: 68T07 (Primary); 68T42 (Secondary) ACM Class: I.2.7; I.2.6; H.5

  8. arXiv:2309.03199  [pdf, other

    eess.AS cs.HC cs.LG cs.SD

    Matcha-TTS: A fast TTS architecture with conditional flow matching

    Authors: Shivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter

    Abstract: We introduce Matcha-TTS, a new encoder-decoder architecture for speedy TTS acoustic modelling, trained using optimal-transport conditional flow matching (OT-CFM). This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching. Careful design choices additionally ensure each synthesis step is fast to run. The method is probabilistic… ▽ More

    Submitted 9 January, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures. Final version, accepted to IEEE ICASSP 2024

    MSC Class: 68T07 ACM Class: I.2.7; I.2.6; H.5.5

  9. arXiv:2307.15034  [pdf, other

    cs.LG math.NA

    Guaranteed Approximation Bounds for Mixed-Precision Neural Operators

    Authors: Renbo Tu, Colin White, Jean Kossaifi, Boris Bonev, Nikola Kovachki, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Neural operators, such as Fourier Neural Operators (FNO), form a principled approach for learning solution operators for PDEs and other mappings between function spaces. However, many real-world problems require high-resolution training data, and the training time and limited GPU memory pose big barriers. One solution is to train neural operators in mixed precision to reduce the memory requirement… ▽ More

    Submitted 5 May, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: ICLR 2024

  10. arXiv:2306.07096  [pdf, other

    cs.CV

    Global and Local Semantic Completion Learning for Vision-Language Pre-training

    Authors: Rong-Cheng Tu, Yatai Ji, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu

    Abstract: Cross-modal alignment plays a crucial role in vision-language pre-training (VLP) models, enabling them to capture meaningful associations across different modalities. For this purpose, numerous masked modeling tasks have been proposed for VLP to further promote cross-modal interactions. The core idea of previous masked modeling tasks is to focus on reconstructing the masked tokens based on visible… ▽ More

    Submitted 5 December, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2211.13437

  11. Range Anxiety Among Battery Electric Vehicle Users: Both Distance and Waiting Time Matter

    Authors: Jiyao Wang, Chunxi Huang, Dengbo He, Ran Tu

    Abstract: Range anxiety is a major concern of battery electric vehicles (BEVs) users or potential users. Previous work has explored the influential factors of distance-related range anxiety. However, time-related range anxiety has rarely been explored. The time cost when charging or waiting to charge the BEVs can negatively impact BEV users' experience. As a preliminary attempt, this survey study investigat… ▽ More

    Submitted 24 January, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted by Human Factors and Ergonomics Society International Annual Meeting 2023

  12. arXiv:2304.04681  [pdf, other

    cs.CV cs.LG

    Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models

    Authors: Wenjie Yin, Ruibo Tu, Hang Yin, Danica Kragic, Hedvig Kjellström, Mårten Björkman

    Abstract: Data-driven and controllable human motion synthesis and prediction are active research areas with various applications in interactive media and social robotics. Challenges remain in these fields for generating diverse motions given past observations and dealing with imperfect poses. This paper introduces MoDiff, an autoregressive probabilistic diffusion model over motion sequences conditioned on c… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  13. arXiv:2301.13819  [pdf, other

    cs.CL cs.LG

    Causal-Discovery Performance of ChatGPT in the context of Neuropathic Pain Diagnosis

    Authors: Ruibo Tu, Chao Ma, Cheng Zhang

    Abstract: ChatGPT has demonstrated exceptional proficiency in natural language conversation, e.g., it can answer a wide range of questions while no previous large language models can. Thus, we would like to push its limit and explore its ability to answer causal discovery questions by using a medical benchmark (Tu et al. 2019) in causal discovery.

    Submitted 6 February, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

  14. arXiv:2301.10076  [pdf

    cs.CY

    Influential Factors of Users' Trust in the Range Estimation Systems of Battery Electric Vehicles -- A Survey Study in China

    Authors: Jiyao Wang, Chunxi Huang, Ran Tu, Dengbo He

    Abstract: Although the rapid development of battery technology has greatly increased the range of battery electric vehicle (BEV), the range anxiety is still a major concern of BEV users or potential users. Previous work has proposed a framework explaining the influential factors of range anxiety and users' trust toward the range estimation system (RES) of BEV has been identified as a leading factor of range… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: Accepted and reported at Transportation Research Board Annual Meeting 2022

    Report number: TRBAM-23-01746

  15. arXiv:2301.07966  [pdf, ps, other

    cs.LG math.OC

    Getting Away with More Network Pruning: From Sparsity to Geometry and Linear Regions

    Authors: Junyang Cai, Khai-Nguyen Nguyen, Nishant Shrestha, Aidan Good, Ruisen Tu, Xin Yu, Shandian Zhe, Thiago Serra

    Abstract: One surprising trait of neural networks is the extent to which their connections can be pruned with little to no effect on accuracy. But when we cross a critical level of parameter sparsity, pruning any further leads to a sudden drop in accuracy. This drop plausibly reflects a loss in model complexity, which we aim to avoid. In this work, we explore how sparsity also affects the geometry of the li… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: (Under review)

  16. arXiv:2212.10013  [pdf, other

    cs.AI cs.CL

    DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

    Authors: Forrest Sheng Bao, Ruixuan Tu, Ge Luo, Yinfei Yang, Hebi Li, Minghui Qiu, Youbiao He, Cen Chen

    Abstract: Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system s… ▽ More

    Submitted 26 November, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted into Findings of EMNLP 2023

  17. arXiv:2212.03125   

    cs.CV

    Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations

    Authors: Minghao Chen, Renbo Tu, Chenxi Huang, Yuqi Lin, Boxi Wu, Deng Cai

    Abstract: Previous work on action representation learning focused on global representations for short video clips. In contrast, many practical applications, such as video alignment, strongly demand learning the intensive representation of long videos. In this paper, we introduce a new framework of contrastive action representation learning (CARL) to learn frame-wise action representation in a self-supervise… ▽ More

    Submitted 1 March, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: author conflicts

  18. arXiv:2211.13437  [pdf, other

    cs.CV cs.CL cs.MM

    Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

    Authors: Yatai Ji, Rongcheng Tu, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu

    Abstract: Cross-modal alignment is essential for vision-language pre-training (VLP) models to learn the correct corresponding information across different modalities. For this purpose, inspired by the success of masked language modeling (MLM) tasks in the NLP pre-training area, numerous masked modeling tasks have been proposed for VLP to further promote cross-modal interactions. The core idea of previous ma… ▽ More

    Submitted 26 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: CVPR 2023 accept

  19. arXiv:2210.03324  [pdf, other

    cs.LG cs.AI stat.ML

    AutoML for Climate Change: A Call to Action

    Authors: Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White

    Abstract: The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications. The climate change AI (CCAI) community works on a diverse, challenging set of problems which often involve physics-constrained ML or heterogeneous spatiotemporal data. It would be desirable to use automated machine learning (AutoML)… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  20. arXiv:2210.03230  [pdf, other

    cs.LG cs.AI stat.ML

    NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies

    Authors: Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, Frank Hutter

    Abstract: Zero-cost proxies (ZC proxies) are a recent architecture performance prediction technique aiming to significantly speed up algorithms for neural architecture search (NAS). Recent work has shown that these techniques show great promise, but certain aspects, such as evaluating and exploiting their complementary strengths, are under-studied. In this work, we create NAS-Bench-Suite: we evaluate 13 ZC… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: NeurIPS Datasets and Benchmarks Track 2022

  21. arXiv:2209.11475  [pdf, other

    cs.CV cs.IR

    Unsupervised Hashing with Semantic Concept Mining

    Authors: Rong-Cheng Tu, Xian-Ling Mao, Kevin Qinghong Lin, Chengfei Cai, Weize Qin, Hongfa Wang, Wei Wei, Heyan Huang

    Abstract: Recently, to improve the unsupervised image retrieval performance, plenty of unsupervised hashing methods have been proposed by designing a semantic similarity matrix, which is based on the similarities between image features extracted by a pre-trained CNN model. However, most of these methods tend to ignore high-level abstract semantic concepts contained in images. Intuitively, concepts play an i… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  22. arXiv:2207.01622  [pdf, other

    cs.CV

    Egocentric Video-Language Pretraining @ Ego4D Challenge 2022

    Authors: Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rongcheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou

    Abstract: In this report, we propose a video-language pretraining (VLP) based solution \cite{kevin2022egovlp} for four Ego4D challenge tasks, including Natural Language Query (NLQ), Moment Query (MQ), Object State Change Classification (OSCC), and PNR Localization (PNR). Especially, we exploit the recently released Ego4D dataset \cite{grauman2021ego4d} to pioneer Egocentric VLP from pretraining dataset, pre… ▽ More

    Submitted 3 August, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Preprint. 4 pages, 2 figures, 5 tables. Code: https://github.com/showlab/EgoVLP. The Ego4D challenge technical report of EgoVLP arXiv:2206.01670. See EPIC challenge technical report arXiv:2207.01334 for overlap

  23. arXiv:2207.01334  [pdf, other

    cs.CV

    Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

    Authors: Kevin Qinghong Lin, Alex Jinpeng Wang, Rui Yan, Eric Zhongcong Xu, Rongcheng Tu, Yanru Zhu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Wei Liu, Mike Zheng Shou

    Abstract: In this report, we propose a video-language pretraining (VLP) based solution \cite{kevin2022egovlp} for the EPIC-KITCHENS-100 Multi-Instance Retrieval (MIR) challenge. Especially, we exploit the recently released Ego4D dataset \cite{grauman2021ego4d} to pioneer Egocentric VLP from pretraining dataset, pretraining objective, and development set. Based on the above three designs, we develop a pretra… ▽ More

    Submitted 3 August, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: To appeared in CVPRW22. 5 pages, 2 figures, 2 tables. Code: https://github.com/showlab/EgoVLP. The EPIC challenge technical report of EgoVLP arXiv:2206.01670. See Ego4D challenge technical report arXiv:2207.01622

  24. arXiv:2206.01670  [pdf, other

    cs.CV cs.AI

    Egocentric Video-Language Pretraining

    Authors: Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rongcheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou

    Abstract: Video-Language Pretraining (VLP), which aims to learn transferable representation to advance a wide range of video-text downstream tasks, has recently received increasing attention. Best performing works rely on large-scale, 3rd-person video-text datasets, such as HowTo100M. In this work, we exploit the recently released Ego4D dataset to pioneer Egocentric VLP along three directions. (i) We create… ▽ More

    Submitted 12 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS 2022. Double champions at Ego4D and EPIC-Kitchens, CVPR 2022 challenges. 23 pages, 13 figures, 12 tables. Code: https://github.com/showlab/EgoVLP

  25. arXiv:2203.15787  [pdf

    cs.RO math.OC

    Effective and Acceptable Eco-Driving Guidance for Human-Driving Vehicles: A Review

    Authors: Ran Tu, Junshi Xu

    Abstract: Ecodriving guidance includes courses or suggestions for human drivers to improve driving behaviour, reducing energy use and emissions. This paper presents a systematic review of existing eco-driving guidance studies and identifies challenges to tackle in the future. A standard agreement on the guidance design has not been reached, leading to difficulties in designing and implementing eco-driving g… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

  26. arXiv:2201.09366  [pdf, other

    cs.LG stat.ME

    Optimal transport for causal discovery

    Authors: Ruibo Tu, Kun Zhang, Hedvig Kjellström, Cheng Zhang

    Abstract: To determine causal relationships between two variables, approaches based on Functional Causal Models (FCMs) have been proposed by properly restricting model classes; however, the performance is sensitive to the model assumptions, which makes it difficult to use. In this paper, we provide a novel dynamical-system view of FCMs and propose a new framework for identifying causal direction in the biva… ▽ More

    Submitted 29 March, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

  27. arXiv:2110.06257  [pdf, other

    cs.LG stat.ML

    Causal Discovery from Conditionally Stationary Time Series

    Authors: Carles Balsells-Rodas, Ruibo Tu, Hedvig Kjellstrom, Yingzhen Li

    Abstract: Causal discovery, i.e., inferring underlying causal relationships from observational data, has been shown to be highly challenging for AI systems. In time series modeling context, traditional causal discovery methods mainly consider constrained scenarios with fully observed variables and/or data from stationary time-series. We develop a causal discovery approach to handle a wide class of non-stati… ▽ More

    Submitted 23 February, 2024; v1 submitted 12 October, 2021; originally announced October 2021.

  28. arXiv:2110.05668  [pdf, other

    cs.CV cs.LG

    NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks

    Authors: Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar

    Abstract: Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, e.g. image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to… ▽ More

    Submitted 19 January, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track

  29. arXiv:2106.04502  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

    Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar

    Abstract: Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investig… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  30. arXiv:2104.08157  [pdf, other

    cs.LG stat.ME

    Capturing patterns of variation unique to a specific dataset

    Authors: Robin Tu, Alexander H. Foss, Sihai D. Zhao

    Abstract: Capturing patterns of variation present in a dataset is important in exploratory data analysis and unsupervised learning. Contrastive dimension reduction methods, such as contrastive principal component analysis (cPCA), find patterns unique to a target dataset of interest by contrasting with a carefully chosen background dataset representing unwanted or uninteresting variation. However, such metho… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  31. arXiv:2103.11349  [pdf, other

    cs.LG stat.ML

    Neighbor Embedding Variational Autoencoder

    Authors: Renfei Tu, Yang Liu, Yongzeng Xue, Cheng Wang, Maozu Guo

    Abstract: Being one of the most popular generative framework, variational autoencoders(VAE) are known to suffer from a phenomenon termed posterior collapse, i.e. the latent variational distributions collapse to the prior, especially when a strong decoder network is used. In this work, we analyze the latent representation of collapsed VAEs, and proposed a novel model, neighbor embedding VAE(NE-VAE), which ex… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: Paper under review for ICML2021

  32. arXiv:2011.03451  [pdf, other

    cs.CV cs.IR cs.MM

    Deep Cross-modal Hashing via Margin-dynamic-softmax Loss

    Authors: Rong-Cheng Tu, Xian-Ling Mao, Rongxin Tu, Binbin Bian, Wei Wei, Heyan Huang

    Abstract: Due to their high retrieval efficiency and low storage cost for cross-modal search task, cross-modal hashing methods have attracted considerable attention. For the supervised cross-modal hashing methods, how to make the learned hash codes preserve semantic information sufficiently contained in the label of datapoints is the key to further enhance the retrieval performance. Hence, almost all superv… ▽ More

    Submitted 18 May, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

  33. arXiv:2011.02620  [pdf, other

    cs.MM cs.LG

    A multi-level approach with visual information for encrypted H.265/HEVC videos

    Authors: Wenying Wen, Rongxin Tu, Yushu Zhang, Yuming Fang, Yong Yang

    Abstract: High-efficiency video coding (HEVC) encryption has been proposed to encrypt syntax elements for the purpose of video encryption. To achieve high video security, to the best of our knowledge, almost all of the existing HEVC encryption algorithms mainly encrypt the whole video, such that the user without permissions cannot obtain any viewable information. However, these encryption algorithms cannot… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

  34. arXiv:2010.11300  [pdf, ps, other

    cs.LG cs.CY

    How Do Fair Decisions Fare in Long-term Qualification?

    Authors: Xueru Zhang, Ruibo Tu, Yang Liu, Mingyan Liu, Hedvig Kjellström, Kun Zhang, Cheng Zhang

    Abstract: Although many fairness criteria have been proposed for decision making, their long-term impact on the well-being of a population remains unclear. In this work, we study the dynamics of population qualification and algorithmic decisions under a partially observed Markov decision problem setting. By characterizing the equilibrium of such dynamics, we analyze the long-term impact of static fairness c… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted to the 34th Conference on Neural Information Processing Systems (NeurIPS)

  35. Multi-Objective Eco-Routing for Dynamic Control of Connected & Automated Vehicles

    Authors: Shadi Djavadian, Ran Tu, Bilal Farooq, Marianne Hatzopoulou

    Abstract: The advent of intelligent vehicles that can communicate with infrastructure as well as automate the movement provides a range of new options to address key urban traffic issues such as congestion and pollution, without the need for centralized traffic control. Furthermore, the advances in the information, communication, and sensing technologies have provided access to real-time traffic and emissio… ▽ More

    Submitted 8 October, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Journal ref: Transportation Research Part D: Transport and Environment. 87C: 1-16 (2020)

  36. arXiv:2004.08286  [pdf, other

    eess.SP cs.LG stat.ML

    Greenhouse Gas Emission Prediction on Road Network using Deep Sequence Learning

    Authors: Lama Alfaseeh, Ran Tu, Bilal Farooq, Marianne Hatzopoulou

    Abstract: Mitigating the substantial undesirable impact of transportation systems on the environment is paramount. Thus, predicting Greenhouse Gas (GHG) emissions is one of the profound topics, especially with the emergence of intelligent transportation systems (ITS). We develop a deep learning framework to predict link-level GHG emission rate (ER) (in CO2eq gram/second) based on the most representative pre… ▽ More

    Submitted 4 December, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

  37. arXiv:1907.12490  [pdf, other

    cs.IR cs.MM

    Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning

    Authors: Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, Heyan Huang

    Abstract: Due to their high retrieval efficiency and low storage cost, cross-modal hashing methods have attracted considerable attention. Generally, compared with shallow cross-modal hashing methods, deep cross-modal hashing methods can achieve a more satisfactory performance by integrating feature learning and hash codes optimizing into a same framework. However, most existing deep cross-modal hashing meth… ▽ More

    Submitted 29 July, 2019; originally announced July 2019.

  38. arXiv:1906.01732  [pdf, other

    cs.LG stat.ML

    Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation

    Authors: Ruibo Tu, Kun Zhang, Bo Christer Bertilson, Hedvig Kjellström, Cheng Zhang

    Abstract: Discovery of causal relations from observational data is essential for many disciplines of science and real-world applications. However, unlike other machine learning algorithms, whose development has been greatly fostered by a large amount of available benchmark datasets, causal discovery algorithms are notoriously difficult to be systematically evaluated because few datasets with known ground-tr… ▽ More

    Submitted 28 October, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted by NeurIPS 2019, 6 figures, 10 tables

  39. arXiv:1811.09822  [pdf, other

    cs.CV

    Object Detection based Deep Unsupervised Hashing

    Authors: Rong-Cheng Tu, Xian-Ling Mao, Bo-Si Feng, Bing-Bing Bian, Yu-shu Ying

    Abstract: Recently, similarity-preserving hashing methods have been extensively studied for large-scale image retrieval. Compared with unsupervised hashing, supervised hashing methods for labeled data have usually better performance by utilizing semantic label information. Intuitively, for unlabeled data, it will improve the performance of unsupervised hashing methods if we can first mine some supervised se… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

  40. arXiv:1810.03435  [pdf, ps, other

    q-bio.QM cs.LG stat.ML

    Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation

    Authors: Charles Hamesse, Ruibo Tu, Paul Ackermann, Hedvig Kjellström, Cheng Zhang

    Abstract: Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries. Rehabilitation after such a musculoskeletal injury remains a prolonged process with a very variable outcome. Accurately predicting rehabilitation outcome is crucial for treatment decision support. However, it is challenging to train an automatic method for predicting the ATR rehabilitation outcome from treatment data, due to… ▽ More

    Submitted 13 August, 2019; v1 submitted 8 September, 2018; originally announced October 2018.

  41. arXiv:1807.04010  [pdf, ps, other

    cs.LG stat.ML

    Causal Discovery in the Presence of Missing Data

    Authors: Ruibo Tu, Kun Zhang, Paul Ackermann, Bo Christer Bertilson, Clark Glymour, Hedvig Kjellström, Cheng Zhang

    Abstract: Missing data are ubiquitous in many domains including healthcare. When these data entries are not missing completely at random, the (conditional) independence relations in the observed data may be different from those in the complete data generated by the underlying causal process. Consequently, simply applying existing causal discovery methods to the observed data may lead to wrong conclusions. I… ▽ More

    Submitted 12 July, 2020; v1 submitted 11 July, 2018; originally announced July 2018.

  42. arXiv:1804.11031  [pdf, other

    cs.CV

    Towards Deeper Generative Architectures for GANs using Dense connections

    Authors: Samarth Tripathi, Renbo Tu

    Abstract: In this paper, we present the result of adopting skip connections and dense layers, previously used in image classification tasks, in the Fisher GAN implementation. We have experimented with different numbers of layers and inserting these connections in different sections of the network. Our findings suggests that networks implemented with the connections produce better images than the baseline, a… ▽ More

    Submitted 12 November, 2018; v1 submitted 29 April, 2018; originally announced April 2018.

  43. arXiv:0801.4571  [pdf, ps, other

    cs.IT

    Is SP BP?

    Authors: Ronghui Tu, Yongyi Mao, Jiying Zhao

    Abstract: The Survey Propagation (SP) algorithm for solving $k$-SAT problems has been shown recently as an instance of the Belief Propagation (BP) algorithm. In this paper, we show that for general constraint-satisfaction problems, SP may not be reducible from BP. We also establish the conditions under which such a reduction is possible. Along our development, we present a unification of the existing SP a… ▽ More

    Submitted 29 January, 2008; originally announced January 2008.

    Comments: 77 page double-spaced single-column submitted version to IEEE Transactions on Information Theory