subscribe to arXiv mailings

Relating CNN-Transformer Fusion Network for Change Detection

Authors: Yuhao Gao, Gensheng Pei, Mengmeng Sheng, Zeren Sun, Tao Chen, Yazhou Yao

Abstract: While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbo… ▽ More While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbone to exploit both spatial and temporal features early on, \textbf{(2)} a Cross-Stage Aggregation (CSA) module for enhanced temporal representation, \textbf{(3)} a Multi-Scale Feature Fusion (MSF) module for enriched feature extraction in the decoder, and \textbf{(4)} an Efficient Self-deciphering Attention (ESA) module utilizing transformers to capture global information and fine-grained details for accurate change detection. Extensive experiments demonstrate RCTNet's clear superiority over traditional RS image CD methods, showing significant improvement and an optimal balance between accuracy and computational cost. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: accepted by IEEE Conference on Multimedia Expo

arXiv:2407.02768 [pdf, other]

Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation

Authors: Tao Chen, XiRuo Jiang, Gensheng Pei, Zeren Sun, Yucheng Wang, Yazhou Yao

Abstract: Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a \textbf{K}nowledge \textbf{T}ransfer with \textbf{S}imulated Inter-Image \textbf{E}rasing (KTSE) a… ▽ More Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a \textbf{K}nowledge \textbf{T}ransfer with \textbf{S}imulated Inter-Image \textbf{E}rasing (KTSE) approach for weakly supervised semantic segmentation to alleviate the above problem. In contrast to existing erasing-based methods that remove the discriminative part for more object discovery, we propose a simulated inter-image erasing scenario to weaken the original activation by introducing extra object information. Then, object knowledge is transferred from the anchor image to the consequent less activated localization map to strengthen network localization ability. Considering the adopted bidirectional alignment will also weaken the anchor image activation if appropriate constraints are missing, we propose a self-supervised regularization module to maintain the reliable activation in discriminative regions and improve the inter-class object boundary recognition for complex images with multiple categories of objects. In addition, we resort to intra-image erasing and propose a multi-granularity alignment module to gently enlarge the object activation to boost the object knowledge transfer. Extensive experiments and ablation studies on PASCAL VOC 2012 and COCO datasets demonstrate the superiority of our proposed approach. Source codes and models are available at https://github.com/NUST-Machine-Intelligence-Laboratory/KTSE. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: accepted by the European Conference on Computer Vision (ECCV), 2024

arXiv:2405.11742 [pdf, other]

Universal Organizer of SAM for Unsupervised Semantic Segmentation

Authors: Tingting Li, Gensheng Pei, Xinhao Cai, Huafeng Liu, Qiong Wang, Yazhou Yao

Abstract: Unsupervised semantic segmentation (USS) aims to achieve high-quality segmentation without manual pixel-level annotations. Existing USS models provide coarse category classification for regions, but the results often have blurry and imprecise edges. Recently, a robust framework called the segment anything model (SAM) has been proven to deliver precise boundary object masks. Therefore, this paper p… ▽ More Unsupervised semantic segmentation (USS) aims to achieve high-quality segmentation without manual pixel-level annotations. Existing USS models provide coarse category classification for regions, but the results often have blurry and imprecise edges. Recently, a robust framework called the segment anything model (SAM) has been proven to deliver precise boundary object masks. Therefore, this paper proposes a universal organizer based on SAM, termed as UO-SAM, to enhance the mask quality of USS models. Specifically, using only the original image and the masks generated by the USS model, we extract visual features to obtain positional prompts for target objects. Then, we activate a local region optimizer that performs segmentation using SAM on a per-object basis. Finally, we employ a global region optimizer to incorporate global image information and refine the masks to obtain the final fine-grained masks. Compared to existing methods, our UO-SAM achieves state-of-the-art performance. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: accepted by IEEE International Conference on Multimedia & Expo

arXiv:2405.05990 [pdf, other]

Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models

Authors: Yang Bai, Ge Pei, Jindong Gu, Yong Yang, Xingjun Ma

Abstract: Large language models (LLMs) have achieved remarkable performance on a wide range of tasks. However, recent studies have shown that LLMs can memorize training data and simple repeated tokens can trick the model to leak the data. In this paper, we take a step further and show that certain special characters or their combinations with English letters are stronger memory triggers, leading to more sev… ▽ More Large language models (LLMs) have achieved remarkable performance on a wide range of tasks. However, recent studies have shown that LLMs can memorize training data and simple repeated tokens can trick the model to leak the data. In this paper, we take a step further and show that certain special characters or their combinations with English letters are stronger memory triggers, leading to more severe data leakage. The intuition is that, since LLMs are trained with massive data that contains a substantial amount of special characters (e.g. structural symbols {, } of JSON files, and @, # in emails and online posts), the model may memorize the co-occurrence between these special characters and the raw texts. This motivates us to propose a simple but effective Special Characters Attack (SCA) to induce training data leakage. Our experiments verify the high effectiveness of SCA against state-of-the-art LLMs: they can leak diverse training data, such as code corpus, web pages, and personally identifiable information, and sometimes generate non-stop outputs as a byproduct. We further show that the composition of the training data corpus can be revealed by inspecting the leaked data -- one crucial piece of information for pre-training high-performance LLMs. Our work can help understand the sensitivity of LLMs to special characters and identify potential areas for improvement. △ Less

Submitted 20 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.19311 [pdf, other]

A Light-weight Transformer-based Self-supervised Matching Network for Heterogeneous Images

Authors: Wang Zhang, Tingting Li, Yuntian Zhang, Gensheng Pei, Xiruo Jiang, Yazhou Yao

Abstract: Matching visible and near-infrared (NIR) images remains a significant challenge in remote sensing image fusion. The nonlinear radiometric differences between heterogeneous remote sensing images make the image matching task even more difficult. Deep learning has gained substantial attention in computer vision tasks in recent years. However, many methods rely on supervised learning and necessitate l… ▽ More Matching visible and near-infrared (NIR) images remains a significant challenge in remote sensing image fusion. The nonlinear radiometric differences between heterogeneous remote sensing images make the image matching task even more difficult. Deep learning has gained substantial attention in computer vision tasks in recent years. However, many methods rely on supervised learning and necessitate large amounts of annotated data. Nevertheless, annotated data is frequently limited in the field of remote sensing image matching. To address this challenge, this paper proposes a novel keypoint descriptor approach that obtains robust feature descriptors via a self-supervised matching network. A light-weight transformer network, termed as LTFormer, is designed to generate deep-level feature descriptors. Furthermore, we implement an innovative triplet loss function, LT Loss, to enhance the matching performance further. Our approach outperforms conventional hand-crafted local feature descriptors and proves equally competitive compared to state-of-the-art deep learning-based methods, even amidst the shortage of annotated data. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: accepted by Information Fusion

arXiv:2404.13505 [pdf, other]

Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation

Authors: Gensheng Pei, Yazhou Yao, Jianbo Jiao, Wenguan Wang, Liqiang Nie, Jinhui Tang

Abstract: Conventional video object segmentation (VOS) methods usually necessitate a substantial volume of pixel-level annotated video data for fully supervised learning. In this paper, we present HVC, a \textbf{h}ybrid static-dynamic \textbf{v}isual \textbf{c}orrespondence framework for self-supervised VOS. HVC extracts pseudo-dynamic signals from static images, enabling an efficient and scalable VOS model… ▽ More Conventional video object segmentation (VOS) methods usually necessitate a substantial volume of pixel-level annotated video data for fully supervised learning. In this paper, we present HVC, a \textbf{h}ybrid static-dynamic \textbf{v}isual \textbf{c}orrespondence framework for self-supervised VOS. HVC extracts pseudo-dynamic signals from static images, enabling an efficient and scalable VOS model. Our approach utilizes a minimalist fully-convolutional architecture to capture static-dynamic visual correspondence in image-cropped views. To achieve this objective, we present a unified self-supervised approach to learn visual representations of static-dynamic feature similarity. Firstly, we establish static correspondence by utilizing a priori coordinate information between cropped views to guide the formation of consistent static feature representations. Subsequently, we devise a concise convolutional layer to capture the forward / backward pseudo-dynamic signals between two views, serving as cues for dynamic representations. Finally, we propose a hybrid visual correspondence loss to learn joint static and dynamic consistency representations. Our approach, without bells and whistles, necessitates only one training session using static image data, significantly reducing memory consumption ($\sim$16GB) and training time ($\sim$\textbf{2h}). Moreover, HVC achieves state-of-the-art performance in several self-supervised VOS benchmarks and additional video label propagation tasks. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2403.17881 [pdf, other]

Deepfake Generation and Detection: A Benchmark and Survey

Authors: Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, Dacheng Tao

Abstract: Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few. With the advancements in deep learning, techniques primarily represented by Variational Autoencoders and Generative Adversarial Networks have achieved… ▽ More Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few. With the advancements in deep learning, techniques primarily represented by Variational Autoencoders and Generative Adversarial Networks have achieved impressive generation results. More recently, the emergence of diffusion models with powerful generation capabilities has sparked a renewed wave of research. In addition to deepfake generation, corresponding detection technologies continuously evolve to regulate the potential misuse of deepfakes, such as for privacy invasion and phishing attacks. This survey comprehensively reviews the latest developments in deepfake generation and detection, summarizing and analyzing current state-of-the-arts in this rapidly evolving field. We first unify task definitions, comprehensively introduce datasets and metrics, and discuss developing technologies. Then, we discuss the development of several related sub-fields and focus on researching four representative deepfake fields: face swapping, face reenactment, talking face generation, and facial attribute editing, as well as forgery detection. Subsequently, we comprehensively benchmark representative methods on popular datasets for each field, fully evaluating the latest and influential published works. Finally, we analyze challenges and future research directions of the discussed fields. △ Less

Submitted 16 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: We closely follow the latest developments in https://github.com/flyingby/Awesome-Deepfake-Generation-and-Detection

arXiv:2402.19082 [pdf, other]

VideoMAC: Video Masked Autoencoders Meet ConvNets

Authors: Gensheng Pei, Tao Chen, Xiruo Jiang, Huafeng Liu, Zeren Sun, Yazhou Yao

Abstract: Recently, the advancement of self-supervised learning techniques, like masked autoencoders (MAE), has greatly influenced visual representation learning for images and videos. Nevertheless, it is worth noting that the predominant approaches in existing masked image / video modeling rely excessively on resource-intensive vision transformers (ViTs) as the feature encoder. In this paper, we propose a… ▽ More Recently, the advancement of self-supervised learning techniques, like masked autoencoders (MAE), has greatly influenced visual representation learning for images and videos. Nevertheless, it is worth noting that the predominant approaches in existing masked image / video modeling rely excessively on resource-intensive vision transformers (ViTs) as the feature encoder. In this paper, we propose a new approach termed as \textbf{VideoMAC}, which combines video masked autoencoders with resource-friendly ConvNets. Specifically, VideoMAC employs symmetric masking on randomly sampled pairs of video frames. To prevent the issue of mask pattern dissipation, we utilize ConvNets which are implemented with sparse convolutional operators as encoders. Simultaneously, we present a simple yet effective masked video modeling (MVM) approach, a dual encoder architecture comprising an online encoder and an exponential moving average target encoder, aimed to facilitate inter-frame reconstruction consistency in videos. Additionally, we demonstrate that VideoMAC, empowering classical (ResNet) / modern (ConvNeXt) convolutional encoders to harness the benefits of MVM, outperforms ViT-based approaches on downstream tasks, including video object segmentation (+\textbf{5.2\%} / \textbf{6.4\%} $\mathcal{J}\&\mathcal{F}$), body part propagation (+\textbf{6.3\%} / \textbf{3.1\%} mIoU), and human pose tracking (+\textbf{10.2\%} / \textbf{11.1\%} PCK@0.1). △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: accepted by IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

arXiv:2312.09525 [pdf, other]

Hierarchical Graph Pattern Understanding for Zero-Shot VOS

Authors: Gensheng Pei, Fumin Shen, Yazhou Yao, Tao Chen, Xian-Sheng Hua, Heng-Tao Shen

Abstract: The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow coul… ▽ More The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow could be effectively supplemented by modeling in a structural form. This paper proposes a new hierarchical graph neural network (GNN) architecture, dubbed hierarchical graph pattern understanding (HGPU), for zero-shot video object segmentation (ZS-VOS). Inspired by the strong ability of GNNs in capturing structural relations, HGPU innovatively leverages motion cues (\ie, optical flow) to enhance the high-order representations from the neighbors of target frames. Specifically, a hierarchical graph pattern encoder with message aggregation is introduced to acquire different levels of motion and appearance features in a sequential manner. Furthermore, a decoder is designed for hierarchically parsing and understanding the transformed multi-modal contexts to achieve more accurate and robust results. HGPU achieves state-of-the-art performance on four publicly available benchmarks (DAVIS-16, YouTube-Objects, Long-Videos and DAVIS-17). Code and pre-trained model can be found at \url{https://github.com/NUST-Machine-Intelligence-Laboratory/HGPU}. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: accepted by IEEE Transactions on Image Processing

Journal ref: IEEE Transactions on Image Processing 2023

arXiv:2304.03910 [pdf, other]

doi 10.1109/TIP.2023.3267244

Co-attention Propagation Network for Zero-Shot Video Object Segmentation

Authors: Gensheng Pei, Yazhou Yao, Fumin Shen, Dan Huang, Xingguo Huang, Heng-Tao Shen

Abstract: Zero-shot video object segmentation (ZS-VOS) aims to segment foreground objects in a video sequence without prior knowledge of these objects. However, existing ZS-VOS methods often struggle to distinguish between foreground and background or to keep track of the foreground in complex scenarios. The common practice of introducing motion information, such as optical flow, can lead to overreliance on… ▽ More Zero-shot video object segmentation (ZS-VOS) aims to segment foreground objects in a video sequence without prior knowledge of these objects. However, existing ZS-VOS methods often struggle to distinguish between foreground and background or to keep track of the foreground in complex scenarios. The common practice of introducing motion information, such as optical flow, can lead to overreliance on optical flow estimation. To address these challenges, we propose an encoder-decoder-based hierarchical co-attention propagation network (HCPN) capable of tracking and segmenting objects. Specifically, our model is built upon multiple collaborative evolutions of the parallel co-attention module (PCM) and the cross co-attention module (CCM). PCM captures common foreground regions among adjacent appearance and motion features, while CCM further exploits and fuses cross-modal motion features returned by PCM. Our method is progressively trained to achieve hierarchical spatio-temporal feature propagation across the entire video. Experimental results demonstrate that our HCPN outperforms all previous methods on public benchmarks, showcasing its effectiveness for ZS-VOS. △ Less

Submitted 8 April, 2023; originally announced April 2023.

Comments: accepted by IEEE Transactions on Image Processing

arXiv:2207.08485 [pdf, other]

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

Authors: Gensheng Pei, Fumin Shen, Yazhou Yao, Guo-Sen Xie, Zhenmin Tang, Jinhui Tang

Abstract: Optical flow is an easily conceived and precious cue for advancing unsupervised video object segmentation (UVOS). Most of the previous methods directly extract and fuse the motion and appearance features for segmenting target objects in the UVOS setting. However, optical flow is intrinsically an instantaneous velocity of all pixels among consecutive frames, thus making the motion features not alig… ▽ More Optical flow is an easily conceived and precious cue for advancing unsupervised video object segmentation (UVOS). Most of the previous methods directly extract and fuse the motion and appearance features for segmenting target objects in the UVOS setting. However, optical flow is intrinsically an instantaneous velocity of all pixels among consecutive frames, thus making the motion features not aligned well with the primary objects among the corresponding frames. To solve the above challenge, we propose a concise, practical, and efficient architecture for appearance and motion feature alignment, dubbed hierarchical feature alignment network (HFAN). Specifically, the key merits in HFAN are the sequential Feature AlignMent (FAM) module and the Feature AdaptaTion (FAT) module, which are leveraged for processing the appearance and motion features hierarchically. FAM is capable of aligning both appearance and motion features with the primary object semantic representations, respectively. Further, FAT is explicitly designed for the adaptive fusion of appearance and motion features to achieve a desirable trade-off between cross-modal features. Extensive experiments demonstrate the effectiveness of the proposed HFAN, which reaches a new state-of-the-art performance on DAVIS-16, achieving 88.7 $\mathcal{J}\&\mathcal{F}$ Mean, i.e., a relative improvement of 3.5% over the best published result. △ Less

Submitted 19 July, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: Accepted by ECCV-2022

arXiv:2201.00628 [pdf]

An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Authors: Shujie Wang, Gongshu Wang, Guangying Pei

Abstract: As the second most common neurodegenerative disease, Parkinson's disease has caused serious problems worldwide. However, the cause and mechanism of PD are not clear, and no systematic early diagnosis and treatment of PD have been established. Many patients with PD have not been diagnosed or misdiagnosed. In this paper, we proposed an EEG-based approach to diagnosing Parkinson's disease. It mapped… ▽ More As the second most common neurodegenerative disease, Parkinson's disease has caused serious problems worldwide. However, the cause and mechanism of PD are not clear, and no systematic early diagnosis and treatment of PD have been established. Many patients with PD have not been diagnosed or misdiagnosed. In this paper, we proposed an EEG-based approach to diagnosing Parkinson's disease. It mapped the frequency band energy of electroencephalogram(EEG) signals to 2-dimensional images using the interpolation method and identified classification using capsule network(CapsNet) and achieved 89.34% classification accuracy for short-term EEG sections. A comparison of separate classification accuracy across different EEG bands revealed the highest accuracy in the gamma bands, suggesting that we need to pay more attention to the changes in gamma band changes in the early stages of PD. △ Less

Submitted 11 January, 2022; v1 submitted 27 December, 2021; originally announced January 2022.

Comments: 6 pages,2 image, 3 tables

ACM Class: I.2.8

arXiv:2111.02040 [pdf, other]

Three-dimensional Cooperative Localization of Commercial-Off-The-Shelf Sensors

Authors: Yulong Wang, Shenghong Li, Wei Ni, David Abbott, Mark Johnson, Guangyu Pei, Mark Hedley

Abstract: Many location-based services use Received Signal Strength (RSS) measurements due to their universal availability. In this paper, we study the association of a large number of low-cost Internet-of-Things (IoT) sensors and their possible installation locations, which can enable various sensing and automation-related applications. We propose an efficient approach to solve the corresponding permutatio… ▽ More Many location-based services use Received Signal Strength (RSS) measurements due to their universal availability. In this paper, we study the association of a large number of low-cost Internet-of-Things (IoT) sensors and their possible installation locations, which can enable various sensing and automation-related applications. We propose an efficient approach to solve the corresponding permutation combinatorial optimization problem, which integrates continuous space cooperative localization and permutation space likelihood ascent search. A convex relaxation-based optimization is designed to estimate the coarse locations of blindfolded devices in continuous 3D spaces, which are then projected to the feasible permutation space. An efficient Cramér-Rao Lower Bound based likelihood ascent search algorithm is proposed to refine the solution. Extensive experiments were conducted to evaluate the performance of the proposed approach, which show that the proposed approach significantly outperforms state-of-the-art combinatorial optimization algorithms and achieves close-to-100% accuracy with affordable execution time. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: 10 pages, 12 figures

ACM Class: I.m; I.6.5

arXiv:1208.0811 [pdf, ps, other]

Efficient Algorithms for Maximum Link Scheduling in Distributed Computing Models with SINR Constraints

Authors: Guanhong Pei, Anil Kumar S. Vullikanti

Abstract: A fundamental problem in wireless networks is the maximum link scheduling problem: given a set $L$ of links, compute the largest possible subset $L'\subseteq L$ of links that can be scheduled simultaneously without interference. This problem is particularly challenging in the physical interference model based on SINR constraints (referred to as the SINR model), which has gained a lot of interest i… ▽ More A fundamental problem in wireless networks is the maximum link scheduling problem: given a set $L$ of links, compute the largest possible subset $L'\subseteq L$ of links that can be scheduled simultaneously without interference. This problem is particularly challenging in the physical interference model based on SINR constraints (referred to as the SINR model), which has gained a lot of interest in recent years. Constant factor approximation algorithms have been developed for this problem, but low complexity distributed algorithms that give the same approximation guarantee in the SINR model are not known. Distributed algorithms are especially challenging in this model, because of its non-locality. In this paper, we develop a set of fast distributed algorithms in the SINR model, providing constant approximation for the maximum link scheduling problem under uniform power assignment. We find that different aspects of available technology, such as full/half-duplex communication, and non-adaptive/adaptive power control, have a significant impact on the performance of the algorithm; these issues have not been explored in the context of distributed algorithms in the SINR model before. Our algorithms' running time is $O(g(L) \log^c m)$, where $c=1,2,3$ for different problem instances, and $g(L)$ is the "link diversity" determined by the logarithmic scale of a communication link length. Since $g(L)$ is small and remains in a constant range in most cases, our algorithms serve as the first set of "sublinear" time distributed solution. The algorithms are randomized and crucially use physical carrier sensing in distributed communication steps. △ Less

Submitted 16 November, 2012; v1 submitted 3 August, 2012; originally announced August 2012.

arXiv:1206.1113 [pdf, ps, other]

A Fast Distributed Approximation Algorithm for Minimum Spanning Trees in the SINR Model

Authors: Maleq Khan, V. S. Anil Kumar, Gopal Pandurangan, Guanhong Pei

Abstract: A fundamental problem in wireless networks is the \emph{minimum spanning tree} (MST) problem: given a set $V$ of wireless nodes, compute a spanning tree $T$, so that the total cost of $T$ is minimized. In recent years, there has been a lot of interest in the physical interference model based on SINR constraints. Distributed algorithms are especially challenging in the SINR model, because of the no… ▽ More A fundamental problem in wireless networks is the \emph{minimum spanning tree} (MST) problem: given a set $V$ of wireless nodes, compute a spanning tree $T$, so that the total cost of $T$ is minimized. In recent years, there has been a lot of interest in the physical interference model based on SINR constraints. Distributed algorithms are especially challenging in the SINR model, because of the non-locality of the model. In this paper, we develop a fast distributed approximation algorithm for MST construction in an SINR based distributed computing model. For an $n$-node network, our algorithm's running time is $O(D\log{n}+μ\log{n})$ and produces a spanning tree whose cost is within $O(\log n)$ times the optimal (MST cost), where $D$ denotes the diameter of the disk graph obtained by using the maximum possible transmission range, and $μ=\log{\frac{d_{max}}{d_{min}}}$ denotes the "distance diversity" w.r.t. the largest and smallest distances between two nodes. (When $\frac{d_{max}}{d_{min}}$ is $n$-polynomial, $μ= O(\log n)$.) Our algorithm's running time is essentially optimal (upto a logarithmic factor), since computing {\em any} spanning tree takes $Ω(D)$ time; thus our algorithm produces a low cost spanning tree in time only a logarithmic factor more than the time to compute a spanning tree. The distributed scheduling complexity of the spanning tree resulted from our algorithm is $O(μ\log n)$. Our algorithmic design techniques can be useful in designing efficient distributed algorithms for related "global" problems in wireless networks in the SINR model. △ Less

Submitted 5 June, 2012; originally announced June 2012.

ACM Class: C.2.4; F.2.2

Showing 1–15 of 15 results for author: Pei, G