Shailesh Kumar

Hyderabad, Telangana, India

33K followers 500+ connections

View mutual connections with Shailesh

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Jio

The University of Texas at Austin

About

We are living in very interesting times - the era of digital connectivity, AI, Internet…

Experience & Education

*** *********

******* ****** **/** *******
****** ****** ** ********

******** ******* - ******* ********
*** ********** ** ***** ** ******

**.*. (******** ***********) *********** ******* ***********, **** ****** *.* / *.*

1997 - 2000
*** ********** ** ***** ** ******

*.*. (**) ********** ************, ******* ********, ************* ******** *.* / *.*

1995 - 1997

View Shailesh’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Volunteer Experience

Trainer

Heartfulness.org

Social Services

Heartfulness is a Meditation technique that can be practiced with or without "Yogic Transmission". See www.heartfulness.org for more details.
Editorial Panel for Fifth Elephant Conference

HasGeek

Apr 2015 - Present 9 years 4 months

Science and Technology

Helped review submissions for Fifth Elephant Conference - 2014, 2015, 2016

Publications

Class Vectors : Embedding representation of Document Classes

Arxiv.org Aug 2015
Distributed representations of words and paragraphs as semantic embeddings in high dimensional data are used across a number of Natural Language Understanding tasks such as retrieval, translation, and classification. In this work, we propose "Class Vectors" - a framework for learning a vector per class in the same embedding space as the word and paragraph embeddings. Similarity between these class vectors and word vectors are used as features to classify a document to a class. In experiment on…

Distributed representations of words and paragraphs as semantic embeddings in high dimensional data are used across a number of Natural Language Understanding tasks such as retrieval, translation, and classification. In this work, we propose "Class Vectors" - a framework for learning a vector per class in the same embedding space as the word and paragraph embeddings. Similarity between these class vectors and word vectors are used as features to classify a document to a class. In experiment on several sentiment analysis tasks such as Yelp reviews and Amazon electronic product reviews, class vectors have shown better or comparable results in classification while learning very meaningful class embeddings.

Other authors
See publication
Compacting Large and Loose Communities

Asian Conference on Pattern Recognition August 11, 2013
Detecting compact overlapping communities in large networks is an important pattern recognition problem with applications in many domains. Most community detection algorithms trade-off between community sizes, their compactness and the scalability of finding communities.
Clique Percolation Method (CPM) and Local Fitness Maximization (LFM) are two prominent and commonly used overlapping community detection methods that scale with large networks. However, significant number of communities…

Detecting compact overlapping communities in large networks is an important pattern recognition problem with applications in many domains. Most community detection algorithms trade-off between community sizes, their compactness and the scalability of finding communities.
Clique Percolation Method (CPM) and Local Fitness Maximization (LFM) are two prominent and commonly used overlapping community detection methods that scale with large networks. However, significant number of communities found by them are large, noisy, and loose. In this paper, we propose a general algorithm that takes such large and loose communities generated by any method and refines them into compact communities in a systematic fashion. We define a new measure of community-ness based on eigenvector centrality, identify loose communities using this measure and propose an algorithm for partitioning such loose communities into compact communities. We refine the communities found by CPM and LFM using our method and show their effectiveness compared to the original communities in a recommendation engine task.

Other authors
See publication
Image Annotation in Presence of Noisy Labels

Pattern Recognition and Machine Intelligence August 1, 2013
Labels associated with social images are valuable source of information for tasks of image annotation, understanding and retrieval. These labels are often found to be noisy, mainly due to the collaborative tagging activities of users. Existing methods on annotation have been developed and verified on noise free labels of images. In this paper, we propose a novel and generic framework that exploits the collective knowledge embedded in noisy label co-occurrence pairs to derive robust annotations.…

Labels associated with social images are valuable source of information for tasks of image annotation, understanding and retrieval. These labels are often found to be noisy, mainly due to the collaborative tagging activities of users. Existing methods on annotation have been developed and verified on noise free labels of images. In this paper, we propose a novel and generic framework that exploits the collective knowledge embedded in noisy label co-occurrence pairs to derive robust annotations. We compare our method with a well-known image annotation algorithm and show its superiority in terms of annotation accuracy on benchmark Corel5K and ESP datasets in presence of noisy labels.

Other authors
See publication
Learning Multiple Non-Linear Sub-Spaces using K-RBMs

2013 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, USA June 27, 2013
Understanding the nature of data is the key to building good representations. In domains such as natural images, the data comes from very complex distributions which are hard to capture. Feature learning intends to discover or best approximate these underlying distributions and use their knowledge to weed out irrelevant information, preserving most of the relevant information. Feature learning can thus be seen as a form of dimensionality reduction. In this paper, we describe a feature learning…

Understanding the nature of data is the key to building good representations. In domains such as natural images, the data comes from very complex distributions which are hard to capture. Feature learning intends to discover or best approximate these underlying distributions and use their knowledge to weed out irrelevant information, preserving most of the relevant information. Feature learning can thus be seen as a form of dimensionality reduction. In this paper, we describe a feature learning scheme for natural images. We hypothesize that image patches do not all come from the same distribution, they lie in multiple nonlinear subspaces. We propose a framework that uses K-Restricted Boltzmann Machines (K-RBMS) to learn multiple non-linear subspaces in the raw image space. Projections of the image patches into these subspaces gives us features, which we use to build image representations. Our algorithm solves the coupled problem of ﬁnding the right non-linear subspaces in the input space and associating image patches with those subspaces in an iterative EM like algorithm to minimize the overall reconstruction error. Extensive empirical results over several popular image classiﬁcation datasets show that representations based on our framework outperform the traditional feature representations such as the SIFT based Bag-of-Words (BoW) and convolutional deep belief networks.

Other authors
See publication
Logical Itemset Mining

2012 IEEE 12th International Conference on Data Mining Workshops, Brussels, Belgium Belgium December 10, 2012
Frequent Itemset Mining (FISM) attempts to ﬁnd large and frequent itemsets in bag-of-items data such as retail market baskets. Such data has two properties that are not naturally addressed by FISM: (i) a market basket might contain items from more than one customer intent (mixture property) and (ii) only a subset of items related to a customer intent are present in most market baskets (projection property). We propose a simple and robust framework called LOGICAL ITEMSET MINING (LISM) that…

Frequent Itemset Mining (FISM) attempts to ﬁnd large and frequent itemsets in bag-of-items data such as retail market baskets. Such data has two properties that are not naturally addressed by FISM: (i) a market basket might contain items from more than one customer intent (mixture property) and (ii) only a subset of items related to a customer intent are present in most market baskets (projection property). We propose a simple and robust framework called LOGICAL ITEMSET MINING (LISM) that treats each market basket as a mixture-of, projections-of, latent customer intents. LISM attempts to discover logical itemsets from such bagof-items data. Each logical itemset can be interpreted as a latent customer intent in retail or semantic concept in text tagsets. While the mixture and projection properties are easy to appreciate in retail domain, they are present in almost all types of bag-of-items data. Through experiments on two large datasets, we demonstrate the quality, novelty, and actionability of logical itemsets discovered by the simple, scalable, and aggressively noise-robust LISM framework. We conclude that while FISM discovers a large number of noisy, observed, and
frequent itemsets, LISM discovers a small number of high quality, latent logical itemsets.

Other authors
See publication
Learning Hierarchical Bag of Words using Naive Bayes Clustering

2012 - 11th Asian Conference on Computer Vision, Daejeon, Korea. November 5, 2012
Image analysis tasks such as classi?cation, clustering, detection, and retrieval are only as good as the feature representation of the images they use. Much research in computer vision is focused on fi?nding better or semantically richer image representations. Bag of visual Words (BoW) is a representation that has emerged as an e?ffective one for a variety of computer vision tasks. BoW methods traditionally use low level features. We have devised a strategy to use these low level features to…

Image analysis tasks such as classi?cation, clustering, detection, and retrieval are only as good as the feature representation of the images they use. Much research in computer vision is focused on fi?nding better or semantically richer image representations. Bag of visual Words (BoW) is a representation that has emerged as an e?ffective one for a variety of computer vision tasks. BoW methods traditionally use low level features. We have devised a strategy to use these low level features to create \higher level" features by making use of the spatial context in images. In this paper, we propose a novel hierarchical feature learning framework that uses a Naive Bayes Clustering algorithm to convert a 2-D symbolic image at one level to a 2-D symbolic image at the next level with richer features. On two popular datasets, Pascal VOC 2007 and Caltech 101, we empirically show that classi?cation accuracy ob-
tained from the hierarchical features computed using our approach is signi?cantly higher than the traditional SIFT based BoW representation of images even though our image representations are more compact.

Other authors
See publication
Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis

Pattern Analysis and Applications, spl. Issue on Fusion of Multiple Classifiers Jun 2002
Many classification problems involve high dimensional inputs and a large number of classes. Multiclassifier fusion approaches to such difficult problems typically centre around smart feature extraction, input resampling methods, or input space partitioning to exploit modular learning. In this paper, we investigate how partitioning of the output space (i.e. the set of class labels) can be exploited in a multiclassifier fusion framework to simplify such problems and to yield better solutions…

Many classification problems involve high dimensional inputs and a large number of classes. Multiclassifier fusion approaches to such difficult problems typically centre around smart feature extraction, input resampling methods, or input space partitioning to exploit modular learning. In this paper, we investigate how partitioning of the output space (i.e. the set of class labels) can be exploited in a multiclassifier fusion framework to simplify such problems and to yield better solutions. Specifically, we introduce a hierarchical technique to recursively decompose a C-class problem into C_1 two-(meta) class problems. A generalised modular learning framework is used to partition a set of classes into two disjoint groups called meta-classes. The coupled problems of finding a good partition and of searching for a linear feature extractor that best discriminates the resulting two meta-classes are solved simultaneously at each stage of the recursive algorithm. This results in a binary tree whose leaf nodes represent the original C classes. The proposed hierarchical multiclassifier framework is particularly effective for difficult classification problems involving a moderately large number of classes. The proposed method is illustrated on a problem related to classification of landcover using hyperspectral data: a 12-class AVIRIS subset with 180 bands. For this problem, the classification accuracies obtained were superior to most other techniques developed for hyperspectral classification. Moreover, the class hierarchies that were automatically discovered conformed very well with human domain experts’ opinions, which demonstrates the potential of using such a modular learning approach for discovering domain knowledge automatically from data.

Other authors
See publication
Error based criterion for on-line wavelet data compression

Journal of Process Control Dec 2001
Wavelet based data compression methods have demonstrated superior performance over the conventional interpolative methods. However, the wavelet based methods need thresholding on the wavelet domain coefficients. Since wavelet coefficients are not commonly intuitive to engineers, significant a priori knowledge of either the wavelet coefficients or process thresholds is required. So unless thresholds are pre-specified, this requirement makes wavelets unsuitable for on-line implementations…

Wavelet based data compression methods have demonstrated superior performance over the conventional interpolative methods. However, the wavelet based methods need thresholding on the wavelet domain coefficients. Since wavelet coefficients are not commonly intuitive to engineers, significant a priori knowledge of either the wavelet coefficients or process thresholds is required. So unless thresholds are pre-specified, this requirement makes wavelets unsuitable for on-line implementations. Furthermore, as the relation between the wavelet domain coefficients and the measures of the quality of compression [root mean square error (RMSE) and local point error (LPE)] is not straightforward, it is difficult to achieve good control over the quality of compression by specifying thresholds on the wavelet coefficients. In this paper, an error based criterion is proposed for online wavelet data compression. It uses semantically straightforward measures of the quality of the result to be obtained to adaptively calculate the thresholds. Given a bound on time domain error limits like the RMSE and LPE, this technique adaptively computes the threshold values in wavelet domain. Experiments show that the resulting algorithm gives superior compression as compared to other wavelet based methods. Most importantly, it can be used on-line and provides an effective way of controlling LPE and RMSE. Finally, this method can easily be extended to other on-line wavelet applications such as data rectification and de-noising.

Other authors
See publication
Best-bases feature extraction algorithms for classification of hyperspectral data

IEEE Transactions on Geoscience and Remote Sensing. Jul 2001
Due to advances in sensor technology, it is now possible to acquire hyperspectral data simultaneously in hundreds of bands. Algorithms that both reduce the dimensionality of the data sets and handle highly correlated bands are required to exploit the information in these data sets effectively. the authors propose a set of best-bases feature extraction algorithms that are simple, fast, and highly effective for classification of hyperspectral data. These techniques intelligently combine subsets…

Due to advances in sensor technology, it is now possible to acquire hyperspectral data simultaneously in hundreds of bands. Algorithms that both reduce the dimensionality of the data sets and handle highly correlated bands are required to exploit the information in these data sets effectively. the authors propose a set of best-bases feature extraction algorithms that are simple, fast, and highly effective for classification of hyperspectral data. These techniques intelligently combine subsets of adjacent bands into a smaller number of features. Both top-down and bottom-up algorithms are proposed. The top-down algorithm recursively partitions the bands into two (not necessarily equal) sets of bands and then replaces each final set of bands by its mean value. The bottom-up algorithm builds an agglomerative tree by merging highly correlated adjacent bands and projecting them onto their Fisher direction, yielding high discrimination among classes. Both these algorithms are used in a pairwise classifier framework where the original C-class problem is divided into a set of (2C) two-class problems. The new algorithms (1) find variable length bases localized in wavelength, (2) favor grouping highly correlated adjacent bands that, when merged either by taking their mean or Fisher linear projection, yield maximum discrimination, and (3) seek orthogonal bases for each of the (2C) two-class problems into which a C-class problem can be decomposed. Experiments on an AVIRIS data set for a 12-class problem show significant improvements in classification accuracies while using a much smaller number of features

Other authors
See publication
A Bayesian Pairwise Classifier for Character Recognition

Cognitive and Neural Models for Word Recognition and Document Processing, 2000
We develop a Bayesian Pairwise Classifier framework that is suitable for pattern recognition problems involving a moderately large number of classes, and apply it to two character recognition datasets. A C class pattern recognition problem (e.g.; C = 26 for recognition of English Alphabet) is divided into a set of (2C) two-class problems. For each pair of classes, a Bayesian classifier based on a mixture of Gaussians (MOG) is used to model the probability density functions conditioned on a…

We develop a Bayesian Pairwise Classifier framework that is suitable for pattern recognition problems involving a moderately large number of classes, and apply it to two character recognition datasets. A C class pattern recognition problem (e.g.; C = 26 for recognition of English Alphabet) is divided into a set of (2C) two-class problems. For each pair of classes, a Bayesian classifier based on a mixture of Gaussians (MOG) is used to model the probability density functions conditioned on a single feature. A forward feature selection algorithm is then used to grow the feature space, and an efficient technique is developed to obtain a MOG in the larger feature space from the MOG's in the smaller spaces. Apart from improvements in classification accuracy, the proposed architecture also provides valuable domain knowledge such as identifying what features are most important in separating a pair of characters, relative distance between any two characters, etc.

Other authors
See publication
A Hierarchical Multiclassifier System for Hyperspectral Data Analysis

Lecture Notes in Computer Science, Vol. 1857 2000
Many real world classification problems involve high dimensional inputs and a large number of classes. Feature extraction and modular learning approaches can be used to simplify such problems. In this paper, we introduce a hierarchical multiclassifier paradigm in which a C- class problem is recursively decomposed into C- 1 two-class problems. A generalized modular learning framework is used to partition a set of classes into two disjoint groups called meta-classes. The coupled problem of…

Many real world classification problems involve high dimensional inputs and a large number of classes. Feature extraction and modular learning approaches can be used to simplify such problems. In this paper, we introduce a hierarchical multiclassifier paradigm in which a C- class problem is recursively decomposed into C- 1 two-class problems. A generalized modular learning framework is used to partition a set of classes into two disjoint groups called meta-classes. The coupled problem of finding a good partition and of searching for a linear feature extractor that best discriminates the resulting two meta-classes are solved simultaneously at each stage of the recursive algorithm. This results in a binary tree whose leaf nodes represent the original C classes. The proposed hierarchical multiclassifier architecture was used to classify 12 types of landcover from 183-dimensional hyperspectral data. The classification accuracy was significantly improved by 4 to 10% relative to other feature extraction and modular learning approaches. Moreover, the class hierarchy that was automatically discovered conformed very well with a human domain expert–s opinion, which demonstrates the potential of such a modular learning approach for discovering domain knowledge automatically from data.

Other authors
See publication
Multiresolution feature extraction for pairwise classification

SPIE Conf. on Applications of Artificial Neural Networks in Image Processing V 2000
Other authors
See publication
On-line data compression and error analysis using wavelet technology

AIChE Journal Jan 2000
Wavelet representation of a signal is efficient for process data compression. An on-line compression algorithm based on Haar wavelets is proposed here. As a new data point arrives, the algorithm computes all the approximation coefficients and updates the multiresolution tree before it prepares to receive the next data point. An efficient bookkeeping and indexing scheme improves compression ratio more significantly than batch-mode wavelet compression. Reconstruction algorithms and historian…

Wavelet representation of a signal is efficient for process data compression. An on-line compression algorithm based on Haar wavelets is proposed here. As a new data point arrives, the algorithm computes all the approximation coefficients and updates the multiresolution tree before it prepares to receive the next data point. An efficient bookkeeping and indexing scheme improves compression ratio more significantly than batch-mode wavelet compression. Reconstruction algorithms and historian format for this bookkeeping are developed. Various analytical results on the bounds on compression ratio and sum of the square error that can be achieved using this algorithm are derived. Experimental evaluation over two sets of plant data shows that wavelet compression is superior to conventional interpolative methods (such as boxcar, backward slope, and SLIM3) in terms of quality of compression measured both in time and frequency domain and that the proposed on-line wavelet compression algorithm performs better than the batch-mode wavelet compression algorithm due to the efficient indexing and bookkeeping scheme. The on-line algorithm combines the high quality of compression of wavelet-based methods and on-line implementation of interpolative compression algorithms at the same time.

Other authors
See publication
Fusion of airborne polarimetric and interferometric SAR for classification of coastal environments

IEEE Transactions on Geoscience and Remote Sensing May 1999
AIRSAR and TOPSAR data were acquired over the wetlands of Bolivar Peninsula along the Gulf coast of Texas for mapping land cover types and topographic features such as beach ridges, dunes, and relict storm features. Classification of land cover over this wetlands and uplands environment is difficult because of the similarity of spectral signatures of the vegetation types. In addition, because the distribution of vegetation communities in coastal marshes is strongly related to salinity, which in…

AIRSAR and TOPSAR data were acquired over the wetlands of Bolivar Peninsula along the Gulf coast of Texas for mapping land cover types and topographic features such as beach ridges, dunes, and relict storm features. Classification of land cover over this wetlands and uplands environment is difficult because of the similarity of spectral signatures of the vegetation types. In addition, because the distribution of vegetation communities in coastal marshes is strongly related to salinity, which in turn is largely dictated by frequency and duration of inundation, surface topography is critical to determination of the vegetation characteristics at any location. The potential advantages of multisensor classification, including, in particular, topographic information from a TOPSAR DEM are investigated. An approach which employs a class dependent feature selection procedure in conjunction with pairwise Bayesian classifiers is proposed and applied to the polarimetric and interferometric SAR data

Other authors
See publication
A versatile framework for labelling imagery with a large number of classes

International Joint Conference on Neural Netoworks 1999
Conventional methods for feature selection use some kind of separability criteria or classification accuracy for computing the relevance of a feature subset to the classification task. In two-class problems, this approach may be suitable, but for problems such as character recognition with 26 classes, these feature selection algorithms are often faced with complex tradeoffs among efficacy of features for separating different subsets of classes. We propose a class-pair based feature selection…

Conventional methods for feature selection use some kind of separability criteria or classification accuracy for computing the relevance of a feature subset to the classification task. In two-class problems, this approach may be suitable, but for problems such as character recognition with 26 classes, these feature selection algorithms are often faced with complex tradeoffs among efficacy of features for separating different subsets of classes. We propose a class-pair based feature selection algorithm which, in conjunction with mixture modeling technique, provides significantly superior results for differentiating a large number of classes, even when the class priors vary considerably. This technique is applied to multisensor NASA/JPL remote sensing AIRSAR data for characterizing 11 types of land cover. The proposed polychotomous approach not only gives improved test accuracy, but also reduces the number of features used. Important domain information can be derived from the features selected for different class pairs and the distance measure between these class pairs

Other authors
See publication
Confidence Based Dual Reinforcement Q-Routing: An Adaptive On-Line Routing Algorithm

16th International Joint Conference on Artificial Intelligence (IJCAI-99) 1999
This paper describes and evaluates the Confidence-based Dual Reinforcement Q-Routing algorithm (CDRQ-Routing) for adaptive packet routing in communication networks. CDRQ-Routing is based on an application of the Q-learning framework to network routing, as first proposed by Littman and Boyan (1993). The main contribution of CDRQ-routing is an increased quantity and an improved quality of exploration. Compared to Q-Routing, the state-of-the-art adaptive Bellman-Ford Routing algorithm, and the…

This paper describes and evaluates the Confidence-based Dual Reinforcement Q-Routing algorithm (CDRQ-Routing) for adaptive packet routing in communication networks. CDRQ-Routing is based on an application of the Q-learning framework to network routing, as first proposed by Littman and Boyan (1993). The main contribution of CDRQ-routing is an increased quantity and an improved quality of exploration. Compared to Q-Routing, the state-of-the-art adaptive Bellman-Ford Routing algorithm, and the non-adaptive shortest path method, CDRQ-Routing learns superior policies significantly faster. Moreover, the overhead due to exploration is shown to be insignificant compared to the improvements achieved, which makes CDRQ-Routing a practical method for real communication networks.

Other authors
See publication
GAMLS: A Generalized framework for Associative Modular Learning Systems

Applications and Science of Computational Intelligence II 1999
Other authors
See publication
Confidence Based Q-Routing: An On-Line Adaptive Network Routing Algorithm

Smart Engineering Systems: Neural Networks, Fuzzy Logic, Data Mining, and Evolutionary Programming, 8 1998
This paper describes and evaluates how confidence values can be used to improve the quality of exploration in Q-Routing for adaptive packet routing in communication networks. In Q-Routing each node in the network has a routing decision maker that adapts, on-line, to learn routing policies that can sustain high network loads and have low average packet delivery time. These decision makers maintain their view of the network in terms of Q values which are updated as the routing takes place. In…

This paper describes and evaluates how confidence values can be used to improve the quality of exploration in Q-Routing for adaptive packet routing in communication networks. In Q-Routing each node in the network has a routing decision maker that adapts, on-line, to learn routing policies that can sustain high network loads and have low average packet delivery time. These decision makers maintain their view of the network in terms of Q values which are updated as the routing takes place. In Confidence based Q-Routing (CQ-Routing), the improved implementation of Q-Routing with confidence values, each Q value is attached with a confidence (C value) which is a measure of how closely the corresponding Q value represents the current state of the network. While the learning rate in Q-Routing is a constant, the learning rate in CQ-Routing is computed as a function of confidence values of the old and estimated Q values for each update. If either the old Q value has a low confidence or the estimated Q value has a high confidence, the learning rate is high. The quality of exploration is improved in CQ-Routing as a result of this variable learning rate. Experiments over several network topologies have shown that at low and medium loads, CQ-Routing learns the adequate routing policies significantly faster than Q-Routing, and at high loads, it learns routing policies that are significantly better than those learned by Q-Routing in terms of average packet delivery time. CQ-Routing is able to sustain higher network loads than Q-Routing, non-adaptive shortest-path routing and adaptive Bellman-Ford Routing. Finally, CQ-Routing was found to adapt significantly faster than Q-Routing to changes in network topology.

Other authors
See publication
Dual Reinforcement Q-Routing: An On-Line Adaptive Routing Algorithm

Smart Engineering Systems: Neural Networks, Fuzzy Logic, Data Mining, and Evolutionary Programming 7 1997
Other authors
See publication
On-Line Adaptation Of A Signal Predistorter Through Dual Reinforcement Learning

13th International Conference on Machine Learning 1996
Several researchers have demonstrated how neural networks can be trained to compensate for nonlinear signal distortion in e.g. digital satellite communications systems. These networks, however, require that both the original signal and its distorted version are known. Therefore, they have to be trained off-line, and they cannot adapt to changing channel characteristics. In this paper, a novel dual reinforcement learning approach is proposed that can adapt on-line while the system is performing.…

Several researchers have demonstrated how neural networks can be trained to compensate for nonlinear signal distortion in e.g. digital satellite communications systems. These networks, however, require that both the original signal and its distorted version are known. Therefore, they have to be trained off-line, and they cannot adapt to changing channel characteristics. In this paper, a novel dual reinforcement learning approach is proposed that can adapt on-line while the system is performing. Assuming that the channel characteristics are the same in both directions, two predistorters at each end of the communication channel co-adapt using the output of the other predistorter to determine their own reinforcement. Using the common Volterra Series model to simulate the channel, the system is shown to successfully learn to compensate for distortions up to 30%, which is significantly higher than what might be expected in an actual channel.

Other authors
See publication

Patents

Method and System for Maximizing Ride-Sharing Bookings

Issued July 13, 2021 11062237
A method and a system for maximizing share-ride bookings in a geographical area in a ride-sharing system are provided. Historical share-ride demands for the geographical area are estimated. A time period is segmented into time intervals such that each time interval has an equal count of the estimated historical share-ride demands. A conversion rate and a gross merchandise value (GMV) per unit of distance are determined for a first time interval at a check point of a second time interval. Error…

A method and a system for maximizing share-ride bookings in a geographical area in a ride-sharing system are provided. Historical share-ride demands for the geographical area are estimated. A time period is segmented into time intervals such that each time interval has an equal count of the estimated historical share-ride demands. A conversion rate and a gross merchandise value (GMV) per unit of distance are determined for a first time interval at a check point of a second time interval. Error signals are generated at the check point based on deviations in the conversion rate and the GMV per unit of distance with respect to a defined conversion rate and a defined GMV per unit of distance, respectively. A share-ride fare in the second time interval is controlled based on the error signals to maximize the share-ride bookings during the second time interval.

Other inventors
Method and System for Location Clustering for Transportation Services

Issued November 20, 2020 US US10846314

A method and a system for location clustering for a transportation service are provided. A plurality of locations are clustered into a plurality of clusters, each having one or more locations of the plurality of locations. A graph is generated by connecting the plurality of clusters. A first cluster of the plurality of clusters is connected to one or more second clusters of the plurality of clusters that satisfy one or more threshold parameters. The graph is segmented into a plurality of…

A method and a system for location clustering for a transportation service are provided. A plurality of locations are clustered into a plurality of clusters, each having one or more locations of the plurality of locations. A graph is generated by connecting the plurality of clusters. A first cluster of the plurality of clusters is connected to one or more second clusters of the plurality of clusters that satisfy one or more threshold parameters. The graph is segmented into a plurality of fully-connected maximal sub-graphs based on one or more connections between the plurality of clusters. One or more fully-connected maximal sub-graphs of the plurality of fully-connected maximal sub-graphs have a set of common clusters. The plurality of fully-connected maximal sub-graphs are used for performing one or more transportation service operations of the transportation service.

See patent
Method for Text Classification and Feature Selection Using Class Vectors and the System Thereof

Filed December 13, 2018 US US20180357531A1
A method for text classification and feature selection using class vectors, comprising the steps of receiving a text/training corpus including a plurality of training features representing a plurality of objects from a plurality of classes; learning a vector representation for each of the classes along with word vectors in the same embedding space; training the class vectors and words vectors jointly using skip-gram approach; and performing class vector based scoring for a particular feature;…

A method for text classification and feature selection using class vectors, comprising the steps of receiving a text/training corpus including a plurality of training features representing a plurality of objects from a plurality of classes; learning a vector representation for each of the classes along with word vectors in the same embedding space; training the class vectors and words vectors jointly using skip-gram approach; and performing class vector based scoring for a particular feature; and performing feature selection based on class vectors.

Other inventors
See patent
Overlapping Community Detection in Weighted Graphs

Issued August 16, 2016 US 9418142

The disclosure includes a system and method for detecting communities in a weighted graph. The community detection module includes a tagset data aggregator, a counts statistics engine, a weighted graph generator, a coherence engine, a community detector and a tag recommendation engine. The tagset data aggregator receives tagset data. The counts statistics engine determines counts statistics for the tagset data. The weighted graph generator generates and denoises weighted tag occurrence graph…

The disclosure includes a system and method for detecting communities in a weighted graph. The community detection module includes a tagset data aggregator, a counts statistics engine, a weighted graph generator, a coherence engine, a community detector and a tag recommendation engine. The tagset data aggregator receives tagset data. The counts statistics engine determines counts statistics for the tagset data. The weighted graph generator generates and denoises weighted tag occurrence graph based on the counts statistics. The coherence engine determines importance score for all tags and coherence score for all tagsets in the tagset data. The community detector determines maximally coherent communities in the weighted tag co-occurrence graph. The tag recommendation engine recommends tags in real time using the maximally coherent communities.

See patent
Adding document filters to an existing cluster hierarchy

Issued February 23, 2016 US 9268844
Other inventors
See patent
Contextual weighting of words in a word grouping

Issued December 1, 2015 US 9201876
Other inventors
See patent
Phrase identification in a sequence of words

Issued November 18, 2014 US 8892422

See patent
Keyword Suggestion for Efficient Legal E-Discovery

Issued November 4, 2014 US 8880492
Other inventors
See patent
Custodian Suggestion for Efficient Legal E-Discovery

Issued September 9, 2014 US 8832126
Other inventors
See patent
Product Space Browser

Issued July 1, 2014 US 8768743
Other inventors
See patent
Identifying and Redacting Privileged Information

Issued June 10, 2014 US 8752204
Other inventors
See patent
Query Suggestion for Efficient Legal E-Discovery

Issued November 12, 2013 US 8583669

See patent
Assortment planning based on demand transfers between products

Issued April 9, 2013 US 8417559
Other inventors
See patent
Co-occurrence Consistency Analysis method and apparatus for finding predictive variable groups

Issued January 15, 2013 US 8355896
Other inventors
See patent
Incremental factorization-based smoothing of sparse multi-dimensional risk tables

Issued March 6, 2012 US 8131615
Other inventors
See patent
Method and apparatus for recommendation engine using pair-wise co-occurrence consistency

Issued September 21, 2010 US 7801843
Other inventors
See patent
Purchase Sequence Browser

Issued June 14, 2010 US 7962368
Other inventors
See patent
Method and apparatus for initiating a transaction based on a bundle-lattice space of feasible product bundles

Issued March 23, 2010 US 7685021
Other inventors
See patent
Method and apparatus for retail data mining using pair-wise co-occurrence consistency

Issued March 2, 2010 US 7672865
Other inventors
See patent
Recursive on-line wavelet data compression technique for use in data storage and communications

Issued April 10, 2001 US 6215907
Other inventors
See patent
Signature Detection in E-Mails

Filed March 30, 2011 US US20120254166
Other inventors
See patent

Courses

Artificial Intelligence

-
Computer Vision

-
Data Mining

-
Digital Signal Processing

-
Image Processing

-
Machine Learning

-
Mathematical Logic

-
Neural Networks

-
Optimization Algorithms

-
Parallel Algorithms

-
Parallel Computer Architecture

-
Pattern Recognition

-
Real Time Systems

-
Robotics

-

Honors & Awards

50 Most Influential AI Leaders in India

Analytics India Magazine

2021

https://analyticsindiamag.com/50-most-influential-ai-leaders-in-india-2021/
10 Most Influential Analytics Leaders in India

Analytics India Magazine

Apr 2020

https://analyticsindiamag.com/10-most-influential-analytics-leaders-in-india-2020/
Analytics 50 - Top 50 Analytics Leaders in India - 2018

Analytics India Magazine

May 2018

https://www.themachinecon.com/awards/
Top 10 Data Scientist in India - 2015

Analytics India Magazine

Dec 2015

http://analyticsindiamag.com/top-10-data-scientists-in-india-2015/
Yahoo! Team Super Star Award

Yahoo! Inc.

Oct 2009

For the work done at Yahoo! on improving Yahoo! Image Search by more than 10%.
Fair Isaac Innovation Award

Fair Isaac

Jan 2008

For the work done on Retail Data Mining based on Co-occurrence Analytics
President's Gold Medal

IIT Varanasi

Jul 1995

Class Rank 1 in graduating class of 1995 Computer Science and Engineering.

Recommendations received

25 people have recommended Shailesh

Join now to view

View Shailesh’s full profile

See who you know in common
Get introduced
Contact Shailesh directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Shailesh Kumar in India

6164 others named Shailesh Kumar in India are on LinkedIn

See others named Shailesh Kumar

Add new skills with these courses

See all courses

Shailesh Kumar

Hyderabad, Telangana, India 33K followers 500+ connections

About

Experience & Education

Jio

***** **** *********, *** **/**

View Shailesh’s full experience

See their title, tenure and more.

Volunteer Experience

Trainer

Heartfulness.org

Editorial Panel for Fifth Elephant Conference

Publications

Arxiv.org Aug 2015

Asian Conference on Pattern Recognition August 11, 2013

Pattern Recognition and Machine Intelligence August 1, 2013

2013 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, USA June 27, 2013

2012 IEEE 12th International Conference on Data Mining Workshops, Brussels, Belgium Belgium December 10, 2012

2012 - 11th Asian Conference on Computer Vision, Daejeon, Korea. November 5, 2012

Pattern Analysis and Applications, spl. Issue on Fusion of Multiple Classifiers Jun 2002

Journal of Process Control Dec 2001

IEEE Transactions on Geoscience and Remote Sensing. Jul 2001

Cognitive and Neural Models for Word Recognition and Document Processing, 2000

Lecture Notes in Computer Science, Vol. 1857 2000

SPIE Conf. on Applications of Artificial Neural Networks in Image Processing V 2000

AIChE Journal Jan 2000

IEEE Transactions on Geoscience and Remote Sensing May 1999

International Joint Conference on Neural Netoworks 1999

16th International Joint Conference on Artificial Intelligence (IJCAI-99) 1999

Applications and Science of Computational Intelligence II 1999

Smart Engineering Systems: Neural Networks, Fuzzy Logic, Data Mining, and Evolutionary Programming, 8 1998

Smart Engineering Systems: Neural Networks, Fuzzy Logic, Data Mining, and Evolutionary Programming 7 1997

13th International Conference on Machine Learning 1996

Patents

Method and System for Maximizing Ride-Sharing Bookings

Issued July 13, 2021 11062237

Issued November 20, 2020 US US10846314

Filed December 13, 2018 US US20180357531A1

Issued August 16, 2016 US 9418142

Issued February 23, 2016 US 9268844

Issued December 1, 2015 US 9201876

Issued November 18, 2014 US 8892422

Issued November 4, 2014 US 8880492

Issued September 9, 2014 US 8832126

Issued July 1, 2014 US 8768743

Issued June 10, 2014 US 8752204

Issued November 12, 2013 US 8583669

Issued April 9, 2013 US 8417559

Issued January 15, 2013 US 8355896

Issued March 6, 2012 US 8131615

Issued September 21, 2010 US 7801843

Issued June 14, 2010 US 7962368

Issued March 23, 2010 US 7685021

Issued March 2, 2010 US 7672865

Issued April 10, 2001 US 6215907

Filed March 30, 2011 US US20120254166

Courses

Artificial Intelligence

-

Computer Vision

-

Data Mining

-

Digital Signal Processing

-

Image Processing

-

Machine Learning

-

Mathematical Logic

-

Neural Networks

-

Optimization Algorithms

-

Parallel Algorithms

-

Parallel Computer Architecture

-

Pattern Recognition

Hyderabad, Telangana, India

33K followers 500+ connections