Dr. Abhay Alok

Hyderabad, Telangana, India Contact Info

Sign in to view Dr. Abhay’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

2K followers 500+ connections

View mutual connections with Dr. Abhay

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Electronic Arts (EA)

Indian Institute of Technology, Patna

Company Website

About

Working on core critical problems as Enhance user adoption metrics, Customer life time…

Activity

Starting with a service-based company, my brother Kapil Ahuja spent four years learning and working hard before finally getting placed at…

Starting with a service-based company, my brother Kapil Ahuja spent four years learning and working hard before finally getting placed at…

Liked by Dr. Abhay Alok
Accelerating LLMs by 2x with Graph-structured Speculative Decoding. Researchers have found a way to make speculative decoding up to 2x faster by…

Accelerating LLMs by 2x with Graph-structured Speculative Decoding. Researchers have found a way to make speculative decoding up to 2x faster by…

Liked by Dr. Abhay Alok
A nice tutorial about Semantic Search in the NLP course by Lewis Tunstall from Hugging Face. This course helps me to teach students in Egypt about…

A nice tutorial about Semantic Search in the NLP course by Lewis Tunstall from Hugging Face. This course helps me to teach students in Egypt about…

Liked by Dr. Abhay Alok

Join now to see all activity

Experience & Education

Electronic Arts (EA)

******

********* **** ******* *******
****** ******

**** ******* *******
****** ********* ** **********, *****

**.* ******** ******* ***********, ******* ********, **********, ************** ************ ***

2011 - 2016
****** ********* ** *********** **********

****** ** ********** (*.****.) ***** ******** ***********

2008 - 2010

View Dr. Abhay’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Licenses & Certifications

AMCAT Certified Data Processing Specialist

Aspiring Minds

Issued Sep 2014

Credential ID 154561-211

See credential
AMCAT Certified Data Processing Specialist

Aspiring Minds

Issued Sep 2014

Credential ID 154561-211

See credential

Publications

Multi-objective semi-supervised clustering of tissue samples for cancer diagnosis

Springer/Soft Computing August 7, 2015

In the domain of bioinformatics, the clustering of gene expression profiles of different tissue samples over different experimental conditions has gained importance with the invention of micro-array based technology. This study also has some impact on cancer diagnosis. The proper classification of cancer tissue samples generated using the micro-array technology helps in detecting cancers in an automated way. In the current paper we have developed a semi-supervised clustering technique for…

In the domain of bioinformatics, the clustering of gene expression profiles of different tissue samples over different experimental conditions has gained importance with the invention of micro-array based technology. This study also has some impact on cancer diagnosis. The proper classification of cancer tissue samples generated using the micro-array technology helps in detecting cancers in an automated way. In the current paper we have developed a semi-supervised clustering technique for proper partitioning of these gene expression data sets. Semi-supervised clustering is a combination of unsupervised and supervised classification techniques. It uses some amount of supervised information and a large collection of unsupervised data. Here a multi-objective based semi-supervised clustering technique is developed for solving the cancer tissue classification problem. Different combinations of objective functions are used. As the supervised information we assume that class labels of 10 % data are available. The proposed technique is evaluated for three open source benchmark cancer data sets (brain tumor data set, adult malignancy and small round blood cell tumors). Two classification quality measures, viz., Adjusted Rand Index and Classification Accuracy are used to measure the goodness of the obtained partitionings. Obtained results are compared with several state-of-the-art clustering techniques. Moreover, significant gene markers have been identified and demonstrated visually from the clustering solutions obtained.

See publication
Use of Semi-supervised Clustering and Feature Selection Techniques for Gene-Expression Data

IEEE/ IEEE Jpurnal of Biomedical and Health Informatics July 20, 2015

Studying the patterns hidden in gene expression data helps to understand the functionality of genes. In general, clustering techniques are widely used for the identification of natural partitionings from the gene expression data. In order to put constraints on dimensionality, feature selection is the key issue because not all features are important from clustering point of view. Moreover some limited amount of supervised information can help to fine-tune the obtained clustering solution. In…

Studying the patterns hidden in gene expression data helps to understand the functionality of genes. In general, clustering techniques are widely used for the identification of natural partitionings from the gene expression data. In order to put constraints on dimensionality, feature selection is the key issue because not all features are important from clustering point of view. Moreover some limited amount of supervised information can help to fine-tune the obtained clustering solution. In this paper the problem of simultaneous feature selection and semi-supervised clustering is formulated as a multi-objective optimization task. A modern simulated annealing based multiobjective optimization technique namely AMOSA is utilized as the background optimization methodology. Here features and cluster centers are represented in the form of a string and the assignment of points to different clusters is done using a point symmetry based distance. Six optimization criteria based on several internal and external cluster validity indices are utilized. In order to generate the supervised information, a popular clustering technique, Fuzzy C-mean, is utilized. Appropriate subset of features, proper number of clusters and the proper partitioning are determined using the search capability of AMOSA. The effectiveness of this proposed semi-supervised clustering technique, Semi-FeaClustMOO, is demonstrated on five publicly available benchmark gene expression data sets. Comparison results with the existing techniques for gene expression data clustering again reveal the superiority of the proposed technique. Statistical and biological significance tests have also been carried out.

See publication
Multi-objective semi-supervised clustering for automatic pixel classification from remote sensing imagery

Springer/ Soft Computing May 16, 2015

Classifying the pixels of satellite images into homogeneous regions is a very challenging task as different regions have different types of land covers. Some land covers contain more regions, while some contain relatively smaller regions (e.g., bridges, roads). In satellite image segmentation, no prior information is available about the number of clusters. Here, in this paper, we have solved this problem using the concepts of semi-supervised clustering which utilizes the property of…

Classifying the pixels of satellite images into homogeneous regions is a very challenging task as different regions have different types of land covers. Some land covers contain more regions, while some contain relatively smaller regions (e.g., bridges, roads). In satellite image segmentation, no prior information is available about the number of clusters. Here, in this paper, we have solved this problem using the concepts of semi-supervised clustering which utilizes the property of unsupervised and supervised classification. Three cluster validity indices are utilized, which are simultaneously optimized using AMOSA, a modern multiobjective optimization technique based on the concepts of simulated annealing. The first two cluster validity indices, symmetry distance based Sym-index, and Euclidean distance based I-index, are based on unsupervised properties. The last one is a supervised information based cluster validity index, Minkowski index. For supervised information, initially fuzzy C-mean clustering technique is used. Thereafter, based on the highest membership values of the data points to their respective clusters, randomly 10 % data points with their class labels are chosen. The effectiveness of this proposed semi-supervised clustering technique is demonstrated on three satellite image data sets of different cities of India. Results are also compared with existing clustering techniques.

See publication
A new semi-supervised clustering technique using multi-objective optimization

Applied Intelligence/ Springer May 5, 2015

Semi-supervised clustering techniques have been proposed in the literature to overcome the problems associated with unsupervised and supervised classification. It considers a small amount of labeled data and the whole data distribution during the process of clustering a data. In this paper, a new approach towards semi-supervised clustering is implemented using multiobjective optimization (MOO) framework. Four objective functions are optimized using the search capability of a multiobjective…

Semi-supervised clustering techniques have been proposed in the literature to overcome the problems associated with unsupervised and supervised classification. It considers a small amount of labeled data and the whole data distribution during the process of clustering a data. In this paper, a new approach towards semi-supervised clustering is implemented using multiobjective optimization (MOO) framework. Four objective functions are optimized using the search capability of a multiobjective simulated annealing based technique, AMOSA. These objective functions are based on some unsupervised and supervised information. First three objective functions represent, respectively, the goodness of the partitioning in terms of Euclidean distance, total symmetry present in the clusters and the cluster connectedness. For the last objective function, we have considered different external cluster validity indices, including adjusted rand index, rand index, a newly developed min-max distance based MMI index, NMMI index and Minkowski Score. Results show that the proposed semi-supervised clustering technique can effectively detect the appropriate number of clusters as well as the appropriate partitioning from the data sets having either well-separated clusters of any shape or symmetrical clusters with or without overlaps. Twenty four artificial and five real-life data sets have been used in the evaluation. We develop five different versions of Semi-GenClustMOO clustering technique by varying the external cluster validity indices. Obtained partitioning results are compared with another recently developed multiobjective semi-supervised clustering technique, Mock-Semi. At the end of the paper the effectiveness of the proposed Semi-GenClustMOO clustering technique is shown in segmenting one remote sensing satellite image on the part from the city of Kolkata

See publication
Semi-supervised clustering for gene-expression data in multiobjective optimization framework

Springer/ International Journal of Machine Learning and Cybernetics February 15, 2015

Studying the patterns hidden in gene expression data helps to understand the functionality of genes. But due to the large volume of genes and the complexity of biological networks it is difficult to study the resulting mass of data which often consists of millions of measurements. In order to reveal natural structures and to identify interesting patterns from the given gene expression data set, clustering techniques are applied. Semi-supervised classification is a new direction of machine…

Studying the patterns hidden in gene expression data helps to understand the functionality of genes. But due to the large volume of genes and the complexity of biological networks it is difficult to study the resulting mass of data which often consists of millions of measurements. In order to reveal natural structures and to identify interesting patterns from the given gene expression data set, clustering techniques are applied. Semi-supervised classification is a new direction of machine learning. It requires huge unlabeled data and a few labeled data. Semi-supervised classification in general performs better than unsupervised classification. But to the best of our knowledge there are no works for solving gene expression data clustering problem using semi-supervised classification techniques. In the current paper we have made an attempt to solve the gene expression data clustering problem using a multiobjective optimization based semi-supervised classification technique with the aim to attain good quality partitions by using few labeled data. In order to generate the labeled data, initially Fuzzy C-means clustering technique is applied. In order to automatically determine the partitioning, multiple cluster centers corresponding to a cluster are encoded in the form of a string. In order to compute the quality of the obtained partitioning, values of five objective functions are computed. The effectiveness of this proposed semi-supervised clustering technique is demonstrated on five publicly available benchmark gene expression data sets. Comparison results with the existing techniques for gene expression data clustering prove that the proposed method is the most effective one. Statistical and biological significance tests have also been carried out.

See publication
A min-max distance based external cluster validity index: MMI

IEEE December 4, 2012

Evaluating a given clustering result is a very difficult problem in real world. Cluster validity indices are developed for this purpose. There are two different types of cluster validity indices available : External and Internal. External cluster validity indices utilize some supervised information and internal cluster validity indices utilize the intrinsic structure of the data. In this paper a new external cluster validity index, MMI has been implemented based on Max-Min distance among data…

Evaluating a given clustering result is a very difficult problem in real world. Cluster validity indices are developed for this purpose. There are two different types of cluster validity indices available : External and Internal. External cluster validity indices utilize some supervised information and internal cluster validity indices utilize the intrinsic structure of the data. In this paper a new external cluster validity index, MMI has been implemented based on Max-Min distance among data points and prior information based on structure of the data. A new probabilistic approach has been implemented to find the correct correspondence between the true and obtained clustering. Genetic K-means algorithm (GAK-means) and single linkage have been used as the underlying clustering techniques. Results of the proposed index for identifying the appropriate number of clusters is shown for five artificial and two real-life data sets. GAK-means and single linkage clustering techniques are used as the underlying partitioning techniques with the number of clusters varied over a range. The MMI index is then used to determine the appropriate number of clusters. The performance of MMI is compared with existing external cluster validity indices, adjusted rand index (ARI) and rand index (RI). It works well for two class and multi class data sets.

See publication
Semi-supervised clustering using multiobjective optimization

IEEE December 4, 2012

See publication
Feature selection and semi-supervised clustering using multiobjective optimization

Springer Plus

In this paper we have coupled feature selection problem with semi-supervised clustering. Semi-supervised clustering
utilizes the information of unsupervised and supervised learning in order to overcome the problems related to them.
But in general all the features present in the data set may not be important for clustering purpose. Thus appropriate
selection of features from the set of all features is very much relevant from clustering point of view. In this paper we
have solved the…

In this paper we have coupled feature selection problem with semi-supervised clustering. Semi-supervised clustering
utilizes the information of unsupervised and supervised learning in order to overcome the problems related to them.
But in general all the features present in the data set may not be important for clustering purpose. Thus appropriate
selection of features from the set of all features is very much relevant from clustering point of view. In this paper we
have solved the problem of automatic feature selection and semi-supervised clustering using multiobjective
optimization. A recently created simulated annealing based multiobjective optimization technique titled archived
multiobjective simulated annealing (AMOSA) is used as the underlying optimization technique. Here features and
cluster centers are encoded in the form of a string. We assume that for each data set for 10% data points class level
information are known to us. Two internal cluster validity indices reflecting different data properties, an external cluster
validity index measuring the similarity between the obtained partitioning and the true labelling for 10% data points
and a measure counting the number of features present in a particular string are optimized using the search capability
of AMOSA. AMOSA is utilized to detect the appropriate subset of features, appropriate number of clusters as well as the
appropriate partitioning from any given data set. The effectiveness of the proposed semi-supervised feature selection
technique as compared to the existing techniques is shown for seven real-life data sets of varying complexities.

See publication

Honors & Awards

Technical program committee of International Conference

-

Jun 2015

Reviewer of International Conference like...ICACCI 2015, IEEE SPICES 2015, ICCME 2015, VisioNet 2015, Confluence 2013 and CIMTA
2013.
Organising member of workshop on Optimization Technique for Language technology

IIT Bombay, Coling Conference

Dec 2012

It is under the influence of Coling -2012 , 25th International Conference.
Organizing member of Indo-Australia workshop on optimization Technique for Human Language Technology

India-Australia

Dec 2012

The aim of the workshop is to bring together the communities who are working in the areas of: evolutionary computation, optimization techniques, machine learning, language technology/Natural Language Processing, information retrieval, text mining. The workshop will be a starting platform to explore the possibilities of interdisciplinary research works that will focus on developing optimization based methods on the above fields within the context of human language technology. Almost all the…

The aim of the workshop is to bring together the communities who are working in the areas of: evolutionary computation, optimization techniques, machine learning, language technology/Natural Language Processing, information retrieval, text mining. The workshop will be a starting platform to explore the possibilities of interdisciplinary research works that will focus on developing optimization based methods on the above fields within the context of human language technology. Almost all the research and development activities in human language technology rely on the high level of performance to satisfy the users' intended needs, and have to deal with many objectives and parameters. For example, in Information Retrieval, it is often necessary to optimize the recall and precision parameters. In automatic summarisation, it is desired to optimize different objective functions like similarity to user query, ROUGE metric, important sentence score, and difference in length between the scored sentence and the desired sentence and many others. Other examples of optimization in NLP include parsing, machine translation, and computational models of language acquisition.
MHRD Scholarship

Govt of India

Jul 2011

Four Year got MHRD scholarship during period 2011-2015 for completion of Doctorate in Philosophy.
MHRD Scholarship

Govt Of India

Jul 2008

Got 2 Year MHRD Scholarship during 2008-2010 for completion of Masters in Technology
Reviewer of SCI Journal

IEEE/Springer/PLOSONE

Reviewer of IEEE/ ACM Transaction on Computational biology, Intelligent Service Robotics(JIST), Springer, PLOS One Journa, IJMLC, Environment and earth Sciences.

Languages

English

Professional working proficiency
Hindi

Full professional proficiency
Sanskrit

Limited working proficiency
Bhojpuri

Professional working proficiency
Bengali

Elementary proficiency
Maithili

Elementary proficiency

Organizations

IEEE

Student Member

Feb 2014 - Present

Recommendations received

1 person has recommended Dr. Abhay

Join now to view

More activity by Dr. Abhay

IIT Bombay has signed an MoU with the Centre for Railway Information System (CRIS), an IT wing of Indian Railways. The collaboration aims to solve…

IIT Bombay has signed an MoU with the Centre for Railway Information System (CRIS), an IT wing of Indian Railways. The collaboration aims to solve…

Liked by Dr. Abhay Alok
Must Read 😍

Must Read 😍

Liked by Dr. Abhay Alok
🚀 GPT-4o mini... OpenAI's most cost-efficient small model yet! GPT-4o mini, with its improved performance and drastically reduced costs, is…

🚀 GPT-4o mini... OpenAI's most cost-efficient small model yet! GPT-4o mini, with its improved performance and drastically reduced costs, is…

Liked by Dr. Abhay Alok
🌟 Exciting Collaboration Announcement! 🌟 We are thrilled to announce the successful signing of a Memorandum of Understanding (MoU) between Indian…

🌟 Exciting Collaboration Announcement! 🌟 We are thrilled to announce the successful signing of a Memorandum of Understanding (MoU) between Indian…

Liked by Dr. Abhay Alok
Learn how to improve customer experiences and safeguard data with AI and a unified cloud infrastructure.

Learn how to improve customer experiences and safeguard data with AI and a unified cloud infrastructure.

Liked by Dr. Abhay Alok
Dive deep into advanced chunking strategies that transform large text documents into coherent, searchable units. Join me for an exciting workshop on…

Dive deep into advanced chunking strategies that transform large text documents into coherent, searchable units. Join me for an exciting workshop on…

Liked by Dr. Abhay Alok
During a recent talk a person asked: "It seems there is a glass ceiling that avoids to get into Data Science if you don’t have a PhD. Is it true?"…

During a recent talk a person asked: "It seems there is a glass ceiling that avoids to get into Data Science if you don’t have a PhD. Is it true?"…

Liked by Dr. Abhay Alok
Nvidia & Mistral! have just launched Mistral NeMo 12B, an exceptional model licensed under Apache 2.0. Here’s what makes it standout: - It…

Nvidia & Mistral! have just launched Mistral NeMo 12B, an exceptional model licensed under Apache 2.0. Here’s what makes it standout: - It…

Liked by Dr. Abhay Alok
I think 12B is the new 7B for on device LLMs. Mistral x NVIDIA collab dropped the Mistral-NeMo 12B open weights LLM with 128K context length. This is…

I think 12B is the new 7B for on device LLMs. Mistral x NVIDIA collab dropped the Mistral-NeMo 12B open weights LLM with 128K context length. This is…

Liked by Dr. Abhay Alok
Introducing Jude Bellingham, our EA SPORTS #FC25 Standard Edition Cover Star. “I played this game with my brother all the time growing up, and I’ve…

Introducing Jude Bellingham, our EA SPORTS #FC25 Standard Edition Cover Star. “I played this game with my brother all the time growing up, and I’ve…

Liked by Dr. Abhay Alok
AI + on-device = ❤️ There is a new small language model family called SmolLM with 135M, 360M, and 1.7B parameters. It's trained also on a new…

AI + on-device = ❤️ There is a new small language model family called SmolLM with 135M, 360M, and 1.7B parameters. It's trained also on a new…

Liked by Dr. Abhay Alok
The next chapter of the world’s game ⚽. Reveal trailer out now! 👇 #FC25 #WeAreEA #EASPORTS

The next chapter of the world’s game ⚽. Reveal trailer out now! 👇 #FC25 #WeAreEA #EASPORTS

Liked by Dr. Abhay Alok

View Dr. Abhay’s full profile

See who you know in common
Get introduced
Contact Dr. Abhay directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses

See all courses

About

Activity

Starting with a service-based company, my brother Kapil Ahuja spent four years learning and working hard before finally getting placed at…

Liked by Dr. Abhay Alok

Accelerating LLMs by 2x with Graph-structured Speculative Decoding. Researchers have found a way to make speculative decoding up to 2x faster by…

Liked by Dr. Abhay Alok

A nice tutorial about Semantic Search in the NLP course by Lewis Tunstall from Hugging Face. This course helps me to teach students in Egypt about…

Liked by Dr. Abhay Alok

Experience & Education

Electronic Arts (EA)

****** **** *******

View Dr. Abhay’s full experience

See their title, tenure and more.

Licenses & Certifications

Publications

Springer/Soft Computing August 7, 2015

IEEE/ IEEE Jpurnal of Biomedical and Health Informatics July 20, 2015

Springer/ Soft Computing May 16, 2015

Applied Intelligence/ Springer May 5, 2015

Springer/ International Journal of Machine Learning and Cybernetics February 15, 2015

IEEE December 4, 2012

IEEE December 4, 2012

Springer Plus

Honors & Awards

Technical program committee of International Conference

-

Organising member of workshop on Optimization Technique for Language technology

IIT Bombay, Coling Conference

Organizing member of Indo-Australia workshop on optimization Technique for Human Language Technology

India-Australia

MHRD Scholarship

Govt of India

MHRD Scholarship

Govt Of India

Reviewer of SCI Journal

IEEE/Springer/PLOSONE

Languages

English

Professional working proficiency

Hindi

Full professional proficiency

Sanskrit

Limited working proficiency

Bhojpuri

Professional working proficiency

Bengali

Elementary proficiency

Maithili

Elementary proficiency

Organizations

IEEE

Student Member

Recommendations received

Dr. Vijay Bhaskar Semwal

Dr. Vijay Bhaskar Semwal

More activity by Dr. Abhay

IIT Bombay has signed an MoU with the Centre for Railway Information System (CRIS), an IT wing of Indian Railways. The collaboration aims to solve…

Liked by Dr. Abhay Alok

Must Read 😍

Liked by Dr. Abhay Alok

🚀 GPT-4o mini... OpenAI's most cost-efficient small model yet! GPT-4o mini, with its improved performance and drastically reduced costs, is…

Liked by Dr. Abhay Alok

🌟 Exciting Collaboration Announcement! 🌟 We are thrilled to announce the successful signing of a Memorandum of Understanding (MoU) between Indian…

Liked by Dr. Abhay Alok

Learn how to improve customer experiences and safeguard data with AI and a unified cloud infrastructure.

Liked by Dr. Abhay Alok

Dive deep into advanced chunking strategies that transform large text documents into coherent, searchable units. Join me for an exciting workshop on…

Liked by Dr. Abhay Alok

During a recent talk a person asked: "It seems there is a glass ceiling that avoids to get into Data Science if you don’t have a PhD. Is it true?"…

Liked by Dr. Abhay Alok

Nvidia & Mistral! have just launched Mistral NeMo 12B, an exceptional model licensed under Apache 2.0. Here’s what makes it standout: - It…

Liked by Dr. Abhay Alok

I think 12B is the new 7B for on device LLMs. Mistral x NVIDIA collab dropped the Mistral-NeMo 12B open weights LLM with 128K context length. This is…

Liked by Dr. Abhay Alok

Introducing Jude Bellingham, our EA SPORTS #FC25 Standard Edition Cover Star. “I played this game with my brother all the time growing up, and I’ve…

Liked by Dr. Abhay Alok

AI + on-device = ❤️ There is a new small language model family called SmolLM with 135M, 360M, and 1.7B parameters. It's trained also on a new…

Liked by Dr. Abhay Alok

The next chapter of the world’s game ⚽. Reveal trailer out now! 👇 #FC25 #WeAreEA #EASPORTS