Our team will present our latest graph learning work, "VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections," at ICLR 2024 in Vienna, Austria. The paper introduces a novel approach for mini-batch based graph transformers, enabling them to efficiently process billion-scale industry graph datasets. This advancement opens new possibilities for research into large graph foundation models and has potential applications in fields like social network analysis and graph-based recommendation systems. We eagerly anticipate engaging discussions at ICLR 2024. Additionally, we invite you to visit us at the AI at Meta booth to explore more exciting AI advancements from Meta. @Dongqi Fu, Zhigang Hua, Yan Xie, Jin Fang, Si Zhang, Kaan Sancak, Hao Wu, Andrey Malevich, Jingrui He
Bo Long’s Post
More Relevant Posts
-
AI Research Director @J.P. Morgan | Inventor (50+ patents) | Scientist (100+ papers - 30 H-index) | Engineer (15+ AI systems) | Speaker
Great step towards #trustworthy #LLMs, which are crucial for LLMs adoption at scale, specially when critical decisions are required to be made (#Finance domains…). Plenty of direct applications e.g., LLMs inputs attribution for #hallucinations detection, #FactChecker !! Even broader: think LLMs outputs now directly attributing LLMs inputs. Great work, Sanjay Kariyappa and team. LLMs inputs and outputs nicely connected with #XAI. #TrustworthyAI
Excited to announce that our paper, "Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions," has been accepted to #ICML2024! In this work, we introduce a novel explainable AI (XAI) framework called "Progressive Inference". Our approach leverages intermediate predictions from decoder-only Transformer models to generate high-quality SHAP-like input attributions for sequence classification tasks. Our method significantly outperforms prior XAI techniques and is a step towards improving the trustworthiness of large language models. For more details, check out our arXiv preprint: https://lnkd.in/gC3z3t2J Joint work with Freddy Lecue, Saumitra Mishra, PhD, Chris Pond, Daniele Magazzeni and Manuela Veloso. #AI #MachineLearning #XAI #ICML2024
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
arxiv.org
To view or add a comment, sign in
-
Excited to announce that our paper, "Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions," has been accepted to #ICML2024! In this work, we introduce a novel explainable AI (XAI) framework called "Progressive Inference". Our approach leverages intermediate predictions from decoder-only Transformer models to generate high-quality SHAP-like input attributions for sequence classification tasks. Our method significantly outperforms prior XAI techniques and is a step towards improving the trustworthiness of large language models. For more details, check out our arXiv preprint: https://lnkd.in/gC3z3t2J Joint work with Freddy Lecue, Saumitra Mishra, PhD, Chris Pond, Daniele Magazzeni and Manuela Veloso. #AI #MachineLearning #XAI #ICML2024
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
arxiv.org
To view or add a comment, sign in
-
Great library for tensor decomposition on large data in python!
Our tensor decomposition based machine learning toolbox, Tensor Extraction of Latent Features (T-ELF), has been released (https://lnkd.in/gQJNEdEE)! T-ELF is one of the machine learning software packages developed as part of the R&D 100 winning SmartTensors AI project at Los Alamos National Laboratory (LANL). T-ELF presents an array of customizable software solutions crafted for analysis of large datasets. Acting as a comprehensive toolbox, T-ELF includes tools for data pre-processing, extraction of latent features, and structuring results to facilitate informed decision-making. Leveraging high-performance computing and cutting-edge GPU architectures, our toolbox is optimized for analyzing large datasets from diverse set of problems. Central to T-ELF's core capabilities lie non-negative matrix and tensor factorization solutions for discovering multi-faceted hidden details in data, featuring automated model determination facilitating the estimation of latent factors or rank. This pivotal functionality ensures precise data modeling and the extraction of concealed patterns. Additionally, our software suite incorporates cutting-edge modules for both pre-processing and post-processing of data, tailored for diverse tasks including text mining, Natural Language Processing, and robust tools for matrix and tensor analysis and construction. T-ELF's adaptability spans across a multitude of disciplines, positioning it as a AI and data analytics solution. Its application extends across various fields such as Large-scale Text Mining, High Performance Computing, Computer Security, Applied Mathematics, Dynamic Networks and Ranking, Biology, Material Science, Medicine, Chemistry, Data Compression, Climate Studies, Relational Databases, Data Privacy, Economy, and Agriculture. T-ELF: https://lnkd.in/gQJNEdEE SmartTensors AI: https://lnkd.in/g_PDDVER #machinelearning #gpu #hpc #highperformancecomputing #dataanalysis #artificialintelligence
To view or add a comment, sign in
-
-
🚀 What is Support Vector Classification (SVC): Unveiling the Power of Support Vectors for Classification 📊 🔍 Support Vector Classification (SVC) is a robust and versatile machine learning algorithm used for classification tasks. It's based on the concept of Support Vector Machines (SVM) and is widely employed in various fields such as finance, healthcare, marketing, and more. 🎯 Objective: The primary goal of SVC is to classify data points into different classes by finding the optimal hyperplane that maximizes the margin between classes while minimizing misclassifications. 📈 Margin Maximization: SVC focuses on maximizing the margin, which is the distance between the hyperplane and the nearest data points (support vectors) of each class. This leads to better generalization and robustness against noise. 📊 Kernel Trick: SVC can efficiently handle non-linear decision boundaries by using kernel functions, such as linear, polynomial, Gaussian (RBF), or sigmoid kernels. This allows SVC to learn complex patterns and achieve high accuracy. 🔑 Key Features: SVC is effective in high-dimensional spaces, robust to overfitting, and capable of handling both linearly separable and non-linearly separable data. 📚 Training: SVC learns the optimal hyperplane by maximizing the margin and minimizing the classification error. Various optimization algorithms, such as gradient descent or quadratic programming, can be used for training. 🔍 Evaluation: Performance metrics such as accuracy, precision, recall, F1-score, and area under the ROC curve (AUC-ROC) are commonly used to evaluate SVC models. These metrics provide insights into the model's ability to correctly classify instances. 💡 Applications: SVC finds applications in diverse domains, including image classification, text categorization, sentiment analysis, medical diagnosis, and spam detection. 🚀 Conclusion: Support Vector Classification (SVC) is a powerful algorithm for classification tasks, offering robustness, flexibility, and high accuracy. Understanding its principles and applications can empower data scientists and analysts to tackle complex real-world classification problems effectively. #machinelearning #deeplearning #ai #math #mathematic #statistics #artificalintelligence #classification #regression
To view or add a comment, sign in
-
-
Over the past few years, I’ve worked with a variety of startups on advancing their computer vision capabilities. With most, their team made great progress. But with a few of them, I wish I’d been able to help them more. Some of the challenges they had included: - A limited amount of labeled data - Weakly labeled data (labels applied to large images without detailed annotations) - Multispectral or multiplex images that are unsuited to typical transfer learning approaches We ended up relying on handcrafted features and classical machine learning because deep learning models overfit no matter what we tried. With 20/20 hindsight and recent advances in self-supervised learning, I now see a potential solution: a 𝐝𝐨𝐦𝐚𝐢𝐧-𝐬𝐩𝐞𝐜𝐢𝐟𝐢𝐜 𝐟𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐦𝐨𝐝𝐞𝐥. In many cases, the dataset may not be large enough to train a traditional foundation model with 100s of millions of parameters (or more). But a smaller model trained with a self-supervised objective (like reconstructing masked out regions) could work great. Such a model could form the foundation from which many other task-specific models are fine tuned – which also makes it quicker to experiment with new tasks. In January, I'll be launching a new assessment for computer vision teams to help get a clear perspective on whether they can benefit from a foundation model. Though this assessment, I will answer the following questions: - Would you benefit from a foundation model trained on your proprietary images? - Do you have the capability to train one yourselves? - What are some of the factors you’d need to consider when training one? 𝐋𝐞𝐚𝐫𝐧 𝐦𝐨𝐫𝐞 𝐚𝐧𝐝 𝐚𝐩𝐩𝐥𝐲 𝐧𝐨𝐰 𝐭𝐨 𝐛𝐞 𝐟𝐢𝐫𝐬𝐭 𝐢𝐧 𝐥𝐢𝐧𝐞 𝐟𝐨𝐫 𝐉𝐚𝐧𝐮𝐚𝐫𝐲: https://lnkd.in/g3MkE4Y4 #Pathology #RemoteSensing #EarthObservation #MedicalImaging #MachineLearning #DeepLearning #ComputerVision
Custom Vision Model Assessment
pixelscientia.com
To view or add a comment, sign in
-
PhD at Kings | Artificial Intelligence Consultant | Computer Scientist | Machine Learning | Artificial General Intelligence
It is a heartwarming moment to share, that our paper titled "ChartEye: A Deep Learning Framework for Chart Information Extraction", has been accepted at The International Conference on Digital Image Computing: Techniques and Applications (DICTA) 2023, which is a Core-ranked conference and will be held in Port Macquarie NSW Australia. This project has been one of the major research projects at the Center of Excellence - Artificial Intelligence. It has been wonderful teamwork with my co-authors Muhammad Khizer Ali, Momina Moetesum, and Imran Siddiqi. It would not have been possible without the excellent supervision of Dr. Momina and Dr. Imran. Dr. Momina has been a great supervisor and guide throughout, starting from the initiation of this project. Abstract: The widespread use of charts and infographics as a means of data visualization in various domains has inspired recent research in automated chart understanding. However, information extraction from chart images is a complex multi-tasked process due to style variations and, as a consequence, it is challenging to design an end-to-end system. In this study, we propose a deep learning-based framework that provides a solution for key steps in the chart information extraction pipeline. The proposed framework utilizes hierarchal vision transformers for the tasks of chart-type and text-role classification, while YOLOv7 for text detection. The detected text is then enhanced using Super Resolution Generative Adversarial Networks to improve the recognition output of the OCR. Experimental results on a benchmark dataset show that our proposed framework achieves excellent performance at every stage with F1-scores of 0.97 for chart-type classification, 0.91 for text-role classification, and a mean Average Precision of 0.95 for text detection. #research #deeplearning #computervision #documentai #dicta
To view or add a comment, sign in
-
Enhanced Stochastic Optimization Algorithms with Power for Large-Scale Machine Learning Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning Zhuang Yang; 24- 241 -:1−29, 2023. Abstract Stochastic optimization, particularly stochastic gradient descent - SGD -, has become the most commonly used method for solving machine learning problems. In order to enhance the performance of the traditional SGD algorithm, which has a slow convergence rate and poor generalization, several strategies have been developed, such as control variates, adaptive learning rate, and momentum technique. Most of these strategies focus on controlling the updating direction - e.g., gradient descent or gradient ascent - or manipulating the learning rate. In this study, we propose a novel type of improved powered stochastic gradient descent algorithms that use the Powerball function to determine the updating direction. We also address the issue of the learning rate in powered stochastic optimization - PSO - by introducing an adaptive mechanism based on the Barzilai-Borwein - BB - like scheme, not only for the proposed algorithm but also for classical PSO algorithms. The theoretical properties of these algorithms for non-convex optimization problems are analyzed. Empirical tests using various benchmark datasets demonstrate the efficiency and robustness of our proposed algorithms. https://lnkd.in/dHrsPDMt
Enhanced Stochastic Optimization Algorithms with Power for Large-Scale Machine Learning
https://instadatahelpainews.com
To view or add a comment, sign in
-
Why does fine-tuning work so efficiently with pre-trained universal machine learning interatomic potential (uMLIP)? Artificial intelligence (AI) is increasingly changing the paradigm of scientific discovery, and one major contribution comes from uMLIPs which provide foundational models to bridge quantum-mechanical accuracy and large-scale simulations. We discovered a systematic potential energy surface (PES) softening effect in current uMLIPs through a series of out-of-distribution (OOD) atomic modeling benchmarks, which is characterized by systematic energy and force underpredictions. We show that a simple linear correction with a single data point is able to remove the PES softening issue and substantially reduce uMLIP prediction errors. Our findings suggest that a considerable fraction of errors in pre-trained uMLIP are highly systematic, and can therefore be efficiently corrected, rationalizing the data-efficient fine-tuning error reduction commonly observed. We highlight the advantage of atomic modeling with foundational AI models. Check our pre-print: https://lnkd.in/gPTfx_4r
Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning
arxiv.org
To view or add a comment, sign in
-
Co-Founder, Chief AI & Analytics Advisor @ InstaDataHelp | Innovator and Patent-Holder in Gen AI and LLM | Data Science Thought Leader and Blogger | FRSS(UK) FSASS FRIOASD | 16+ Years of Excellence
Enhanced Stochastic Optimization Algorithms with Power for Large-Scale Machine Learning Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning Zhuang Yang; 24- 241 -:1−29, 2023. Abstract Stochastic optimization, particularly stochastic gradient descent - SGD -, has become the most commonly used method for solving machine learning problems. In order to enhance the performance of the traditional SGD algorithm, which has a slow convergence rate and poor generalization, several strategies have been developed, such as control variates, adaptive learning rate, and momentum technique. Most of these strategies focus on controlling the updating direction - e.g., gradient descent or gradient ascent - or manipulating the learning rate. In this study, we propose a novel type of improved powered stochastic gradient descent algorithms that use the Powerball function to determine the updating direction. We also address the issue of the learning rate in powered stochastic optimization - PSO - by introducing an adaptive mechanism based on the Barzilai-Borwein - BB - like scheme, not only for the proposed algorithm but also for classical PSO algorithms. The theoretical properties of these algorithms for non-convex optimization problems are analyzed. Empirical tests using various benchmark datasets demonstrate the efficiency and robustness of our proposed algorithms. https://lnkd.in/dEZM5WFt
Enhanced Stochastic Optimization Algorithms with Power for Large-Scale Machine Learning
https://instadatahelpainews.com
To view or add a comment, sign in
-
Matrix Extension for Pathological Radar Clutter Machine Learning Abstract This paper deals with radar clutter statistical learning based on spatial Doppler fluctuation. In articles [1]–[4], data is clustered cell by cell. In this article, we generalize the previous model to extract information not only from each cell independently, but also from the cells spatial correlation. We first introduce the radar data, then the model and efficient tools to estimate the model parameters. The model parameters will be shown to be Hermitian Positive Definite BlockToeplitz matrices. Next we endow the manifold of Hermitian Positive Definite Block-Toeplitz matrices with a Riemannian metric coming from information geometry. Finally, we adapt a supervised classification algorithm (the k-Nearest Neighbors) and an unsupervised classification algorithm (the Agglomerative Hierarchical Clustering) to this Riemannian manifold. Index Terms—Radar clutter, multidimensional signals, spatiotemporal correlation, machine learning, Information geometry, Riemannian manifold, Block-Toeplitz matrices, Siegel disk.
To view or add a comment, sign in