Vassily Carantino

Paris, Île-de-France, France Coordonnées
6 k abonnés + de 500 relations

Devenir membre pour voir le profil

À propos

I have cofounded CarbonFarm, a climatech start-up leveraging satellite and AI to provide…

Activité

Expérience et formation

  • CarbonFarm

Voir toute l’expérience de Vassily

Découvrez son poste, son ancienneté et plus encore.

ou

En cliquant sur Continuer pour vous inscrire ou vous identifier, vous acceptez les Conditions d’utilisation, la Politique de confidentialité et la Politique relative aux cookies de LinkedIn.

Licences et certifications

Expériences de bénévolat

  • Graphique Les Glénans

    Voluntary Sailing Instructor

    Les Glénans

    - aujourd’hui 13 ans 2 mois

    Formation

Projets

  • Predicting Red Hat Business Value using different Machine Learning Algorithms

    Red Hat is the world's leading provider of open source, enterprise IT solutions. Over time, Red Hat was able to gather a great deal of information about the behavior of individuals who interact with them.

    Goal of the project (from Kaggle Competition):
    • use this behavioral data to predict which individuals they should approach
    • create a classification algorithm that accurately identifies which customers have the most potential business value for Red Hat based on their…

    Red Hat is the world's leading provider of open source, enterprise IT solutions. Over time, Red Hat was able to gather a great deal of information about the behavior of individuals who interact with them.

    Goal of the project (from Kaggle Competition):
    • use this behavioral data to predict which individuals they should approach
    • create a classification algorithm that accurately identifies which customers have the most potential business value for Red Hat based on their characteristics and activities.

    In this project, we compared a couple of machine learning algorithms on Red Hat Business data set, which is pretty large(almost two millions*56). We compared SVM(linear and nonlinear), Random Forest, Light GBM, Xgboost, Neural Network, Multinomial log-linear Model. And it proved Xgoost and Light GBM have the highest accuracy around 98%.

    This project was part of the Applied Data Science Course teached by Pr. Ying Liu at Columbia University (Statistics department).

    Other creators
    See project
  • A Comparison of Best Practices in Collaborative Filtering

    In this project, we wanted to compare different options to perform collaborative filtering on two distinct databases: the Microsoft Web dataset, and a classic Movie Recommendation dataset.

    We compared two class of algorithms:
    • Memory-based Algorithm(based on neighbors) and different combinations of options:
    > Similarity Weight: Pearson Correlation, Entropy, Mean-Square-difference or SimRank
    > Significance Weighting
    >…

    In this project, we wanted to compare different options to perform collaborative filtering on two distinct databases: the Microsoft Web dataset, and a classic Movie Recommendation dataset.

    We compared two class of algorithms:
    • Memory-based Algorithm(based on neighbors) and different combinations of options:
    > Similarity Weight: Pearson Correlation, Entropy, Mean-Square-difference or SimRank
    > Significance Weighting
    > Selecting Neighbors: Weight Threshold, Best-n-estimator, Combined
    > Rating Normalization: Deviation for Mean

    • Model- based Algorithm (for which we chose the best meta-parameters using Expectation Maximization).

    This project was part of the Applied Data Science Course teached by Pr. Ying Liu at Columbia University (Statistics department).

    Other creators
    See project
  • Image Classification: Dogs, Fried Chicken or Blueberry Muffins?

    In this project, we created a classification engine for images of Dogs, images of Fried Chicken and images of Blueberry Muffins.

    • We set our baseline model using SIFT features and gradient boosting machine(GBM) classifier. Besides the SIFT features, we also used HOG, RGB and HSV to do feature selection.

    • In terms of classifiers, we considered SVM(linear and non-linear), Random Forest, XGBoost and Neural Network. After model evaluation and comparison, the final advanced model…

    In this project, we created a classification engine for images of Dogs, images of Fried Chicken and images of Blueberry Muffins.

    • We set our baseline model using SIFT features and gradient boosting machine(GBM) classifier. Besides the SIFT features, we also used HOG, RGB and HSV to do feature selection.

    • In terms of classifiers, we considered SVM(linear and non-linear), Random Forest, XGBoost and Neural Network. After model evaluation and comparison, the final advanced model we selected is using RGB feature and XGBoost classifier. We increased the accuracy by 12.0% and only took 11.7% of running time as in baseline model.

    Results: Our baseline model is greater than random forest and SVM (linear and non-linear) under all feature selection methods. Only XGBoost gets higher accuracy than GBM based on HOG, RGB and HSV features.

    This project was part of the Applied Data Science Course teached by Pr. Ying Liu at Columbia University (Statistics department).

    Other creators
    See project
  • Open Data App using RShiny - Where do you want to live in New York City ?

    In this second project, we have developed an R-Shiny application designed to help New Yorkers to find a place to live. This app allow the users to explore data and very quickly restrict their search to neighborhoods that would suit them, by answering their more important questions:


    • on the price componant: range of prices in the neighborhood, trends, for different types of appartment (#bedroom, size)
    • on the neighborhood characteristics: demographics, age, families, and so…

    In this second project, we have developed an R-Shiny application designed to help New Yorkers to find a place to live. This app allow the users to explore data and very quickly restrict their search to neighborhoods that would suit them, by answering their more important questions:


    • on the price componant: range of prices in the neighborhood, trends, for different types of appartment (#bedroom, size)
    • on the neighborhood characteristics: demographics, age, families, and so forth
    • on transportation: metro and bus stations
    • on education: schools
    • on health: hospitals and health facilities
    • on entertainment: theatre, galleries, and so forth

    Data Sources used:
    • Crime data comes from https://www.data.gov/
    • Rental Prices data comes from https://www.zillow.com/
    • Sales Prices data comes from https://www.zillow.com/
    • School data comes from https://opendata.cityofnewyork.us/
    • Hospitals data comes from https://opendata.cityofnewyork.us/
    • Art Galleries data comes from https://opendata.cityofnewyork.us/
    • Theatres data comes from https://opendata.cityofnewyork.us/
    • All the information on the neighborhood (demographics, real estate information, earnings data) was queried using a web crawler on this website: https://www.unitedstateszipcodes.org/

    We used Google Map API to get the precise location corresponding to the different addresses.

    This project was part of the Applied Data Science Course teached by Pr. Ying Liu at Columbia University (Statistics department).

    Other creators
    See project
  • NLP project: Understanding the evolution of the American's concerns leveraging the recent Presidential Speeches

    In this work, we will leverage the presidential inaugural speeches to analyze the evolution of the concerns of American people using NLP techniques. In order to have a more interesting analysis, we focused on recent elections only: starting from Bill Clinton in 1993 to Donlad Trump in 2017, which gives approximately 25 years of recent American History.

    We first analyzed the chronological evolution of the word clouds, after correcting for the presidents verbal tics and style. We then…

    In this work, we will leverage the presidential inaugural speeches to analyze the evolution of the concerns of American people using NLP techniques. In order to have a more interesting analysis, we focused on recent elections only: starting from Bill Clinton in 1993 to Donlad Trump in 2017, which gives approximately 25 years of recent American History.

    We first analyzed the chronological evolution of the word clouds, after correcting for the presidents verbal tics and style. We then conducted a sentiment analysis of these recent speeches and analyzed the emotion they convey.

    This project was part of the Applied Data Science Course teached by Pr. Ying Liu at Columbia University (Statistics department).

    See project
  • Consulting Project: Forecasting powdered milk demand in foreign countries

    -

    Study and analysis of powdered milk demand globally for a major French dairy cooperative: we used econometrics and advanced analytics techniques to forecast international demand out of OECD, FAO, UNICEF and private databases

    Other creators
  • Android App Development - Locolize Project

    -

    Conceived, developed and implemented an Android app that reinvents social interaction using geolocation. The Locolize project allows to chat, meet and share events with your friends using their location.

    Technology used & Skills acquired :
    • Client / Server architecture using Python / MySQL (server) & Java / SQLite (client)
    • Socket Programming using TCP/IP
    • Integration of Google play services and Google Maps API V2
    • Use of Eclipse IDE and Android SDK
    • Performed…

    Conceived, developed and implemented an Android app that reinvents social interaction using geolocation. The Locolize project allows to chat, meet and share events with your friends using their location.

    Technology used & Skills acquired :
    • Client / Server architecture using Python / MySQL (server) & Java / SQLite (client)
    • Socket Programming using TCP/IP
    • Integration of Google play services and Google Maps API V2
    • Use of Eclipse IDE and Android SDK
    • Performed testing on physical device and the android emulator.

    This app has been developed as part of the Software Architecture Course teached by Pr. X. Clerc at ENPC. It was awarded the Best Project prize but was not released on the Android App Store for monetary reasons.

    Other creators
    See project

Langues

  • Français

    Bilingue ou langue natale

  • Anglais

    Bilingue ou langue natale

  • Espagnol

    Capacité professionnelle générale

  • Allemand

    Compétence professionnelle limitée

Voir le profil complet de Vassily

  • Découvrir vos relations en commun
  • Être mis en relation
  • Contacter Vassily directement
Devenir membre pour voir le profil complet

Autres profils similaires

Ajoutez de nouvelles compétences en suivant ces cours