Guruprasad Padmanabhan’s Post

View profile for Guruprasad Padmanabhan, graphic

Associate Director - Data Strategy & Data Science | AIIMS & IIMC

For all life-science folks curious about AI, following is a high-level view of the mechanics of Machine Learning (field within AI). There are a couple of key "components" for machine learning - 1. Data - this is usually multiple records, each comprising of input variables or "features", and the dependent variable or "response". As an example, if we are interested in automated diagnosis of pulmonary tuberculosis from chest X-rays, then the data required would be 1000s of X-rays and the diagnosis of "yes" or "no" to Tuberculosis for each. 2. Model - This is a mathematical object that is used to create a mapping or function between the features (e.g - X-ray) and the response (e.g - TB). Well, how is this model generated? This process is known as training the model. In simple words, this is how you can think about this process - a. The available data is split into 2 unequal parts, the larger is known as Training data and the smaller, Test data. (I am simplifying a few things here). b. The training data is fed to the model, and the model initiates a mapping which is initially wrong. An error is calculated between what the model spat out and what the ground truth is. Iteratively, this error is reduced (through a group of techniques known as gradient descent). When this error is sufficiently low, the model may be ready for use. But before it can be, it needs to be tested. c. A few trained models (whose errors are sufficiently low) are now tested on the test data and the error calculated. If the model continues to have a low error, this tested model can then be deployed. This step is important as the model is forced to perform on data that it has not seen before and therefore is a good way to validate. d. Trained models that have performed well on the test data as well, would be deployed and made available for use. In the example of automated diagnosis of TB using X-rays, this model may be deployed in remote locations or on the cloud where radiologists are not available. Something to ponder - Do you think that models that achieve very low error in the training data (high performance) would perform great in the real world? Let me know in the comments. #machinelearning #healthtech #aiforhealth #datascience

Valentin M.

Digital innovator, AI enthusiast, Life Sciences Change enabler, Entrepreneur, MBA

2mo

Great explanation 👌

Like
Reply
Chintan Shah

International Buisness development

2mo

Very informative…thanks a lot Guruprasad Padmanabhan

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics