Hi, I'm Ratan Sai Rohith.
A
Passionate Computer Scientist with an Entrepreneurial mindset who loves to build products which solves complex problems.
About
I am a Data Science Graduate Student
I am a curious Data scientist with professional experience in Machine learning, Large Language Models, Computer Vision, Databases, and Product development.
Experience
- Built a Software services and Product-based company within a year from just an idea to a business with a small team developing a solution for small business promotion.
- Heading the Data Science Department; Designed and executed 10+ tasks applying deep learning.
- Developed and implemented the business plan and strategy for the company with other founding members.
- Conducted online survey and identified key customer segments and customer's requirement.
- Helped the Start-up to secure the incubation in STPI Pune, NASSCOM, T-Hub.
- Helped company to identify key innovation and published a patent with number 202041040669 A.
- Built a face key points detection algorithm using Customized Convolutional neural network architecture in pytorch. Quantized the float32 trained model into int8 model to run on edge device.
- Developed a distraction classifier to detect if driver was paying attention to the road or not using yaw, roll, pitch axis.
- Designed and executed a customized Faster Rcnn object detection algorithm with resnet-18,resnet-34 and resnet-50 as backbone to detect and classify the 9 types of road Damages.
- Deployed a super fast and lightweight anchor free object detection algorithm to detect objects at edge with 30 FPS.
- Built a super efficient Face detection,Face recognition pipeline. Which uses face encodings and an edge SVM classifier to classify the driver. Automatically retraines and deploys the newly trained model if it encounters new driver within 3 seconds. Achieved an accuracy of 99% on edge device.
- Designed a Driver performance matrix which uses machine learning algorithms on the data from OBD and device to rank the driver according to his performance in the fleet.
- Assisted Dr.Padma Ganasala in the research project.
- Built a pipeline to automatically detect text labels on the AST disk Images.
- Applied the Deep learning and OCR techniques to automatically extract AST disk from plate and find labels on it.
- Achieved a mean average precision of 0.9 at IOU>0.5 for disk detection and could identify text on labels independent of the text orientation using Convolutional Recurrent Neural Networks with Self-Attention Mechanism.
Projects

A Android based Skin Cancer detection
- Tools: Python, Java, Android studios, Tensorflow, Google Colab, Numpy, Scipy, Pandas, Opencv
- The dataset was very small and had very fewer samples of different classes which causes class imbalance problem. To nullify the problem used data augmentation strategies to upsample the minor classes.
- Applied Transfer learning technique and trained mobilenet model in keras.
- Converted the trained keras model into tflite model.
- Wrote an android app which can take continuous video frames from camera of the phone and classify the feeds using the converted tflite model.

A Web application based on Flask to detect Pneumonia from images.
- Tools:Python, Flask, Tensorflow, Jquery, HTML, GoogleColab, Numpy, Scipy, Pandas, Opencv
- Used VGG-16 architecture as basic backbone and modified the last classification layer which outputs 2 classes probability.
- Trained the model in google colab GPUS for 50 epochs and got an accuracy of 98%
- Built a rest api using flask, jquery and HTML web page. Which takes a chest x-ray image as input and outputs whether the x-ray consists of pneumonia or not.

A Deep learning OCR based solution to classify Bengali Handwritten Grapheme.
- Tools:Pytorch, Albumentations, Google Colab, Python, Numpy, Scipy, Pandas, Opencv
- Created convolution neural networks like resnets, efficientnets, densenets, ghost nets, wide-resnets, resnexts with customized heads to classify grapheme root, vowel diacritics, and consonant diacritics.
- Evaluated training models with hierarchical macro average recall; Used iterative stratification to split data into 5 folds.
- Examined and generated advanced augmentation techniques: CUTOUT, CUTMIX, MIXUP, AUGMIX, GRIDMASK.
- Attained hierarchical macro average recall of 0.985 with ensemble of efficient nets, resnets and densenets.

Using advanced text analytics techniques to get insights from tweets regarding Australian Elections.
- Tools:Python, Numpy, Regex, Pandas, Matplotlib, Geopandas, Plotly, Textblob, NLTK, Sklearn, PyLDAvis
- Identified the most common professions of the persons who tweets most about australian elections.
- Used textblob to identify sentiment of the text.
- Derived a Ngrams from the tweets.
- Identified most tweeted countries about Australian Elections.
- Identified sentiment like (positive,negative,neutral) of Australian Elections according to countries.
- Built topic modelling based on LDA..

Using advanced text analytics techniques to get insights from research papers to fight COVID-19.
- Tools:Python, Pandas, Numpy, Seaborn, Matplotlib, Plotly, Spacy, Sklearn, Beautiful soup, Gensim, pyLDAvis
- Used unsupervised machine learning on text data to determine cluster words for the given research papers.Done topic modelling using Latent Dirichlet Allocation.
- Done the readability tests on the papers to know the complexity of the paper.
- Summerised the 5000+ words papers into 250 words which represents the content of the paper.

Developing a deep neural networks that looks at the labeled sentiment for a given tweet and figure out what word or phrase best supports it.
- Tools:Python, Pytorch, Regex, Hugging Face Transformers, scikit-learn, Numpy, Pandas
- Built BERT, Roberta-base and Roberta-Large, Electra, Distil Bert.
- Developed a 1-D Convolutional neural network head which takes encodings from the transformer models as 1D CNN’s best captures the spatial representations.
- There was a lot of text noise in the data, manual and logic based cleaning never worked best. So, developed a two level roberta model where 1st level model outputs which are trained on noise data was combined with training data and again trained with second level model to capture the noise. This increased jaccard score from 0.70 to 0.71.
Achievements
Won Gold Medal in IIA International Innovation Fair 2017 out of 600 companies/projects from more than 50 countries.
Skills
Languages and Databases






Libraries






Frameworks






Hardware



Other





Education
Boston, MA
Degree: Masters in Data Science (Computer Science)
CGPA: 3.90/4.0
- Introduction to Data Management and Processing
- Supervised Machine Learning and Learning Theory
- Unsupervised Machine Learning and Data Mining
- Natural Language Processing
- Database Management Systems
- Algorithms
Relevant Coursework:
GVP College of Engineering (Autonomous)
Visakhapatnam, India
Degree: B.Tech in Electronics and Communication Engineering
CGPA: 8.05/10
- Digital Communications
- VLSI Design
- Data structures and algorithms
- Calculus, Algebra and Statistics
- Digital Logic Design and Electronic circuit analysis
- Optical fiber communications
Relevant Courseworks: