Learn Data Skills
Beta
Praneet Pabolu

Praneet Pabolu

Senior Applied Scientist

Oracle Cloud Infrastructure | Bangalore

Technologies

My Portfolio Highlights

My New Track

Data Scientist

My New Course

Introduction to Python

Insights sculptor, shaping raw information into actionable intelligence.

My Work

Take a look at my latest work.

course

Introduction to Functions in Python

course

Introduction to Python

course

Intermediate Python

DataCamp Course Completion

Take a look at all the courses I’ve completed on DataCamp.

My Work Experience

Where I've interned and worked during my career.

Oracle | May 2022 - Present

Senior Applied Scientist

Primarily working on Generative AI in NLP and Computer Vision: End-to-End NLP Projects I’ve worked on: 1. Knowledge Graph Guided and Entity Controlled Doctor-Patient Conversation Generation: The objective is to generate Doctor-Patient interactions in a natural and coherent form while preserving the entity-level annotations. 2. Entity Focused Natural Language Generation(EFNLG): Training the model to detect more than 70+ annotations involving not only the generic entities but also the personal & healthcare(PII/PHI) entities requires 100s of thousands of Controlled Randomization for Generating Low-Template, Labelled, Synthetic Document Dataset for Training Document AI Models Long-Short Portfolio Construction via Optimization annotations. So, I’ve designed and developed an approach/pipeline using the custom generative model architecture which was able to generate contextually meaningful near-real-world texts while maintaining annotations. 3. Multi-Lingual NLG in 100+ Languages: Extended the EFNLG approach with myriad of advancements in the architecture involving Automated Data Preparation, Knowledge Distillation, Keyword Extraction, Multi-Lingual Summarization and many more, which led to the Novel Multi-Lingual data generation in our desired language. This approach is currently being used to generate semantically meaningful data for the use-cases of NER, Relationship Extraction and Aspect-Based Sentiment Analysis. Ground-up Computer Vision Projects, I’ve worked on: 1. Developed a Novel Generic Semantic Segmentation model by employing Vision Transformers in a semi-supervised shape-agnostic fashion. Designed this approach before the release of Meta’s Segment Anything Model(SAM). 2. Architectured a Framework for Generating Synthetic Invoices while preserving positional encodings and bounding box information that leverages the use of multiple processes revolving around Diffusion Models, GANs, U- NETs and some Heuristic rules to validate the bounds. Currently being used in the usecases of generating training data for extracting information from Passports, PAN Cards, Social Identity Cards and Driving Licenses. - *7 Patents Filed in total
Show More

Oracle | Nov 2021 - Present

Senior Machine Learning Engineer

Built a Forecasting Model which: - Introduces a Novel Approach to Generate Forecasts leveraging all the available primary, meta and additional data to compute numerous inputs along with Time-Varying Feature crosses, which is used to Design and Develop a Novel Endogenous Feature Engineering approach that has the best MAPE, SMAPE and RMSE values. - Designed the whole MLOps Architecture from scratch to Benchmark multiple Model Runs, Capture Efficiency and also to facilitate Active Learning including but not limited to MLFlow, AirFlow, KubeFlow, Docker, etc. - Bagged 1st in OCI AI Hackathon among 120 Teams globally (Project: Forecasting Air Quality using Object Detection in Traffic Videos). - *Filed 2 Patents

Cargill | Apr 2020 - Nov 2021

Machine Learning Engineer

Built an Image Denoising Pipeline which: - Is capable of removing 120,000 different types of noises accurately. - Researched and Developed a divide and conquer approach using Attention Based – Cycle GAN to facilitate input of any sized images, with a mean custom metric determined using SSIM, PSNR & FFT. - Was able to achieve a whopping 84.7% noise removal in images. Developed an NLP Platform that has the features of: - Generating a Bot model from the Chatbot Pipeline module that Fetches, Pre- processes, Feature Engineers the Data and once the Hyperparameters are Fine-Tuned produces a model with a test report which after verification deploys it on AWS SageMaker. - Extracting clauses from Documents by applying Semantic Search using DistilBERT, fine-tuned on the domain-specific Taxonomy with custom-added Bi-LSTM Layers at the top and bottom of BERT model to extend the limitation of 512 positional embeddings to accept 8192. - Synthetic Data Generation for NER in order to increase the number of examples in training data and to add variance while keeping in check of the unwanted bias during its preparation. Implemented NMT(Neural Machine Translation) and GAN concepts to approach this. Tech Stack: Python, TensorFlow, PyTorch, Keras, OpenCV, SpaCy, RASA, FastAPI, Postgres

Cogoport | Mar 2019 - Apr 2020

Data Scientist (Deep Learning and Data Processing Engine)

- Developed Freight Rate Forecasting, Shipping Line Predictions, Demand Forecasting, and Document Verification from scratch using NN Architectures (LSTMs, GANs, etc.) and Deployed these models using TensorFlow-Serving APIs for easy load balancing and versioning which are accessed through Django APIs. - Designed and Developed the whole Distributed Deep Learning Pipeline for Stream Processing using Debezium, Kafka, TensorFlow, Flink, and Druid which is used to produce personalized predictions and cancellation probabilities. - Used this Pipeline for Handling Millions of Data items in just a few seconds and worked with BAs to generate Prediction Dashboards. Tech Stack: Python, TensorFlow, Keras, OpenCV, Django, Java, Celery, RabbitMQ, MongoDB, Postgres

Airtel Payments Bank | Jun 2018 - Mar 2019

Software Development Engineer

- Developed a Fake Names Classification Model to predict the Human names entered by users as Fake/Real pre-processed through multiple NLP concepts like Feature Crosses, word embeddings, and word vector bindings. - Created a custom YOLO Architecture for Automatic Payment of Tower Bills from Documents. - Designed, built, tested, and deployed the Automatic Offers feature on Airtel App using Spring Boot. - Implemented the Backend for Utility Recharges (Water, Electric & Gas) impacting millions of customers daily using Spring Batch. Tech Stack: Python, Java, Flask, TensorFlow, Spring Boot, Aerospike, SQL

RedCarpetUp.com | Dec 2017 - Feb 2018

Machine Learning Engineer Intern

- Analyzed the data in order to detect unusual patterns from credit user information and accordingly predicted the credit score for a particular user. - Implemented the concept of feature store for storing the daily generated new features, this led to the process of automating the requirement-based training data for the model instead of manually supplying the data.

KATE Technologies Pvt Ltd | May 2017 - Jul 2017

Software Developer Intern

REST API requests for updating the delivery status for an Android App named Cart in Hour which aimed at delivering groceries to customers within 2 hours from the time of order.

My Education

Take a look at my formal education

Bachelor of Technology - BTech in Mathematics and Computer ScienceITM UNIVERSITY, GWALIOR | 2018

About Me

Praneet Pabolu

I'm an Applied Research Scientist with Deep Expertise in Machine Learning, Deep Learning, Data Science, and Large Scale Distributed Data Systems. Have a proven track record of research-oriented development and filed multiple patents in AI.

Powered by

  • Work
  • Courses
  • Experience
  • Education
  • About Me
  • Create Your Data Portfolio for Free