Learn Data Skills
Beta
Pablo Spring

Pablo Spring

Certified

Data Scientist

H1 Insights | New York

Technologies

My Portfolio Highlights

My New Course

Supervised Learning with scikit-learn

Insights artist, painting vivid pictures of knowledge with data as the brush.

My Certifications

These are the industry credentials that I’ve earned.

Data Analyst Associate

Data Analyst Associate

AI Fundamentals

AI Fundamentals

Data Literacy

Data Literacy

DataCamp Course Completion

Take a look at all the courses I’ve completed on DataCamp.

My Work Experience

Where I've interned and worked during my career.

H1 | Oct 2023 - Present

Data Scientist

Led Data Migration and Optimization of QA Reporting Pipelines - Spearheaded strategic migration of QA reporting pipelines from Databricks to AWS EMR Studio, enhancing efficiency and scalability. - Managed the technical shift from Pandas to PySpark, optimizing performance and achieving a 50% increase in data throughput. - Developed centralized Python and PySpark libraries, leading to a 30% improvement in computational efficiency and a 20% reduction in data anomalies. - Facilitated cross-team collaboration, contributing to a 30% cost reduction in pipeline expenses. Advanced NLP and AI Model Development for QA Reporting - Enhanced QA reporting processes using OpenAI API and SpaCy with Python and PySpark, modernizing medical publication analysis. - Implemented document classification models and prompt engineering techniques, increasing report processing speed by 45%. - Streamlined report QA checks, reducing manual review time by 25% and setting new standards in AI and data analysis within the organization. Advanced Probabilistic Entity Deduplication in Large-Scale Organizational Database Using PySpark - Directed a large-scale database deduplication project, applying Fellegi- Sunter model and Expectation Maximization algorithm for enhanced entity matching. - Successfully managed over 1 million records, reducing duplication errors by 40% and increasing operational efficiency by 30%. - Led knowledge transfer initiatives, ensuring a smooth transition to advanced data processing technologies and methodologies.
Show More

H1 | Jan 2023 - Sep 2023

Senior Data Analyst

Data Quality Enhancement in Medical Data Collection - Key contributor to enhancing data quality assurance for a large-scale medical data collection project, handling over 4000 data sources. - Developed a Python-based validation framework for automated error detection, improving data accuracy and efficiency. - Managed the entire quality assurance cycle, from data extraction using SQL to error reporting via Google Sheets API. - Generated statistical reports and dashboards in Tableau, reducing manual data review workload by 30% and increasing data accuracy by 25%. Enhanced Data Analytics Dashboard Initiative - Led the overhaul of 15 Tableau dashboards, redeveloping data extraction and publishing processes for diverse stakeholders. - Redesigned data workflow using Databricks-to-Tableau connections, focusing on KPIs and data quality metrics. - Achieved a 40% reduction in data processing time and a 30% increase in dashboard usage, bridging technical execution and business intelligence. Data Quality Assurance and Error Resolution Project Leadership - Directed a comprehensive data quality project, integrating advanced analytics in Python, Databricks, PySpark, and Tableau. - Coordinated with analysts to produce 65 detailed reports, streamlining the project within a Databricks pipeline. - Enhanced company's analytical capabilities, leading to a promotion to a data science position. Data Analytics Mentoring and Teaching - Initiated a Project-Based Learning program, advancing the skills of junior data analysts and researchers. - Led hands-on training in Python, Pandas, and PySpark, applying real-world data analytics projects. - Designed and conducted workshops and seminars, fostering a culture of continuous improvement and innovation.

H1 | Feb 2021 - Dec 2022

Research Data Analyst

Data Parsing and Integration - Spearheaded Python-based data scraping initiatives, targeting medical practitioner data across Latin American websites, significantly enhancing research methodologies. - Implemented Scrapy and Pandas for custom spider development, overcoming challenges in diverse website architectures and data frameworks. - Led rigorous QA processes, ensuring data integrity and fidelity aligned with original web content. - Pioneered data alignment strategies with existing databases, resulting in substantial manpower savings and operational enhancements. PubMed Data Extraction and Author Profiling - Innovated Python-based tools for automated data extraction from PubMed, handling over 34.7 million citations and streamlining workflows. - Developed custom algorithms for author profiling, utilizing Python libraries and k-means clustering, achieving an 87.5% reduction in processing time. - Spearheaded a project recognized for enhancing data accuracy and research methodologies, leading to a promotion to Senior Data Analyst. Language Proficiency in Data Management - Utilized Spanish and Portuguese fluency to bridge gaps in data collection and integration for Latin American markets. - Devised solutions for complex entity matching and accurate biographical data extraction, enhancing data quality in an English-centric product framework. Data Modeling and Integration with U.S. Medical Boards - Collaborated on researching and deciphering U.S. medical board databases, contributing to the development of parsing rules and compliant data integration. - Improved data processing efficiency by 25% and increased data accuracy, aiding in data-driven decision-making. Automated Entity Matching Pipeline Development - Led the design and implementation of an automated entity matching pipeline, reducing manual workload by 50% and streamlining data management processes. - Applied Python, Pandas, TFIDF, and DBSCAN algorithms for enhanced data deduplication and alignment.

Universidad de los Andes (CL) | Mar 2016 - Feb 2021

Lecturer

Curriculum Development and Delivery - Efficiently developed and delivered Applied Physics lectures, demonstrating strong capabilities in complex information synthesis and effective communication.

Pontificia Universidad Católica de Chile | Jan 2010 - Dec 2018

Research Analyst

Spectroscopic Analysis and Material Characterization: Led the implementation of advanced spectroscopic techniques, including Raman and Infrared (IR) spectroscopy, to characterize various materials. Focused on analyzing the physical properties of materials like ferroelectric and delafossite thin films, employing Python and specialized libraries for data processing and interpretation. Surface Science and Thin Film Analysis: Conducted in-depth studies using Auger Electron Spectroscopy (AES) for surface composition analysis of materials. Specialized in understanding the electronic and adsorptive properties of epitaxial thin films and nanostructured materials. Laser Surface Texturing: Involved in analyzing the effects of laser surface texturing on the microstructural properties of materials. Utilized data analytics to understand the impact of texturing processes on material durability and performance. Data Management in Experimental Physics: Oversaw the collection, processing, and analysis of large datasets from spectroscopic experiments. Applied data cleaning, normalization, and transformation techniques to ensure the accuracy and usability of data in material analysis. Material Failure Analysis: Engaged in the analysis of material failures, particularly in industrial applications. Used data-driven approaches to identify root causes and improve material design and application. Collaborative Research Projects: Actively participated in interdisciplinary research projects, employing data analytics to contribute to material science innovations. Focused on optimizing experimental procedures and data analysis methodologies to enhance material characterization and understanding.

My Education

Take a look at my formal education

Master's degree, Materials SciencePontificia Universidad Católica de Chile | 2018
Engineer's degree, Industrial Engineering and Operation ResearchUniversidad Mayor | 2018
Bachelor's degree, EducationPontificia Universidad Católica de Chile | 2013
Bachelor's degree, Engineering Physics/Applied PhysicsPontificia Universidad Católica de Chile | 2009

Powered by

  • Work
  • Certifications
  • Courses
  • Experience
  • Education
  • Create Your Data Portfolio for Free