Learn Data Skills
Beta
Mark Pedigo

Mark Pedigo

Instructor
Certified

Data Scientist

1904 Labs | Hopkins, MN

Technologies

My Portfolio Highlights

My New Track

Python Fundamentals

My New Course

Introduction to Python

Mindful analyst, decoding patterns to unravel the big picture.

My Work

Take a look at my latest work.

DataLab

Course Notes: Introduction to PySpark

course

Introduction to R

course

Introduction to Python

Authored Curriculum

Take a look at the content that I created on DataCamp.

My Most Recent Course

Case Study: Building Software in Python

3 hours10 Videos29 Exercises272 Learners

My Certifications

These are the industry credentials that I’ve earned.

AI Fundamentals

AI Fundamentals

Data Literacy

Data Literacy

DataCamp Course Completion

Take a look at all the courses I’ve completed on DataCamp.

My Work Experience

Where I've interned and worked during my career.

DataCamp | May 2024 - Present

Instructor / Writer

Write articles, create classes for Datacamp

UnitedHealth Group | May 2024 - Jan 2025

Principal Data Scientist

• Clinic Optimization and Rationalization. Spearheaded the evaluation and selection of optimization solutions for clinical real estate. Organized software demonstrations and consulted with real estate firms, ultimately partnering with IBM to develop a proof-of-concept (POC) optimization model. Collaborated on requirements gathering and facilitated weekly sprint meetings. • Procurement ML models. Supervised a direct report with the design and implementation of ML models to predict turn-around time and approval authorities for procurement requests. • A/B Testing. Designed an A/B test to test the sentinel effect of travel procedures. • Team Development and Education. Delivered a series of presentations to the corporate services team regarding high-level machine learning approaches and their potential applications in procurement and real estate.
Show More

1904labs | Jul 2022 - Present

Senior Data Scientist

Client: Kingdom Capital/PercayAI (now Bordo AI). Project: Missouri Medicaid Services. We worked with PercayAI to develop methods of determining potential fraud by analyzing claims data. The purpose of the product was to aid fraud investigators/analysts to indicate potential flags that they could then investigate. As a research data scientist, my duties were as follows. • Research methods that are used to defraud the healthcare system, and develop tools to discover potential fraud schemes within the data. • One of these detection methods is the so-called “impossible day,” in which the recorded claims data indicates situations that are infeasible to have occurred in the time span indicated. For example, a patient might be recorded as simultaneously being in the hospital and in a geographically distant outpatient setting. • To this end, I created algorithms to determine “inpatient windows” that indicate when a patient is in the hospital. Outpatient and medical claims can then be compared against these windows to determine if a patient has been claimed to be in multiple places at once. • Created data for a knowledge graph for a demo product. Client: Ring/Amazon. Project: AI-Enabled Customer Service (AIECS). Ring receives 8 million + calls per year. To shorten the average time of calls, and thus reduce expenses, they wished to make the customer service calls more efficient by using chatbots to authenticate the user, answer "where is my order" questions, and dispatch the call to the appropriate channel. As a data scientist, my duties were as follows. • Helped design and built a dashboard to visualize KPIs and performance metrics using AWS QuickSight. • Worked with DX/Conversation Designers to optimize the performance of an AWS Lex bot. Wrote code to connect to the bot and gather performance metrics (recall, precision) per intent using a "tuning" set of utterances.

MedeAnalytics | Mar 2021 - Jul 2022

Senior Data Scientist

Mede/Analytics delivers actionable insights that improve financial, operational, and clinical outcomes for payers and providers across the UK and US. Their platform includes advanced analytics technologies such as ML, guided analysis, and predictive analytics. They provide a platform-as-a-service offering that enables clients to build their own healthcare applications. As a senior data scientist at Mede/Analytics, I was a senior level member of a team that used statistical and ML techniques to solve problems in provider and payer Healthcare. We developed sophisticated data science and analytics products to drive data-driven decisions within the organization and for its clients. I also mentored a junior level data scientist, leveraging my many years of university teaching, through presentations, instruction, and hands- on techniques (e.g., pair programming) to grow a skill set in data science and Python programming. The team ensured high quality models through in-depth peer reviews, SonarQube, and unit testing. Project: Mississippi Department of Medicaid (MSDOM) • For this project, the team was tasked to identify key influencers and predictors of infant mortality, maternal mortality and NICU stays. • We designed and wrote SQL code to transform and aggregate messy real- world healthcare claims data into a form consumable by our ML models. • We interfaced with the clients. I presented the results to the client, and incorporated feedback and further requests into the product. Project: Statistical Toolbox • The statistical toolbox is a microservice which ingests a JSON request from the client, which includes the data, which statistics are required and so on, processes those statistics for the datapoints (and rollups) and returns a JSON- formatted result.

Washington University in St. Louis | Jan 2012 - May 2022

Adjunct Professor

I taught as an adjunct professor at Washington University in St. Louis, University College one or two evenings per week, depending on the semester. Introduction to Data Science • Python data science ecosystem: NumPy, SciPy, pandas, matplotlib, Seaborn, scikit-learn • How to use NumPy effectively to write performant code using arrays, vectorized functions (ufuncs), and broadcasting • How to use pandas to import data into data frames, how to clean that data, methods for imputing missing values, and how to join and filter data frames with SQL-like manipulations • How to make effective visualizations using matplotlib (in general), or Seaborn • Machine Learning fundamentals: training and test sets, cross-validation, performance metrics (e.g., accuracy/confusion matrix), supervised learning (regression and classification), unsupervised learning (clustering, anomaly detection) Introduction to Programming in Python. • “Textbook topics”: Flow of control, strings, functions, files, exceptions, data structures (lists, tuples, dictionaries, sets), OOP • Supplemental project topics (depending on the semester’s interest and demand): pandas’ basics, graphing fundamentals (matplotlib), web scraping, maps using Folio Other courses I have taught: Differential Equations, Calculus II-IV, Introduction to Programming in R, Introduction to Statistics, Foundations of Mathematics

Centene Corporation | May 2020 - Mar 2021

Data Scientist Engineer II

I created web visualizations using the Safe Agile methodology, CI/CD processes, and various web visualization techniques. I also created backend using node and the databases PostgreSQL and Mongo DB. Project: Special Investigative Unit Provider Pre-Pay Interface The Special Investigative Unit Provider Pre-Pay Interface project is designed to help investigators pinpoint fraud, waste and abuse in the healthcare system and monitor providers that have been identified as displaying abnormal patterns of behavior. • Developed visualizations using JavaScript, node, React, JSX, Material UI, and MDBootstrap. Project: Business Planning Application The Business Planning Application consists of two main modules: the data management module, which provides a dashboard view of employee, organizational, financial statistics, the working forecast, a projection of the costs a product or department will incur in the future to help leadership determine if a particular product or department is on track to remain in budget for a specified time period, or if action should action should be taken to resolve any variance. • Wrote code to calculate variances based on the actual costs and budget and fixed costs as based on internal and external rosters. The user could create hypothetical situations by modifying the existing rosters. • Developed web visualizations and callbacks using Python and Plotly Dash Project: preCENteR The preCENTeR project displayed web visualizations of Agile metrics for Centene teams. • Developed web visualizations of Agile charts (e.g., burn down chart, lag chart) using R and R Shiny

Centene Corporation | Jul 2019 - May 2020

Data Scientist / Data Visualization

(Through Neteffects) Shiny R Agile viz for Agile metrics

Allscripts | Sep 2016 - Jun 2019

Data Scientist

EPSi, based in Chesterfield, MO, develops integrated financial analytics support, budgeting, and planning solution software for the healthcare industry. • Data Scientist/Statistical Modeler for the Real Cost Forecasting product. • Develop statistical models for budget forecasting following the CRISP-DM process. • Developed POCs in R and Python. • Present results (proofs of concept for initial approaches, error rates for incremental improvements) to executive-level management. • Work with subject matter experts in health care (operational and clinical) to improve existing models. • Design and develop production level code in coordination with the development team using the Python data analysis/statistical ecosystem (pandas, NumPy, matplotlib, Seaborn). • Refactor the production code with an eye towards efficiency using NumPy. • Develop forecasting code for the AWS cloud using the SageMaker ML platform and Amazon’s DeepAR recurrent neural net forecasting algorithm. • Design and develop a web-based demo using Python, Flask, JavaScript, HTML5, CSS3 and HighCharts for the 2017 Allscripts Client Experience (ACE) convention in Chicago. • Wrangle data into a tidy form usable by the forecasting code. • Help identify requirements and estimate timelines for the project plan. • Document algorithm details and upload to Confluence. • Develop sample test cases for QA. • Interface with the management, development, and QA teams. • Healthcare finance variance visualization prototypes. • Prototype dashboards to reveal cost variances and drill down to identify potential causes. • Through drill downs, show the top-down flow from facilities to departments to procedures to providers. • Work with a diverse team of technology specialists and health care subject matter experts. • Develop visualizations in Tableau using information extracted from tables in a PostgreSQL database system.

Booz Allen Hamilton | Mar 2014 - Sep 2016

Analyst/Developer, Strategic Innovation Group

DC area. They provide consulting, analysis, and engineering services to public and private sector clients. Client: NASA • Project Cost Estimating Capability (PCEC). Developed an Excel add-in (in VBA) with a robust and transparent collection of NASA cost-estimating relationships (CERs), work breakdown structures, and risk analysis via Monte Carlo simulation to facilitate the creation of cost estimations for robotic/ crewed spacecraft and launch vehicles. • Correlation Task. This project investigated correlations between cost and schedule for historical NASA missions. Acted as a statistics/mathematics subject matter expert (SME), wrote a literature review of pertinent material and interacted weekly with clients. The correlation team presented the results at NASA headquarters in Washington D.C., and wrote a paper presented at the 2015 International Cost Estimating and Analysis Association (ICEAA) Conference • Joint Confidence Level (JCL) Assessments. The JCL Assessments project is a training environment for non-cost estimators who need a basic understanding of cost estimating (e.g., Project Managers). Developed Project Management tool which incorporated Monte Carlo simulation, Gantt charts. Client: Scott Air Force Base • A6 CIO Technical Recommendations. As a deliverable for this project, co- authored a document which provided recommendations to the Air Mobility Command Communications Directorate (AMC A6) to follow to successfully incorporate data analytics and visualization techniques. • Capabilities Based Assessment (CBA). A CBA provides recommended solutions to identified capability gaps that meet an established need. On this project, served as a data analysis/technology SME.

My Education

Take a look at my formal education

Artificial Intelligence Professional Program, Artificial IntelligenceStanford University | 2021
(No degree), Data ScienceHarvard Extension School | 2016
Doctor of Philosophy - PhD, MathematicsSaint Louis University | 2014
BS, Computer ScienceUniversity of Missouri-Rolla | 1992

About Me

Mark Pedigo

An accomplished PhD data scientist with expertise in predictive modeling, statistical analysis, and data visualization, Mark has extensive experience in healthcare, government, and academia, including roles at UnitedHealth Group, Ring, and NASA. Known for strong problem-solving and mentorship skills, as well as exceptional communication abilities, he excels at fostering collaboration and building productive team environments.

Powered by

  • Work
  • Curriculum
  • Certifications
  • Courses
  • Experience
  • Education
  • About Me
  • Create Your Data Portfolio for Free