Lewati ke konten utama

Isi detailnya untuk membuka webinar.

Dengan melanjutkan, Anda menerima Ketentuan Penggunaan kami, Kebijakan Privasi kami dan bahwa data Anda disimpan di Amerika Serikat.

Share this webinar

Close your data and AI skills gap

We're the only platform uniquely engineered to advance data and AI skills across your entire organization. Let's explore a tailored program.

Book an Enterprise Demo
Upskilling a small team?Get started today
Data Science

10 Guidelines for A/B Testing

November 2021
Webinar Preview

Your Presenter(s)

Emily Robinson

Emily works as a Data Analyst at Etsy with their search team to design, implement, and analyze experiments on the ranking algorithm, UI changes, and new features. In summer 2016, she completed Metis’s three-month, full-time Data Science Bootcamp, where she did several data science projects, ranging from using random forests to predict successful projects on DonorsChoose.org to building an application in R Shiny that helps data science freelancers find their best-fit jobs. Before Metis, she graduated in June from INSEAD with a Master’s degree in Management (specialization in Organizational Behavior). She also earned her bachelor’s degree from Rice University in Decision Sciences, an interdisciplinary major she designed that focused on understanding how people behave and make decisions.

Summary

A-B testing is an essential strategy for evaluating the effects of changes on web platforms, enabling companies to make informed decisions by contrasting two versions of a webpage or feature. The presenter, a data scientist with a social sciences background, stressed the significance of proper experiment design and analysis to ensure dependable results. A-B testing is extensively used by tech leaders like Amazon and Facebook and is available even to smaller companies through platforms like Optimizely. The webinar also outlined significant principles for successful A-B testing, including the necessity of having a single key metric, conducting power calculations, avoiding hasty test conclusions, and understanding the restrictions of multiple hypothesis testing. Furthermore, the presenter pointed out common mistakes in A-B testing, such as sample ratio mismatch and overcomplicating methods with advanced statistical techniques. The session wrapped up with practical advice on incorporating data scientists in the experimental process and maintaining a balance between thorough testing and practical implementation.

Key Takeaways:

  • Identify one key metric per experiment to simplify decision-making.
  • Perform power calculations to determine the duration and feasibility of tests.
  • Avoid ending tests hastily to prevent false positives.
  • Be wary of multiple hypothesis testing, which can inflate false positive rates.
  • Include data scientists throughout the experimental process to ensure effective analysis.

Deep Dives

Understanding A-B Testing

A-B testing is a strategy of contrasting two versions of a webpage or feature to determine which one performs better. The process involves randomly assigning users to different versions and measuring their behavior to assess the impact of changes. This technique is prevalent not only in large tech companies but also accessible to smaller businesses through platforms like Optimizely. The primary advantage of A-B testing is its capacity to isolate the effect of a single change, providing clear insights into user behavior shifts. However, launching changes without proper testing can lead to misinterpretation, as external factors might influence outcomes. As pointed out by the presenter, "generating numbers is easy; generating numbers you should trust is hard."

Importance of a Single Key Metric

When conducting A-B tests, it is vital to define a single key metric that serves as the primary measure of success. This approach simplifies decision-making by focusing on the most critical outcome. While additional metrics can be monitored as guardrails, the key metric should guide the overall evaluation. For instance, at DataCamp, despite targeting registrations or course starts, subscriptions remain a guardrail metric to ensure no negative impact on overall revenue. As the presenter noted, "having one key metric per experiment simplifies decision-making and ensures alignment with business goals."

Conducting Power Calculations

Power calculations are essential in determining the feasibility and duration of A-B tests. These calculations estimate the number of participants required to detect a significant effect, thus preventing unnecessary tests with insufficient data. The presenter emphasized that power is about avoiding false negatives, ensuring that real differences are not overlooked. Tools like Booking.com's power calculator can help in planning tests by estimating the required sample size or the duration needed to observe meaningful changes. This step is critical to ensuring resources are allocated efficiently and tests are designed to produce actionable insights.

Challenges of Multiple Hypothesis Testing

Multiple hypothesis testing can increase the likelihood of false positives by examining numerous segments or metrics simultaneously. This approach can lead to misleading conclusions if not managed properly. The presenter advised against exploring every possible user segment for differences, recommending pre-specifying hypotheses to maintain test integrity. When hypothesis testing is unavoidable, advanced statistical methods can adjust for multiple comparisons but often make tests more conservative. As the presenter explained, "checking all possible segments will almost surely find some segment where something appears to change, but it didn't really."


Terkait

white paper

8 Rules for Better Data Storytelling

8 Rules for Better Data Storytelling

white paper

5 Best Practices for Building Data Science Skills Academies

Best practices and expert advice on setting up an in-house skills academy

white paper

5 Best Practices for Building Data Science Skills Academies

Best practices and expert advice on setting up an in-house skills academy

webinar

8 Rules for Effective Data Storytelling

Best practices for crafting actionable data stories that drive business impact.

webinar

How to hire and test for data skills: A one-size-fits-all kit

Need to hire data scientists or analysts? This guide shows you how.

webinar

Driving Impact with Data Storytelling

Eight best practices you can adopt right now to become a better data storyteller