본문으로 바로가기

강의

R로 시작하는 Bioconductor

중급기술 수준

업데이트됨 2022. 12.

바이러스·곰팡이·인간·식물 데이터로 핵심 Bioconductor 패키지를 활용해 생물정보학 분석을 학습하세요!

무료로 강의 시작

RProbability & Statistics

4시간

14 동영상

54 연습 문제

4,050 XP

18,471

성취 증명서

수천 개 기업의 학습자들이 사랑하는

팀을 교육하시나요?

비즈니스용으로 체험해 보세요

강의 설명

의학부터 바이오텍까지, 생물학 연구는 점차 서열 분석 중심으로 이동하고 있어요. 이제는 표적 유전체부터 전체 유전체까지 방대한 데이터를 생성하고 있으며, 이를 분석해 생물학적 질문에 답해야 합니다. 이 과정을 시작할 수 있도록 Bioconductor 프로젝트를 소개해 드립니다. Bioconductor는 유전체 데이터의 분석과 이해를 위해 소프트웨어 도구(패키지), 워크플로, 데이터셋을 공유하는 인프라를 제공하고 구축합니다. Bioconductor는 누구나 접근할 수 있고, 커뮤니티가 함께 개발하는 오픈 소프트웨어 자원이에요. 이 과정을 마치면 핵심 Bioconductor 패키지를 사용할 수 있고, 인프라와 일부 내장 데이터셋의 개념을 익히게 됩니다. 다양한 종의 실제 데이터와 함께 BSgenome, Biostrings, IRanges, GenomicRanges, TxDB, ShortRead, Rqc를 사용해 보는 흥미로운 경험이 될 거예요!

선수 조건

Introduction to R Introduction to the Tidyverse

1

What is Bioconductor?

In this chapter, you will get hands-on with Bioconductor. Bioconductor is the specialized repository for bioinformatics software, developed and maintained by the R community. You will learn how to install and use bioconductor packages. You'll be introduced to S4 objects and functions, because most packages within Bioconductor inherit from S4. Additionally, you will use a real genomic dataset of a fungus to explore the BSgenome package.

Introduction to the Bioconductor Project

Bioconductor version

BiocManager to install packages

The role of S4 in Bioconductor

S4 class definition

Interaction with classes

Introducing biology of genomic datasets

Discovering the yeast genome

Partitioning the yeast genome

Available genomes

2

Biostrings and When to Use Them?

Biostrings are memory efficient string containers. Biostring has matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences. How efficient you can become by using the right containers for your sequences? You will learn about alphabets, and sequence manipulation by using the tiny genome of a virus.

Introduction to Biostrings

Exploring the Zika virus sequence

Biostrings containers

Manipulating Biostrings

Sequence handling

From a set to a single sequence

Subsetting a set

Common sequence manipulation functions

Why are we interested in patterns?

Searching for a pattern

Finding Palindromes

Finding a conserved region within six frames

Looking for a match

3

IRanges and GenomicRanges

The IRanges and GenomicRanges packages are also containers for storing and manipulating genomic intervals and variables defined along a genome. These packages provide infrastructure and support to many other Bioconductor packages because of their enriching features. You will learn how to use these containers and their associated metadata, for manipulation of your sequences. The dataset you will be looking at is a special gene of interest in the human genome.

IRanges and Genomic Structures

Constructing IRanges

Interacting with IRanges

Gene of interest

From tabular data to Genomic Ranges

GenomicRanges accessors

ABCD1 mutation

Human genome chromosome X

Manipulating collections of GRanges

A sequence window

Is it there?

More about ABCD1

How many transcripts?

From GRangesList object into a GRanges object

4

Introducing ShortRead

ShortRead is the package for input, manipulation and assessment of fasta and fastq files. You can subset, trim and filter the sequences of interest, and even do a report of quality. An extra bonus towards the last exercises will give you the tools for parallel quality assessment, wink, wink Rqc. Exciting enough, for this you will use plant genome sequences!

Sequence files

Reading in files

Exploring a fastq file

Extract a sample from a fastq file

Sequence quality

Exploring sequence quality

Base quality plot

Try your own nucleotide frequency plot

Match and filter

Filtering reads on the go!

Removing duplicates

More filtering!

Multiple assessment

Plotting cycle average quality

Introduction to Bioconductor

R로 시작하는 Bioconductor

강의
완료

수료증 획득

LinkedIn 프로필, 이력서 또는 CV에 이 인증서를 추가하세요
소셜 미디어와 성과 평가에서 공유하세요지금 등록

19백만 명 이상의 학습자와 함께 R로 시작하는 Bioconductor을(를) 시작하세요!

DataCamp for Mobile을 통해 데이터 분석 능력을 향상시키세요.

모바일 강좌와 매일 5분 코딩 챌린지를 통해 이동 중에도 학습 효과를 높이세요.