Lecx5.ppt

Similar documents
(baking powder) 1 ( ) ( ) 1 10g g (two level design, D-optimal) 32 1/2 fraction Two Level Fractional Factorial Design D-Optimal D

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库

Microsoft Word - TIP006SCH Uni-edit Writing Tip - Presentperfecttenseandpasttenseinyourintroduction readytopublish

Microsoft PowerPoint - NCBA_Cattlemens_College_Darrh_B

1.ai

Microsoft Word doc

彩色地图中道路的识别和提取

untitled

(Microsoft Word - 11-\261i\256m\253i.doc)


The Development of Color Constancy and Calibration System

Improved Preimage Attacks on AES-like Hash Functions: Applications to Whirlpool and Grøstl

Windows XP




Microsoft Word - A doc

TestNian

untitled

2015年4月11日雅思阅读预测机经(新东方版)

Some experiences in working with Madagascar: installa7on & development Tengfei Wang, Peng Zou Tongji university

热设计网

A VALIDATION STUDY OF THE ACHIEVEMENT TEST OF TEACHING CHINESE AS THE SECOND LANGUAGE by Chen Wei A Thesis Submitted to the Graduate School and Colleg

從詩歌的鑒賞談生命價值的建構

案例正文:(幼圆、小三、加粗)(全文段前与段后0


Microsoft Word - 論文封面 修.doc

Microsoft Word - 07.docx

问 她! 我 们 把 这 只 手 机 举 起 来 借 着 它 的 光 看 到 了 我 老 婆 正 睁 着 双 眼 你 在 干 什 么 我 问, 我 开 始 想 她 至 少 是 闭 着 眼 睛 在 yun 酿 睡 意 的 我 睡 不 着 她 很 无 辜 地 看 着 我 我 问 她 yun 酿 的 yu

% GIS / / Fig. 1 Characteristics of flood disaster variation in suburbs of Shang

报 告 1: 郑 斌 教 授, 美 国 俄 克 拉 荷 马 大 学 医 学 图 像 特 征 分 析 与 癌 症 风 险 评 估 方 法 摘 要 : 准 确 的 评 估 癌 症 近 期 发 病 风 险 和 预 后 或 者 治 疗 效 果 是 发 展 和 建 立 精 准 医 学 的 一 个 重 要 前

南華大學數位論文

Microsoft PowerPoint - Aqua-Sim.pptx

% % 34


1505.indd

ISSN

國立中山大學學位論文典藏.PDF

iml v C / 0W EVM - pplication Notes. IC Description The iml8683 is a Three Terminal Current Controller (TTCC) for regulating the current flowin

Microsoft PowerPoint _代工實例-1

iml v C / 4W Down-Light EVM - pplication Notes. IC Description The iml8683 is a Three Terminal Current Controller (TTCC) for regulating the cur

編 者 的 話 理 財 的 概 念 要 從 小 培 養 還 記 得 小 時 候, 一 個 香 腸 包 賣 多 少 錢 嗎? 3 元? 4 元? 5 元? 現 在 又 需 要 幾 多 錢 才 可 買 一 個 呢? 6 元? 8 元? 10 元? 十 年 後 又 賣 多 少 錢?( 大 概 20 元 有

A dissertation for Master s degree Metro Indoor Coverage Systems Analysis And Design Author s Name: Sheng Hailiang speciality: Supervisor:Prof.Li Hui,

1 119 Clark 1951 Martin Harvey a 2003b km 2

医学科研方法

WTO

IP TCP/IP PC OS µclinux MPEG4 Blackfin DSP MPEG4 IP UDP Winsock I/O DirectShow Filter DirectShow MPEG4 µclinux TCP/IP IP COM, DirectShow I

iml88-0v C / 8W T Tube EVM - pplication Notes. IC Description The iml88 is a Three Terminal Current Controller (TTCC) for regulating the current flowi

LH_Series_Rev2014.pdf

PowerPoint Presentation

2005硕士论文模版

[1-3] (Smile) [4] 808 nm (CW) W 1 50% 1 W 1 W Fig.1 Thermal design of semiconductor laser vertical stack ; Ansys 20 bar ; bar 2 25 Fig

4. 每 组 学 生 将 写 有 习 语 和 含 义 的 两 组 卡 片 分 别 洗 牌, 将 顺 序 打 乱, 然 后 将 两 组 卡 片 反 面 朝 上 置 于 课 桌 上 5. 学 生 依 次 从 两 组 卡 片 中 各 抽 取 一 张, 展 示 给 小 组 成 员, 并 大 声 朗 读 卡

致 谢 本 人 自 2008 年 6 月 从 上 海 外 国 语 大 学 毕 业 之 后, 于 2010 年 3 月 再 次 进 入 上 外, 非 常 有 幸 成 为 汉 语 国 际 教 育 专 业 的 研 究 生 回 顾 三 年 以 来 的 学 习 和 生 活, 顿 时 感 觉 这 段 时 间 也

- I -

Microsoft PowerPoint - ATF2015.ppt [相容模式]

幻灯片 1


Abstract Since 1980 s, the Coca-Cola came into China and developed rapidly. From 1985 to now, the numbers of bottlers has increased from 3 to 23, and

Microsoft Word - ChineseSATII .doc

1 * 1 *

天 主 教 輔 仁 大 學 社 會 學 系 學 士 論 文 百 善 孝 為 先? 奉 養 父 母 與 接 受 子 女 奉 養 之 態 度 及 影 響 因 素 : 跨 時 趨 勢 分 析 Changes in attitude toward adult children's responsibilit

TX-NR3030_BAS_Cs_ indd

1

南華大學數位論文

14 建筑环境设计模拟分析软件DeST--辅助商业建筑设计应用实例.doc

101 年 全 國 高 職 學 生 實 務 專 題 製 作 競 賽 暨 成 果 展 報 告 書 題 目 :Beat CNN`s Report, 驚 艷 外 國 人 的 嘴 - 皮 蛋 之 大 改 造 指 導 老 師 : 林 佩 怡 參 賽 學 生 : 胡 雅 吟 楊 椀 惇 張 毓 津 許 巧 文

Microsoft PowerPoint - Sens-Tech WCNDT [兼容模式]

摘 要 互 联 网 的 勃 兴 为 草 根 阶 层 书 写 自 我 和 他 人 提 供 了 契 机, 通 过 网 络 自 由 开 放 的 平 台, 网 络 红 人 风 靡 于 虚 拟 世 界 近 年 来, 或 无 心 插 柳, 或 有 意 噱 头, 或 自 我 表 达, 或 幕 后 操 纵, 网 络

國立桃園高中96學年度新生始業輔導新生手冊目錄

Microsoft Word - template.doc

BC04 Module_antenna__ doc

Microsoft Word - Students-app_2014

2005 5,,,,,,,,,,,,,,,,, , , 2174, 7014 %, % 4, 1961, ,30, 30,, 4,1976,627,,,,, 3 (1993,12 ),, 2

2-7.FIT)

UDC 厦门大学博硕士论文摘要库

untitled

, : III


穨control.PDF

标题

Revit Revit Revit BIM BIM 7-9 3D 1 BIM BIM 6 Revit 0 4D 1 2 Revit Revit 2. 1 Revit Revit Revit Revit 2 2 Autodesk Revit Aut


<4D F736F F D2035B171AB73B6CBA8ECAB73A6D3A4A3B6CBA158B3AFA46CA9F9BB50B169A445C4D6AABAB750B94AB8D6B9EFA4F1ACE3A873>

Transcription:

CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 3/11/08 CAP5510 1

Reading The following slides come from a series of talks by Rafael Irizzary from Johns Hopkins Much of the material can be found in detail in the following papers from [http://www.biostat.jhsph.edu/~ririzarr/papers/] Irizarry, RA, Hobbs, B, Collin, F, Beazer-Barclay, YD, Antonellis, KJ, Scherf, U, Speed, TP (2003) Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Biostatistics. Vol. 4, Number 2: 249-264. Bolstad, B.M., Irizarry RA, Astrand, M, and Speed, TP (2003), A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics. 19(2):185-193. 3/11/08 CAP5510 2

Inference Process 3/11/08 CAP5510 3

Affymetrix Genechip Design 3/11/08 CAP5510 4

Workflow: Analyzing Affy data 3/11/08 CAP5510 5

Affy Files DAT file: image file, about 10 million pixels, 30-50 MB CEL file: cell intensity file with probe level PM and MM values CDF file: chip description file describing which probes go in which probe sets and the location of probe-pair sets (genes, gene fragments, ESTs) 3/11/08 CAP5510 6

Image analysis & Background Correction Each probe cell: 10 X 10 pixels Gridding estimates location of probe cell centers Signal is computed by Ignoring outer 36 pixels leaving a 8 X 8 pixel area Taking the 75 percentile of the signal from the 8 X 8 pixel area Background signal is computed as the average of the lowest 2% probe cell values, which is then subtracted from the individual signals 3/11/08 CAP5510 7

Analyzing Affy data MAS 4.0 Works with PM-MM Negative values result very often Very noisy for low expressed genes Averages without log-transformation dchip [Li & Wong, PNAS 98(1):31-36] Accounts for probe effect Uses non-linear normalization Multi-chip analysis reveals outliers MAS 5.0 Improves on problems with MAS 4.0 3/11/08 CAP5510 8

Why you use log-transforms? SD SD Average Intensity Average Intensity 3/11/08 CAP5510 9

Problem with using (transformed) PM-MM 3/11/08 CAP5510 10

Bimodality for large expression values 3/11/08 CAP5510 11

MAS 5.0 MAS 5.0 is Affymetrix software for microarray data analysis. Ad hoc background procedure used For summarization, they use: Signal = TukeyBiweight{log(PM j -MM j *)} Tukey Biweight: B(x) = (1 - (x/c) 2 ) 2, if x<c = 0 otherwise Ad hoc scale normalization used & PhD thesis by Astrand 3/11/08 CAP5510 12

2 replicate arrays Expression from corresponding probes are highly correlated Expression not correlated when probes randomly partitioned 3/11/08 CAP5510 13

We have to deal with variations! 3/11/08 CAP5510 14

MvA Plots 3/11/08 CAP5510 15

Spike-in Experiment Replicate RNA samples were hybridized to various arrays Some probe sets were spiked in at different concentrations across the different arrays Goal was to see if these spiked probe sets stood out as differentially expressed 3/11/08 CAP5510 16

Analyzing Spike-in data with MAS 5.0 3/11/08 CAP5510 17

Robust Multiarray normalization (RMA) Background correction separately for each array Find E{Sig Sig+Bgd = PM} Bgd is normal and Sig is exponential Uses quantile normalization to achieve identical empirical distributions of intensities on all arrays Summarization: Performed separately for each probe set by fitting probe level additive model Uses median polish algorithm to robustly estimate expression on a specific chip Also see GCRMA [Wu, Irizzary et al., 2004] & PhD thesis by Astrand 3/11/08 CAP5510 18

Analyzing Spike-in data with RMA 3/11/08 CAP5510 19

MvA and q-q plots MAS 4.0 MAS 5.0 3/11/08 CAP5510 20

MvA and q-q Plots MBEI RMA 3/11/08 CAP5510 21

Before and after quantile normalization 3/11/08 CAP5510 22

Bioconductor Bioconductor is an open source and open development software project for the analysis of biomedical and genomic data. World-wide project started in 2001 R and the R package system are used to design and distribute software Commercial version of Bioconductor software called ArrayAnalyzer 3/11/08 CAP5510 23

R: A Statistical Programming Language Try the tutorial at: [http://www.cyclismo.org/tutorial/r/] Also at: [http://www.math.ilstu.edu/dhkim/rstuff/rtutor.html] 3/11/08 CAP5510 24

Installing a package from Bioconductor Let s consider LIMMA: Linear Models for Microarray Data. It is a software package for the analysis of gene expression microarray data, especially the use of linear models for analyzing designed experiments and the assessment of differential expression. The package includes pre-processing capabilities for two-color spotted arrays. The differential expression methods apply to all array platforms and treat Affymetrix, single channel and two channel experiments in a unified way. Here s how you install and load it: Here is an installation script > source("http://www.bioconductor.org/bioclite.r") > bioclite("limma") > bioclite("statmod") If you want to install some other package (say affy ), then you type: > bioclite( affy ) 3/11/08 CAP5510 25

Analyzing E. coli Lrp Data (Affymetrix) Follow instructions in Section 8.3 of LIMMA User s Guide (http://pbil.univ-lyon1.fr/library/limma/doc/usersguide.html) Data for the experiment is not from the address given in Sec 8.3, but from: http://cybert.microarray.ics.uci.edu/tutorial/affy%20data/ 3/11/08 CAP5510 26