untitled

Similar documents
States and capital package

HIV JO Slide Template

美国各州和地区在核准医生执照时对住院实习时间的要求

(baking powder) 1 ( ) ( ) 1 10g g (two level design, D-optimal) 32 1/2 fraction Two Level Fractional Factorial Design D-Optimal D


Microsoft Word - College Acceptances 2018 #3.docx

%

Microsoft Word - TIP006SCH Uni-edit Writing Tip - Presentperfecttenseandpasttenseinyourintroduction readytopublish

428 NCAL.pdf

2015年4月11日雅思阅读预测机经(新东方版)

Microsoft PowerPoint - STU_EC_Ch08.ppt

BC04 Module_antenna__ doc

American Rabbit Breeders Association, Inc 201 MEDIA KIT

Microsoft PowerPoint _代工實例-1

IPCC CO (IPCC2006) 1 : = ( 1) 1 (kj/kg) (kgc/gj) (tc/t)

1505.indd

Microsoft PowerPoint - NCBA_Cattlemens_College_Darrh_B

Microsoft Word - p11.doc


(Microsoft Word - 11-\261i\256m\253i.doc)

encourages children to develop rich emotions through close contact with surrounding nature. It also cultivates a foundation for children s balanced de

Stochastic Processes (XI) Hanjun Zhang School of Mathematics and Computational Science, Xiangtan University 508 YiFu Lou talk 06/

南華大學數位論文

地質調査研究報告/Bulletin of the Geological Survey of Japan

Formula SAE Michigan 2019 Results

<4D F736F F D2078B1CFD2B5C2DBCEC4A3BAA1B6BDF0D8D1D2AAC2D4A1B7D0BAD0C4CCC0B7BDD6A4D1D0BEBF2E646F63>

<4D F736F F D203033BDD7A16DA576B04FA145A4ADABD2A5BBACF6A16EADBAB6C0ABD2A4A7B74EB8712E646F63>

Chn 116 Neh.d.01.nis

Microsoft Word - 第四組心得.doc

12-2 プレート境界深部すべりに係る諸現象の全体像


Microsoft Word - A doc

hks298cover&back

國立桃園高中96學年度新生始業輔導新生手冊目錄


< F5FB77CB6BCBD672028B0B6A46AABE4B751A874A643295F5FB8D5C5AA28A668ADB6292E706466>

Untitled-3

Prasenjit Duara 3 nation state Northwestern Journal of Ethnology 4 1. A C M J M M

Microsoft Word - 論文封面 修.doc

168 健 等 木醋对几种小浆果扦插繁殖的影响 第1期 the view of the comprehensive rooting quality, spraying wood vinegar can change rooting situation, and the optimal concent

摘 要 張 捷 明 是 台 灣 當 代 重 要 的 客 語 兒 童 文 學 作 家, 他 的 作 品 記 錄 著 客 家 人 的 思 想 文 化 與 觀 念, 也 曾 榮 獲 多 項 文 學 大 獎 的 肯 定, 對 台 灣 這 塊 土 地 上 的 客 家 人 有 著 深 厚 的 情 感 張 氏 於

untitled

A Community Guide to Environmental Health

<4D F736F F D D31332DA655B0CFB9EAAC49A5AEA8E0B942B0CAB943C0B8BDD2B57BB27BAA70A4C0AA522D2D2DBBB2A46A >

Improved Preimage Attacks on AES-like Hash Functions: Applications to Whirlpool and Grøstl

% % % % % % ~

國立中山大學學位論文典藏

政治哲學要跨出去!

唐彪《讀書作文譜》述略


天 主 教 輔 仁 大 學 社 會 學 系 學 士 論 文 小 別 勝 新 婚? 久 別 要 離 婚? 影 響 遠 距 家 庭 婚 姻 感 情 因 素 之 探 討 Separate marital relations are getting better or getting worse? -Exp

<4D F736F F D203338B4C12D42A448A4E5C3C0B34EC3FE2DAB65ABE1>

前 言 一 場 交 換 學 生 的 夢, 夢 想 不 只 是 敢 夢, 而 是 也 要 敢 去 實 踐 為 期 一 年 的 交 換 學 生 生 涯, 說 長 不 長, 說 短 不 短 再 長 的 路, 一 步 步 也 能 走 完 ; 再 短 的 路, 不 踏 出 起 步 就 無 法 到 達 這 次

Coaching Records Year-by-Year Results All Time Records Vs. Opponents Individual

诚 实 守 信 公 平 交 易 好 的 伦 理 为 经 营 之 道 我 们 的 价 值 观 我 们 的 日 常 工 作 让 客 户 和 消 费 者 展 露 微 笑 我 们 关 注 员 工 产 品 和 业 务 的 不 断 改 善 和 进 步 我 们 珍 视 我 能 做 到 的 态 度 和 精 神, 尝

Microsoft Word - 24.doc


<4D F736F F D205F FB942A5CEA668B443C5E9BB73A740B5D8A4E5B8C9A552B1D0A7F75FA6BFB1A4ACFC2E646F63>

59-81

國立中山大學學位論文典藏.PDF

2008 Nankai Business Review 61

% GIS / / Fig. 1 Characteristics of flood disaster variation in suburbs of Shang

Microsoft Word - KSAE06-S0262.doc

2005 5,,,,,,,,,,,,,,,,, , , 2174, 7014 %, % 4, 1961, ,30, 30,, 4,1976,627,,,,, 3 (1993,12 ),, 2

spss.doc

國 史 館 館 刊 第 23 期 Chiang Ching-kuo s Educational Innovation in Southern Jiangxi and Its Effects ( ) Abstract Wen-yuan Chu * Chiang Ching-kuo wa

1 * 1 *

簡報技巧

coverage2.ppt

度 身 體 活 動 量 ; 芬 蘭 幼 兒 呈 現 中 度 身 體 活 動 量 之 比 例 高 於 臺 灣 幼 兒 (5) 幼 兒 在 投 入 度 方 面 亦 達 顯 著 差 異 (χ²=185.35, p <.001), 芬 蘭 與 臺 灣 幼 兒 多 半 表 現 出 中 度 投 入 與 高 度

Liao Mei-Yu Professor, Department of Chinese Literature, National Cheng Kung University Abstract Yao Ying was a government official in Taiwan for more

從詩歌的鑒賞談生命價值的建構

护理10期.indd

Microsoft Word - Final Exam Review Packet.docx

Microsoft Word - template.doc

:1949, 1936, 1713 %, 63 % (, 1957, 5 ), :?,,,,,, (,1999, 329 ),,,,,,,,,, ( ) ; ( ), 1945,,,,,,,,, 100, 1952,,,,,, ,, :,,, 1928,,,,, (,1984, 109

UDC Empirical Researches on Pricing of Corporate Bonds with Macro Factors 厦门大学博硕士论文摘要库

89???????q?l?????T??

PowerPoint Presentation

<4D F736F F D20B5DAC8FDB7BDBE57C9CFD6A7B8B6D6AEB7A8C2C98696EE7DCCBDBEBF2E646F63>

슬라이드 1

03施琅「棄留臺灣議」探索.doc

Settlement Equation " H = CrH 1+ e o log p' o + ( p' p' c o! p' o ) CcH + 1+ e o log p' c + p' f! ( p' p' c c! p' o ) where ΔH = consolidation settlem

2005 4,,,,,,,,,,,,, (2001 ) ;() ( 1997 ) ; ( 1997 ) ; () (1996 ) ;,: ( 1995 ) ; (1995 ),,,, (J13) (J9) (J10), (J19) (J17) 58

蔡 氏 族 譜 序 2

66 臺 中 教 育 大 學 學 報 : 人 文 藝 術 類 Abstract This study aimed to analyze the implementing outcomes of ability grouping practice for freshman English at a u

POOL 1 University of Florida Wake Forest University Georgia College Winning % Record Place Bracket # TEAM TEAM DATE TIME CRTS SCORE R1 University of F

高中英文科教師甄試心得

2002 2,,,,,,, ,,,,,,,,,, 1907,1925,, ,, , 1928,1934,1934 5,, ,, ,,,,

<4D F736F F D D312DC2B2B4C2AB47A16DC5AAAED1B0F3B5AAB0DDA144A7B5B867A16EB2A4B1B4A277A548AED1A4A4BEC7A5CDB0DDC344ACB0A8D2>

Important Notice SUNPLUS TECHNOLOGY CO. reserves the right to change this documentation without prior notice. Information provided by SUNPLUS TECHNOLO

~ ~ ~

影響新產品開發成效之造型要素探討

南華大學數位論文

,,,,,,, (1975) (,2004 : ) (1981) 20,, (,1987 :6) L ,, (,2005b),,, ;,,,,,, ( ) (,1989) :, :A,, ;B, ;C ;D, (,1987 : ) 16

國立中山大學學位論文典藏.PDF

參 加 第 二 次 pesta 的 我, 在 是 次 交 流 營 上 除 了, 與 兩 年 沒 有 見 面 的 朋 友 再 次 相 聚, 加 深 友 誼 外, 更 獲 得 與 上 屆 不 同 的 體 驗 和 經 歴 比 較 起 香 港 和 馬 來 西 亞 的 活 動 模 式, 確 是 有 不 同 特


* RRB *

Transcription:

Statistics & Data Analysis 4 Zhu Huaiqiu i @Peking University A picture says more than a thousand words!

4.1 41 Descriptive statistics, or Statistical description In a clear and understandable way What does the Descriptive Statistics do? How to present your arguments or viewpoints? Open-air preaching in James Street, Covent Garden, London. The preacher is using an unusual style, he has his main points written on laminated plastic cards and is sticking them on a board.

What does the Descriptive Statistics do? Describe the basic features of the data gathered from an experimental study in various ways; Form the basis of quantitative analysis of data together with simple graphics analysis; Provide simple summaries about the sample and the measures; Proceed to inferential statistics if there are enough data to draw a conclusion It is necessary to be familiar with primary methods of describing data in order to understand phenomena and make intelligent decisions. (1) Graphical ldisplays of the data in which graphs summarize the data or facilitate comparisons. (2) Tabular description in which tables of numbers summarize the data. (3) Summary statistics (single numbers) which summarize y ( g ) the data.

(1) Collect data (2) Classify data (3) Summarize data (4) Present data (5) Proceed dto inferential statistics ti ti ifthere are enough data to draw a conclusion 4.2 42 (Summary Statistics) 1.

2. 3. order statistics X 1, X 2,, X n X x 1, x 2,, x n x (1) x (2) x x (n) X X 1, X 2,, X n X (1), X (2),, X (n)

X F(x) f(x) X X (1), X (2),, X (n) 1 ( X (1), X (2),, X (n) ) n! f ( x ) f ( x )... f ( x ), x x... x 1 2 n 1 2 2 X (k), k=1, 2,, n n! 1 f ( x)[ F( x)] [1 F( x)] ( k 1)!( n k)! 3 ( X (k), X (l) ), k<l k n k n! k 1 f( xk) f( xl) [ F( xk)] ( k 1)!( l k 1)!( n l)! l k 1 n l [ F( xl) F( xk)] [1 F( xl)], xk xl n 4. range X 1, X 2,, X n X X (1), X (2),, X (n) R n =max{x 1, X 2,, X n } min{x 1, X 2,, X n } =X (n) X X (1) X 1, X 2,, X n

R n ER ( ) n DR ( ) n c n d 2 2 n R n σ 5. Median 50% X X 1,X 2,..., X n x 1,x 2,..., x n x (*) 1 x 2... x n n 2k+1 x k 1 n=2k ( x k x k 1)/2 1 2 3

6. Mode The value that occurs the most frequently in a data set or a probability distribution X X 1, X 2,..., X n x 1, x 2,..., x n x M 1 n x M 2 3 Example p 4.1 Plant Growth (# of leaves/plant) 6 4 5 4 8 3

7. X X 1, X 2,..., X n 1 n X X i n i 1 1 S X X C v S X n 2 2 ( i ) n 1 i 1 X coefficient of variation Example 4.2 20 X 25.8 s=3.2 15 70850.0 s=4264.0 X

8. Chebyshev s theorem x 1, x 2,..., x N μ σ > 0 k>1 [μ kσ, μ+kσ] k=2: 75.0% k=3: 88.9% k=4: 93.8% 1 1 k 2

4.3 43 exploratory analysis 1 2 3 4 outlier

4.3.1 Histogram Histogram Histogram: histos anything set upright gramma drawing, record, writing graphical display of tabulated frequencies, shown as bars, it shows what proportion of cases fall into each of several categories. 1. occurrence histogram 1 2 / 5 1 6~15 2 5

K Moore Moore, 1986 2 K C n 5 C =1~3 Sturges Sturges, 1928 k k k=1, 2,..., K K 1 3.322lg n n 1, n 2,..., n K, n k / k 2. frequency f n n, k 1,2,..., K k K k fk 1 k 1 frequency q ydensity fk nk p k, k 1,2,..., K n K k 1 p k k 1 k k

Example 4.3 Consider data collected by the U.S. Census Bureau on time to travel to work (2000 census). ) The census found that there were 124 million people who work outside of their homes. This rounding is a common phenomenon when collecting data from people. Interval Width Quantity Quantity/width 0 5 4180 836 5 5 13687 2737 10 5 18618 3723 15 5 19634 3926 20 5 17981 3596 25 5 7190 1438 30 5 16369 3273 35 5 3212 642 40 5 4122 824 Data by absolute numbers 45 15 9200 613 60 30 6461 215 90 60 3435 57 Histogram of travel time, US 2000 census. Area under the curve equals the total number of cases. This diagram uses Q/width from the table. Data by yproportionp Interval Width Quantity (Q) Q/total/width 0 5 4180 0.0067 5 5 13687 0.0220 10 5 18618 0.03000300 15 5 19634 0.0316 20 5 17981 0.0289 25 5 7190 0.0115 30 5 16369 0.0263 35 5 3212 0.0051 40 5 4122 0.0066 45 15 9200 0.00490049 60 30 6461 0.0017 90 60 3435 0.0004 Histogram of travel time, US 2000 census. Area under the curve equals 1. This diagram uses Q/total/width from the table

1 2 3

Bar Bar chart 3. Pareto Pareto chart V. Pareto A type of chart that contains both bars and a line graph, where individual values are represented in descending order by bars, and the cumulative total is represented by the line. Pareto

A Pareto chart showing the relative frequency of reasons for arriving late at work. 4.3.2 Stem-leaf plot x i {x i }

1. x x i Stem Leaf 2. Stem 3. Stem Leaf 4. Stem Leaf Example 4.4 100 / psi 342 342 346 344 343 339 336 342 347 340 340 350 340 336 341 339 346 338 342 346 340 346 346 345 344 350 348 342 340 356 339 348 338 342 347 347 344 343 3 339 341 348 341 340 340 342 337 344 340 344 346 342 344 345 338 341 348 345 339 343 345 346 344 344 344 343 345 345 350 353 345 352 350 345 343 347 343 350 343 350 344 343 348 342 344 345 349 332 343 340 346 342 335 349 343 344 347 341 346 341 362

Cumulative Freq. Stem Leaf 1 33 2 2 33 5 5 33 6 6 7 13 33 8 8 8 9 9 9 9 9 28 34 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 48 34 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 (21) 34 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 31 34 66666666677777 6 6 6 6 6 6 6 6 7 7 7 7 17 34 8 8 8 8 8 9 9 10 35 000000 0 0 0 0 0 4 35 2 3 2 35 2 35 6 1 35 1 36 1 36 2 1/4 3/4

4.3.3 Boxplot Boxplot Boxplot Box-whisker Plot John Tukey 1/4 3/4 3/4 1 2 Q1 Q3 3 Q3 1.5IQR Q1 1.5IQR Q3 3IQR Q1 3IQR mild outliers (extreme outliers) 4 5

IQR: Interquartile Range Extreme outliers Mild outliers Mild outliers Extreme outliers Mild outliers Extreme outliers

John Wilder Tukey 1915 2000 Let the data speak for themselves! J. W. Tukey Example 4.5

Example 4.6 1969.10-1972.10 Bayonne 36 30 Clambers 1983 The boxplots... show many properties p of thedatarather strikingly. gy Thereisageneral reduction in sulphur dioxide concentration through time due to the gradual conversion to low sulphur fuels in the region. The decline is most dramatic for the highest quantiles. Also, there are higher concentrations during the winter months due to the use of heating oil. In addition, the boxplots show that the distributions areskewedtowardhighvaluesandthatthespreadofthedistributions...islargerwhenthe general level of concentration is higher.

Example 4.7 The total number of tornadoes recorded by State of USA (including the District of Columbia) from 2000. Table 1. The total number of recorded tornadoes in 2000, arranged alphabetically by State of USA and including the District of Columbia. (Source: National Oceanic and Atmospheric Administration's National Climatic Data Center) State No. of No. of State Tornadoes Tornadoes Alabama 44 Montana 10 Alaska 0 Nebraska 60 Arizona 0 Nevada 2 Arkansas 37 New Hampshire 0 California 9 New Jersey 0 Colorado 60 New Mexico 5 Connecticut 1 New York 5 Delaware 0 North Carolina 23 District of Columbia 0 North Dakota 28 Florida 77 Ohio 25 Georgia 28 Oklahoma 44 Hawaii 0 Oregon 3 Idaho 13 Pennsylvania 5 Illinois 55 Rhode Island 1 Indiana 13 South Carolina 20 Iowa 45 South Dakota 18 Kansas 59 Tennessee 27 Kentucky 23 Texas 147 Louisiana 43 Utah 3 Maine 2 Vermont 0 Maryland 8 Virginia 11 Massachusetts 1 Washington 3 Michigan 4 West Virginia 4 Minnesota 32 Wisconsin 18 Mississippi 27 Wyoming 5 Missouri 28

Graphical Summaries: Dispersion Graphs (dot plots) A dispersion graph places individual data values along a number line, thereby representing the position of each data value in relation to all the other data values. Researchers use dispersion graphs to identify patterns in data such as concentrations, locations of data 'gaps', or atypical data (i.e. observations that do not fit the general character of the dt data.) We can see that most of the data is concentrated at the lower end of the graph, indicating that most States had 20 or fewer tornadoes, and only a few States had more than 50 tornadoes. Texas had 147 tornadoes during 2000 and this data value is positioned on the far right-hand side of the graph. There are also several gaps in the data where there are no values; these are shown by the green rectangles on the graph. Since the data are concentrated toward the lower end of the dispersion graph, we can say that the number of tornadoes for Texas (147) is atypical of this particular data set. Atypical data values are also referred to as outliers. Graphical Summaries: Histograms We see that the first class contains all the States that experienced between zero and nineteen tornadoes during 2000. Notice that each class has the same width along the x- axis. The decision to set the width of each class at nineteen is arbitrary. A different width could easily be used and would likely change the overall appearance of the histogram. As with dispersion graphs, histograms can show gaps where no data values exist (the 100-119 class). In the Figure, there are three empty classes: 80-99, 100-119, and 120-139. When histogram data cluster to one side or the other, the shape of the histogram is described as skewed. In the Figure, the tornado data are clustered on the lower or left-hand side, which is known as positive skew. Due to the single outlier, the data is said to tail to the positive side.

Numerical Summaries: Measures of Central Tendency Mean If we examine the mean in relation to all data values, we can see that the mean lies toward the lower end of the dispersion graph, which makes sense because this is where the majority of the data values are concentrated. Numerical Summaries: Measures of Central Tendency Median We can see the tornado data median value of 11 on the dispersion graph. Note that for this data set, the median is positioned closer to the lower end of the data values than the mean. This shows that the median is not influenced by outliers as was the mean, but by the number of data values. When a data set has outliers, reporting the median as the central tendency of the data often gives a better 'typical' data value than the mean. Rank State No. of Tornadoes Rank State No. of Tornadoes 1 Alaska 0 27 Idaho 13 2 Arizona 0 28 Indiana 13 3 District i t of Columbia 0 29 South Dakota 18 4 Delaware 0 30 Wisconsin 18 5 Hawaii 0 31 South Carolina 20 6 New Hampshire 0 32 Kentucky 23 7 New Jersey 0 33 North Carolina 23 8 Vermont 0 34 Ohio 25 9 Connecticut 1 35 Mississippi 27 10 Massachusetts 1 36 Tennessee 27 11 Rhode Island 1 37 Georgia 28 12 Maine 2 38 Missouri 28 13 Nevada 2 39 North Dakota 28 14 Oregon 3 40 Minnesota 32 15 Utah 3 41 Arkansas 37 16 Washington 3 42 Louisiana i 43 17 Michigan 4 43 Alabama 44 18 West Virginia 4 44 Oklahoma 44 19 New Mexico 5 45 Iowa 45 20 New York 5 46 Illinois 55 21 Pennsylvania 5 47 Kansas 59 22 Wyoming 5 48 Colorado 60 23 Maryland 8 49 Nebraska 60 24 California 9 50 Florida 77 25 Montana 10 51 Texas 147 26 Virginia 11

Numerical Summaries: Measures of Central Tendency Mode The mode is the data value that occurs the most frequently in a data set. Although not used as often as the mean and the median, by identifying the most commonly occurring data value the mode may suggest the central tendency of the data. For the tornado data, the mode is 0. There are eight States that did not experience any tornadoes in 2000. However, it would be misleading to suggest that the central tendency of this data set is 0, since it is obvious from the data values that the value of 0 is not 'central' to the range of values. Numerical Summaries: Measures of Dispersions Virginia 11-10.1 102.01 Variance No. of Squared Deviation Idaho 13-8.1 65.61 State Tornad Deviation Scores Indiana 13-8.1 65.61 oes Scores South 18-3.1 9.61 Alaska 0-21.1 445.21 Dakota Arizona 0-21.1 445.21 Wisconsin 18-3.1 9.61 District of South 0-21.1 445.21 Columbia Carolina 20-1.1 1.21 Delaware 0-21.1 445.21 Kentucky 23 1.9 3.61 Hawaii 0-21.1 445.21 North 23 1.9 3.61 Carolina New Hampshire 0-21.1 445.21 Ohio 25 3.9 15.21 New Jersey 0-21.1 445.21 Mississippi 27 5.9 34.81 Vermont 0-21.1 445.21 Tennessee 27 5.9 34.81 Connecticut 1-20.1 404.01 Georgia 28 6.9 47.61 Massachusetts 1-20.1 404.01 Missouri 28 6.9 47.61 North Rhode Island 1-20.1 404.0101 28 6.9 47.61 Dakota Maine 2-19.1 364.81 Nevada 2-19.1 364.81 Oregon 3-18.1 327.61 Utah 3-18.1 327.61 Minnesota 32 10.9 118.81 Arkansas 37 15.9 252.81 Louisiana 43 21.9 479.61 Alabama 44 22.9 524.41 Washington 3-18.11 327.61 Oklahoma 44 22.9 524.4141 Michigan 4-17.1 292.41 Iowa 45 23.9 571.21 West Virginia 4-17.1 292.41 Illinois 55 33.9 1149.21 New Mexico 5-16.1 259.21 Kansas 59 37.9 1436.41 New York 5-16.1 259.21 Colorado 60 38.9 1513.21 Pennsylvania 5-16.1 259.21 Nebraska 60 38.9 1513.21 Wyoming 5-16.1 259.21 Florida 77 55.9 3124.81 Maryland 8-13.1 171.61 Texas 147 125.9 15850.81 California 9-12.1 146.41 Montana 10-11.1 123.21 Sum=0.0 Sum=36096.5

Numerical Summaries: Measures of Dispersions Standard Deviation Using the standard deviation of 26.6 for the tornado data, we can create bounds around the mean that describe data positions that are 1, 2, or 3 standard deviations. For example, if we add one standard deviation to and subtract one standard deviation from the mean we arrive at 47.7 and -5.5, respectively. FromtheFigure, we can see that most of the data fall within 1 standard deviation of the mean, which suggests that the data are concentrated about the mean. Notice that as the number of standard deviations increases, fewer data values are found. In fact, only six data values are found beyond 1 standard deviations from the mean. It is interesting to note that one data value is beyond 3 standard deviations. When interpreting any standard deviation value it is important to keep in mind that the greater the value of the standard deviation, the more spread out or dispersed a data set is likely to be. Numerical Summaries: Measures of Dispersions Interquartile Range A quartile can be thought of as one of the classes created from the division of an ordered data set into four equally-sized groups. The 25th quartile has 25% of the data falling below it and the 75th quartile has 75% of the data falling below it. The interquartile range describes the middle one-half (or 50%) of an ordered data set, so represents the range between the data value of the 25th quartile and the data value of the 75th quartile. The Figure illustrates the bounds of the interquartile range for the tornado data.

Numerical Summaries: Measures of Dispersions Interquartile Range Using the box-and-whisker plot, you can see the position of the central tendency with respect to the interquartile range. In our case, the median is positioned toward the lower end of the data, which suggests that the data is positively skewed. You can also see the length of the interquartile range compared to the entire data set, and identify atypical data values and the degree to which those values are atypical. The numbers on top of the circle and asterisk indicate the rank of the value, and allow you to locate the specific data value. 4.3.4

Scatter plot A 3D scatter plot allows for the visualization of multivariate data of up to four dimensions. The Scatter plot takes multiple scalar variables and uses them for different axes in phase space. The different variables are combined to form coordinates in the phase space and they are displayed using glyphs and colored using another scalar variable.

1.0 1.0 10 1.0 0 1.0 1.0 x f t x sin t x cos t x sin2 t x cos2 f X 1 2 3 4 5 t 2...

Statistics for Data Analysis Descriptive Statistics Classification & Arrangement Summarizing Presentation (figure, graph, table, ) Data + Working Hypothesis Inferential Statistics Sampling Parameter Estimation Hypothesis Test Conclusions