分享
分销 收藏 举报 申诉 / 13
播放页_导航下方通栏广告

类型统计学外文翻译.docx

  • 上传人:天****
  • 文档编号:9810916
  • 上传时间:2025-04-09
  • 格式:DOCX
  • 页数:13
  • 大小:83.31KB
  • 下载积分:8 金币
  • 播放页_非在线预览资源立即下载上方广告
    配套讲稿:

    如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

    特殊限制:

    部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

    关 键  词:
    统计学 外文 翻译
    资源描述:
    外文翻译原文 名称:Fundamentals_of_Statistics Measures of Central Tendency and Location: mean, median, mode, percentiles, quartiles and deciles. x sorted x 53 53 55 53 70 53 58 55 64 57 57 57 53 58 69 64 57 68 68 69 53 70 The Measures of Central Tendency are Mean, Median and Mode Mean ® x-bar or ® for a given variable, it is the sum of the values divided by the number of values (Sxi/n). In this case, we have n = 11. So we need to add all of the values together and divide by 11. S = 657, = 59.73 Median ® the number in a distribution of a variable’s response where one half of the values are above and one half of the values are below. To find the median, we first need to put our data in ascending order (smallest to largest). Then we can determine the median…if the value of n is odd, it is simply the middle observation, but if the value of n is even, it is the average of the two middle observations. In this case, n is odd, so the median will be the middle observation of our sorted values (the 6th value)...57 Mode ® the value that occurs most frequently. If there are two different values most frequently occurring, the data are said to be bi-modal. If there are more than two modes, and the distribution is said to be multi-modal. In this case, the value that occurs most often is 53. So, the mode is 53. The measures of location are Percentile, Quartile and Decile Percentile ® the pth percentile is a value such that at least p percent of the observations are less than or equal to this value and at least (100 – p) percent of the observations are greater than or equal to this value. To calculate percentiles, we use indices (i). i = (p/100) n for p1, p2, p3,…p99 If the answer is a whole number (an integer), then i is the average of (P/100)n and 1 + (P/100)n. If the index number is not a whole number, we ALWAYS round up. The position of the index is the next whole number (integer) greater than the computed index. For example: i(p50) = (50/100)11 = 5.5...this rounds up to 6 So, we would count from the lowest value of the sorted data to the index number (6). Since the calculated i was not a whole number we had to round up to find the value where at least 50% of the values are equal to or lower than this value and at least 50% are equal to or higher than this value. In this case, the value of the 50th percentile is the 6th value...57 … Does this look familiar? ® The 50th percentile is the same thing as the median. What does it tell us? In this distribution, AT LEAST 50% of the observations are LESS THAN OR EQUAL TO 57 AND AT LEAST 50% of the observations are GREATER THAN OR EQUAL TO 57. i(p80) = (80/100)11 = 8.8...this round up to 9. The 9th value is 68. Again, since the index number is not a whole number, we round up. So, we would count from the lowest value of the sorted data to the index number (9). In this case, the value of the 80th percentile is 68. Since this dataset has 11 observations, we won’t have any instances where our calculated index number is a whole number. However, if we just remove our value of 70 and create a new distribution, we will be able to see an example... 53 53 53 55 57 57 58 64 68 69 i(p30) = (30/100)10 = 3...this is a whole number, so we must take the 3rd and 4th values and average them to find the 30th percentile. (53 + 55)/2 = 54 So, the value of the 30th percentile is 54. Return to our original data distribution ... Quartiles – are special cases of percentiles…Q1 = P25, Q2 = P50, Q3 = P75, These three values divide the distribution into 4 equal quarters i(Q1) = (25/100)11 = 2.75...this rounds to 3, so Q1 is the 3rd value...53 i(Q2) = (50/100)11 = 5.5...this round to 6, so Q2 is the 6th value...57 i(Q3) = (75/100)11 = 8.25...this rounds to 9, so Q3 is the 9th value...64 Measures of Dispersion or Variability: Range, interquartile range (IQR), variance, standard deviation and coefficient of variation. Range = This tells us how wide the span is from the maximum value to the minimum value. (Max – Min) = Range. In this instance, the range is 69 - 53 = 16. Interquartile Range (IQR) = This tells us how wide the span is in the middle 50% of the data. (Q3 – Q1) = IQR. In this case ... 64 – 53 = 11 We will use IQR in later processes, so we will want to keep this x (x-xbar) (x-xbar)2 53 53 -6.73 -6.73 45.29 45.29 53 53 -6.73 -6.73 45.29 45.29 53 53 -6.73 -6.73 45.29 45.29 55 55 -4.73 -4.73 22.37 22.37 57 57 -2.73 -2.73 7.45 7.45 57 57 -2.73 -2.73 7.45 7.45 58 58 -1.73 -1.73 2.99 2.99 64 64 4.27 4.27 18.23 18.23 68 68 8.27 8.27 68.39 68.39 69 69 9.27 9.27 85.93 85.93 70 70 10.27 10.27 105.47 105.47 657 657 -0.03 -0.03 454.18 454.18 657/11=59.73 454.18/10≈45.2 We use the formula: = s2 The variance for these data is 454.18. For our purposes here, the computation of variance is just a step towards the computation of the standard deviation. Sample standard deviation (s) is the positive square root of the variance. = s So the formula for sample standard deviation is… Population Variance (s2)®uses the same formula in the numerator, but N instead of n-1 in the denominator. Since we rarely have information about the entire population, we almost always use the formula for sample variance, s2. Population Standard Deviation: s = …since we rarely have information from the entire population, we use the formula for sample standard deviation, s. Coefficient of Variation: tells us what percent the sample standard deviation is of the sample mean This number is “relative” and is only of use in comparing the distribution of two or more variables. Suppose I have two samples, and I want to know which sample has more variability… If both samples have the same mean, the one with the higher standard deviation will have the greater variability. However, if they have different means, I need to calculate the coefficient of variation to determine which one has the most variability. xbar = 458, s = 112 versus xbar = 687, s = 192 Standardized Data and Detecting Outliers Z-score: z = The z-score tells us how many standard deviations a value is from the mean. We can look at a picture of what a z-score tells us. In the Normal Curve…the mean is at the highest point and the curve tails off symmetrically in both directions. The sign of the z-score tells us which direction the value is from the mean on the Normal Curve. Negative values will be to the left, and positive values will be to the right. Standardizing Scores: Standard Normal Curve…the mean is zero, and the standard deviation is 1. The distribution is bell-shaped and symmetrical. The area under the curve is 1, and the tails of the curve extend out infinitely. They never actually touch the horizontal axis. The highest point on the curve is at the mean Return to our data …let’s calculate the z-scores for each of the values… Empirical Rule ®used when the distribution is assumed to known to be approximately normal. ® Approximately 68% of the values will fall within 1 sd of the mean ® Approximately 95% of the values will fall within 2 sd of the mean ® Approximately 99.9% of the values will fall within 3 sd of the mean Chebyshev’s Theorem ® doesn’t require that the data have a normal distribution Says that at least (1 – 1/z2) values will fall within z standard deviations of the mean. 1-1/12 = 0, 1-1/22 = .75, 1-1/32 = .88889, 1-1/42 = .9375, 1-1/52 = .96 ® We can’t make any assumptions about the percent of values that are within 1 sd of the mean But… ® At least 75% of the values will fall within 2 sd of the mean ® At least 88.9% of the values will fall within 3 sd of the mean We use Chebyshev’s Theorem to estimate the variation in a distribution when ® n < 30, or ® the shape of the distribution is unknown, or ®the distribution is assumed to be non-normal. Outliers: suspect or extreme values of data that must be identified and scrutinized. If they are instances of incorrectly entered data, they should be corrected. If the value was entered correctly and it is a valid number, it should remain in the dataset as part of the initial analysis. When we use the z-score method for identifying outliers, we assume that any value that has a z-score with an absolute value greater than 3.0 (that is less than -3.0 or greater than +3.0) is an outlier. Before we proceed with data analysis, we need to examine all outliers for accuracy. If we determine that the value is valid, we often run two sets of analysis. One with the outlier, and one without. Another way to identify outliers… Related to IQR is the Five number summary…minimum, Q1, Q2, Q3, & maximum. These values feed into upper and lower limits, and we graph them in a box plot. Five Number Summary Minimum 53 Q1 53 Q2 57 Q3 64 Maximum 70 ® Use the box plot… The advantage of the boxplot is that it is not influenced by outliers or extreme values as are Z-scores. Box Plots – Whiskers show the range of data within the inner fences 3(IQR) 1.5(IQR) Q1 Median Q3 1.5(IQR) 3(IQR) below Q1 below Q1 (IQR) above Q3 above Q3 (Lower Outer & Inner Fences) (Upper Inner & Outer Fences) Any values between the inner and outer fences are “unusual,” and any values out beyond the outer fences are “outliers.” Advantage of using the box plot method as well as the z-score method...the box plot method is not influenced by extreme values in the same way that the mean and the standard deviation are....it is said to be a more conservative method of evaluating outliers. 外文翻译原文 课题名称:统计基础 Measures of Central Tendency and Location:趋势和位置的划分: mean, median, mode, percentiles, quartiles and deciles. 意思是说,中位数,众数,百分位数,四分位数和十分位数。   x x                    sorted x 排序x 53 53                             53 53 55 55                             53 53 70 70                             53 53 58 58                             55 55 64 64                             57 57 57 57                             57 57 53 53                             58 58 69 69                             64 64 57 57                             68 68 68 68                             69 69 53 53                             70 70   The Measures of Central Tendency are Mean, Median and Mode 中央趋势的划分是平均数,中位数和众数   均值 均值® ® 对于一个给定的变量,它的值除以变量的数目的总和。 在这种情况下,我们有 N = 11。 因此,我们需要添加所有的值除以11。   S = 657 , S= 657, = 59.73 = 59.73 中位数 ®值的一半以上和一个值的一半以下再在分配变量的响应。找到中位数,我们首先需要把我们的数据在升序(从最小到最大)。然后我们可以判断,中位数,如果n 的值是奇数,它仅仅是中间的观察,但如果n 的值是偶数,这是中间的两个观测的平均 。 在这种情况下,n是奇数,所以中位数将是我们的排序好的变量的中间观察值(第6个值)... 57  Mode 众数® ® the value that occurs most frequently. 发生最频繁的值。 If there are two different values most frequently occurring, the data are said to be bi-modal. 如果有两种不同的价值观最经常发生的,说是数据双峰。 If there are more than two modes, and the distribution is said to be multi-modal. In this case, the value that occurs most often is 5 3 . 如果有两个以上的众数,分布被认为是多众数。在这种情况下,最常出现的值是5 3。 So, the mode is 5 3 . 因此,众数是5 3。   The measures of location are 位置的度量 Percentile, Quartile and Decile百分位数,四分位数和十分位数 百分位数 ® ®在第p百分是一个变量至少为p% 的观测小于或等于这个值 ,至少(100 - P)% 的意见是大于或等于这个值。计算百分,我们使用指数(i)。     If the answer is a whole number (an integer), then i is the average of (P/100)n and 1 + (P/100)n . 如果答案是一个整数(整数),那么 i是 和 的平均值。 If the index number is not a whole number, we ALWAYS round u如果指数数不是一个整数,我们通常取该指数的位置是下一个整数(整数)大于计算指数。 For examp例如: ......这时候取6 So, we would count from the lowest value of the so rted data to the index number (6 ). Since the calculat所以,我们会把从的最小值排序的数据到索引号(6)。由于计算 i was not a whole number we had to round up to find the value where at least 50% of the values are equal to or lower than this value and at least 50% are equal to or higher than this value. In this case, the value of the 50 th percentile is the 6 th value...57i不是一个整数,我们必须找到值四舍五入到至少50%的值等于或低于这个值和至少50%是等于或者高于这个值。在这种情况下, 第50百分位值是第6个... 57 … ... Does this look familiar? 这是否很熟悉? ® The 50 th percentile is the same thing as the median. ®这和第50百分位数相同。 What does it tell us? 它告诉我们什么?In this distribution, AT LEAST 50% of the observations are LESS THAN OR EQUAL TO 57 AND AT LEAST 50% of the observations are GREATER THAN OR EQUAL TO在此分布,至少有50%的意见是小于或等于57,至少有50%的意见都大于或等于57。 i ( p 80) ...... 这时候取9。 The 9 th value is第九个变量是668。 Again, s ince the index number is not a whole number, we round up. So, we would count from the lowest value of the sor ted data to the index number (9 ). In this case, the value of the 80 th percentile is 同样,由于索引号不是一个完整的数,我们。所以,我所以,我们会把从的最小值排序的数据到索引号(9)。在这种情况下,第80百分位值是 68 . 68。 Since this dataset has 11 observations, we won't have any instances where our calculated index number is a whole number.因为这个数据集有11的观察值,我们将不会有任何情况下,我们计算的索引号是一个整数。However, if we just remove our value of 70 and create a new distribution, we will be able to see an example.然而,如果我们只是删除我们的价值为70,并创建一个新的分布,我们将能看到一个例子... 53 53 53 53 53 53 55 55 57 57 57 57 58 58 64 64 68 68 69 69 i (p30) ......这是一个整数,所以我们必须采取第3和第4值和平均他们找到第30百分位。(53 + 55)/ 2 = 54 So, t he value of the 30 th percentile is 因此,第30百分位是54。 Return to our original data distribut返回到我们的原始数据分布 ... ... Quartiles – are special cases of percentiles…Q 1四分数 - 特殊情况下,这三个值将分布分成4等分   i (Q1).....这时候取3,因此第3个值... 53 i (Q2).....这时候取6,因此是第6个值... 57 i (Q3) ......这时候取6,因此是第6个值...64 离散程度Measures of Dispersion or Variability : Range , interquartile range (IQR) , variance ,离散离散分布 或可变性,极差,四分位间距(IQR),方差, standard deviation and coefficient of variation . 标准差和变异系数。 极差Range = This tells us how wide the span is from the maximum value to the minimum value.极:这就告诉我们有多宽跨度是从最大值到最小值。((Max – Min) = Range. I n this instance, the range is 69 - 5 3最大值-最小值)=极差在这个实例中,极差是69-53=16。 四分位间距(Interquartile Range (IQR) = This tells us how wide the span is in the middle 50% of the data. (Q3 – Q1) = IQR.四分位间距(IQR)= 这告诉我们在中间50%的数据是跨度有多大 In this case 在这种情况下 ... 64 – 53 = 11... 64 - 53 = 11   We will use IQR in later processes, so we will want to keep this 我们将在以后的过程中使用的四分间距,所以我们要保持这个  样本方差-这告诉我们,从平均值的偏差的平方的总和。大的方差表示偏离程度打,小的方差表示偏离程度小。 We square the values so that we don't end up with zero.由于是变量的平方,所以结果不会为零。让我们来看看这是如何实现的 xXX (x-xbar) () (x-xbar) 2 53 53 -6.73 -6.73 45.29 45.29 53 53 -6.73 -6.73 45.29 45.29 53 53 -6.73 -6.73 45.29 45.29 55 55 -4.73 -4.73 22.37 22.37 57 57 -2.73 -2.73 7.45 7.45 57 57 -2.73 -2.73 7.45 7.45 58 58 -1.73 -1.73 2.99 2.99 64 64 4.27 4.27 18.23 18.23 68 68 8.27 8.27 68.39 68.39 69 69 9.27 9.27 85.93 85.93 70 70 10
    展开阅读全文
    提示  咨信网温馨提示:
    1、咨信平台为文档C2C交易模式,即用户上传的文档直接被用户下载,收益归上传人(含作者)所有;本站仅是提供信息存储空间和展示预览,仅对用户上传内容的表现方式做保护处理,对上载内容不做任何修改或编辑。所展示的作品文档包括内容和图片全部来源于网络用户和作者上传投稿,我们不确定上传用户享有完全著作权,根据《信息网络传播权保护条例》,如果侵犯了您的版权、权益或隐私,请联系我们,核实后会尽快下架及时删除,并可随时和客服了解处理情况,尊重保护知识产权我们共同努力。
    2、文档的总页数、文档格式和文档大小以系统显示为准(内容中显示的页数不一定正确),网站客服只以系统显示的页数、文件格式、文档大小作为仲裁依据,个别因单元格分列造成显示页码不一将协商解决,平台无法对文档的真实性、完整性、权威性、准确性、专业性及其观点立场做任何保证或承诺,下载前须认真查看,确认无误后再购买,务必慎重购买;若有违法违纪将进行移交司法处理,若涉侵权平台将进行基本处罚并下架。
    3、本站所有内容均由用户上传,付费前请自行鉴别,如您付费,意味着您已接受本站规则且自行承担风险,本站不进行额外附加服务,虚拟产品一经售出概不退款(未进行购买下载可退充值款),文档一经付费(服务费)、不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
    4、如你看到网页展示的文档有www.zixin.com.cn水印,是因预览和防盗链等技术需要对页面进行转换压缩成图而已,我们并不对上传的文档进行任何编辑或修改,文档下载后都不会有水印标识(原文档上传前个别存留的除外),下载后原文更清晰;试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓;PPT和DOC文档可被视为“模板”,允许上传人保留章节、目录结构的情况下删减部份的内容;PDF文档不管是原文档转换或图片扫描而得,本站不作要求视为允许,下载前可先查看【教您几个在下载文档中可以更好的避免被坑】。
    5、本文档所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用;网站提供的党政主题相关内容(国旗、国徽、党徽--等)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
    6、文档遇到问题,请及时联系平台进行协调解决,联系【微信客服】、【QQ客服】,若有其他问题请点击或扫码反馈【服务填表】;文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“【版权申诉】”,意见反馈和侵权处理邮箱:1219186828@qq.com;也可以拔打客服电话:0574-28810668;投诉电话:18658249818。

    开通VIP折扣优惠下载文档

    自信AI创作助手
    关于本文
    本文标题:统计学外文翻译.docx
    链接地址:https://www.zixin.com.cn/doc/9810916.html
    页脚通栏广告

    Copyright ©2010-2026   All Rights Reserved  宁波自信网络信息技术有限公司 版权所有   |  客服电话:0574-28810668    微信客服:咨信网客服    投诉电话:18658249818   

    违法和不良信息举报邮箱:help@zixin.com.cn    文档合作和网站合作邮箱:fuwu@zixin.com.cn    意见反馈和侵权处理邮箱:1219186828@qq.com   | 证照中心

    12321jubao.png12321网络举报中心 电话:010-12321  jubao.png中国互联网举报中心 电话:12377   gongan.png浙公网安备33021202000488号  icp.png浙ICP备2021020529号-1 浙B2-20240490   


    关注我们 :微信公众号  抖音  微博  LOFTER               

    自信网络  |  ZixinNetwork