rstudio中位数的公式
In this tutorial, let’s learn how we can find the median in R. Median is defined as the measurement of central tendency in the data. In simpler terms, you may call it the ‘middle’ value.
在本教程中,让我们学习如何在R中找到中位数。中位数定义为数据中中心趋势的度量。 简单来说,您可以将其称为“中间”值。
The process includes grouping or ordering the values and then finding the middle number among them. If you encounter multiple middle values, you can take the average or ‘mean’ of those values.
该过程包括对值进行分组或排序,然后在其中找到中间的数字。 如果遇到多个中间值,则可以取这些值的平均值或“平均值” 。
中位数–优缺点 (Median – Merits and Demerits )
Merits:
优点:
- It is very easy to calculate the median. In some simple cases, you can find the median just by analyzing the values. 计算中位数非常容易。 在一些简单的情况下,您可以仅通过分析值来找到中位数。
- Median has real use in open-ended data distributions. Because the median gives more importance to the position of the number than its value. 中位数在开放式数据分发中有实际用途。 因为中位数比数字的值更重视数字的位置。
- One of the major advantages of the median is that it is not affected by the outliers present in the data. 中位数的主要优势之一是它不受数据中存在的异常值的影响。
Outliers: Outliers are described as the extreme values, which are different from the rest of the values in the data.
离群值:离群值描述为极值,与数据中的其余值不同。
Ex: The retirement age values are – (52,53,54,54,55,56,57,58,79)
例如:退休年龄值为–(52,53,54,54,55,56,57,58,79)
Here, 79 is an extreme value and it is different from the rest of the values or data. It will affect the mean and mode drastically. But Median will not be affected as it deals with position rather than the value.
在这里,79是一个极值,它不同于其余的值或数据。 它将严重影响均值和众数。 但是中位数不会受到影响,因为它只处理头寸而不是价值。
Demerits:
缺点:
- Median will not look for the accurate value as it will not utilize the entire data. 中位数不会寻找 准确的价值,因为它不会利用全部数据。
- Median is not capable of further statistical or mathematical operations. 中位数不能进行进一步的统计或数学运算。
查找给定值的中位数 (Finding the median of the given values)
In this section, we will create a list of values and try to find the median of those values.
在本节中,我们将创建一个值列表,并尝试查找这些值的中位数。
#creates a list
x <- c(45,76,56,87,65,45,34,56,78,98,87,65,34,48,76)
#displays the values
show(x)
---> 45 76 56 87 65 45 34 56 78 98 87 65 34 48 76
#calculates the median of the values in the list 'x'
median(x)
Output: 65
输出:65
You may wonder how 65 can be a middle value. Well, the median() function first groups or order the values in ascending or descending order, then it will calculate the middle or central value.
您可能想知道65如何成为中间值。 好吧,位数()函数首先将值分组或以升序或降序排列,然后将计算中间值或中心值。
Note: If one or more values are found to be central values, then the average of them will be considered as the median.
注意:如果发现一个或多个值是中心值,则将它们的平均值视为中位数 。
查找“国家的用电量数据”的中位数。 (Finding the median of the ‘Electricity consumption data of the countries’.)
In this section, we import the CSV file which includes the data of ‘Electricity/energy consumption’ across the above-mentioned countries – India, Romania, USA, and Jamaica in the year 2019.
在本部分中,我们导入CSV文件 ,其中包含上述国家(印度,罗马尼亚,美国和牙买加)在2019年的“电力/能源消耗”数据。
Execute the below code to find the median of the ‘Voltage’ consumed by these countries in 2019.
执行以下代码以查找这些国家/地区在2019年所消耗的“电压”中位数。
Note: View or Download the ‘Energy consumtion’ dataset here
#reads the value present in the file.
df <- read.csv("energydata.csv")
#displays the values.
df
#calculates the median of the 'voltage' values.
median(df$Voltage)
Output: 220 Volts,
输出:220伏
Note: In this data set, the results showed that the median is 220, i.e. the central tendency of the data is 220 volts.
注意:在此数据集中,结果显示中位数为220,即数据的中心趋势为220伏。
借助箱形图可视化数据的中位数 (Visualizing the Median of the data with the help of the box plot)
In R, you can create a box plot to understand the distribution of median as shown in the below plot.
在R中,您可以创建一个箱形图以了解中位数的分布,如下图所示。
boxplot: Boxplots are used in R to understand the distribution of data. R offers the function boxplot() to create the box graph. The thick line in the plot represents the median.
boxplot : R中使用Boxplots来了解数据的分布。 R提供了boxplot()函数来创建箱形图。 图中的粗线代表中位数。
使用直方图了解“电压”的中值 (Using Histogram to Understand the Median of the ‘voltage’)
In this section, we are going to plot the voltage distribution with the help of a histogram in Rstudio.
在本节中,我们将借助Rstudio中的直方图来绘制电压分布图。
Execute the below code to plot the histogram, which shows the voltage distribution and the median of the voltage.
执行以下代码以绘制直方图,该直方图显示电压分布和电压中值。
#reads the value present in the file.
df <- read.csv("energydata.csv")
#displays the values.
df
#calculates the median of the 'voltage' values.
median(df$Voltage)
#plots the histogram
hist(df$Voltage, col='orange', xlab='voltage', ylab='frequency', main='Voltage distribution')
#adds the median line
abline(v=median(df$Voltage), col='black', lwd='3')
#adds the legend
legend(x='topright', c('median'),col = 'black', lwd = '3')
In the above plot, you can see the ‘black’ line, which is actually showing the median. Through the histograms we can easily demonstrate the mean, median, and density curves as well.
在上图中,您可以看到“黑”线,它实际上是显示中位数。 通过直方图,我们还可以轻松显示均值,中值和密度曲线。
结论 (Conclusion)
With the help of the Median() function, we can understand the central tendency of the data. Median is very easy to find in some cases, where you are able to tell the median value by just inspecting it.
借助Median()函数,我们可以了解数据的集中趋势。 在某些情况下,很容易找到中值,您可以通过检查中值来判断中值。
R offers great visualizing functions to understand the hidden data patterns. As shown above, you can easily analyze the median using the histogram and box plots.
R提供了出色的可视化功能,以了解隐藏的数据模式。 如上所示,您可以使用直方图和箱形图轻松分析中位数。
That’s all for now. Connect with us for more R tutorials. Don’t hesitate to comment below if you have any queries. Happy learning!!!.
目前为止就这样了。 与我们联系以获取更多R教程。 如有任何疑问,请在下面评论。 学习愉快!
rstudio中位数的公式