python matplotlib.pyplot.hist()函数 (绘制直方图)

hist from pyplot.py

# Autogenerated by boilerplate.py.  Do not edit as changes will be lost.
@docstring.copy(Axes.hist)
def hist(
        x, bins=None, range=None, density=None, weights=None,
        cumulative=False, bottom=None, histtype='bar', align='mid',
        orientation='vertical', rwidth=None, log=False, color=None,
        label=None, stacked=False, normed=None, *, data=None,
        **kwargs):
    return gca().hist(
        x, bins=bins, range=range, density=density, weights=weights,
        cumulative=cumulative, bottom=bottom, histtype=histtype,
        align=align, orientation=orientation, rwidth=rwidth, log=log,
        color=color, label=label, stacked=stacked, normed=normed,
        **({"data": data} if data is not None else {}), **kwargs)

hist from pyplot.py

#### Data analysis

    @_preprocess_data(replace_names=["x", 'weights'], label_namer="x")
    def hist(self, x, bins=None, range=None, density=None, weights=None,
             cumulative=False, bottom=None, histtype='bar', align='mid',
             orientation='vertical', rwidth=None, log=False,
             color=None, label=None, stacked=False, normed=None,
             **kwargs):
        """
        Plot a histogram. 绘制直方图。

        Compute and draw the histogram of *x*.  The return value is a tuple
        (*n*, *bins*, *patches*) or ([*n0*, *n1*, ...], *bins*, [*patches0*,
        *patches1*,...]) if the input contains multiple data.  See the
        documentation of the *weights* parameter to draw a histogram of
        already-binned data.

		计算并绘制* x *的直方图。 
		返回值是元组(* n *,* bins *,* patches *)或([* n0 *,* n1 *,...],* bins *,
		[* patches0 *,* patches1 * .. 。])
		(如果输入包含多个数据)。 请参阅* weights *参数的文档以绘制已合并数据的直方图。

        Multiple data can be provided via *x* as a list of datasets
        of potentially different length ([*x0*, *x1*, ...]), or as
        a 2-D ndarray in which each column is a dataset.  Note that
        the ndarray form is transposed relative to the list form.

		可以通过* x *提供多个数据,这些数据可以作为长度可能不同的数据集的列表([* x0 *,* x1 *,...]),
		也可以作为二维数组,其中每列都是一个数据集。 请注意,ndarray形式是相对于列表形式进行转置的。

        Masked arrays are not supported at present. 目前不支持掩码数组。

        Parameters
        ----------
        x : (n,) array or sequence of (n,) arrays
            Input values, this takes either a single array or a sequence of
            arrays which are not required to be of the same length.
			(n,)个数组或(n,)个数组序列输入值,它可以是单个数组或不需要相同长度的数组序列。

        bins : int or sequence or str, optional
            If an integer is given, ``bins + 1`` bin edges are calculated and
            returned, consistent with `numpy.histogram`.
            int或sequence或str,可选如果给出整数,则将计算bins + 1bin边并返回,
            与`numpy.histogram`一致。

            If `bins` is a sequence, gives bin edges, including left edge of
            first bin and right edge of last bin.  In this case, `bins` is
            returned unmodified.
            如果`bins'是一个序列,则给出bin边缘,包括第一个bin的左边缘和最后一个bin的右边缘。 
            在这种情况下,`bins`会原样返回。

            All but the last (righthand-most) bin is half-open.  In other
            words, if `bins` is::
            除了最后一个(最右边)的bin外,其他所有bin都是半开的。 换句话说,如果bins是:

                [1, 2, 3, 4]

            then the first bin is ``[1, 2)`` (including 1, but excluding 2) and
            the second ``[2, 3)``.  The last bin, however, is ``[3, 4]``, which
            *includes* 4.
            那么第一个bin是``[1,2)''(包括1,但不包括2)和第二个``[2,3)''。 
            但是,最后一个bin是“ [3,4]”,*包括* 4。

            Unequally spaced bins are supported if *bins* is a sequence.
            如果* bins *是序列,则支持不等间距的bin。

            With Numpy 1.11 or newer, you can alternatively provide a string
            describing a binning strategy, such as 'auto', 'sturges', 'fd',
            'doane', 'scott', 'rice' or 'sqrt', see
            `numpy.histogram`.

            The default is taken from :rc:`hist.bins`.

        range : tuple or None, optional
            The lower and upper range of the bins. Lower and upper outliers
            are ignored. If not provided, *range* is ``(x.min(), x.max())``.
            Range has no effect if *bins* is a sequence.

            If *bins* is a sequence or *range* is specified, autoscaling
            is based on the specified bin range instead of the
            range of x.

            Default is ``None``

        density : bool, optional
            If ``True``, the first element of the return tuple will
            be the counts normalized to form a probability density, i.e.,
            the area (or integral) under the histogram will sum to 1.
            This is achieved by dividing the count by the number of
            observations times the bin width and not dividing by the total
            number of observations. If *stacked* is also ``True``, the sum of
            the histograms is normalized to 1.

            Default is ``None`` for both *normed* and *density*. If either is
            set, then that value will be used. If neither are set, then the
            args will be treated as ``False``.

            If both *density* and *normed* are set an error is raised.

        weights : (n, ) array_like or None, optional
            An array of weights, of the same shape as *x*.  Each value in *x*
            only contributes its associated weight towards the bin count
            (instead of 1).  If *normed* or *density* is ``True``,
            the weights are normalized, so that the integral of the density
            over the range remains 1.

            Default is ``None``.

            This parameter can be used to draw a histogram of data that has
            already been binned, e.g. using `np.histogram` (by treating each
            bin as a single point with a weight equal to its count) ::

                counts, bins = np.histogram(data)
                plt.hist(bins[:-1], bins, weights=counts)

            (or you may alternatively use `~.bar()`).

        cumulative : bool, optional
            If ``True``, then a histogram is computed where each bin gives the
            counts in that bin plus all bins for smaller values. The last bin
            gives the total number of datapoints. If *normed* or *density*
            is also ``True`` then the histogram is normalized such that the
            last bin equals 1. If *cumulative* evaluates to less than 0
            (e.g., -1), the direction of accumulation is reversed.
            In this case, if *normed* and/or *density* is also ``True``, then
            the histogram is normalized such that the first bin equals 1.

            Default is ``False``

        bottom : array_like, scalar, or None
            Location of the bottom baseline of each bin.  If a scalar,
            the base line for each bin is shifted by the same amount.
            If an array, each bin is shifted independently and the length
            of bottom must match the number of bins.  If None, defaults to 0.

            Default is ``None``

        histtype : {'bar', 'barstacked', 'step',  'stepfilled'}, optional
            The type of histogram to draw.

            - 'bar' is a traditional bar-type histogram.  If multiple data
              are given the bars are arranged side by side.

            - 'barstacked' is a bar-type histogram where multiple
              data are stacked on top of each other.

            - 'step' generates a lineplot that is by default
              unfilled.

            - 'stepfilled' generates a lineplot that is by default
              filled.

            Default is 'bar'

        align : {'left', 'mid', 'right'}, optional
            Controls how the histogram is plotted.

                - 'left': bars are centered on the left bin edges.

                - 'mid': bars are centered between the bin edges.

                - 'right': bars are centered on the right bin edges.

            Default is 'mid'

        orientation : {'horizontal', 'vertical'}, optional
            If 'horizontal', `~matplotlib.pyplot.barh` will be used for
            bar-type histograms and the *bottom* kwarg will be the left edges.

        rwidth : scalar or None, optional
            The relative width of the bars as a fraction of the bin width.  If
            ``None``, automatically compute the width.

            Ignored if *histtype* is 'step' or 'stepfilled'.

            Default is ``None``

        log : bool, optional
            If ``True``, the histogram axis will be set to a log scale. If
            *log* is ``True`` and *x* is a 1D array, empty bins will be
            filtered out and only the non-empty ``(n, bins, patches)``
            will be returned.

            Default is ``False``

        color : color or array_like of colors or None, optional
            Color spec or sequence of color specs, one per dataset.  Default
            (``None``) uses the standard line color sequence.

            Default is ``None``

        label : str or None, optional
            String, or sequence of strings to match multiple datasets.  Bar
            charts yield multiple patches per dataset, but only the first gets
            the label, so that the legend command will work as expected.

            default is ``None``

        stacked : bool, optional
            If ``True``, multiple data are stacked on top of each other If
            ``False`` multiple data are arranged side by side if histtype is
            'bar' or on top of each other if histtype is 'step'

            Default is ``False``

        normed : bool, optional
            Deprecated; use the density keyword argument instead.

        Returns
        -------
        n : array or list of arrays
            The values of the histogram bins. See *density* and *weights* for a
            description of the possible semantics.  If input *x* is an array,
            then this is an array of length *nbins*. If input is a sequence of
            arrays ``[data1, data2,..]``, then this is a list of arrays with
            the values of the histograms for each of the arrays in the same
            order.  The dtype of the array *n* (or of its element arrays) will
            always be float even if no weighting or normalization is used.

        bins : array
            The edges of the bins. Length nbins + 1 (nbins left edges and right
            edge of last bin).  Always a single array even when multiple data
            sets are passed in.

        patches : list or list of lists
            Silent list of individual patches used to create the histogram
            or list of such list if multiple input datasets.

        Other Parameters
        ----------------
        **kwargs : `~matplotlib.patches.Patch` properties

        See also
        --------
        hist2d : 2D histograms

        """
        # Avoid shadowing the builtin.
        bin_range = range
        from builtins import range

        if np.isscalar(x):
            x = [x]

        if bins is None:
            bins = rcParams['hist.bins']

        # Validate string inputs here so we don't have to clutter
        # subsequent code.
        if histtype not in ['bar', 'barstacked', 'step', 'stepfilled']:
            raise ValueError("histtype %s is not recognized" % histtype)

        if align not in ['left', 'mid', 'right']:
            raise ValueError("align kwarg %s is not recognized" % align)

        if orientation not in ['horizontal', 'vertical']:
            raise ValueError(
                "orientation kwarg %s is not recognized" % orientation)

        if histtype == 'barstacked' and not stacked:
            stacked = True

        if density is not None and normed is not None:
            raise ValueError("kwargs 'density' and 'normed' cannot be used "
                             "simultaneously. "
                             "Please only use 'density', since 'normed'"
                             "is deprecated.")
        if normed is not None:
            cbook.warn_deprecated("2.1", name="'normed'", obj_type="kwarg",
                                  alternative="'density'", removal="3.1")

        # basic input validation
        input_empty = np.size(x) == 0
        # Massage 'x' for processing.
        x = cbook._reshape_2D(x, 'x')
        nx = len(x)  # number of datasets

        # Process unit information
        # Unit conversion is done individually on each dataset
        self._process_unit_info(xdata=x[0], kwargs=kwargs)
        x = [self.convert_xunits(xi) for xi in x]

        if bin_range is not None:
            bin_range = self.convert_xunits(bin_range)

        # We need to do to 'weights' what was done to 'x'
        if weights is not None:
            w = cbook._reshape_2D(weights, 'weights')
        else:
            w = [None] * nx

        if len(w) != nx:
            raise ValueError('weights should have the same shape as x')

        for xi, wi in zip(x, w):
            if wi is not None and len(wi) != len(xi):
                raise ValueError(
                    'weights should have the same shape as x')

        if color is None:
            color = [self._get_lines.get_next_color() for i in range(nx)]
        else:
            color = mcolors.to_rgba_array(color)
            if len(color) != nx:
                error_message = (
                    "color kwarg must have one color per data set. %d data "
                    "sets and %d colors were provided" % (nx, len(color)))
                raise ValueError(error_message)

        hist_kwargs = dict()

        # if the bin_range is not given, compute without nan numpy
        # does not do this for us when guessing the range (but will
        # happily ignore nans when computing the histogram).
        if bin_range is None:
            xmin = np.inf
            xmax = -np.inf
            for xi in x:
                if len(xi):
                    # python's min/max ignore nan,
                    # np.minnan returns nan for all nan input
                    xmin = min(xmin, np.nanmin(xi))
                    xmax = max(xmax, np.nanmax(xi))
            # make sure we have seen at least one non-nan and finite
            # value before we reset the bin range
            if not np.isnan([xmin, xmax]).any() and not (xmin > xmax):
                bin_range = (xmin, xmax)

        # If bins are not specified either explicitly or via range,
        # we need to figure out the range required for all datasets,
        # and supply that to np.histogram.
        if not input_empty and len(x) > 1:
            if weights is not None:
                _w = np.concatenate(w)
            else:
                _w = None

            bins = histogram_bin_edges(np.concatenate(x),
                                       bins, bin_range, _w)
        else:
            hist_kwargs['range'] = bin_range

        density = bool(density) or bool(normed)
        if density and not stacked:
            hist_kwargs['density'] = density

        # List to store all the top coordinates of the histograms
        tops = []
        mlast = None
        # Loop through datasets
        for i in range(nx):
            # this will automatically overwrite bins,
            # so that each histogram uses the same bins
            m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
            m = m.astype(float)  # causes problems later if it's an int
            if mlast is None:
                mlast = np.zeros(len(bins)-1, m.dtype)
            if stacked:
                m += mlast
                mlast[:] = m
            tops.append(m)

        # If a stacked density plot, normalize so the area of all the stacked
        # histograms together is 1
        if stacked and density:
            db = np.diff(bins)
            for m in tops:
                m[:] = (m / db) / tops[-1].sum()
        if cumulative:
            slc = slice(None)
            if isinstance(cumulative, Number) and cumulative < 0:
                slc = slice(None, None, -1)

            if density:
                tops = [(m * np.diff(bins))[slc].cumsum()[slc] for m in tops]
            else:
                tops = [m[slc].cumsum()[slc] for m in tops]

        patches = []

        # Save autoscale state for later restoration; turn autoscaling
        # off so we can do it all a single time at the end, instead
        # of having it done by bar or fill and then having to be redone.
        _saved_autoscalex = self.get_autoscalex_on()
        _saved_autoscaley = self.get_autoscaley_on()
        self.set_autoscalex_on(False)
        self.set_autoscaley_on(False)

        if histtype.startswith('bar'):

            totwidth = np.diff(bins)

            if rwidth is not None:
                dr = np.clip(rwidth, 0, 1)
            elif (len(tops) > 1 and
                  ((not stacked) or rcParams['_internal.classic_mode'])):
                dr = 0.8
            else:
                dr = 1.0

            if histtype == 'bar' and not stacked:
                width = dr * totwidth / nx
                dw = width
                boffset = -0.5 * dr * totwidth * (1 - 1 / nx)
            elif histtype == 'barstacked' or stacked:
                width = dr * totwidth
                boffset, dw = 0.0, 0.0

            if align == 'mid':
                boffset += 0.5 * totwidth
            elif align == 'right':
                boffset += totwidth

            if orientation == 'horizontal':
                _barfunc = self.barh
                bottom_kwarg = 'left'
            else:  # orientation == 'vertical'
                _barfunc = self.bar
                bottom_kwarg = 'bottom'

            for m, c in zip(tops, color):
                if bottom is None:
                    bottom = np.zeros(len(m))
                if stacked:
                    height = m - bottom
                else:
                    height = m
                patch = _barfunc(bins[:-1]+boffset, height, width,
                                 align='center', log=log,
                                 color=c, **{bottom_kwarg: bottom})
                patches.append(patch)
                if stacked:
                    bottom[:] = m
                boffset += dw

        elif histtype.startswith('step'):
            # these define the perimeter of the polygon
            x = np.zeros(4 * len(bins) - 3)
            y = np.zeros(4 * len(bins) - 3)

            x[0:2*len(bins)-1:2], x[1:2*len(bins)-1:2] = bins, bins[:-1]
            x[2*len(bins)-1:] = x[1:2*len(bins)-1][::-1]

            if bottom is None:
                bottom = np.zeros(len(bins) - 1)

            y[1:2*len(bins)-1:2], y[2:2*len(bins):2] = bottom, bottom
            y[2*len(bins)-1:] = y[1:2*len(bins)-1][::-1]

            if log:
                if orientation == 'horizontal':
                    self.set_xscale('log', nonposx='clip')
                    logbase = self.xaxis._scale.base
                else:  # orientation == 'vertical'
                    self.set_yscale('log', nonposy='clip')
                    logbase = self.yaxis._scale.base

                # Setting a minimum of 0 results in problems for log plots
                if np.min(bottom) > 0:
                    minimum = np.min(bottom)
                elif density or weights is not None:
                    # For data that is normed to form a probability density,
                    # set to minimum data value / logbase
                    # (gives 1 full tick-label unit for the lowest filled bin)
                    ndata = np.array(tops)
                    minimum = (np.min(ndata[ndata > 0])) / logbase
                else:
                    # For non-normed (density = False) data,
                    # set the min to 1 / log base,
                    # again so that there is 1 full tick-label unit
                    # for the lowest bin
                    minimum = 1.0 / logbase

                y[0], y[-1] = minimum, minimum
            else:
                minimum = 0

            if align == 'left':
                x -= 0.5*(bins[1]-bins[0])
            elif align == 'right':
                x += 0.5*(bins[1]-bins[0])

            # If fill kwarg is set, it will be passed to the patch collection,
            # overriding this
            fill = (histtype == 'stepfilled')

            xvals, yvals = [], []
            for m in tops:
                if stacked:
                    # starting point for drawing polygon
                    y[0] = y[1]
                    # top of the previous polygon becomes the bottom
                    y[2*len(bins)-1:] = y[1:2*len(bins)-1][::-1]
                # set the top of this polygon
                y[1:2*len(bins)-1:2], y[2:2*len(bins):2] = (m + bottom,
                                                            m + bottom)
                if log:
                    y[y < minimum] = minimum
                if orientation == 'horizontal':
                    xvals.append(y.copy())
                    yvals.append(x.copy())
                else:
                    xvals.append(x.copy())
                    yvals.append(y.copy())

            # stepfill is closed, step is not
            split = -1 if fill else 2 * len(bins)
            # add patches in reverse order so that when stacking,
            # items lower in the stack are plotted on top of
            # items higher in the stack
            for x, y, c in reversed(list(zip(xvals, yvals, color))):
                patches.append(self.fill(
                    x[:split], y[:split],
                    closed=True if fill else None,
                    facecolor=c,
                    edgecolor=None if fill else c,
                    fill=fill if fill else None))
            for patch_list in patches:
                for patch in patch_list:
                    if orientation == 'vertical':
                        patch.sticky_edges.y.append(minimum)
                    elif orientation == 'horizontal':
                        patch.sticky_edges.x.append(minimum)

            # we return patches, so put it back in the expected order
            patches.reverse()

        self.set_autoscalex_on(_saved_autoscalex)
        self.set_autoscaley_on(_saved_autoscaley)
        self.autoscale_view()

        if label is None:
            labels = [None]
        elif isinstance(label, str):
            labels = [label]
        elif not np.iterable(label):
            labels = [str(label)]
        else:
            labels = [str(lab) for lab in label]

        for patch, lbl in itertools.zip_longest(patches, labels):
            if patch:
                p = patch[0]
                p.update(kwargs)
                if lbl is not None:
                    p.set_label(lbl)

                for p in patch[1:]:
                    p.update(kwargs)
                    p.set_label('_nolegend_')

        if nx == 1:
            return tops[0], bins, cbook.silent_list('Patch', patches[0])
        else:
            return tops, bins, cbook.silent_list('Lists of Patches', patches)

示例:

参考文章:opencv python matplotlib.pyplot.hist() 如何绘制灰度直方图,如何根据灰度直方图确定最优二值化值

参考文章:matplotlib可视化篇hist()–直方图

### 回答1: Matplotlib.pyplot.hist函数用于绘制直方图,它可以接受一维数组作为输入,然后将数据分成若干个区间,统计每个区间内数据的数量,并将结果绘制直方图。该函数有多个参数可以调整直方图的样式和布局,比如bins参数可以指定区间的数量,range参数可以指定数据的范围等。 ### 回答2: MatplotlibPython的一个绘图库,而pyplotMatplotlib的一个子模块。Matplotlib.pyplot.hist函数用于绘制直方图直方图是一种统计图表,用来表示数据的分布情况。直方图将数据划分为若干个等宽的区间,通过统计落入每个区间的数据个数来展示数据的分布情况。 使用Matplotlib.pyplot.hist函数绘制直方图时,需要输入一个数据集作为参数。函数将根据数据集中的值来统计落入每个区间的数据个数,并根据统计结果绘制直方图。 该函数的常见参数包括: - bins:用于指定划分区间的个数或区间的边界值。默认为10,意味着将数据划分为10个等宽的区间。 - range:用于指定数据的范围,超出该范围的数据将被忽略。 - density:用于控制是否将直方图的纵轴单位设置为概率密度。默认为False,即纵轴单位为数据个数。 - color:用于指定直方图的颜色。 - alpha:用于指定直方图的透明度。 - label:用于给直方图添加标签。 除了常见参数外,还可以通过调整其他参数来进一步调整直方图的样式,如设置边界宽度、设置直方图的类型等。 总之,使用Matplotlib.pyplot.hist函数可以方便地绘制直方图,帮助我们更直观地了解数据的分布情况。 ### 回答3: Matplotlib.pyplot.hist函数用于绘制直方图直方图是一种用来展示数据分布情况的统计图表。该函数可以将数据划分为一系列的间隔,并统计落在每个间隔内的数据个数,然后以柱形图的形式显示出来。 使用该函数,首先需要传入要绘制直方图的数据,可以是一个数组或者一列数据。函数会自动根据数据的最小值和最大值确定需要划分的间隔数。也可以通过设置bins参数来指定间隔数,或者通过设置range参数来指定数据的范围。 在绘制直方图时,可以设置一些可选参数来调整图表的样式。例如,可以通过设置color参数来改变柱形的颜色;通过设置alpha参数来调整柱形的透明度;通过设置edgecolor参数来改变柱形之间的边界颜色。 此外,还可以通过设置normed参数为True来将每个柱形的高度转化为概率密度。通过设置cumulative参数为True,可以绘制累积直方图绘制直方图后,还可以添加一些额外的元素,例如标题、横轴和纵轴标签、图例等。 总的来说,Matplotlib.pyplot.hist函数是一个简单而强大的绘制直方图的工具,可以方便地展示数据的分布情况,进而提供数据分析的依据。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Dontla

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值