n8_Visualizing Multivariate_sns_3D plot_matplotlib.dates_mpl_finance_aapl stock_EMA_RSI_Bollinger

     When we have big data that contains many variables, the plot types in Chapter 7(https://blog.csdn.net/Linli522362242/article/details/93617948), Visualizing Online Data may no longer be an effective way of data visualization. We may try to cramp[kræmp]受限制的 as many variables in a single plot as possible, but the overcrowded or cluttered details would quickly reach the boundary of a human's visual perception capabilities.

 

     In this chapter, we aim to introduce multivariate data visualization techniques; they enable us to better understand the distribution of data and the relationships between variables. Here is the outline of this chapter:

  • Getting End-of-Day (EOD) stock data from Quandl
  • Two-dimensional faceted plots:
    • Factor plot in Seaborn
    • Faceted grid in Seaborn
    • Pair plot in Seaborn
  • Other two-dimensional multivariate plots:
    • Heatmap in Seaborn
    • Candlestick plot in matplotlib.finance:
    • Visualizing various stock market indicators
    • Building a comprehensive stock chart
  • Three-dimensional plots:
    • Scatter plot
    • Bar chart
    • Caveats of using Matplotlib 3D

     First, we will discuss faceted plots, which is a divide-and-conquer[ˈkɑːŋkər]攻克 approach to visualizing multivariate data. The gestalt[ɡəˈʃtælt,ɡəˈʃtɑːlt] 格式塔,完全形态 of this approach is to slice input data into different facets such that only a handful of attributes will be represented in each visualization panel. This will reduce visual clutter[ˈklʌtər]杂乱 by allowing inspection of variables in reduced subsets. Sometimes, finding a suitable way to represent multivariate data in a 2D graph is difficult. Therefore, we are going to introduce 3D plotting functions in Matplotlib as well. 

     The data used in this chapter was collected from Quandl's End-of-Day (EOD) stock database. Let's get the data from Quandl first.

Getting End-of-Day (EOD) stock data from Quandl

     Since we are going to discuss stock data extensively, note that we do not guarantee the accuracy, completeness, or validity of the content presented; nor are we responsible for any errors or omissions that may have occurred. The data, visualizations, and analyses are provided on an “as is” basis for educational purposes only, without any representations, warranties, or conditions of any kind. Therefore, the publisher and the authors do not accept liability for your use of the content. It should be noted that past stock performance may not predict future performance. Readers should also be aware of the risks involved in stock investments and should not take any investment decisions based on the content in this chapter. In addition, readers are advised to conduct their own independent research into individual stocks before making an investment decision.
######################## Why ?

pip install yfinance

import yfinance as yf

aapl_df = yf.download( 'AAPL', start=start_date, end=end_date)
aapl_df.head()

Indicates that the data we get later using Quandl JSON API is not real stock data

######################## 

     We are going to adapt the Quandl JSON API code in Chapter 7, Visualizing Online Data to get EOD stock data from Quandl. The historical stock data from January 1, 2017 to June 30, 2017 for six stock codes will be obtained: Apple Inc.(EOD/AAPL), The Procter & Gamble Company (EOD/PG), Johnson & Johnson (EOD/JNJ), Exxon Mobil Corporation (EOD/XOM), International Business Machines Corporation (EOD/IBM), and Microsoft Corporation (EOD/MSFT). Again, we will use the default urllib and json modules to handle Quandl API calls, followed by converting the data into a Pandas DataFrame:

from urllib.request import urlopen
import json
import pandas as pd
def get_quandl_dataset( api_key, code, start_date, end_date ):
    """Obtain and parse a quandl dataset in Pandas DataFrame format
        Quandl returns dataset in JSON format, where data is stored as a
        list of lists in response['dataset']['data'], and column headers
        stored in response['dataset']['column_names'].
        Args:
            api_key: Quandl API key
            code: Quandl dataset code
        Returns:
            df: Pandas DataFrame of a Quandl dataset
    """
    # https://docs.data.nasdaq.com/docs/in-depth-usage
    # https://data.nasdaq.com/api/v3/datasets/{database_code}/{dataset_code}.json?api_key=sKqHwnHr8rNWK-3s5imS
    # https://docs.data.nasdaq.com/docs/parameters-2
    # example
    # https://data.nasdaq.com/api/v3/datasets/wiki/AAPL.json?api_key=sKqHwnHr8rNWK-3s5imS
    #                                                        &start_date=2017-01-01
    #                                                        &end_date=2017-06-30
    base_url = "https://data.nasdaq.com/api/v3/datasets/"
    url_suffix = ".json?api_key="
    date = "&start_date={}&end_date={}".format( start_date, end_date )
    
    # Fetch the JSON response
    u = urlopen(base_url + code + url_suffix + api_key + date)
    # https://data.nasdaq.com/api/v3/datasets/WIKI/AAPL.json?api_key=sKqHwnHr8rNWK-3s5imS&start_date=2017-01-01&end_date=2017-06-30
    response = json.loads( u.read().decode('utf-8') )
    # Format the response as Pandas Dataframe
    df = pd.DataFrame( response['dataset']['data'],
                       columns = response['dataset']['column_names']
                     )
    return df

# Input your own API key here
api_key = "sKqHwnHr8rNWK-3s5imS" #"gwguNnzq_4xR18V7ChED"

# Quandl code for six US companies
# {database_code}/{dataset_code}
codes = ["WIKI/AAPL", # Apple Inc
         "WIKI/PG",   # The Procter & Gamble Company
         "WIKI/JNJ",  # Johnson & Johnson
         "WIKI/XOM",  # Exxon Mobil Corporation
         "WIKI/IBM",  # International Business Machines Corporation
         "WIKI/MSFT"  # Microsoft Corporation
        ]
start_date = "2017-01-01"
end_date = "2017-06-30"

dfs = []
# Get the DataFrame that contains the WIKI data for each company
for code in codes:
    df = get_quandl_dataset( api_key, code, start_date, end_date )
    df["Company"] = code[5:] # WIKI/AAPL ==> AAPL
    dfs.append( df ) # dfs[appl, pg, ...]
    
# Concatenate all dataframes into a single one
stock_df = pd.concat( dfs, axis=0 )

# Sort by ascending order of Company then Date
stock_df = stock_df.sort_values( ["Company", "Date"] )
stock_df.head()

     The dataframe contains Opening, High, Low, and Closing (OHLC) prices for each stock. Extra information is also available; for example, the dividend column reflects the cash dividend value on that day. The split column shows the ratio of new shares to old shares if a split occurred on that day. The adjusted prices account for price fluctuations due to distributions or corporate actions by assuming that all these actions were reinvested into the current stock. For more information about these columns, consult the documentation pages on Quandl. 

Adjusted Closing Price Definition

  • The adjusted closing price amends a stock's closing price to reflect that stock's value after accounting for any corporate actions.
  • The closing price is the raw price, which is just the cash value of the last transacted price before the market closes.
  • The adjusted closing price factors in corporate actions, such as stock splits, dividends, and rights offerings.例如股票分割、股息和供股. Adjustments allow investors to obtain an accurate record of the stock's performance. Investors should understand how corporate actions are accounted for in a stock's adjusted closing price. It is especially useful when examining historical returns because it gives analysts an accurate representation of the firm's equity value股权价值.
  • The adjusted closing price can obscure[əbˈskjʊr]掩盖 the impact of key nominal prices and stock splits on prices in the short term.

Grouping the companies by industry

     As you may have noticed, three of the companies (AAPL, IBM, and MSFT) are tech companies, while the remaining three companies are not. Stock analysts often group companies by industry行业 to gain deeper insights. Let's try to label the companies by industry:

# Classify companies by industry
tech_companies = set( ['AAPL', 'IBM', 'MSFT'] )
stock_df['Industry'] = [ "Tech" if c in tech_companies else "Others"
                                for c in stock_df["Company"]
                       ]
stock_df.head()

Converting the date to a supported format

     The Date column in stock_df is recorded as a series of Python strings. Although Seaborn can use string-formatted dates in some functions, Matplotlib cannot. To make the dates malleable[ˈmæliəbl]可塑的;有延展性的;易适应的 to data processing and visualizations, we need to convert the values to float numbers supported by Matplotlib

from matplotlib.dates import date2num

# Convert Date column from string to Python datetime object,
# then to float number that is supported by Matplotlib.
stock_df['Datetime'] = date2num( pd.to_datetime( stock_df["Date"], # pd.Series: Name: Date, Length: 750, dtype: object
                                                 format="%Y-%m-%d"
                                               ).tolist()
                                # [Timestamp('2017-01-03 00:00:00'),...,Timestamp('2017-06-30 00:00:00')]
                               )# Convert datetime objects to Matplotlib dates.
stock_df.head()

Getting the percentage change of the closing price 

     Next, we want to calculate the change of the closing price with regard to the previous day's close. The pct_change() function in Pandas makes this task very easy: 

https://blog.csdn.net/Linli522362242/article/details/102314389 : Trading Strategy- simple moving averages: SMA1 = 42 or SMA2 = 252

import numpy as np

# Calculate percentage change versus the previous close
stock_df["Close_change"] = stock_df["Close"].pct_change()

# Since the DataFrame contain multiple companies' stock data,
# the first record in the "Close_change" should be changed to
# NaN in order to prevent referencing the price of incorrect company.
stock_df.loc[ stock_df["Date"]=="2017-01-03",
              "Close_change"
            ] = np.NaN
stock_df.head()

(116.02-116.15)/116.15

 

Two-dimensional faceted plots

     We are going to introduce three major ways to create faceted plots: seaborn.factorplot(), seaborn.FacetGrid(), and seaborn.pairplot(). These functions actually work similarly to seaborn.lmplot() in the way of defining facets. You might have seen some faceted plots in the previous chapterhttps://blog.csdn.net/Linli522362242/article/details/93617948, when we talked about seaborn.lmplot(). Actually, the seaborn.lmplot() function combines seaborn.regplot() with seaborn.FacetGrid(), and the definitions of data subsets can be adjusted by the hue, col, and row parameters.

Factor plot in Seaborn

     With the help of seaborn.factorplot(), we can draw categorical point plots, box plots, violin plots, bar plots, or strip plots onto a seaborn.FacetGrid() by tuning the kind parameter. The default plot type for factorplot is point plot. Unlike other plotting functions in Seaborn, which support a wide variety of input data formats, factorplot supports pandas DataFrames as input only, while variable/column names can be supplied as string to x, y, hue, col, or row:

import seaborn as sns
import matplotlib.pyplot as plt

sns.set( style="ticks" )

# Plot EOD stock closing price vs Date for each company.
# Color of plot elements is determined by company name (hue="Company"),
# plot panels are also arranged in columns accordingly (col="Company").
# The col_wrap parameter determines the number of panels per row
# g = sns.factorplot( x="Date", y="Close",
#                     hue="Company", # (optional), This parameter take column name for colour encoding
#                     col="Company",
#                     data = stock_df, col_wrap=3
#                   )

# The `factorplot` function has been renamed to `catplot`.
g = sns.catplot( x="Date", y="Close",
                 hue="Company", # (optional), This parameter take column name for colour encoding
                 col="Company",
                 data = stock_df, col_wrap=3,
                 kind='point',
               )
plt.show()

################################### it will take a long time to run

import seaborn as sns
import matplotlib.pyplot as plt

sns.set( style="ticks" )

# Plot EOD stock closing price vs Date for each company.
# Color of plot elements is determined by company name (hue="Company"),
# plot panels are also arranged in columns accordingly (col="Company").
# The col_wrap parameter determines the number of panels per row
# g = sns.factorplot( x="Date", y="Close",
#                     hue="Company", # (optional), This parameter take column name for colour encoding
#                     col="Company",
#                     data = stock_df, col_wrap=3
#                   )

g = sns.catplot( x="Date", y="Close",
                 hue="Company", # (optional), This parameter take column name for colour encoding
                 col="Company",
                 data = stock_df, col_wrap=3,
                 kind='point',
               )
plt.show()


###################################

There are several issues in the preceding plot. (kind='point')

     First, the aspect ratio纵横比 (length divided by height) is slightly suboptimal for a time series chart. A wider plot would allow us to observe minute changes during the time period. We are going to adjust that using the aspect parameter.(height=3 inches of each facet, aspect=1.5)

     Second, the lines and dots are too thick, thereby masking some details in the plot. We can reduce the size of these visual elements by tweaking the scale parameter.

     Lastly, the ticks are too close to each other, and the tick labels are overlapping. After plotting, sns.factorplot() returns a FacetGrid, which was denoted as g in the code. We can further tweak the aesthetics of the plot, such as tick positions and labels, by calling the relevant functions in the FacetGrid object:

# Increase the aspect ratio and size of each panel
g = sns.catplot( x="Date", y="Close",
                 hue="Company", col="Company", #each company name as each facet's title
                 data=stock_df,
                 # “Wrap” the column variable at this width, 
                 # so that the column facets span multiple rows.
                 col_wrap=3,
                 height=3, # Height (in inches) of each facet.
                 aspect=1.5, # length divided by height
                 # https://seaborn.pydata.org/generated/seaborn.pointplot.html#seaborn.pointplot
                 kind="point", # pointplot()
                 scale=0.5, # Scale factor for the plot elements. since kind='point': pointplot()
               )

# Thinning of ticks (select 1 in 10)
locs, labels = plt.xticks()
g.set( xticks=locs[0::10], 
       xticklabels=labels[0::10]
     )

# Rotate the tick labels to prevent overlap
g.set_xticklabels( rotation=30 )

# Reduce the white space between plots
g.fig.subplots_adjust( wspace=0.1, hspace=0.2 )

plt.show()

 ############################

# Increase the aspect ratio and size of each panel
g = sns.catplot( x="Date", y="Close",
                 hue="Company", col="Company",
                 data=stock_df,
                 # “Wrap” the column variable at this width, 
                 # so that the column facets span multiple rows.
                 col_wrap=3,
                 #height=3, # Height (in inches) of each facet.
                 aspect=1.5, # length divided by height
                 # https://seaborn.pydata.org/generated/seaborn.pointplot.html#seaborn.pointplot
                 kind="point", # pointplot()
                 scale=0.5, # Scale factor for the plot elements. since kind='point': pointplot()
               )

# Increase the aspect ratio and size of each panel
g = sns.catplot( x="Date", y="Close",
                 hue="Company", col="Company",
                 data=stock_df,
                 # “Wrap” the column variable at this width, 
                 # so that the column facets span multiple rows.
                 col_wrap=3,
                 #height=3, # Height (in inches) of each facet.
                 aspect=1.5, # length divided by height
                 # https://seaborn.pydata.org/generated/seaborn.pointplot.html#seaborn.pointplot
                 kind="point", # pointplot()
                 #scale=0.5, # Scale factor for the plot elements. since kind='point': pointplot()
               )

# Thinning of ticks (select 1 in 10)
locs, labels = plt.xticks()
g.set( xticks=locs[0::10], 
       xticklabels=labels[0::10]
     )

# Rotate the tick labels to prevent overlap
g.set_xticklabels( rotation=30 )

# Reduce the white space between plots
g.fig.subplots_adjust( wspace=0.1, hspace=0.2 )

plt.show()

############################

# Create faceted plot separated by industry
g = sns.catplot( x="Date", y="Close",
                 hue="Company",# each facet' title 
                 col="Industry",
                 data=stock_df,
                 height=4, # Height (in inches) of each facet.
                 aspect=1.5, # length divided by height
                 # https://seaborn.pydata.org/generated/seaborn.pointplot.html#seaborn.pointplot
                 kind="point", # pointplot()
                 scale=0.5, # Scale factor for the plot elements. since kind='point': pointplot()
               )

# Thinning of ticks (select 1 in 10)
locs, labels = plt.xticks()
g.set( xticks=locs[0::10], 
       xticklabels=labels[0::10]
     )

# Rotate the tick labels to prevent overlap
g.set_xticklabels( rotation=30 )

# Reduce the white space between plots
g.fig.subplots_adjust( wspace=0.1, hspace=0.2 )

plt.show()

Faceted grid in Seaborn

Up until now, we have already mentioned FacetGrid a few times, but what exactly is it? 

     As you may know, FacetGrid is an engine for subsetting data and drawing plot panels determined by assigning variables to the rows and columns of hue parameters. While we can use wrapper functions such as lmplot and factorplot to scaffold plots构建图 on FacetGrid easily, it would be more flexible to build FacetGrid from scratch. To do that, we first supply a pandas DataFrame to the FacetGrid object and specify the way to lay out the grid via col, row, and hue parameters. Then we can assign a Seaborn or Matplotlib plotting function to each panel by calling the map() function of the FacetGrid object

g = sns.FacetGrid( stock_df, col="Company", hue="Company", 
                   height=3, aspect=2, col_wrap=2,
                 )
# Map the seaborn.distplot function to the panels,
# which shows a histogram of closing prices.
# g.map( sns.distplot, "Close", kde=True, bins=8 )
g.map( sns.histplot, "Close", kde=True, kde_kws=dict(cut=3), bins=8 )

# Label the axes
g.set_axis_labels( "Closing price (US Dollars)",
                   "Density"
                 )
# plt.yticks( plt.gca().get_yticks(), plt.gca().get_yticks() * 100)
plt.show()

     A related plot type is a density plot, which is formed by computing an estimate of a continuous probability distribution that might have generated the observed data. The usual procedure is to approximate this distribution as a mixture of “kernels”—that is, simpler distributions like the normal distribution. Thus, density plots are also known as kernel density estimate (KDE) plots : 

 #####################################

FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).

g = sns.FacetGrid( stock_df, col="Company", hue="Company", 
                   height=3, aspect=2, col_wrap=2,
                 )
# Map the seaborn.distplot function to the panels,
# which shows a histogram of closing prices.
g.map( sns.distplot, "Close", kde=True, bins=8 )
# g.map( sns.histplot, "Close", kde=True, kde_kws=dict(cut=3), bins=8 )

# Label the axes
g.set_axis_labels( "Closing price (US Dollars)",
                   "Density"
                 )
# plt.yticks( plt.gca().get_yticks(), plt.gca().get_yticks() * 100)
plt.show()

The density axis scales of the two graphs are different,

#####################################

 We can also supply keyword arguments to the plotting functions:

# [ stock_df['Company']=='AAPL']
g = sns.FacetGrid( stock_df, col="Company", hue="Company", 
                   height=3, aspect=2.2, col_wrap=2
                 )

# We can supply extra kwargs to the plotting function.
# Let's turn off KDE line (kde=False), and 
# plot raw frequency of bins only (norm_hist=False).
# By setting rug=True, tick marks that denotes the density of data points
#                                will be shown in the bottom.
g.map( sns.distplot, "Close", 
       kde=False,
       norm_hist=False, # If True, the histogram height shows a density rather than a count.
                        # This is implied if a KDE or fitted density is plotted.
       rug=True
     )
g.set_axis_labels( "Closing price (US Dollar)",
                   "Density"
                 )
plt.show()

 ########################################

g = sns.FacetGrid( stock_df[ stock_df['Company']=='AAPL'], col="Company", hue="Company", 
                   height=8, aspect=2.2, col_wrap=2
                 )

# We can supply extra kwargs to the plotting function.
# Let's turn off KDE line (kde=False), and 
# plot raw frequency of bins only (norm_hist=False).
# By setting rug=True, tick marks that denotes the density of data points
#                                will be shown in the bottom.
g.map( sns.distplot, "Close", 
       kde=False,
       norm_hist=False, # If True, the histogram height shows a density rather than a count.
                        # This is implied if a KDE or fitted density is plotted.
       rug=True
     )
g.set_axis_labels( "Closing price (US Dollar)",
                   "Density"
                 )
plt.show()

g = sns.FacetGrid( stock_df[ stock_df['Company']=='AAPL'], col="Company", hue="Company", 
                   height=8, aspect=2.2, col_wrap=2
                 )

# We can supply extra kwargs to the plotting function.
# Let's turn off KDE line (kde=False), and 
# plot raw frequency of bins only (norm_hist=False).
# By setting rug=True, tick marks that denotes the density of data points
#                                will be shown in the bottom.
g.map( sns.histplot, "Close", 
       kde=False,# not exist norm_hist argument
       bins=8,
     )
g.map( sns.rugplot, "Close", 
     )

g.set_axis_labels( "Closing price (US Dollar)",
                   "Density"
                 )
plt.show()

 Proved: The density axis scales of the two graphs are different, which is normal.

########################################

     FacetGrid is not limited to the use of Seaborn plotting functions; let's try to map the good old Matplotlib.pyplot.plot() function to FacetGrid: 

from matplotlib.dates import DateFormatter

# from matplotlib.dates import date2num

# # Convert Date column from string to Python datetime object,
# # then to float number that is supported by Matplotlib.
# stock_df['Datetime'] = date2num( pd.to_datetime( stock_df["Date"], # pd.Series: Name: Date, Length: 750, dtype: object
#                                                  format="%Y-%m-%d"
#                                                ).tolist()
#                                 # [Timestamp('2017-01-03 00:00:00'),...,Timestamp('2017-06-30 00:00:00')]
#                                )# Convert datetime objects to Matplotlib dates. e.g. 736332.0, 736333.0

g = sns.FacetGrid( stock_df, hue="Company", col="Industry",
                   height=4, aspect=1.5, 
                   col_wrap=2
                 )

# plt.plot doesn't support string-formatted Date,
# so we need to use the Datetime column 
# that we prepared earlier instead.
g.map( plt.plot, "Datetime", "Close", 
       marker="o", markersize=3, 
       linewidth=1
     )

# We can access individual axes through g.axes[column]
# or g.axes[row,column] if multiple rows are present.
# Let's adjust the tick formatter and rotate the tick labels
# in each axes.
for col in range(2):
    g.axes[col].xaxis.set_major_formatter( DateFormatter('%Y-%m-%d') )
    plt.setp( g.axes[col].get_xticklabels(), rotation=30 )
    
# g.set_axis_labels( "", "Closing price (US Dollar)" )
plt.show()

Pair plot in Seaborn(sns.reset_defaults())

     A pair plot is a special type of FacetGrid. Pairwise relationships between all variables in the input DataFrame will be visualized as scatter plots. In addition, a series of histograms will be displayed along the diagonal axes(\) to show the distribution of the variable in that column: 

# Show a pairplot of three selected variables (vars=["Open", "Volume", "Close"])
sns.reset_defaults()
g = sns.pairplot( stock_df, hue="Company",
                  vars=["Open", "Volume", "Close"],
                  diag_kind="hist",
                  diag_kws=dict( multiple="stack", bins=10, linewidth=0 )
                )
plt.show()

########################### 

# Show a pairplot of three selected variables (vars=["Open", "Volume", "Close"])
g = sns.pairplot( stock_df, hue="Company",
                  vars=["Open", "Volume", "Close"],
                  diag_kind="hist",
                  diag_kws=dict( multiple="stack", bins=10,)#  linewidth=0 
                )
plt.show()

VS PairGrid 

sns.reset_defaults()
g = sns.PairGrid( stock_df, hue="Company", 
                  vars=["Open", "Volume", "Close"]
                )
g.map_diag(plt.hist, 
           histtype='bar',
           stacked=True,
          )
g.map_offdiag(plt.scatter)
g.add_legend(title="", adjust_subtitles=True)
plt.show()

###########################

     We can tweak many aspects of the plot. In the next example, we will increase the aspect ratio, change the plot type in the diagonal line to KDE plot, and adjust the aesthetics of the plots using keyword arguments:

# Adjust the aesthetics of the plot
g = sns.pairplot( stock_df, hue="Company",
                  vars = ["Open", "Volume", "Close"],
                  aspect=1.5,
                  diag_kind="kde",
                  diag_kws = dict( shade=True ),
                  plot_kws = dict( s=15, marker="+"),
                )

     Similar to other plots based on FacetGrid, we can define the variables to be displayed in each panel. We can also manually define the comparisons that matter to us instead of an all-versus-all comparison by setting the x_vars and y_vars parameters. You may also use seaborn.PairGrid() directly if you require even higher flexibility for defining comparison groups:

# Manually defining the comparisons that we are interested.
sns.set( font_scale=1.2 )
g = sns.pairplot( stock_df[1:], hue="Company", aspect=1.5,
                  x_vars=["Open", "Volume"],
                  y_vars=["Close", "Close_change"],
                  diag_kind=None, #####very important
                )
plt.show()

Other two-dimensional multivariate plots

     FacetGrid, factor plot, and pair plot may take up a lot of space when we need to visualize more variables or samples. There are two special plot types that come in handy if you want the maximize space efficiency - Heatmaps and Candlestick plots. 

Heatmap in Seaborn

     A heatmap is an extremely compact[kəmˈpækt;ˈkɑːmpækt]紧凑的 way to display a large amount of data. In the finance world, color-coded blocks can give investors a quick glance at which stocks are up or down. In the scientific world, heatmaps allow researchers to visualize the expression level of thousands of genes. 

     The seaborn.heatmap() function expects a 2D list, 2D Numpy array, or pandas DataFrame as input. If a list or array is supplied, we can supply column and row labels via xticklabels and yticklabels respectively. On the other hand, if a DataFrame is supplied, the column labels and index values will be used to label the columns and rows respectively.

     To get started, we will plot an overview of the performance of the six stocks using a heatmap. We define stock performance as the change of closing price when compared to the previous close. This piece of information was already calculated earlier in this chapter (that is, the Close_change column). Unfortunately, we can't supply the whole DataFrame to seaborn.heatmap() directly, since it expects company names as columns, date as index, and the change in closing price as values.

     If you are familiar with Microsoft Excel, you might have experience in using pivot tables, a powerful technique to summarize the levels or values of a particular variable. pandas includes such functionality. The following code excerpt makes use of the wonderful Pandas.DataFrame.pivot() function to make a pivot table:

stock_df[-5:]

stock_change = stock_df.pivot( index="Date", columns="Company",
                               values="Close_change"
                             )
stock_change = stock_change.loc["2017-06-01":"2017-06-30"]
stock_change.head()


With the pivot table ready, we can proceed to plot our first heatmap: 

supported values are 'Accent', 'Accent_r', 'Blues', 'Blues_r', 'BrBG', 'BrBG_r', 'BuGn', 'BuGn_r', 'BuPu', 'BuPu_r', 'CMRmap', 'CMRmap_r', 'Dark2', 'Dark2_r', 'GnBu', 'GnBu_r', 'Greens', 'Greens_r', 'Greys', 'Greys_r', 'OrRd', 'OrRd_r', 'Oranges', 'Oranges_r', 'PRGn', 'PRGn_r', 'Paired', 'Paired_r', 'Pastel1', 'Pastel1_r', 'Pastel2', 'Pastel2_r', 'PiYG', 'PiYG_r', 'PuBu', 'PuBuGn', 'PuBuGn_r', 'PuBu_r', 'PuOr', 'PuOr_r', 'PuRd', 'PuRd_r', 'Purples', 'Purples_r', 'RdBu', 'RdBu_r', 'RdGy', 'RdGy_r', 'RdPu', 'RdPu_r', 'RdYlBu', 'RdYlBu_r', 'RdYlGn', 'RdYlGn_r', 'Reds', 'Reds_r', 'Set1', 'Set1_r', 'Set2', 'Set2_r', 'Set3', 'Set3_r', 'Spectral', 'Spectral_r', 'Wistia', 'Wistia_r', 'YlGn', 'YlGnBu', 'YlGnBu_r', 'YlGn_r', 'YlOrBr', 'YlOrBr_r', 'YlOrRd', 'YlOrRd_r', 'afmhot', 'afmhot_r', 'autumn', 'autumn_r', 'binary', 'binary_r', 'bone', 'bone_r', 'brg', 'brg_r', 'bwr', 'bwr_r', 'cividis', 'cividis_r', 'cool', 'cool_r', 'coolwarm', 'coolwarm_r', 'copper', 'copper_r', 'crest', 'crest_r', 'cubehelix', 'cubehelix_r', 'flag', 'flag_r', 'flare', 'flare_r', 'gist_earth', 'gist_earth_r', 'gist_gray', 'gist_gray_r', 'gist_heat', 'gist_heat_r', 'gist_ncar', 'gist_ncar_r', 'gist_rainbow', 'gist_rainbow_r', 'gist_stern', 'gist_stern_r', 'gist_yarg', 'gist_+98654.

yarg_r', 'gnuplot', 'gnuplot2', 'gnuplot2_r', 'gnuplot_r', 'gray', 'gray_r', 'hot', 'hot_r', 'hsv', 'hsv_r', 'icefire', 'icefire_r', 'inferno', 'inferno_r', 'jet', 'jet_r', 'magma', 'magma_r', 'mako', 'mako_r', 'nipy_spectral', 'nipy_spectral_r', 'ocean', 'ocean_r', 'pink', 'pink_r', 'plasma', 'plasma_r', 'prism', 'prism_r', 'rainbow', 'rainbow_r', 'rocket', 'rocket_r', 'seismic', 'seismic_r', 'spring', 'spring_r', 'summer', 'summer_r', 'tab10', 'tab10_r', 'tab20', 'tab20_r', 'tab20b', 'tab20b_r', 'tab20c', 'tab20c_r', 'terrain', 'terrain_r', 'twilight', 'twilight_r', 'twilight_shifted', 'twilight_shifted_r', 'viridis', 'viridis_r', 'vlag', 'vlag_r', 'winter', 'winter_r'

ax = sns.heatmap( stock_change, 
                  cmap=  'RdBu_r'# "RdYlBu_r"# "BuPu"# "YlGnBu"
                )

plt.show()

     The default heatmap implementation is not really compact enough. Of course, we can resize the figure via plt.figure(figsize=(width, height)); we can also toggle the square parameter to create square-shaped blocks. To ease visual recognition, we can add a thin border around the blocks.

     By US stock market convention, green denotes a rise and red denotes a fall in prices. Hence we can adjust the cmap parameter to adjust the color map. However, neither Matplotlib nor Seaborn includes a red-green color map, so we need to create our own:

     At the end of Chapter 7, Visualizing Online Data, we briefly introduced functions for creating custom color maps. Here we will use seaborn.diverging_palette() to create the red-green color map, which requires us to specify the hues, saturation, and lightness (husl) for the negative and positive extents of the color map. You may also use this code to launch an interactive widget in Jupyter Notebook to help select the colors: 

#################### 

def choose_diverging_palette(as_cmap=False):
    """Launch an interactive widget to choose a diverging color palette.

    This corresponds with the :func:`diverging_palette` function. This kind
    of palette is good for data that range between interesting low values
    and interesting high values with a meaningful midpoint. (For example,
    change scores relative to some baseline value).

    Requires IPython 2+ and must be used in the notebook.

    Parameters
    ----------
    as_cmap : bool
        If True, the return value is a matplotlib colormap rather than a
        list of discrete colors.

    Returns
    -------
    pal or cmap : list of colors or matplotlib colormap
        Object that can be passed to plotting functions.

    See Also
    --------
    diverging_palette : Create a diverging color palette or colormap.
    choose_colorbrewer_palette : Interactively choose palettes from the
                                 colorbrewer set, including diverging palettes.

    """
    pal = []
    if as_cmap:
        cmap = _init_mutable_colormap()

    @interact
    def choose_diverging_palette(
        h_neg=IntSlider(min=0,
                        max=359,
                        value=220),
        h_pos=IntSlider(min=0,
                        max=359,
                        value=10),
        s=IntSlider(min=0, max=99, value=74),
        l=IntSlider(min=0, max=99, value=50),  # noqa: E741
        sep=IntSlider(min=1, max=50, value=10),
        n=(2, 16),
        center=["light", "dark"]
    ):
        if as_cmap:
            colors = diverging_palette(h_neg, h_pos, s, l, sep, 256, center)
            _update_lut(cmap, colors)
            _show_cmap(cmap)
        else:
            pal[:] = diverging_palette(h_neg, h_pos, s, l, sep, n, center)
            palplot(pal)

    if as_cmap:
        return cmap
    return pal

 VS google colab:

seaborn.husl_palette — seaborn 0.11.2 documentation

%matplotlib notebook
sns.palplot(sns.husl_palette(10, h=.5))

#################### 

# Create a new red-green color map 
# using the husl color system 
# h_neg and h_pos determines the hue of the extents of the color map.
# s determines the color saturation
# l determines the lightness
# sep determines the width of center point
# In addition, we need to set as_cmap=True as the cmap parameter of
# sns.heatmap expects matplotlib colormap object.

rdgn = sns.diverging_palette( h_neg=10, h_pos=140, 
                              s=80, 
                              l=50, 
                              sep=10, as_cmap=True )

# Change to square blocks (square=True), 
# add a thin border (linewidths=.5), and 
# change the color map to follow US stocks market convention (cmap="RdGn").
ax = sns.heatmap( stock_change, cmap=rdgn,
                  linewidths=.5, square=True
                )

# Prevent x axes label from being cropped
plt.tight_layout()

plt.show()

     It could be hard to discern small differences in values when color is the only discriminative factor. Adding text annotations to each color block may help readers understand the magnitude of the difference:

fig = plt.figure( figsize=(8,8) )

# Set annot=True to overlay the values.
# We can also assign python format string to fmt.
# For example ".2%" refers to percentage values with two decimal points.
ax = sns.heatmap( stock_change, cmap = rdgn,
                  annot = True, fmt=".2%",
                  linewidths=.5, # Width of the lines that will divide each cell.
                  cbar = False,  # Whether to draw a colorbar.
                )
plt.show()

Candlestick plot in matplotlib.finance

     As you have seen in the first part of this chapter, our dataset contains the opening and closing prices as well as the highest and lowest price per trading day. None of the plots we have described thus far are able to describe the trend of all these variables in a single plot.

     In the financial world, the candlestick plot is almost the default choice for describing price movements of stocks, currencies, and commodities over a time period. Each candlestick consists of the body, describing the opening and closing prices, and extended wicks[wɪk]灯芯,蜡烛心 illustrating the highest and lowest prices of a particular trading day. If the closing price is higher than the opening price, the candlestick is often colored black. Conversely, the candlestick is colored red if the closing price is lower. The trader can then infer the opening and closing prices based on the combination of color and the boundary of the candlestick body.

     In the following example, we are going to prepare a candlestick chart of Apple Incorporation in the last 50 trading days of our DataFrame. We will also apply the tick formatter to label the ticks as dates:

     Starting from Matplotlib 2.0, matplotlib.finance is deprecated. Readers should use mpl_finance (https://github.com/matplotlib/mpl-finance) in the future instead. However, as of writing this chapter, mpl_finance is not yet available on PyPI, so let's stick to mpl_finance for the time being.

pip install --upgrade mplfinance

import matplotlib.pyplot as plt
from matplotlib.dates import date2num, WeekdayLocator, DayLocator, DateFormatter, MONDAY
import mpl_finance

# Extract stocks data for AAPL.
# candlestick_ohlc expects Date (in floating point number), 
# Open, High, Low, Close columns only
# So we need to select the useful columns first using DataFrame.loc[]. 
# Extra columns can exist, but they are ignored. 
# Next we get the data for the last 50 trading only for simplicity of plots.

candlestick_data = stock_df[ stock_df['Company']=="AAPL"].loc[:, ["Datetime", 
                                                                  "Open",
                                                                  "High",
                                                                  "Low", 
                                                                  "Close", 
                                                                  "Volume"
                                                                 ]
                                                             ].iloc[-50:]
# Create a new Matplotlib figure
fig, ax = plt.subplots(figsize=(10,6) )

# Prepare a candlestick plot
mpl_finance.candlestick_ohlc( ax, candlestick_data.values, width=0.6 )
ax.xaxis.set_major_locator( WeekdayLocator(byweekday=MONDAY,interval=1 ) ) # major ticks on the mondays
ax.xaxis.set_major_formatter(DateFormatter('%Y-%m-%d')) ######
# import matplotlib.dates as mdates
# ax.xaxis.set_major_formatter( mdates.DateFormatter('%y-%m-%d') )
ax.xaxis.set_minor_locator( DayLocator() ) # minor ticks on the days
ax.xaxis_date() # treat the x data as dates
# rotate all ticks
plt.setp( ax.get_xticklabels(), rotation=45, horizontalalignment='right' )

ax.set_ylabel( 'Price (US $)' ) # Set y-axis label
plt.show()

Visualizing various stock market indicators 

     The candlestick plot in the current form is a bit bland. Traders usually overlay stock indicators such as Average True Range (ATR), Bollinger band布林带, Commodity Channel Index (CCI)商品通道指数, Exponential Moving Average (EMA)指数移动平均线, Moving Average Convergence Divergence (MACD)移动平均线收敛散度, Relative Strength Index (RSI)相对强弱指数, and various other stats for technical analysis.

     Stockstats (https://github.com/jealous/stockstats) is a great package for calculating these indicators/stats and many more. It wraps around pandas DataFrames and generate the stats on the fly when they are accessed. To use stockstats, we simply install it via PyPI: 

pip install stockstats

     Next, we can convert a pandas DataFrame to a stockstats DataFrame via stockstats.StockDataFrame.retype(). A plethora[ˈpleθərə]过多 of stock indicators can then be
accessed by following the pattern StockDataFrame["variable_timeWindow_indicator"]. For example, StockDataFrame['open_2_sma'] would give us the 2-day simple moving average on the opening price. Shortcuts may be available for some indicators, so please consult the official
documentation for more information: 

from stockstats import StockDataFrame
# Convert to StockDataFrame
# Need to pass a copy of candlestick_data to StockDataFrame.retype
# Otherwise the original candlestick_data will be modified
stockstats = StockDataFrame.retype( candlestick_data.copy() )

# 5-day exponential moving average on closing price
ema_5 = stockstats[ "close_5_ema" ]
# 20-day exponential moving average on closing price
ema_20 = stockstats[ "close_20_ema" ]
# 50-day exponential moving average on closing price
ema_50 = stockstats[ "close_50_ema" ]

# Upper Bollinger band
boll_ub = stockstats["boll_ub"]
# Lower Bollinger band
boll_lb = stockstats["boll_lb"]

# 7-day Relative Strength Index
rsi_7 = stockstats["rsi_7"]
# 14-day Relative Strength Index
rsi_14 = stockstats["rsi_14"]
stockstats.head()

 

stockstats.tail()

With the stock indicators ready, we can overlay them on the same candlestick chart:

import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import date2num, WeekdayLocator, DayLocator, DateFormatter, MONDAY
from mpl_finance import candlestick_ohlc

# Create a new Matplotlib figure
fig, ax = plt.subplots( figsize=(10,6) )

# candlestick_data = stock_df[ stock_df['Company']=="AAPL"].loc[:, ["Datetime", 
#                                                                   "Open",
#                                                                   "High",
#                                                                   "Low", 
#                                                                   "Close", 
#                                                                   "Volume"
#                                                                  ]
#                                                              ].iloc[-50:]

# Prepare a candlestick plot
candlestick_ohlc( ax, candlestick_data.values, width=0.6 )

# Plot stock indicators in the same plot
ax.plot( candlestick_data["Datetime"], ema_5, lw=1, label="EMA (5)" )
ax.plot( candlestick_data["Datetime"], ema_20, lw=1, label="EMA (20)" )
ax.plot( candlestick_data["Datetime"], ema_50, lw=1, label="EMA (50)" )

ax.plot( candlestick_data["Datetime"], boll_ub, lw=2, linestyle="--", label="Bollinger upper" )
ax.plot( candlestick_data["Datetime"], boll_lb, lw=2, linestyle="--", label="Bollinger lower" )

ax.xaxis.set_minor_locator( DayLocator() ) # minor ticks on the days
ax.xaxis.set_major_locator( WeekdayLocator(MONDAY) ) # major ticks on the mondays
ax.xaxis.set_major_formatter( DateFormatter('%Y-%m-%d') )
ax.xaxis_date() # treat the x data as dates

# rotate all ticks to 45
plt.setp( ax.get_xticklabels(), rotation=45, horizontalalignment='right' )

ax.set_ylabel( "Price (US $)" ) # Set y-axis label

# Limit the x-axis range from 2017-4-23 to 2017-7-1
datemin = datetime.date( 2017, 4, 23 )
datemax = datetime.date( 2017, 7, 1 )
ax.set_xlim( datemin, datemax )

plt.legend(loc='upper left')
plt.tight_layout()# Prevent x axes label from being cropped
plt.show()

VVVVVVVVVVVVVVVVVVVVVVVV 

# aapl_df['Close'] = round(aapl_df['Close'], 2)
aapl_df.head()

convert index to column pandas dataframe

aapl_df.reset_index(inplace=True)
aapl_df.head()

from matplotlib.dates import date2num

# Convert Date column from string to Python datetime object,
# then to float number that is supported by Matplotlib.
aapl_df['Datetime'] = date2num( pd.to_datetime( aapl_df["Date"], # pd.Series: Name: Date, Length: 750, dtype: object
                                                 format="%Y-%m-%d"
                                               ).tolist()
                                # [Timestamp('2017-01-03 00:00:00'),...,Timestamp('2017-06-30 00:00:00')]
                               )# Convert datetime objects to Matplotlib dates.
aapl_df.head()

aapl_df.iloc[-50:][:5]

aapl_df.iloc[-50:][-5:]

  

aapl_candlestick_data = aapl_df.loc[:, ["Datetime", 
                                        "Open",
                                        "High",
                                        "Low", 
                                        "Close", 
                                        "Volume"
                                       ]
                                   ].iloc[-50:]
aapl_candlestick_data.head()

# Convert to StockDataFrame
# Need to pass a copy of candlestick_data to StockDataFrame.retype
# Otherwise the original candlestick_data will be modified
aapl_stockstats = StockDataFrame.retype( aapl_candlestick_data.copy() )

# 5-day exponential moving average on closing price
aapl_ema_5 = aapl_stockstats[ "close_5_ema" ]
# 20-day exponential moving average on closing price
aapl_ema_20 = aapl_stockstats[ "close_20_ema" ]
# 50-day exponential moving average on closing price
aapl_ema_50 = aapl_stockstats[ "close_50_ema" ]

# Upper Bollinger band
aapl_boll_ub = aapl_stockstats["boll_ub"]
# Lower Bollinger band
aapl_boll_lb = aapl_stockstats["boll_lb"]

# 7-day Relative Strength Index
aapl_rsi_7 = aapl_stockstats["rsi_7"]
# 14-day Relative Strength Index
aapl_rsi_14 = aapl_stockstats["rsi_14"]

aapl_stockstats.head()

import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import date2num, WeekdayLocator, DayLocator, DateFormatter, MONDAY
from mpl_finance import candlestick_ohlc

# Create a new Matplotlib figure
fig, ax = plt.subplots( figsize=(7,8) )

# Prepare a candlestick plot
candlestick_ohlc( ax, aapl_candlestick_data.values, width=0.6 )

# Plot stock indicators in the same plot
ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_5, lw=5, c='blue', label="EMA (5)" )
ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_20, lw=1, label="EMA (20)" )
ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_50, lw=1, label="EMA (50)" )

ax.plot( aapl_candlestick_data["Datetime"], aapl_boll_ub, lw=2, linestyle="--", label="Bollinger upper" )
ax.plot( aapl_candlestick_data["Datetime"], aapl_boll_lb, lw=2, linestyle="--", label="Bollinger lower" )

ax.xaxis.set_minor_locator( DayLocator() ) # minor ticks on the days
ax.xaxis.set_major_locator( WeekdayLocator(MONDAY) ) # major ticks on the mondays
ax.xaxis.set_major_formatter( DateFormatter('%Y-%m-%d') )
ax.xaxis_date() # treat the x data as dates

# rotate all ticks to 45
plt.setp( ax.get_xticklabels(), rotation=45, horizontalalignment='right' )

ax.set_ylabel( "Price (US $)" ) # Set y-axis label

# Limit the x-axis range from 2017-4-23 to 2017-7-1
datemin = datetime.date( 2017, 4, 23 )
datemax = datetime.date( 2017, 7, 1 )
ax.set_xlim( datemin, datemax )

plt.legend(loc='upper left')
plt.tight_layout()# Prevent x axes label from being cropped
plt.show()

 

VS yahoo finance.com 

     Why is there a difference? Because Yahoo uses all the historical data (more detailed), the data I use here is from April, so it will cause the initial EWMA value to be different, but the later the EWMA value is, the more Yahoo’s value is. Similar, resulting in the following curves being basically the same

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

VVVVVVVVVVVVVVVVVVVVVVVV  

stockstats2 = candlestick_data.copy()
stockstats2.head()

Exponential Weighted Moving Average(EWMA)

The EMA for a series  may be calculated recursively:
or 

 Where:

  • The coefficient  represents the degree of weighting decrease, a constant smoothing factor between 0 and 1. A higherdiscounts older observations faster.
  •  is the value at a time period.
  •  is the value of the EMA at any time period.

 Expanding out  each time results in the following power series, showing how the weighting factor on each datum p1, p2, etc., decreases exponentially:

 https://github.com/jealous/stockstats/blob/master/stockstats.py

    def _get_ema(cls, df, column, windows):
        """ get exponential moving average
        :param df: data
        :param column: column to calculate
        :param windows: collection of window of exponential moving average
        :return: None
        """
        window = cls.get_only_one_positive_int(windows)
        column_name = '{}_{}_ema'.format(column, window)
        if len(df[column]) > 0:
            df[column_name] = df[column].ewm(
                ignore_na=False, span=window,
                min_periods=0, adjust=True).mean()
        else:
            df[column_name] = []

 https://en.wikipedia.org/wiki/Moving_average

stockstats2 = candlestick_data.copy()
stockstats2.head()

close_ema_5 = 5
close_ema_20 = 20
close_ema_50 = 50
                                                 # Provide exponential weighted (EW) functions
stockstats2['close_5_ema'] = stockstats2['Close'].ewm( ignore_na=False, 
                                                       span=close_ema_5, ###
                                                       min_periods=0,
                                                       adjust=True
                                                     ).mean()
stockstats2['close_20_ema'] = stockstats2['Close'].ewm( ignore_na=False, 
                                                       span=close_ema_20,###
                                                       min_periods=0,
                                                       adjust=True
                                                     ).mean()
stockstats2['close_50_ema'] = stockstats2['Close'].ewm( ignore_na=False, 
                                                       span=close_ema_50,###
                                                       min_periods=0,
                                                       adjust=True
                                                     ).mean()
stockstats2.head()

Bollinger Bands

股价波动在上限和下限的区间之内,这条带状区的宽窄,随着股价波动幅度的大小而变化,股价涨跌幅度加大时,带状区变宽,涨跌幅度狭小盘整时,带状区则变窄。 

布林线利用波带可以显示其安全的高低价位。

当变易性变小,而波带变窄时,激烈的价格波动有可能随时产生

高,低点穿越波带边线时,立刻又回到波带内,会有回档产生。

波带开始移动后,以此方式进入另一波带,这对于找出目标值有相当帮助

应用规则是这样的:当一只股票在一段时间内股价波幅很小,反映在布林线上表现为,股价波幅带长期收窄,而在某个交易日,股价在较大交易量的配合下收盘价突破布林线的阻力线,而此时布林线由收口明显转为开口,此时投资者应该果断买入(从当日的K线图就可明显看出),这是因为,该股票由弱转强,短期上冲的动力不会仅仅一天短线必然会有新高出现,因此可以果断介入。

     Two input parameters chosen independently by the user govern how a given chart summarizes the known historical price data, allowing the user to vary the response of the chart to the magnitude and frequency of price changes, similar to parametric equations in signal processing or control systems. Bollinger Bands consist of

  • an N-period moving average (MA),
  • an upper band at K times an N-period standard deviation above the moving average (MA + ), and
  • a lower band at K times an N-period standard deviation below the moving average (MA − ).
  • The chart thus expresses arbitrary choices or assumptions of the user, and is not strictly about the price data alone. 

     Typical values for N and K are 20 days(BOLL_PERIOD = 20) and 2(BOLL_STD_TIMES = 2), respectively. The default choice for the average is a simple moving average, but other types of averages can be employed as needed. Exponential moving averages are a common second choice.[note 1] Usually the same period is used for both the middle band and the calculation of standard deviation.https://github.com/jealous/stockstats/blob/master/stockstats.py

    def _get_boll(cls, df):
        """ Get Bollinger bands.
        boll_ub means the upper band of the Bollinger bands
        boll_lb means the lower band of the Bollinger bands
        boll_ub = MA + Kσ
        boll_lb = MA − Kσ
        M = BOLL_PERIOD
        K = BOLL_STD_TIMES
        :param df: data
        :return: None
        """
        moving_avg = df['close_{}_sma'.format(cls.BOLL_PERIOD)]
        moving_std = df['close_{}_mstd'.format(cls.BOLL_PERIOD)]
        df['boll'] = moving_avg
        moving_avg = list(map(np.float64, moving_avg))
        moving_std = list(map(np.float64, moving_std))
        # noinspection PyTypeChecker
        df['boll_ub'] = np.add(moving_avg,
                               np.multiply(cls.BOLL_STD_TIMES, moving_std))
        # noinspection PyTypeChecker
        df['boll_lb'] = np.subtract(moving_avg,
                                    np.multiply(cls.BOLL_STD_TIMES,
                                                moving_std))
N_boll_period = 20
K=2
# Calculates the values for the shorter SMA
# https://blog.csdn.net/Linli522362242/article/details/102314389
# min_periodsint, default None
#                Minimum number of observations in window required to have a value (otherwise result is NA). 
#                For a window that is specified by an offset, min_periods will default to 1. 
#                Otherwise, min_periods will default to the size of the window.
s_moving_avg_20 = stockstats2['Close'].rolling( N_boll_period, min_periods=1 ).mean()
s_moving_std_20 = stockstats2['Close'].rolling( N_boll_period, min_periods=1 ).std()

stockstats2['boll'] = s_moving_avg_20

# boll_ub = MA + Kσ
stockstats2['boll_ub'] = np.add( s_moving_avg_20,
                                 np.multiply( K, s_moving_std_20 )
                               )
# boll_lb = MA − Kσ
stockstats2['boll_lb'] = np.subtract( s_moving_avg_20,
                                      np.multiply( K, s_moving_std_20 )
                                    )
stockstats2.head()

import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import date2num, WeekdayLocator, DayLocator, DateFormatter, MONDAY
from mpl_finance import candlestick_ohlc

# Create a new Matplotlib figure
fig, ax = plt.subplots( figsize=(7,8) )

# Prepare a candlestick plot
candlestick_ohlc( ax, aapl_candlestick_data.values, width=0.6 )

# Plot stock indicators in the same plot
# ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_5, lw=5, c='blue', label="EMA (5)" )
# ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_20, lw=1, label="EMA (20)" )
# ax.plot( aapl_candlestick_data["Datetime"], aapl_ema_50, lw=1, label="EMA (50)" )

ax.plot( aapl_candlestick_data["Datetime"], aapl_boll_ub, lw=2, linestyle="--", label="Bollinger upper" )
ax.plot( aapl_candlestick_data["Datetime"], aapl_stockstats[ "boll" ], lw=2, linestyle="-", label="Bollinger mean(middle)" )
ax.plot( aapl_candlestick_data["Datetime"], aapl_boll_lb, lw=2, linestyle="--", label="Bollinger lower" )

ax.xaxis.set_minor_locator( DayLocator() ) # minor ticks on the days
ax.xaxis.set_major_locator( WeekdayLocator(MONDAY) ) # major ticks on the mondays
ax.xaxis.set_major_formatter( DateFormatter('%Y-%m-%d') )
ax.xaxis_date() # treat the x data as dates

# rotate all ticks to 45
plt.setp( ax.get_xticklabels(), rotation=45, horizontalalignment='right' )

ax.set_ylabel( "Price (US $)" ) # Set y-axis label

# Limit the x-axis range from 2017-4-23 to 2017-7-1
datemin = datetime.date( 2017, 4, 23 )
datemax = datetime.date( 2017, 7, 1 )
ax.set_xlim( datemin, datemax )

plt.legend(loc='upper left')
plt.tight_layout()# Prevent x axes label from being cropped
plt.show()

 VS yahoo finance 

BOLL指标应用技巧

1)、当价格运行在布林通道中轨和上轨之间的区域时,只要不破中轨,说明市场处于多头行情中,只考虑逢低买进,不考虑做空

2)、在中轨和下轨之间时,只要不破中轨(高出中轨线,中轨线是20天的简单移动平均线SMA_20),说明是空头市场,交易策略是逢高卖出,不考虑买进。

3)、当市场价格沿着布林通道上轨运行时,说明市场是单边上涨行情持有的多单要守住,只要价格不脱离上轨区域就耐心持有

4)、沿着下轨运行时,说明市场目前为单边下跌行情,一般为一波快速下跌行情,持有的空单,只要价格不脱离下轨区域就耐心持有

5)、当价格运行在中轨区域时,说明市场目前为盘整震荡行情,对趋势交易者来说,这是最容易赔钱的一种行情,应回避,空仓观望为上。

6)、布林通道的缩口状态。价格在中轨附近震荡,上下轨逐渐缩口,此是大行情来临的预兆,应空仓观望,等待时机。

7)、通道缩口后的突然扩张状态。意味着一波爆发性行情来临,此后,行情很可能走单边,可以积极调整建仓,顺势而为。

8)、当布林通道缩口后,在一波大行情来临之前,往往会出现假突破行情,这是主力的陷阱,应提高警惕,可以通过调整仓位化解

9)、布林通道的时间周期应以周线为主,在单边行情时,所持仓单已有高额利润,为防止大的回调,可以参考日线布林通道的原则出局。

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Building a comprehensive stock chart

     In the following elaborate example, we are going to apply the many techniques that we have covered thus far to create a more comprehensive stock chart. In addition to the preceding plot, we will add a line chart to display the Relative Strength Index (RSI) and a bar chart to show trade volume. A special market event (https://markets.businessinsider.com/news/stocks/apple-stock-price-falling-new-iphone-speed-2017-6) is going to be annotated on the chart as well: 

     If you look closely at the charts, you might notice some missing dates. These days are usually non-trading days or public holidays that were not present in our DataFrame.

stock_df[ (stock_df['Company']=="AAPL") & (stock_df['Date']=="2017-06-09") ]

import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import date2num, WeekdayLocator, DayLocator, DateFormatter, MONDAY
from mpl_finance import candlestick_ohlc
from matplotlib.ticker import FuncFormatter

# FuncFormatter to convert tick values to Millions
def millions( x, pos ):
    return '%dM' % (x/1e6)

# Create 3 subplots spread acrosee three rows, with shared x-axis.
# The height ratio is specified via gridspec_kw
fig, axarr = plt.subplots( nrows=3, ncols=1, sharex=True, figsize=(8,8),
                           gridspec_kw = {'height_ratios':[3,1,1]}
                         )

# Prepare a candlestick plot in the first axes
candlestick_ohlc( axarr[0], candlestick_data.values, width=0.6 )

############### Overlay stock indicators in the first axes ###############
axarr[0].plot( candlestick_data['Datetime'], ema_5, lw=1, label="EMA (5)" )
axarr[0].plot( candlestick_data['Datetime'], ema_20, lw=1, label="EMA (20)" )
axarr[0].plot( candlestick_data['Datetime'], ema_50, lw=1, label="EMA (50)" )

axarr[0].plot( candlestick_data['Datetime'], boll_ub, lw=2, linestyle='--', label='Bollinger upper' )
axarr[0].plot( candlestick_data['Datetime'], boll_lb, lw=2, linestyle='--', label='Bollinger lower' )
axarr[0].fill_between( candlestick_data['Datetime'], boll_ub, boll_lb, alpha=0.1 )

##################### Display RSI in the second axes #####################
axarr[1].axhline( y=30, lw=2, color='0.7') # Line for oversold threshold
axarr[1].axhline( y=50, lw=2, linestyle='--', color='0.8' ) # Neutral RSI
axarr[1].axhline( y=70, lw=2, color='0.7') # Line for overbought threshold

axarr[1].plot( candlestick_data['Datetime'], rsi_7, lw=2, label='RSI (7)' )
axarr[1].plot( candlestick_data['Datetime'], rsi_14, lw=2, label='RSI (14)' )

################## Display trade volume in the third axes ##################
axarr[2].bar( candlestick_data['Datetime'], candlestick_data['Volume'] )

# Mark the market reaction to the Bloomberg news
# https://www.bloomberg.com/news/articles/2017-06-09/apple-s-new-iphones-said-to-miss-out-on-higher-speed-data-links
# http://markets.businessinsider.com/news/stocks/apple-stock-price-falling-new-iphone-speed-2017-6-1002082799
axarr[0].annotate( "Bloomberg New",             # 'High' = 155 at 2017-06-09
                   xy=( datetime.date(2017,6,9), 155 ), # The xy parameter specifies the arrow's destination
                   xytext=( 25,10 ), # The position (x, y) to place the text at
                   textcoords='offset points', # Offset (in points) from the xy value
                   fontsize=12,
                   arrowprops=dict( arrowstyle='simple', facecolor ='green', edgecolor='None' )
                 )

# Label the axes
axarr[0].set_ylabel( "Price (US $)" )
axarr[1].set_ylabel( 'RSI' )
axarr[2].set_ylabel( 'Volume (US $)' )

axarr[2].xaxis.set_minor_locator( DayLocator() ) # minor ticks on the days
axarr[2].xaxis.set_major_locator( WeekdayLocator(MONDAY) ) # major ticks on the mondays
axarr[2].xaxis.set_major_formatter( DateFormatter('%Y-%m-%d') ) ####
axarr[2].xaxis_date() # treat the x data as date ######
axarr[2].yaxis.set_major_formatter( FuncFormatter(millions) ) # Change the y-axis ticks to millions

plt.setp( axarr[2].get_xticklabels(), rotation=45, horizontalalignment='right' ) # Rotate x-tick labels by 45 degree

# Limit the x-axis ragne from 2017-4-23 to 2017-7-1
datemin = datetime.date( 2017, 4, 23 )
datemax = datetime.date( 2017, 7, 1 )
axarr[2].set_xlim( datemin, datemax )

# Show figure legend
axarr[0].legend()
axarr[1].legend()

# Show figure title
axarr[0].set_title( "AAPL (Apple Inc.) NASDAQ", loc='left' )

# Reduce unneccesary white space
plt.tight_layout()
plt.show()

     Readings below 30 generally indicate that the stock is oversold, while readings above 70 indicate that it is overbought. Traders will often place this RSI chart below the price chart for the security, so they can compare its recent momentum against its market price 

     Some traders will consider it a “buy signal” if a security’s RSI reading moves below 30, based on the idea that the security has been oversold and is therefore poised for a rebound. However, the reliability of this signal will depend in part on the overall context. If the security is caught in a significant downtrend, then it might continue trading at an oversold level for quite some time. Traders in that situation might delay buying until they see other confirmatory signals.

IF PREVIOUS RSI > 30 AND CURRENT RSI < 30 ==> BUY SIGNAL
IF PREVIOUS RSI < 70 AND CURRENT RSI > 70 ==> SELL SIGNAL

RSI的变动范围在0—100之间,

国内单边做多的股市:强弱指标值一般分布在20—80。

80-100 极强 卖出

50-80 强 买入

20-50 弱 观望

0-20 极弱 买入

国内期货/国际伦敦金/外汇等双向交易市场:强弱指标值一般分布在30-70.

70-100 超买区 做空

30-70 观望慎入区

0-30 超卖区 做多

 VVVVVVVVVVVVVVVVVVVVVVVVVVVVV

     The relative strength index (RSI) is a technical indicator used in the analysis of financial markets. It is intended to chart the current and historical strength or weakness of a stock or market based on the closing prices of a recent trading period. The indicator should not be confused with relative strength.

    The RSI is classified as a momentum oscillator, measuring the velocity and magnitude of price movementsMomentum is the rate of the rise or fall in price. The RSI computes momentum as the ratio of higher closes to lower closes: stocks which have had more or stronger positive changes have a higher RSI than stocks which have had more or stronger negative changes.

      The RSI is most typically used on a 14-day timeframe, measured on a scale from 0 to 100, with high and low levels marked at 70 and 30, respectively. Short or longer timeframes are used for alternately shorter or longer outlooks. High and low levels—80 and 20, or 90 and 10—occur less frequently but indicate stronger momentum.

     For each trading period an upward change U or downward change D is calculated. Up periods are characterized by the close being higher than the previous close: 

     Conversely, a down period is characterized by the close being lower than the previous period's close (note that D is nonetheless a positive number),

# close_-1_d — this is the price difference between time t and t-1
stockstats2['close_-1_s'] = stockstats2['Close'].shift(1)
d = stockstats2['close_-1_d'] = stockstats2['Close']-stockstats2['close_-1_s']

stockstats2['closepm'] = ( d+d.abs() )/2  # if d>0: (d+d)/2=d, if d<0, (d+(-d))/2=0 
stockstats2['closenm'] = ( -d+d.abs() )/2 # if d>0: (-d+d)/=0, if d<0, ((-d)+(-d))/2=-d 

     The ratio of these averages is the relative strength or relative strength factor

     If the average of D values is zero, then according to the equation, the RS value will approach infinity, so that the resulting RSI, as computed below, will approach 100.

The relative strength factor is then converted to a relative strength index between 0 and 100:
     The smoothed moving averages should be appropriately initialized with a simple moving average using the first n values in the price series.https://github.com/jealous/stockstats/blob/master/stockstats.py

    def _get_smma(cls, df, column, windows):
        """ get smoothed moving average.
        :param df: data
        :param windows: range
        :return: result series
        """
        window = cls.get_only_one_positive_int(windows)
        column_name = '{}_{}_smma'.format(column, window)
        smma = df[column].ewm(
            ignore_na=False, alpha=1.0 / window,
            min_periods=0, adjust=True).mean()
        df[column_name] = smma
        return smma
   
    def _get_rsi(cls, df, n_days):
        """ Calculate the RSI (Relative Strength Index) within N days
        calculated based on the formula at:
        https://en.wikipedia.org/wiki/Relative_strength_index
        :param df: data
        :param n_days: N days
        :return: None
        """
        n_days = int(n_days)
        d = df['close_-1_d']

        df['closepm'] = (d + d.abs()) / 2
        df['closenm'] = (-d + d.abs()) / 2
        closepm_smma_column = 'closepm_{}_smma'.format(n_days)
        closenm_smma_column = 'closenm_{}_smma'.format(n_days)
        p_ema = df[closepm_smma_column]
        n_ema = df[closenm_smma_column]

        rs_column_name = 'rs_{}'.format(n_days)
        rsi_column_name = 'rsi_{}'.format(n_days)
        df[rs_column_name] = rs = p_ema / n_ema
        df[rsi_column_name] = 100 - 100 / (1.0 + rs)

        columns_to_remove = ['closepm',
                             'closenm',
                             closepm_smma_column,
                             closenm_smma_column]
        cls._drop_columns(df, columns_to_remove)
n_days_7=7
n_days_14=14
# close_-1_d — this is the price difference between time t and t-1
stockstats2['close_-1_s'] = stockstats2['Close'].shift(1)
d = stockstats2['close_-1_d'] = stockstats2['Close']-stockstats2['close_-1_s']

stockstats2['closepm'] = ( d+d.abs() )/2  # if d>0: (d+d)/2= d, if d<0, (d+(-d))/2= 0 
stockstats2['closenm'] = ( -d+d.abs() )/2 # if d>0: (-d+d)/= 0, if d<0, ((-d)+(-d))/2= -d

p_ema_7 = stockstats2['closepm'].ewm( com = n_days_7 - 1,
                                      adjust=True,
                                    ).mean()
n_ema_7 = stockstats2['closenm'].ewm( com = n_days_7 - 1,
                                      adjust=True,
                                    ).mean()
p_ema_14 = stockstats2['closepm'].ewm( com = n_days_14 - 1,
                                       adjust=True,
                                     ).mean()
n_ema_14 = stockstats2['closenm'].ewm( com = n_days_14 - 1,
                                       adjust=True,
                                     ).mean()

rs_column_name = 'rs_{}'.format(n_days_7)
rsi_column_name = 'rsi_{}'.format(n_days_7)
stockstats2[rs_column_name] = rs = p_ema_7 / n_ema_7
stockstats2[rsi_column_name] = 100 - 100 / (1.0 + rs)

rs_column_name = 'rs_{}'.format(n_days_14)
rsi_column_name = 'rsi_{}'.format(n_days_14)
stockstats2[rs_column_name] = rs = p_ema_14 / n_ema_14
stockstats2[rsi_column_name] = 100 - 100 / (1.0 + rs)


stockstats2=stockstats2.drop(['closepm','closenm'], axis=1)
stockstats2.head()

stockstats2.tail()

     The RSI was designed to indicate whether a security is overbought or oversold in relation to recent price levels.

^^^^^^^^^^^^^^^^^^^^^^^^^^

Three-dimensional (3D) plots

     By transitioning to the three-dimensional space, you may enjoy greater creative freedom when creating visualizations. The extra dimension can also accommodate more information in a single plot. However, some may argue that 3D is nothing more than a visual gimmick[ˈɡɪmɪk]骗人的玩意;花招 when projected to a 2D surface (such as paper) as it would obfuscate[ˈɑːbfʌskeɪt]使模糊;使迷乱 the interpretation of data points.

     In Matplotlib version 2, despite significant developments in the 3D API, annoying bugs or glitches[ɡlɪtʃɪz]故障 still exist. We will discuss some workarounds toward the end of this chapter. More powerful Python 3D visualization packages do exist (such as MayaVi2, Plotly, and VisPy), but it's good to use Matplotlib's 3D plotting functions if you want to use the same package for both 2D and 3D plots, or you would like to maintain the aesthetics of its 2D plots.

     For the most part, 3D plots in Matplotlib have similar structures to 2D plots. As such, we will not go through every 3D plot type in this section. We will put our focus on 3D scatter plots and bar charts.

3D scatter plot

     In Chapter 6, Hello Plotting World!https://blog.csdn.net/Linli522362242/article/details/121045744, we have already explored scatter plots in two dimensions. In this section, let's try to create a 3D scatter plot. Before doing that, we need some data points in three dimensions (x, y, z):

import pandas as pd

source = "https://raw.githubusercontent.com/PointCloudLibrary/data/master/tutorials/ism_train_cat.pcd"
cat_df = pd.read_csv( source, skiprows=11,
                      delimiter=" ",
                      names=["x", "y", "z"],
                      encoding="latin_1"
                    ) # https://docs.python.org/3/library/codecs.html#standard-encodings
cat_df.head()

 
     To declare a 3D plot, we first need to import the Axes3D object from the mplot3d extension in mpl_toolkits, which is responsible for rendering 3D plots in a 2D plane. After that, we need to specify projection='3d' when we create subplots:

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.scatter( cat_df.x, cat_df.y, cat_df.z )

plt.show()

 
     Behold, the mighty sCATter plot in 3D. Cats are currently taking over the internet. According to the New York Times, cats are "the essential building block of the Internet" (https://www.nytimes.com/2014/07/23/upshot/what-the-internet-can-see-from-your-cat-pictures.html). Undoubtedly, they deserve a place in this chapter as well.

     Contrary to the 2D version of scatter(), we need to provide X, Y, and Z coordinates when we are creating a 3D scatter plot. Yet the parameters that are supported in 2D scatter() can be applied to 3D scatter() as well:

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# Change the size, shape and color of markers
ax.scatter( cat_df.x, cat_df.y, cat_df.z, 
            s=4, # The marker size
            c="g", # colors or color
            marker='o'
          )
plt.show()

 

     To change the viewing angle and elevation[ˌelɪˈveɪʃn]仰角 of the 3D plot, we can make use of view_init(). The azim parameter specifies the azimuth[ˈæzɪməθ]方位角  angle in the X-Y plane, while elev specifies the elevation angle. When the azimuth angle is 0, the X-Y plane would appear to the north from you. Meanwhile, an azimuth angle of 180 would show you the south side of the X-Y plane:

fig = plt.figure(figsize=(10,6))

ax1 = fig.add_subplot(221, projection="3d")
ax1.scatter( cat_df.x, cat_df.y, cat_df.z,
             s=4, c="g", marker='o' )

# elev stores the elevation angle in the z plane azim stores the
# azimuth angle in the x,y plane
ax1.view_init( azim=180, elev=10 )
ax1.set_title('azim=180, elev=10')


ax2 = fig.add_subplot(222, projection="3d")

ax2.scatter( cat_df.x, cat_df.y, cat_df.z,
             s=4, c="g", marker='o' )

# elev stores the elevation angle in the z plane azim stores the
# azimuth angle in the x,y plane
ax2.view_init( azim=0, elev=0 )
ax2.set_title('azim=0, elev=0')

ax3 = fig.add_subplot(223, projection="3d")

ax3.scatter( cat_df.x, cat_df.y, cat_df.z,
             s=4, c="b", marker='x' )

# elev stores the elevation angle in the z plane azim stores the
# azimuth angle in the x,y plane
ax3.view_init( azim=10, elev=10 )
ax3.set_title('azim=10, elev=10')


ax4 = fig.add_subplot(224, projection="3d")

ax4.scatter( cat_df.x, cat_df.y, cat_df.z,
             s=4, c="g", marker='o' )

# elev stores the elevation angle in the z plane azim stores the
# azimuth angle in the x,y plane
ax4.view_init( azim=-170, elev=-10 )
ax4.set_title('azim=-170, elev=-10', fontsize=8)
plt.subplots_adjust(top=1.5)

plt.tight_layout()

plt.show()

3D bar chart

     We introduced candlestick plots for showing Open-High-Low-Close (OHLC) financial data. In addition, a 3D bar chart can be employed to show OHLC across time. The next figure shows a typical example of plotting a 5-day OHLC bar chart:

import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D

# Get 1 and every 5th row for the 5-day AAPL OHLC data
ohlc_5d = stock_df[ stock_df["Company"]=="AAPL" ].iloc[1::5, :]

fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")

# Create one color-coded bar chart for Open, High, Low and Close prices.
for color, col, z in zip( ['red', 'blue', 'green', 'yellow'],
                          ['Open', 'High', 'Low', 'Close'],
                          [30, 20, 10, 0]
                        ):
    xs = np.arange( ohlc_5d.shape[0] )
    ys = ohlc_5d[col]
    # Assign color to the bars
    colors = [color]*len(xs)
    ax.bar( xs, ys, zs=z, zdir='y', color=colors, alpha=0.8, width=5 )

plt.show()

 The method for setting ticks and labels is similar to other Matplotlib plotting functions:

fig = plt.figure( figsize=(10,10) )
ax = fig.add_subplot( 111, projection='3d' )

# Create one color-coded bar chart for Open, High, Low and Close prices.
for color, col, z in zip( ['red', 'blue', 'green', 'yellow'],
                          ['Open', 'High', 'Low', 'Close'],
                          [30, 20, 10, 0]
                        ):
    xs = np.arange( ohlc_5d.shape[0] )
    ys = ohlc_5d[col]
    
    # Assign color to the bars
    colors = [color]*len(xs)
    ax.bar( xs, ys, zs=z, zdir='y', color=colors, alpha=0.8 )

# Manually assign the ticks and tick labels
ax.set_xticks( np.arange(ohlc_5d.shape[0]) )
ax.set_xticklabels( ohlc_5d["Date"], rotation=20, 
                    verticalalignment= "baseline",
                    horizontalalignment="right",
                    fontsize="8",
                  )
# https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.tick_params.html
ax.tick_params(axis='x', which='major', pad=15, # Distance in points between tick and label.
               ) # move the tick labels

ax.set_yticks( [30, 20, 10, 0] )
ax.set_yticklabels( ["Open", "High", "Low", "Close"] )

# Set the z-axis label
ax.set_zlabel( "Price (US $)" )

# Rotate viewport
ax.view_init( azim=-42, elev=31 )
plt.tight_layout()
plt.show()

fig = plt.figure( figsize=(9,7) )
Caveats注意事项 of Matplotlib 3D

     Due to the lack of a true 3D graphical rendering backend (such as OpenGL) and proper algorithm for detecting 3D objects' intersections, the 3D plotting capabilities of Matplotlib are not great but just adequate for typical applications. In the official Matplotlib FAQ
(https://matplotlib.org/2.2.2/mpl_toolkits/mplot3d/faq.html), the author noted that 3D plots may not look right at certain angles. Besides, we also reported that mplot3d would fail to clip bar charts if zlim is set (https://github.com/matplotlib/matplotlib/issues/8902; see also https://github.com/matplotlib/matplotlib/issues/209). Without improvements in the 3D rendering backend, these issues are hard to fix.

     To better illustrate the latter issue, let's try to add ax.set_zlim3d(bottom=110, top=150) right above plt.tight_layout() in the previous 3D bar chart:

fig = plt.figure( figsize=(10,10) )
ax = fig.add_subplot( 111, projection='3d' )

# Create one color-coded bar chart for Open, High, Low and Close prices.
for color, col, z in zip( ['red', 'blue', 'green', 'yellow'],
                          ['Open', 'High', 'Low', 'Close'],
                          [30, 20, 10, 0]
                        ):
    xs = np.arange( ohlc_5d.shape[0] )
    ys = ohlc_5d[col]
    
    # Assign color to the bars
    colors = [color]*len(xs)
    ax.bar( xs, ys, zs=z, zdir='y', color=colors, alpha=0.8 )

# Manually assign the ticks and tick labels
ax.set_xticks( np.arange(ohlc_5d.shape[0]) )
ax.set_xticklabels( ohlc_5d["Date"], rotation=20, 
                    verticalalignment= "baseline",
                    horizontalalignment="right",
                    fontsize="8",
                  )
# https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.tick_params.html
ax.tick_params(axis='x', which='major', pad=15, # Distance in points between tick and label.
               ) # move the tick labels

ax.set_yticks( [30, 20, 10, 0] )
ax.set_yticklabels( ["Open", "High", "Low", "Close"] )

# Set the z-axis label
ax.set_zlabel( "Price (US $)" )

# Rotate viewport
ax.view_init( azim=-42, elev=31 )
ax.set_zlim3d( bottom=110, top=150 )
plt.tight_layout()
plt.show()

     Clearly, something is going wrong, as the bars overshoot the lower boundary of the axes. We will try to address the latter issue through the following workaround:

# FuncFormatter to add 110 to the tick labels
def major_formatter( x, pos ):
    return "{}".format( x+110 )

fig = plt.figure( figsize=(10, 10) )
ax = fig.add_subplot(111, projection='3d')

# Create one color-coded bar chart for Open, High, Low and Close prices.
for color, col, z in zip( ['red', 'blue', 'green', 'yellow'],
                          ['Open', 'High', 'Low', 'Close'],
                          [30, 20, 10, 0]
                        ):
    xs = np.arange( ohlc_5d.shape[0] )
    ys = ohlc_5d[col]
    
    # Assign color to the bars
    colors = [color]*len(xs)
    # Truncate the y-values by 110
    ax.bar( xs, ys-110,                                      ###############
            zs=z, zdir='y', color=colors, alpha=0.8 )
    
# Manually assign the ticks and tick labels
ax.set_xticks( np.arange(ohlc_5d.shape[0]) )
ax.set_xticklabels( ohlc_5d["Date"], rotation=20, 
                    verticalalignment= "baseline",
                    horizontalalignment="right",
                    fontsize="8",
                  )
# https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.tick_params.html
ax.tick_params(axis='x', which='major', pad=15, # Distance in points between tick and label.
               ) # move the tick labels

ax.set_yticks( [30, 20, 10, 0] )
ax.set_yticklabels( ["Open", "High", "Low", "Close"] )
ax.zaxis.set_major_formatter( FuncFormatter(major_formatter) )###############

# Set the z-axis label
ax.set_zlabel( "Price (US $)" )

# Rotate viewport
ax.view_init( azim=-42, elev=31 )

plt.tight_layout()
plt.show()

     Basically, we truncated the y values by 110, and then we used a tick formatter (major_formatter) to shift the tick value back to the original. For 3D scatter plots, we can simply remove the data points that exceed the boundary of set_zlim3d() in order to generate a proper figure. However, these workarounds may not work for every 3D plot type.

Summary

     You have successfully learned the techniques for visualizing multivariate data in 2D and 3D forms. Although most examples in this chapter revolved around the topic of stock trading, the data processing and visualization methods can be applied readily to other fields as well. In particular, the divide-and-conquer approach used to visualize multivariate data in facets is extremely useful in the scientific field.

     We didn't go into too much detail of the 3D plotting capability of Matplotlib, as it is yet to be polished. For simple 3D plots, Matplotlib already suffices. The learning curve can be reduced if we use the same package for both 2D and 3D plots. You are advised to take a look at MayaVi2, Plotly, and VisPy if you require more powerful 3D plotting functions.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值