mtb5_Nested Calculations_lookup(hide data)filter_address(direction)_partition(scope)_Rank_Size_Index

I would like to draw your attention to the work of three individuals who helped to make this chapter possible. Joshua Milligan provided the idea of directional and non-directional as a taxonomy[tækˈsɑːnəmi]分类学  for considering how Tableau table calculation functions are applied. Of the dozens of blogs, forum posts, conference sessions, articles, and white papers reviewed for this chapter, Jonathan Drummy's blog post ( http://drawingwithnumbers.artisart.org/at-the-level-unlocking-the-mystery-part-1-ordinal-calcs/ ), At the Level – Unlocking the Mystery Part 1: Ordinal Calcs, was the clearest and most insightful for understanding the various nuances[ˈnuːɑːns]细微差别  of partitioning and addressing. Lastly, Joe Mako's unsurpassed understanding of the inner workings of Tableau and his willingness to share that insight through the Tableau forums was very helpful.

In this chapter, we will discuss the following topics:

  • • A definition and two questions
  • • Introducing functions
  • • Directional and non-directional table calculations
  • • Application of functions

     Let's go into more detail and start this chapter off with a general definition of table calculations and two resulting questions, which we will answer by the end of this chapter.

A definition and two questions

     As discussed in https://blog.csdn.net/Linli522362242/article/details/124207205, Getting Up to Speed – A Review of the Basics, calculated fields can be categorized as either row-level, aggregate-level, or table-level. For row - and aggregate-level calculations, the underlying data source engine does most (if not all) of the computational work and Tableau merely visualizes the results. For table calculations, Tableau also relies on the underlying data source engine and the
available RAM on your machine to execute computational tasks; however, after that work is completed and a dataset is returned, Tableau performs additional processing before rendering the results.

Let us look at the definition of table calculations, as follows:

     A table calculation is a function performed on a cached dataset that has been generated as a result of a query from Tableau to the data source.

     Let's consider a couple of points regarding the dataset in cache mentioned in the preceding definition: 

  • This cache is not simply the returned results of a query. Tableau may adjust the returned results. We will consider this in Chapter 6, All About Data – Data Densification, Cubes, and Big Data; Tableau may expand the cache through data densification but usually, it is user-driven.
  • It's important to consider how the cache is structured. Basically, the dataset in cache is the table used in your worksheet and, like all tables, is made up of rows and columns. This is particularly important for table calculations since a table calculation may be computed as it moves along the cache. Such a table calculation is directional (for an example, see Figure 5.1)
  • Alternatively, a table calculation may be computed based on the entire cache with no directional consideration. These table calculations are non-directional. Directional and non-directional table calculations will be explored more fully in the Directional and non-directional table calculations section.

     Note that in the Tableau documentation, the dataset in cache is typically referred to as a partition缓存中的数据集通常称为分区. This chapter will often use both terms side by side for clarity.

     The structure of this chapter was created with the intent of providing a simple schema for understanding table calculations. This is communicated through two questions:

  • • What is the function?
  • • How is the function applied?

     These two questions are inexorably[ɪnˈeksərəbli] connected密不可分的. You cannot reliably apply something until you know what it is. And you cannot get useful results from something until you correctly apply it. The Introducing functions section explores each unique table calculation function, and how each can be considered directional or non-directional. The Application of functions section explores how table calculations are applied to the view via partitioning and addressing dimensions.

Introducing functions

     As discussed earlier in this chapter, it may be helpful to enhance your understanding of table calculations with the following two questions: what is the function and how is the function applied? We will begin by grouping each table calculation function to be directional or non-directional.

     Tableau offers a wide range of table calculations, but if we narrow our consideration to unique groups of table calculation functions, we will discover that there are only 11.

The following table shows those 11 functions organized into two categories:

     As mentioned in A definition and two questions, non-directional table calculation functions operate on the entire cache and thus are not computed based on movement through the cache. For example,

  • the SIZE function doesn't change based on the value of a previous row in the cache and is therefore considered non-directional.

         Returns the number of rows in the partition. For example, the view below shows quarterly sales. Within the Date partition, there are 7 rows so the Size() of the Date partition is 7.
  • On the other hand, RUNNING_SUM does change based on previous rows in the cache and is therefore considered directional.

         Returns the running sum of the given expression, from the first row in the partition to the current row. (table down)

     In the following section, we'll see directional and non-directional table calculation functions in action.

Directional and non-directional table calculations 

Let us have a closer look at directional and non-directional table calculations: 

  • 2. Navigate to the Directional/Non-Directional worksheet
  • 3. Create the calculated fields, as shown in the following table:
    • LookupNotice the -1 included in this calculation. This instructs Tableau to retrieve the value from the previous row.
      LOOKUP( SUM([Sales]), -1 )

      LOOKUP(expression, [offset]) :
           Returns the value of the expression in a target row, specified as a relative offset from the current row. Use FIRST() + n and LAST() - n as part of your offset definition for a target relative to the first/last rows in the partition. If offset is omitted, the row to compare to can be set on the field menu. This function returns NULL if the target row cannot be determined.

           The view below shows quarterly sales. When LOOKUP (SUM(Sales), 2) is computed within the Date partition, each row shows the sales value from 2 quarters into the future.
    • Size : The SIZE function returns the number of rows in the partition. Therefore, as can be seen in the following screenshot, the size equals the total number of rows
      SIZE()
    • Window Sum : The Window sum functions (WINDOW_SUM(expression, [start, end])
      can operate either directionally or non-directionally.
           Since this example does not include the [start, end] option, it operates non-directionally
      WINDOW_SUM( SUM([Sales]) )

      WINDOW_SUM(expression, [start, end])
           Returns the sum of the expression within the window. The window is defined by means of offsets from the current row. Use FIRST()+n and LAST()-n for offsets from the first or last row in the partition. If the start and end are omitted, the entire partition is used.
           For example, the view below shows quarterly sales. A window sum computed within the Date partition returns the summation of sales across all quarters.

           Example : 
                WINDOW_SUM(SUM([Profit]), FIRST()+1, 0) computes the sum of SUM(Profit) from the second row to the current row.
    • Window Sum w/ Start&End
      WINDOW_SUM( SUM([Sales]), 0, 1 )
      computes the sum of SUM(Sales) from the current row to the next row.
    • Running Sum :
      RUNNING_SUM( SUM([Sales]) )

      RUNNING_SUM(expression) : Returns the running sum of the given expression, from the first row in the partition to the current row. (Accumulation operation) 
  •  4. Place Category and Ship Mode on the Rows shelf.

  •  5. Double-click on Sales, Lookup, Size, Window Sum, Window Sum w/Start&End, and Running Sum to populate the view as shown in the following screenshot:

  • 6. The table in Figure 5.1 shows you that the Size function and the Window Sum function are the only two non-directional functions. All others present a number depending on the direction. For example, Lookup moves down and prints the value prior to the actual field. You can change the direction by clicking on the triangle of any table calculation field and selecting Edit Table Calculation.

     In answer to the question what is the function, we have considered a taxonomy composed of two categories: directional and non-directional. Now, we will move on to considering the table calculation functions individually. Regretfully, space does not allow us to explore all table calculations; however, to gain a working understanding, it should suffice to consider all but one of the unique groups of options. The four table calculations that begin with Script_ will be covered in Chapter 15, Programming Tool Integration. 

     Although some effort has been made to make the following exercises useful for real-world scenarios, each is focused on demonstrating table calculation functionality while considering how each interacts directionally or non-directionally with the dataset in cache (also known as the partition).

Exploring each unique table calculation function

The following exercises will show us each table calculation individually, based on an example.

Lookup(hide data) and Total

     The objectives of the following worksheet are to display those customers who made purchases in the last 2 days of 2013, and the associated absolute as well as relative sales of the month of December(Purchases per customer in the last 2 days as a percentage of total sales in the last month of 2013).

  • 1. In the workbook associated with this chapter, navigate to the Lookup/Total worksheet.
  • 2. Drag Customer Name and Order Date to the Rows shelf. Set Order Date to Month/Day/Year(date value truncated) discrete by right-clicking and selecting Day as well as Discrete.
  • 3. Place Order Date on the Filters shelf, choose to view only December 2013, and select Month/Year as the date format.

    The filter on Order Date ensures that the dataset returned to Tableau only includes data from the month of December 2013.

    Objectives: only show those Customers' Name who made purchases in the December 2013
  • 4. Create the following calculated fields:
    • Objectives: Customers who made purchases in the last 2 days of 2013

      Max(): Returns current customer's last purchase date

      Lookup Intervening Time
      :will return an integer that reflects the difference between the date returned by the LOOKUP function and 12/31/2013 (12/31/2013 - date returned). Note that the LOOKUP function has an offset of 0(=current row). This results in each row returning the date associated with that row. This differs from directional and non-directional table calculations, which include a LOOKUP function with an offset of -1, which caused each row in the view to return data associated with the previous row.
      DATEDIFF( 'day', LOOKUP( Max([Order Date]), 0),
                       MAKEDATE(2013, 12,31)
              )
      MAKEDATE(year, month, day) : 
           Returns a date value constructed from the specified year, month, and date.
           MAKEDATE(2013, 12, 31) = #December 31, 2013#

      6. Place Lookup Intervening Time on the Filters shelf and choose a range from 0 to 1.( 0<= 12/31/2013 - date returned <=)
    • % Total Sales :
      SUM([Sales])/TOTAL( SUM([Sales]) )
        the TOTAL( SUM([Sales]) ) calculated field returns the total sales for the entire dataset. Dividing SUM([Sales]) by this total returns the percentage of the total.
  • 5. Right-click on % Total Sales and select Default Properties | Number format... to set the number format to percentage with two decimal places.

  • 7. Double-click on Sales and % Total Sales to place both fields on the view.
  • 8. Format as desired:Purchases per customer in the last 2 days as a percentage of total sales in the last month of 2013 (First filter ( TOTAL( SUM([Sales]) ) )==>Second filter (Lookup SUM([Sales]) in the last 2 days) 

    Think of Lookup Intervening Time as not filtering but hiding all but the last two days in December. This hiding ensures that the data necessary to calculate % Total Sales is in the dataset in cache/partition.

     At first glance, you might think that you could simplify this workbook by removing Lookup Intervening Time from the Filters shelf and adjusting the filter on [Order Date] to display only the last two days of December. However, if you do this, % Total Sales will add up to 100% across all rows in the view(Last 2 days of purchases per customer as a percentage of last 2 days of total sales in the last month of 2013), which would not satisfy the workbook's objectives(Purchases per customer in the last 2 days as a percentage of total sales in the last month of 2013).​​​​​​​

Previous Value

     The objectives of the following worksheet are to return the aggregate value of sales for each year and set next year's sales goal. Note that two options have been provided for determining next year's sales goal in order to demonstrate how PREVIOUS_VALUE differs from LOOKUP . Also note that PREVIOUS_VALUE behaves directionally

Let us have a look at the steps:

  • 1. In the workbook associated with this chapter, navigate to the Previous Value worksheet.
  • 2. Create the following calculated fields:
    • Next Year Goal Prv_Val :
      PREVIOUS_VALUE( SUM([Sales]) ) * 1.05
      PREVIOUS_VALUE(SUM([Sales])) *1.05 is applied in this worksheet, which retrieves the results from each previous row and adds 5%. In other words, the goal is a steady-state growth rate of 5% per year over all years.

      PREVIOUS_VALUE(expression) : Returns the value of this calculation in the previous row. Returns the given expression if the current row is the first row of the partition.

      PREVIOUS_VALUE(expression) * 1.05 ==> 484,247*1.05=508,460 ==>

      PREVIOUS_VALUE(expression) * 1.05 ==> 508.460*1.05=533,883 ==>
    • Next Year Goal Lkup
      LOOKUP( SUM([Sales]), 0 ) * 1.05
      LOOKUP(SUM([Sales]),0) * 1.05 is also applied in
      this worksheet; this calculation adds 5% to the current year's sales. In other words, the goal for next year is for sales that are 5% greater than this year. Previous years are not considered.
  • 3. Place Order Date on the Rows shelf.
  • 4. Double-click Sales, Next Year Goal Prv_Val and Next Year Goal Lkup to place each on the view.
  • 5. Format as desired, or as seen in the following screenshot (by dragging the piles[paɪlz]桩 in Measure Values, the order will change accordingly in the view as seen in the following screenshot): 

     To better understand this exercise, consider the values associated with 2014 in the preceding screenshot.

  • Next Year Goal Prv_Val is calculated via 2013 Next Year Goal Prv_Val; that is, $560,577 * 1.05.
  • On the other hand, Next Year Goal Lkup is calculated via the 2014 sales; that is, $733,947 * 1.05.
  • It is usually based on this year's sales to formulate next year's sales goals, of course, other related factors must also be considered. 

         To formulate more precisely, all previous data (including this year's data) and various factors should be combined to make a model to predict the next year, so as to formulate goals for the next year

Running

     The objective of the following worksheet is to display the running minimum profit, running average profit, and running maximum profit compared with SUM(Profit) for each month in the dataset.

The following example demonstrates how the Running functions behave directionally:

  • 1. In the workbook associated with this chapter, navigate to the Running worksheet.
  • 2. Create the following calculated fields:
    • Running Min
      RUNNING_MIN( SUM([Profit]) )

      compares the current SUM(Profit) with the least SUM(Profit) recorded to that point in time. If the current SUM(Profit) is less than the least SUM(Profit) recorded to date, the current SUM(Profit) replaces the least SUM(Profit).
    • Running Max
      RUNNING_MAX( SUM([Profit]) )
    • Running Avg
      RUNNING_AVG( SUM([Profit]) )
  • 3. Place Order Date on the Columns shelf and set it to Month/Year continuous. Place Measure Values and Profit on the Rows shelf. 
  • 4. Remove all instances of measures from the Measure Values shelf except for Running Min, Running Max, Running Avg and Profit.
  • 5. Move SUM(Profit) from the Measure Values shelf to the Rows shelf. Right-click on SUM(Profit) and select Dual Axis.
  • 6. Format as desired, or as seen in the following screenshot:

Window 

     The objective of the following worksheet is to display a directional instance of a WINDOW function and a non-directional instance.

Please follow these steps:

  • 1. In the workbook associated with this chapter, navigate to the Window worksheet.
  • 2. Create the following calculated fields:
    • Win Avg Directional
      WINDOW_AVG( SUM([Profit]), -2, 0 )
      has a start point of -2 and an end point of 0, which signifies that Tableau will create a point based on the average of the SUM([Profit]) calculated on the current month and the previous two months. Changing the 0 to 2 would cause the average of each point to be calculated on the previous 2 months, the current month, and the next 2 months. Double-click on the instance of on the Rows shelf to view the underlying code. (This is the instance of Profit that was changed into a table calculation by right-clicking and selecting Quick Table Calculation | Moving Average.) Note that the calculation is identical to the code created for Win Avg Directional. However, if you right-click on both pills and select Edit Table Calculation, you will notice that the resulting dialog boxes differ. The dialog box associated with Profit() has more options, including the ability to change the previous and next values.
      Changing the previous and next values for Win Avg Directional requires adjusting the calculated field (Start and End fields in the code).
    • Win Avg Non-Directional
      WINDOW_AVG( SUM([Profit]) )
      is associated with the horizontal line across the view. Note that it is not dependent on direction. Instead, it is a single value generated by the average of all aggregated Profit values in the dataset in cache/partition and will therefore be the same no matter the order of the data points.
    • 3. Place Order Date on the Columns shelf and set it to Month/Year continuous. Place Measure Values on the Rows shelf.
    • 4. Remove all instances of measures from the Measure Values shelf except Win Avg Directional, Win Avg Non-Directional, and Profit.
    • 5. From the Data pane, drag another instance of Profit to the Rows shelf. Right-click on the instance of Profit on the Rows shelf and select Quick Table Calculation | Moving Average. Right-click on the instance of Profit on the Rows shelf and select Dual Axis.
    • 6. Right-click on the axis labeled Moving Average of Profit and select Synchronize Axis.
    • 7. Format as desired, or as seen in the following screenshot:

First and Last

     The objective of the following worksheet is to display the first and last instance of the best-selling product(Top) in the Superstore dataset. Notice how the following example demonstrates that the FIRST and LAST functions behave directionally.

Take the following steps:

  • 1. In the workbook associated with this chapter, navigate to the First/Last worksheet

  • 2. Create the following calculated fields:

    • First : FIRST() starts at 0 and counts down to the last row of the dataset in cache. In the back screenshot, note that the first instance of Canon imageCLASS occurs on 05/24/13. The fact that FIRST() ranges from 0 to -4 communicates that there are 5 instances of Canon imageCLASS in the dataset in cache or within the window cache.
      FIRST()

           Returns the number of rows from the current row to the first row in the partition. For example, the view below shows quarterly sales. When FIRST() is computed within the Date partition, the offset of the first row from the second row is -1.​​​​​​​

      When the current row index is 3, FIRST() = -2.

    • Last: LAST() starts at the last row of the dataset in cache and counts down to 0. In the back screenshot, note that the last instance of Canon imageCLASS occurs on 11/18/14. The fact that LAST() ranges from 4 to 0 communicates that there are 5 instances of Canon imageCLASS in the dataset in cache.
      LAST()
            Returns the number of rows from the current row to the last row in the partition. For example, the table below shows quarterly sales. When LAST() is computed within the Date partition, the offset of the last row from the second row is 5.
       

      When the current row index is 3 of 7, LAST() = 4.

    • First or Last First or Last: when placed on the Filters shelf and set to True(use Compute Using | Table (down)), hides all instances of matching rows except the first and last.

      FIRST() = 0 OR  LAST() = 0

  • 3. Place Product Name on the Filters shelf, select the Top tab, and choose Top 1 by Sum of Sales, as shown in the following screenshot: 
  • 4. Place First or Last on the Filters shelf and select True in the resulting dialog box          
  • 5. Drag Product Name, Order Date(Month/Day/Year and discrete), and Row ID (add all members)to the Rows shelf

         The Row ID field is included in the view to make sure that the very first and last instances of Canon imageCLASS display. Otherwise, if there are multiple instances of Canon imageCLASS on the first or last date, sales numbers will reflect multiple values. It's important to set Compute Using to Table (down) for each table calculation in the view. Compute Using is the same as Addressing, which will be discussed in detail in the Application of functions section.
  • 7. Double-click on Sales, First, and Last to place each on the view
  • 8. Right-click on First, Last, and First or Last and select Compute Using | Table (down)the first and last instance of the best-selling product(Top) in the Superstore dataset : 
         
    The Row ID field is included in the view to make sure that the very first and last instances of Canon imageCLASS display
         First or Last in the filter use Compute Using | Table (down)

Index

     The objective of the following worksheet is to list those states in the USA with over 50 postal codes represented in the underlying dataset.

     Notice how the following example demonstrates that the INDEX function behaves directionally and can be seen as a count of rows: 

  • 1. In the workbook associated with this chapter, navigate to the Index worksheet
  • 2. Set the Marks card to Circle.
  • 3. Place State on the Rows shelf and Postal Code on the Detail shelf.
  • 4. Create a calculated field named Index with the code: INDEX() counts from 1 to n. As such, it behaves directionally. In this case, as a result of the partitioning(scope: Within State) and addressing(direction:Postal Code) settings, Index is counting postal codes. (Partitioning and addressing will be discussed in detail in the Application of functions section.) Setting the Index filter to display only values of 50 or more ensures that only those states with 50 or more postal codes in the partition/dataset in cache display.
    INDEX()
    Drag Index to the Filters shelf and select the Range of values filter. Choose to view only values that are 50 or greater by moving the slider to a minimum value of 50.since the Index default Compute Using State and the the index of State not exceed 49.

    Drag the Index to the Label shelf
  • 5. Right-click on Index on the filter shelf and Label shelf and select Edit Table Calculation. Select Specific Dimensions and check Postal Code(it is addressed : determine the direction of the calculation):
    the remaining dimension is State, which define the scope of data it is performed on, are called partitioning field.

    right click Index on Label then select remove
  • 6. Select Analysis menu | Stack Marks | Off, then review:list those states in the USA with over 50 postal codes represented in the underlying dataset

Rank

     The objective of the following worksheet is to display the top three selling items in each region. This example will demonstrate how RANK interacts directionally with the dataset in cache:

  • 1. In the workbook associated with this chapter, navigate to the Rank worksheet.
  • 2. Place Product Name on the Filters shelf, select the Top tab, and choose Top 3 by Sum of Sales:
  • 3. Place Region and Product Name on the Rows shelf and Sales on the Columns shelf
  • 4. Note that only two items display for each region:you will have noticed that after placing Region and Product Name on the Rows shelf and
    filtering to show only the top three product names, the resulting visualization only showed two products in each region. This is because the filter on Product Name showed the top three products overall, but it so happened that two out of the three were always present in each region. But we actually wanted to show the top three per region rather than overall. To fix this issue, we will employ the Rank table calculation function.
  • 5. Create a calculated field named Rank with the code
    RANK( SUM([Sales]) )
  • 6. Drag Rank(conver to Discrete) between Region and Product Name on the Rows shelf.
    here Rank : default Compute Using | Table(down) or current rank along the all SUM( Sales)
  • 7. Right-click on Rank and select Compute Using | Product Name.Sort(Rank) the top three products by sales in each region
  • 8. Remove Product Name from the Filters shelf.Sort(Rank) the products by sales in each region
         Note that Rank is counting the product names within each region. This demonstrates that the Rank table calculation operates directionally on the dataset in cache/partition
  • 9. Press Ctrl (or press Option on Mac), and right-click Rank on the Rows shelf then select Show Filter to append the Rank to the Filters shelf(alternately, Pressing the Ctrl key while dragging a pill(here is Rank) from one shelf(Row) to another(Filters) will create a copy of that pill.). Next, edit the Rank on the Filters shelf In the resulting dialog box, select 1, 2, and 3.
  • 10. Pressing the Ctrl key while dragging Rank) from Filters shelf to Color Shelf to create a copy of the Rankthe top three selling items in each region

      Let's understand how this works. Instead of writing the Rank(Sum(Sales)) function, the same code will be automatically generated by right-clicking on an instance of Sales on any shelf and selecting Quick Table Calculation | Rank. Note that Rank is counting the product names within each region. This demonstrates that the Rank table calculation operates directionally on
the dataset in cache/partition. 

Size

     The objective of the following worksheet is to display all states with five or fewer cities in the Superstore dataset. This example will demonstrate how SIZE utilizes the entire partition/dataset in cache and is thus non-directional. We will also use the FIRST table calculation function, which is directional, in order to clean up the view. 

Please follow along with the steps:

  • 1. In the workbook associated with this chapter, navigate to the Size worksheet.
  • 2. Set Analysis | Stack Marks to Off.
  • 3. Create the following calculated fields:
    • Size
      SIZE()
       Size() generates a single number, for example, 3 for Delaware, based on the partition/dataset in cache. That number can change depending on the partitioning and addressing settings, but does not change based on movement across the partition. As such, it behaves non-directionally.
    • City Count
      IF FIRST() = 0
          THEN [Size]
      ELSE
          NULL
      END
      The IF FIRST() = 0 THEN LOOKUP([Size],0) ELSE NULL END field is not strictly necessary. You could, instead, simply place Size on the Text/Label shelf. However, if you do so, you will note that the numbers in the view will look bold. This occurs because the numbers are actually repeated and then layered on top of each other. Utilizing FIRST() = 0 causes only one set of numbers to display.
  • 4. Drag State to the Rows shelf, City to the Detail shelf, City Count to the Text/Label shelfThe text in the table shows the number of occurrences of each city in each state in the dataset (if First() =0, then [Size]; otherwise NULL)

  • 5. Drag Size to the Filters shelf. and select Compute Using | City. In the resulting dialog box, select an At most value of 5: The text in the table shows the number of occurrences of each city in each state in the dataset (if First() =0, then [Size]; otherwise NULL), Besides, we used the filter Size to limit the number of cities in the state to a maximum of 5
  • 7. On the Marks card, right-click on City Count and select Edit Table Calculation. Now you will see the Nested Calculations, since the field [Size] used in the code of City Count is itself calculated values. you need to tell Tableau…and here is the trick…how [Size] is to be calculated inside the nested calculation.

         Under Nested Calculations, select City Count, then Compute Using select State(it is addressed, then the remaining dimension City is partitioned, the table calculation is recalculated for each subpartition).==> City Count ~ State 
    (you can right-click on City Count and Select Compute Using | Table (down) or State
    OR use tableau default
    ) excuting
    We still didn't tell the 
    Tableau how [Size] is to be calculated inside the nested calculation, the tableau will default [Size] to be calculated along each instance of City direction within each partition(City)

    City Count ~ State is addressed, City is partitioned
    within each partition(
    City),  
    City Count [Size] ~ each City instance would be treat as addressing direction

    (2 2 2 since there are 3 consecutive City instances that are not null

    and
    first city instance are null)


         The text in the table shows the number of occurrences of each City in each state in the dataset (if First() =0, then [Size]; otherwise NULL)
         Besides, we used the filter Size to limit the number of cities in the state to a maximum of 5
  • 8. On the Marks card, right-click on City Count and select Edit Table Calculation. Now you will see the Nested Calculations, since the field [Size] used in the code of City Count is itself calculated values. you need to tell Tableau…and here is the trick…how [Size] is to be calculated inside the nested calculation.

         On the Marks card, right-click on City Count and select Edit Table Calculation. Under Nested Calculations, select City Count and Compute Using | Specific Dimensions and check City(it is addressed, then the remaining dimension State is partitioned, the table calculation is recalculated for each partition). ==> City CountCity (Utilizing FIRST() = 0 causes only one set of numbers to display)

    City Count ~ City is addressed, State is partitioned

    We still didn't tell the Tableau how [Size] is to be calculated inside the nested calculation,


    The text in the view based current result will show:
    City Count [Size] ~ ?
         if  [Size] != NULL, then [Size]==the number of occurrences of First City in each state 
         else NULL replaced with following:
              if return City_pointer_position since tableau used City_pointer_position to find [Size] is not NULL, [Size]==the number of occurrences of First City in each state
              else return current row City_pointer_position- - ).
    Besides, we used the filter Size to limit the number of cities in the state to a maximum of 5 

    After finishing pevious Edit Table Calculation: City Count ~ City is addressed, State is partitioned
    Then right-click on City Count and Select Compute Using | City:
    excuting address City for the Table Calculation, the remaining of State is partioned

    the Nested calculated field City Count [Size] will use current addressing value City Count [Size]~(City), State is partitioned

    (Utilizing FIRST() = 0 causes only one set of numbers to display)
    The text in the table shows the number of cities in each state in the dataset (if First() =0, then [Size]), otherwise NULL,
    Besides, we used the filter Size to limit the number of cities in the state to a maximum of 5
  • 9. On the Marks card, right-click on City Count and select Edit Table Calculation. Now you will see the Nested Calculations, since the field [Size] used in the code of City Count is itself calculated values. you need to tell Tableau…and here is the trick…how [Size] is to be calculated inside the nested calculation.

         On the Marks card, right-click on City Count and select Edit Table Calculation. Now use the dropdown under Nested Calculations again and select [Size]. Then, select Compute Using | Specific Dimensions and check City(it is addressed, then the remaining dimension State is partitioned, the table calculation is recalculated for each partition).City Count [Size]~City==>the number of cities in each state

    Besides, we used the filter Size to limit the number of cities in the state to a maximum of 5)

    City Count will use the State as partition from current View, and City is addressed,
    tableau fill each city with the value of [Size] based current view

    3 3 3 since there existing 3 cities with not null 


    Then right-click on City Count and Select Compute Using | CityCity Count ~City(Utilizing FIRST() = 0 causes only one set of numbers to display.)
    excuting address City for the Table Calculation, the remaining of State is partioned
    all states with five or fewer cities in the Superstore dataset

     Perhaps the most difficult thing to grasp about this exercise is the use of partitioning(scope) and addressing(direction). We will discuss partitioning and addressing in the Application of functions section. Note that the preceding exercise(step 9) had an option for Nested Calculations, which is because the Size calculated field was referenced within the City Count calculated field.

     It is recommended to specify Compute Using | ? when doing Edit Table Calculation for the calculated field, Then right-click the calculated field and specify same Compute Using | ? again.

     Now that we have presented the different table calculations, we will see how they can be manipulated in the scope and direction of calculation. 

Application of functions 

      So far, we have covered the first of our two major questions: What is the function? Now we will proceed to the next question: How is the function applied? 

     Let's try to understand that question via the following three options, which are all applications of the INDEX function: 

     The INDEX function is used in each of these three screenshots; however, it is applied differently in each. The first and second screenshots both display 1, 2, and 3, but differ directionally. The third screenshot ranges from 1 to 9. So, how is INDEX being applied in each case?

     Answering this question can be confusing because Tableau uses different terminology. Within Tableau itself, the way a table calculation is applied may be referred to as running along, moving along, compute using, or partitioning and addressing.For our purposes, we will utilize the terms partitioning and addressing,which we will define here according to the Tableau documentation(https://help.tableau.com/current/pro/desktop/en-us/calculations_tablecalculations.htm ):

     The dimensions that define how to group the calculation, that is, define the scope of data it is performed on, are called partitioning fields. The table calculation is performed separately within each partition. The remaining dimensions, upon which the table calculation is performed, are called addressing fields, and determine the direction of the calculation.

     This basically means that the partition defines which fields are being used, for example, Segment. If you have 3 segments divided over 3 categories, you could look

  • at each combination separately,
  • or at all categories within one segment,
  • or at all segments within one category.
  • The way you look at it is what Tableau calls addressing.

     If a table calculation is utilized in the view, you can right-click on it and select Edit Table Calculation. Upon doing so, you will see a dialog box that will allow you to choose specific dimensions.

  • If a dimension is checked, it is addressed.
  • If it is unchecked, it is partitioned

See an example in the following figure: 

     Tableau provides many out-of-the-box partitioning and addressing settings, including Table (down), Table (across), Pane (down), and Cell. We will reference these options occasionally but will not give a detailed review. This leads us to our first partitioning and addressing guideline

     Don't use the out-of-the-box partitioning and addressing settings provided by Tableau, including Table (across) and Pane (down). Force yourself to click Specific Dimensions and manually define the partitioning and addressing so that you clearly understand how every table calculation is applied.

     There are a couple of caveats[ˈkæviɑs]警告,限制性条款,注意事项 to the preceding guideline:

  • • There's an exception, which is Cell. It is not possible to address individual cells in a view using partitioning and addressing. Instead, it is necessary to use
    • ​​​​​​​ Compute Using | Cell
    • or, within the Table Calculation dialog box, to select Cell.
    • Surprisingly, addressing a table calculation along each cell can be useful. An example is provided in the workbook associated with this chapter, on the worksheet Percent of Total. There it is used to show 100% for each cell divided into two categories.
  • • If you set partitioning and addressing for a given table calculation and then add dimensions to the view, usually Tableau will not automatically adjust the partitioning and addressing settings; they are locked down. However, when using Table (down), Pane (across), and the like, Tableau will make automatic adjustments as dimensions are added to the view.

     This leads us to our next guideline:

     Place all needed dimensions on the desired shelves before setting partitioning and addressing for table calculations.

     Following these guidelines will help ensure that you are always clear about how your table calculations are being applied. 

Building a playground 

     Let's set up a simple playground environment to quickly and efficiently explore partitioning and addressing

  • 1. In the workbook associated with this chapter, navigate to the Playground worksheet.
  • 2. Place Category on the Rows shelf and the Index calculation on the Label shelf.
  • 3. The Index calculation is simply Index() . Click on the drop-down menu associated with Index and select Edit Table Calculation. In the resulting dialog box, click Specific Dimensions, then check Category(it is addressed, direction)

     You can see in Figure 5.17 that we created a worksheet displaying the 3 categories Furniture, Office Supplies, and Technology. By adding the Index field to the Text shelf we have the minimum requirement to use a table calculation, which is one dimension, one table calculation. Now we can change the compute used in the table calculation and can spot[spɑːt]看见,注意到 what changes in the visualization. Later on, we will explore more complex examples; however, always start easy and increase the level of complexity slowly to fully understand what is happening. 

     We could explore different functions on various shelves and the different options that that affords. For instance, placing a date field on the Pages shelf will cause a table calculation that uses the TOTAL function to display an option to compute the total across all pages. Regretfully, exploring every possible nuance is simply not possible in one chapter but if you are interested in this topic, feel free to check out the Tableau help page for more content and examples: https://help.tableau.com/current/pro/desktop/en-us/functions_functions_tablecalculation.htm

Partitioning and addressing with one dimension 

     Let's use our playground to start exploring partitioning and addressing with the simplest possible example: 

     In this simple example, addressing Category causes each member of the Category dimension to be counted. This demonstrates that addressing a dimension determines the direction of the calculation. In our example, we have the Table Calculation Index, but Index depends on partitioning and addressing as we learned before.

  • Category serves as a partition in the Rows shelf. We split the data into three: Furniture, Office Supplies, Technology.
  • Now we also say address Category for the Table Calculation. We did that by selecting Compute Using | Category excuting(Figure 5.18). Now the Table Calculation will be applied to the categories and hence counts 1, 2, 3 categories. 

Now, let's consider partitioning and addressing with not one, but two dimensions. 

Partitioning and addressing with two dimensions 

     Two additional options are made available when partitioning and addressing two or more dimensions: At the level and Restarting every. You can compare this scenario to the task of counting all countries per continent. In this scenario, At the level will be countries, since you are not counting streets or trees but countries. Restarting will be continents. After you are done counting countries for one continent, you start at 1 again for the next continent:

     Both At the level and Restarting every allow the author to choose dimensions from a drop-down menu. At the level allows the author to choose what level to increment at, and as the name suggests, Restarting every allows the author to choose which dimensions to restart on. The examples here will provide context for your understanding. 

     Note that At the level has one additional choice: Deepest. In this case, setting At the level to Deepest is the same as selecting Ship Mode. This leads us to our next guideline: 

     It is not necessary to choose the bottom dimension in the At the level drop-down menu. It is always identical to Deepest. 

     To recreate the iterations listed here, you will need to make some changes to the playground environment. In addition to Category on the Rows shelf and Index on the Label shelf, also place Ship Mode on the Rows shelf. We will not cover iterations that include one or more dimensions on the Columns shelf since the behavior of these possibilities is much the same. 

     As you consider and/or reproduce the following options, note that the Addressing order is important. For examples 3–5, Category is first on the addressing list. For options 6–8, Ship Mode is first on the list.

In the 1st example, configure the partitioning and addressing settings thus: 

  • Partitioning: Category(the table calculation is recalculated for new partition)
  • Addressing: Ship Mode (Compute Using | Ship Mode)
  • • At the level: -
  • • Restarting every: - 

This will produce the following visualization: Here, we count the Ship Mode per Category.

In the 2nd example, configure the partitioning and addressing settings thus:

  • Partitioning: Ship Mode(the table calculation is recalculated for new partition)
  • Addressing: Category (Compute Using | Category)
  • • At the level: -
  • • Restarting every: -

This will produce the following visualization: Here, we count the Category per Ship Mode.

 In the 3rd example, configure the partitioning and addressing settings thus: 

  • Partitioning: -
  • Addressing: Category, Ship Mode
  • At the level: Deepest (or Ship Mode)
    At the level
    allows the author to choose what level to increment at)
  • • Restarting every: None 

This will produce the following visualization: we count the Category and Ship Mode combination.

In the 4th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • Addressing: Category, Ship Mode
  • At the level: Deepest(or Ship Mode)
    At the level allows the author to choose what level to increment at)
  • Restarting every: Category 

     This will produce the following visualization: we count the combination Category and Ship Mode and restart counting at every new Category 

 In the 5th example, configure the partitioning and addressing settings thus:

  • • Partitioning: -
  • Addressing: Category, Ship Mode
  • At the level: Category
    (
    At the level allows the author to choose what level to increment at)
  • • Restarting every: None

     This will produce the following visualization: We count the combination Category and Ship Mode that appears at the Category level. Since the Category level is higher in the hierarchy than Ship Mode, we end up counting only one value per Category.

In the 6th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • Addressing: Ship Mode, Category(Drag and move Ship Mode)
  • At the level: Deepest (Category)
    (At the level allows the author to choose what level to increment at)
  • • Restarting every: None

     This will produce the following visualization: We count the combination Ship Mode and Category at the Deepest (Category) level; hence we count First Class and all the Category combinations with it before Same Day and its combinations with Category.      Example 6 may seem confusing at first. Why has the odd奇怪的 numbering sequence occurred? Because the order in which the dimensions are addressed(The addressing order is Ship Mode, Category) differs from the order of dimensions on the Rows shelf(The order on the Rows shelf is Category, Ship Mode). . Simply reversing the position of Category and Ship Mode on the Rows shelf and noting the change in the number sequence should help dispel[dɪˈspel]驱散,消除 any confusion:

     Is there any practical use for example 6? Yes. From time to time, it may be necessary to address dimensions in a different order than they are listed on a shelf. But this is not usually the case. This leads us to our next guideline:
     When addressing multiple dimensions for a table calculation, the order of addressing寻址顺序 will usually reflect the order of dimensions on the Rows and/or Columns shelves.

  • • Partitioning: -
  • Addressing: Ship Mode, Category(Drag and move Ship Mode)
  • At the level: Ship Mode
    (At the level allows the author to choose what level to increment at)
  • • Restarting every: None
    Since the Ship Mode level is higher in the hierarchy than Category, we end up counting only one value per Ship Mode

In the 7th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • • Addressing: Ship Mode, Category
  • • At the level: Deepest(Category)
    (
    At the level allows the author to choose what level to increment at)
  • • Restarting every: Ship Mode 

     This will produce the following visualization: We count the Ship Mode and Category combinations at the deepest level(Category) but we restart at every Ship Mode, therefore we count First Class and Furniture, First Class and Office Supplies, then First Class and Technology. Then we move on to Same Day and its combinations, restarting at 1 again.

In the 8th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • • Addressing: Ship Mode, Category
  • • At the level: Ship Mode
    (At the level allows the author to choose what level to increment at)
  • • Restarting every: None

     This will produce the following visualization: We count the Ship Mode and Category combinations at the Ship Mode level.

      Partitioning and addressing have an impact on how the table is calculated, the result of which is populated in the structure of the view

     Now, let's consider some of the possibilities presented here in more detail. Some of the options are identical. In fact, out of the nine options, only four are unique. Let's consider examples 1, 4, and 8, each of which have identical end results. Does this mean that each is truly identical? Options 1 and 4 are identical. Option 8, however, is slightly different. To understand this, note the description within the table calculation dialog box for option 8: 

     The description in example 8 reads Results are computed along Ship Mode, Category (Figure 5.29). The text in the description box for example 1 is identical to that for example 4: Results are computed along Ship Mode for each Category, which can be translated as partitioning Category and addressing Ship Mode. This translation is identical to the actual partitioning/addressing setup accomplished in example 1. Therefore, examples 1 and 4 are identical. But does the slight difference to example 8 mean there are practical differences? No! 

Partitioning and addressing with three dimensions

     Let's add another dimension to our playground and re-order things slightly. Place

  • Category and Region on the Rows shelf
  • and Ship Mode on the Columns shelf.
  • Index should remain on the Label shelf.
  • Also add two filters.
    ​​​​​​​
    • Filter Region to East, South, and West.
       
    • Filter Ship Mode to First Class, Second Class, and Standard Class

     When partitioning and addressing three dimensions, the number of possible iterations jumps to 57; however, only 14 of these are unique. Here is a listing of those unique possibilities. 

In the 1st example, configure the partitioning and addressing settings thus: 

  • • Partitioning: Category, Region
    (the table calculation is recalculated for new partition)
  • • Addressing: Ship Mode(Compute Using | Ship Mode)
  • • At the level: -
  • • Restarting every: - 

Here we count the different Ship Modes per Category and Region combinations

In the 2nd example, configure the partitioning and addressing settings thus:

  • • Partitioning: Category, Ship Mode
  • • Addressing: Region
  • • At the level: -
  • • Restarting every: -

This will produce the following visualization: Here we count the different Regions per Category and Ship Mode combinations.

==>

In the 3rd example, configure the partitioning and addressing settings thus:

  • • Partitioning: Region, Ship Mode
  • • Addressing: Category
  • • At the level: -
  • • Restarting every: -

Here we count the different Categories per Region and Ship Mode combinations

==>In the 4th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: Category
  • • Addressing: Region, Ship Mode
  • • At the level: Deepest(Ship Mode)
  • • Restarting every: None 

Here we count the different Categories per Region and Ship Mode combinations at the deepest level(Ship Mode)

==>In the 5th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: Category
  • • Addressing: Ship Mode, Region
  • • At the level: Deepest(Region)
  • • Restarting every: None 

Here we count the different Categories per Ship Mode and Region combinations at the deepest level. 

==>

In the 6th example, configure the partitioning and addressing settings thus:

  • • Partitioning: Region
  • • Addressing: Category, Ship Mode
  • • At the level: Deepest(Ship Mode)
  • • Restarting every: None 

Here we count the different Regions per Category and Ship Mode combinations at the deepest level.
==>

In the 7th example, configure the partitioning and addressing settings thus:

  • • Partitioning: Ship Mode
  • • Addressing: Category, Region
  • • At the level: Deepest(Region)
  • • Restarting every: None

Here we count the different Ship Modes per Category and Region combinations at the deepest level. 

 ==>

In the 8th example, configure the partitioning and addressing settings thus:

  • • Partitioning: Ship Mode
  • • Addressing: Region, Category
  • • At the level: Deepest(Category)
  • • Restarting every: None

Here we count the different Ship Modes per Region and Category combinations at the deepest level. 

==>In the 9th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • • Addressing: Category, Ship Mode, Region
  • • At the level: Deepest(Region)
  • • Restarting every: None 

Here we count the Category, Ship Mode, and Region combinations at the deepest level. 

==>

In the 10th example, configure the partitioning and addressing settings thus:

  • • Partitioning: -
  • • Addressing: Ship Mode, Category, Region
  • • At the level: Deepest(Region)
  • • Restarting every: None 

Here we count the Ship Mode, Category, and Region combinations at the deepest level.==>

In the 11th example, configure the partitioning and addressing settings thus:

  • • Partitioning: -
  • • Addressing: Ship Mode, Category, Region
  • • At the level: Category
  • • Restarting every: None 

Here we count the Ship Mode, Category, and Region combinations at the Category level.Since the Category level is higher in the hierarchy than Region, we end up counting only one value per Category for each type of Ship Mode.
==>

In the 12th example, configure the partitioning and addressing settings thus:

  • • Partitioning: -
  • • Addressing: Ship Mode, Region, Category
  • • At the level: Deepest(Category)
  • • Restarting every: None 

Here we count the Ship Mode, Region, and Category combinations at the deepest level.==>

In the 13th example, configure the partitioning and addressing settings thus: 

  • • Partitioning: -
  • • Addressing: Ship Mode, Region, Category
  • • At the level: Deepest(Category)
  • • Restarting every: Ship Mode 

Here we count the Ship Mode, Region, and Category combinations at the deepest level,but we restart at every Ship Mode. 

==>

In the 14th example, configure the partitioning and addressing settings thus:

  • • Partitioning: -
  • • Addressing: Region, Ship Mode, Category
  • • At the level: Deepest(Category)
  • • Restarting every: None 

Here we count the Region, Ship Mode, and Category combinations at the deepest level,

 ==>

     We will not address the various instances of these possibilities. Instead, the reader is encouraged to recreate these 14 possibilities in Tableau in order to solidify their understanding of partitioning and addressing. Even better, consider recreating all 57 possible iterations and working to understand how Tableau is producing each end result. The process may be tedious, but the resulting understanding is invaluable, allowing the user to understand the things that Tableau is doing in the background when computing. 

Create a calculated field: 

Profitable?

([Profit])>0

Edit table calculation for % Total Sales in Rows Shelf and Label Shelf 

 

Summary

In this chapter, we explored the inner workings of table calculations. We began by considering two questions:

  • • What is the function?
  • • How is the function applied?

     As we explored these two questions, we surveyed each unique group of table calculation functions with the exception of the Script_ functions, which will be covered in more detail in Chapter 15, Programming Tool Integration. We also learned how to apply these functions to a view through partitioning and addressing, where partitioning can be seen as the scope and addressing as the direction of the calculation.

     We have seen examples where we counted, for the 1st example, the scope(partitioning) as Category and Region, and the order of counting was Ship Mode(addressing). Therefore, we would start with the first Category and Region and set 1 for the Ship Mode. We would then continue with the next Region in that same Category
and count 2 for the next Ship Mode and so on. The order of how dimensions are being addressed is important and yields different results.

     Using the knowledge we've picked up in this chapter of addressing and partitioning, in the next chapter, we will explore data densification, cubes, and big data.

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值