- Nov 29, 2010
- Choosing a Chart Type
- Understanding Date-Based Axis Versus Category-Based Axis in Trend Charts
- Communicate Effectively with Charts
- Adding an Automatic Trendline to a Chart
- Showing a Trend of Monthly Sales and Year-to-Date Sales
- Understanding the Shortcomings of Stacked Column Charts
- Shortcomings of Showing Many Trends on a Single Chart
- Next Steps
Understanding Date-Based Axis Versus Category-Based Axis in Trend Charts
Excel offers two types of horizontal axes in a trend chart. Having the proper setting can ensure that your message is accurate.
If the spacing of events along the time axis is uniform, it does not matter whether you choose a date-based axis or a text-based axis because the results will be the same. When this occurs, it is fine to allow Excel to choose the type of axis automatically.
However, if the spacing of events along the time axis is haphazard, you definitely want to make sure that Excel uses a date-based axis.
Usually, if your data contains dates, Excel defaults to a date-based axis. However, you should always check to make sure Excel is using the correct type of axis. A number of potential problems force Excel to choose a text-based axis instead of a date-based axis. For example, Excel will choose a text-based axis when dates are stored as text in a spreadsheet and when dates are represented by numeric years. The list following Figure 3.7 summarizes other potential problems.
Figure 3.7 You can explicitly choose an axis type rather than letting Excel choose the default.
To explicitly choose an axis type, follow these steps:
- Right-click the horizontal axis and select Format Axis.
- In the Format Axis dialog box that appears, select the Axis Options category.
- As appropriate, choose either Text Axis or Date Axis from the Axis Type section (see Figure 3.7).
A number of complications that require special handling can occur with date fields. The following are some of the problems you might encounter:
- Dates stored as text—If dates are stored as text dates instead of real dates, a date-based axis will never work. You have to use date functions to convert the text dates to real dates.
- Dates represented by numeric years—Trend charts can have category values of 2008, 2009, 2010, and so on. Excel does not naturally recognize these as dates, but you can trick it into doing so. Read "Plotting Data by Numeric Year" near Figure 3.15 in this chapter.
- Dates before 1900—If your company is old enough to chart historical trends before January 1, 1900, you will have a problem. In Excel's world, there are no dates before 1900. For a workaround, read "Using Dates Before 1900" around Figure 3.16.
- Dates that are really time—It is not difficult to imagine charts in which the horizontal axis contains periodic times throughout a day. For example, you might use a chart like this to show the number of people entering a bank. For such a chart, you need a time-based axis, but Excel will group all of the times from a single day into a single point. See "Using a Workaround to Display a Time-Scale Axis" near Figure 3.19 for the rather complex steps needed to plot data by periods smaller than a day.
Each of these problem situations is discussed in the following sections.
Converting Text Dates to Dates
If your cells contain text that looks like dates, the date-based axis will not work. The data in Figure 3.8 came from a legacy computer system. Each date was imported as text instead of as dates.
Figure 3.8 These dates are really text, as indicated by the apostrophe before the date in the formula bar.
This is a frustrating problem because text dates look exactly like real dates. You may not notice that they are text dates until you see that changing the axis to a date-based axis has no effect on the axis spacing.
If you select a cell that looks like a date cell, look in the formula bar to see whether there is an apostrophe before the date. If so, you know you have text dates (refer to Figure 3.8). This is Excel's arcane code to indicate that a date or number should be stored as text instead of a number.
Understanding How Excel Stores Dates and Time
On a Windows PC, Excel stores dates as the number of days since January 1, 1900. For a date such as 2/17/2011, Excel actually stores the value 40,591, but it formats the date to show you a value such as 02/17/2011.
On a Mac running Mac OS, Excel stores the dates as the number of days since January 1, 1904. The original designers of the Mac OS were trying to squeeze the OS into 64K of ROM. Because every byte mattered, it seemed unnecessary to add a couple lines of code to handle the fact that 1900 is not a leap year. Excel for the Mac adopted the 1904 convention. On a Mac, 2/17/2011 is stored as 39,129.
Figure 3.9 Many groups on the ribbon have this tiny More icon in the lower-right corner. Clicking this icon leads to the legacy dialog box.
Excel for Windows, which needed to be compatible with Lotus 1-2-3, adopted the 1900 convention. As demonstrated in the next case study, the 1900 convention incorrectly made 1900 a leap year.
Excel provides a complete complement of functions to deal with dates including functions that convert data from text to dates and back. Excel stores times as decimal fractions of days. For example, you can enter noon today as =TODAY()+0.5 and 9 a.m. as =TODAY()+0.375. Again, the number format handles converting the decimals to the appropriate display.
Converting Text Dates to Real Dates
The DATEVALUE function converts text that looks like a date into the equivalent serial number. You can then use the Format Cells dialog to display the number as a date.
The text version of a date can take a number of different formats. For example, your international date settings might call for a month/day/year arrangement of the dates. Figure 3.10 shows a number of valid text formats that can be converted with the DATEVALUE function.
Figure 3.10 The DATEVALUE function can handle any of the date formats in Column J.
Figure 3.11 shows a column of text dates. Follow these steps to convert the text dates to real dates:
- Insert a blank Column B by selecting cell B1. Select Home, Insert, Insert Sheet Columns. Alternatively, you can use the Excel 2003 shortcut Alt+I+C.
- In cell B2, enter the formula =DATEVALUE(A2). Excel displays a number in the 40,000 range in cell B2. You are halfway to the result (see Figure 3.11). You still have to format the result as a date.
- Double-click the fill handle in the lower-right corner of cell B2. Excel copies the formula from cell B2 down to your range of dates.
Figure 3.11 The result of the DATEVALUE function is a serial number.
- Select Column B2. On the Home tab, select the drop-down at the top of the Number group and choose either Short Date or Long Date. Excel displays the numbers in Column B as a date (see Figure 3.12). Alternatively, you can press Ctrl+1 and select any date format from the Number tab.
Figure 3.12 Choose a date format from the Number drop-down on the Home tab.
- To convert the live formulas in Column B to be static values, while the range of dates in Column B is selected, press Ctrl+C to copy. Press Ctrl+V to paste. Press Ctrl to open the Paste Options dialog. Press V to paste as values.
- Delete the original column A.
After converting the text dates to real dates, insert a line chart with markers. Excel automatically formats the chart with a date-based axis. In Figure 3.13, the top chart reflects cells that contain text dates. The bottom chart uses cells in which the text dates have been converted to numeric dates.
Figure 3.13 When your original data contains real dates, Excel automatically chooses a more accurate date-based axis. The bottom chart reflects a date-based axis.
Converting Bizarre Text Dates to Real Dates
When you rely on others for source data, you are likely to encounter dates in all sorts of bizarre formats. For example, while gathering data for this book, I found a dataset where each date was listed as a range of dates. Each date was in the format 2/4-6/11. I had to check with the author of the data to find out if they meant February 4th through 6th of 2011 or if they meant February 4th through June 11th. They meant the former.
Used in combination, the functions listed below can be useful when you are converting strange text dates to real dates:
- =DATE(2011,12,31)—Returns the serial number for December 31, 2011.
- =LEFT(A1,2)—Returns the two leftmost characters from cell A1.
- =RIGHT(A1,2)—Returns the two rightmost characters from cell A1.
- =MID(A1,3,2)—Returns the third and fourth characters from cell A2. You read the function as "return the middle characters from A1, starting at character position 3, for a length of 2."
- =FIND("/",A1)—Finds the position number of the first slash within A1.
Follow these steps to convert the text date ranges shown in Figure 3.14 to real dates:
Figure 3.14 A mix of LEFT, RIGHT MID, and FIND functions parse this text to be used in the DATE function.
- Because the year is always the two rightmost characters in column A, enter the formula =RIGHT(A2,2) in cell B2.
- Because the month is the leftmost one or two characters in column A, ask Excel to find the first slash and then return the characters to the left of the slash. Enter =FIND("/",A2) to indicate that the slash is in second character position. Use =LEFT(A2,FIND("/",A2) to get the proper month number.
- For the day, either choose to extract the first or last date of the range. To extract the first date, ask for the middle characters, starting one position after the slash. The logic to figure out whether you need one or two characters is a bit more complicated. Find the position of the dash, subtract the position of the slash, and then subtract 1. Therefore, use this formula in cell D2:
- Use the DATE function as follows in cell E2 to produce an actual date:
Plotting Data by Numeric Year
If you are plotting data where the only identifier is a numeric year, Excel does not automatically recognize this field as a date field.
For example, in Figure 3.15 data is plotted once a decade for the past 50 years and then yearly for the past decade. Column A contains four-digit years such as 1960, 1970, and so on. The default chart shown in the top of the figure does not create a date-based axis. You know this to be true because the distance from 1960 to 1970 is the same as the distance from 2000 to 2001.
Figure 3.15 Excel does not recognize years as dates.
Listed here are two solutions to this problem:
- Convert the years in column A to dates by using =DATE (A2,12,31). Format the resulting value with a yyyy custom number format. Excel displays 2005 but actually stores the serial number for December 31, 2005.
- Convert the horizontal axis to a date-based axis. Excel thinks your chart is plotting daily dates from May 13, 1905, through July 2, 1905. Because no date format has been applied to the cells, they show up as the serial numbers 1955 through 2005. Excel displays the chart properly, even though the settings show that the base units are days.
Using Dates Before 1900
In Excel 2010, dates from January 1, 1900 through December 31, 9999 are recognized as valid dates. However, if your company was founded more than a demisesquicentennial before Microsoft was founded, you will potentially have company history going back before 1900.
Figure 3.16 shows a dataset stretching from 1787 through 1959. The accompanying chart would lead the reader to believe that the number of states in the United States grew at a constant rate. This inaccurate statement would cause Mr. Kessel, my eighth-grade geography teacher, to give me an F for this book.
Figure 3.16 Dates from before 1900 are not valid Excel dates. A date-based axis is not possible in this case.
As mentioned previously, formatting the chart to have a date-based axis will not work because Excel does not recognize dates before 1900 as valid dates. Possible workarounds are discussed in the next two subsections.
Using Date-Based Axis with Dates Before 1900 Spanning Less Than 100 Years
In Figure 3.17, the dates in Column A are text dates from the 1800s. Excel cannot automatically deal with dates from the 1800s, but it can deal with dates from the 1900s.
Figure 3.17 Transforming the 1800s dates to 1900s dates and clever formatting allows Excel to plot this data with a date axis.
One solution is to transform the dates to dates in the valid range of dates that Excel can recognize. You can use a date format with two years and a good title on the chart to explain that the dates are from the 1800s. However, keep in mind that this solution fails when you are trying to display more than 100 years of data points.
To create the chart in Figure 3.17, follow these steps:
- Insert a blank Column B to hold the transformed dates.
- Enter the formula =DATE(100+RIGHT(A4,4),LEFT(A4,2),MID(A4,4,2)) in cell B4. This formula converts the 1836 date to a 1936 date.
- Select cell B4. Press Ctrl+1 to open the Format Cells dialog. Select the date format 3/14/01 from the Date category on the Number tab. This formats the 1936 date as 6/15/36. Later, you will add a title to indicate that the dates in this column are from the 1800s.
- Double-click the fill handle in cell B4 to copy the formula down to all cells.
- Select the range B3:C17.
- From the Insert tab, select Charts, Line, 2-D Line, Line.
- From the Layout tab, select Legend, No Legend.
- Right-click the vertical axis along the left side of the chart and select Format Axis from the context menu.
- In the Format Axis dialog that appears, on the Axis Options page, select the Fixed option button next to Minimum and enter a fixed value of 20.
- Without closing the Format Axis dialog, click the dates in the horizontal axis in the chart. Excel automatically switches to formatting the horizontal axis, and the settings in the Format Axis dialog redraw to show the settings for the horizontal axis. In the Axis Type section, select Date Axis. Click Close to close the dialog box.
- From the Layout tab, select Chart Title, Centered Overlay Title.
- Click the State Count title. Type the new title Westward Expansion<enter>During 1845-1875 Added 13<enter>New States to the Union. Click outside the title to exit Text Edit mode.
- Click the title once. You should have a solid selection rectangle around the title. On the Home tab, click the Decrease Font Size button. Click the Left Align button.
- Carefully click the border of the title. Drag it so the title appears in the top-left corner of the chart.
- Select the dates in B4:B17. Press Ctrl+1 to access the Format Cells dialog. On the Number tab, click the Custom category. Type the custom number format 'yy. This changes the values shown along the horizontal axis from m/d/yy format to show a two-digit year preceded by an apostrophe.
The result is the chart shown in Figure 3.17. The reader may believe that the chart is showing dates in the 1800s, but Excel is actually showing dates in the 1900s.
Using Date-Based Axis with Dates Before 1900 Spanning More Than 100 Years
Microsoft Excel 2010 doesn't do well with large datasets that span 100+ years. Although I managed to create a date-based axis covering 630 years with 10 data points, a dataset covering 102 years and 40 points cannot display a date-based axis.
However, as Figure 3.18 shows, it is possible to create this chart. To do so, you must transform the date axis into a scale that shows months, hide the axis, and then add your own axis using text boxes. These steps are not for the faint of heart.
Figure 3.18 This chart appears to show a date-based axis that spans 200+ years.
First, you need to transform the dates from the 1800s to the 1900s. Next, you will transform the dates spanning 172 years into a range where each month in real time is represented by a single day. This results in a time span of 6 years. You then need to use care to completely hide the labels along the horizontal axis and replace them with text boxes showing the centuries. Lastly, you add a new data series to draw vertical lines at the change of each century.
To create the chart in Figure 3.18, follow these steps:
- Insert new Columns B and C.
- In cell B4, enter the formula =DATE(113+RIGHT(A4,4),LEFT(A4,2),MID(A4,4,2)). This transforms the dates from 1787 to a valid Excel date in 1900. Format this cell with a short date format.
- In cell C4, type the formula =(YEAR(B4)-1899)*12+MONTH(B4) to calculate a number of months. Format this cell as a short date. This formula now reduces 172 years into 172x12 into 2,064 days, where each day represents 1 month of real time.
- Select cells B4:C4 and double-click the fill handle to copy the formula down to your range of data. The dates in Column B span 1900 to 2072. The dates in Column C span 1900 to 1907. Although the relative position of the data points is correct, you have to hide the axis labels that Excel draws in for the horizontal axis. Therefore, it would be helpful to draw in vertical lines to show where the axis switches from the 1700s to the 1800s. Then draw another line to show where the axis switches from the 1800s to the 1900s.
- Insert a new Column E to hold the data for the second series. This series contains just two nonzero points: one at 1800 and one at 1900. Enter the heading Divide Line in cell E3.
- Look through the dates in Column A. Insert a new row before the first date in the 1800s. In this new row, enter 01/01/1800 in Column A. Copy the formulas in Columns B and C. In Column D, copy the point from the row above. In Column E, enter the value 50. This draws a single vertical bar from the horizontal axis up to a height of 50.
- Repeat step 6 to add a new data point for January 1, 1900, and January 1, 2000.
- Select C4:E55.
- From the Insert tab, select Charts, Line, Line.
- On the Layout tab, select Legend, None.
- Right-click the numbers along the vertical axis and then select Format Axis. Change the Maximum option button to Fixed and enter the value 50. This changes the vertical axis to show from 0 to 50.
- On the Layout tab, use the Current Selection drop-down to select Series. Note that there are now only two data points selected in the chart.
- On the Design tab, select Change Chart Type. Select the first icon in the column section—for a clustered column chart. This draws narrow columns—actually lines—at 1800 and 1900 on the chart. Note that the chart type change affects only the second series because you selected the Divide Line series in step 12.
- Click the labels along the horizontal axis. These labels show wrong dates such as 1/23/02. On the Home tab, from the Font Color drop-down select a white font. This causes the axis labels to disappear.
- On the Insert tab, click the Text Box icon. On the chart, draw a text box from the 1800 line to the 1900 line, just below the horizontal axis. The mouse pointer changes into a crosshairs as you draw. Make sure the vertical line in the crosshairs corresponds to the vertical dividing lines. After you create the text box, a flashing cursor appears inside the text box.
- Type 1800s. Click the edge of the text box to change it from a dashed line to a solid line.
- While the text box is selected, select Center Align from the Home tab. Select Vertical Center Align. Select Increase Font Size from the Home tab.
- While the text box is still selected, select Format, Shape Outline, Black on the Layout tab in order to outline the text box.
- Click the text box and start to drag to the right. After you start to drag, hold down the Shift key to constrain the movement to the right. Hold down the Ctrl key to make an identical copy of the text box. When the left edge of the new text box is aligned with the vertical line at 1900, release the mouse button.
- Click in the text box and change the text from 1800s to 1900s.
- On the Layout tab, select Chart Title, Centered Overlay Title. When the title Chart Title appears, it is selected.
- Click inside the Chart Title text area to enter Text Entry mode. Overwrite the default text in the title by typing Growth of USA, press Enter, type by # of States, press Enter, and type 1787-1999.
- Click the border of the chart title to exit Text Entry mode.
- Drag the chart title to a new location in the lower-right corner of the chart.
The result is a chart that appears to show a line chart that spans 217 years. The line is scaled appropriately using a date-based axis.
Using a Workaround to Display a Time-Scale Axis
The developers who create Microsoft Excel are careful in the Format Axis dialog box to call the option a date axis. However, the technical writers who write Excel Help refer to a time-scale axis. The developers get a point here for accuracy because Excel absolutely cannot natively handle an axis that is based on time.
A worksheet in the download files is used to analyze queuing times. In Column A, it logs the time that customers entered a busy bank. Times range from when the bank opened at 10 a.m. until the bank closed at 4 p.m.
After you enter planned staffing levels in Column C, the model calculates when the customer will move from the queue to an open teller window and when he or she will leave the window based on an average of three minutes per transaction.
Data in Columns I:M record the number of people in the bank every time someone enters or leaves. This data is definitely not spaced equally. Only a few customers arrive in the 10:00 hour, while many customers enter the bank during the lunch hour.
The top chart in Figure 3.19 plots the number of customers on a text-based axis. Because each customer arrival or departure merits a new point, the one hour from noon until 1 p.m. takes up 41 percent of the horizontal width of the chart. In reality, this 1-hour period merits only 16 percent of the chart. This sounds like a perfect use for a time-series axis, right? Read on for the answer.
Figure 3.19 Excel cannot show a time-series axis that contains times.
The bottom chart is an identical chart where the axis is converted to show the data on a date-based axis. This is a complete disaster. In a date-based axis, all time information is discarded. The entire set of 300 points is plotted in a single vertical line.
The solution to this problem involves converting the hours to a different time scale (similar to the 1800s date example in the preceding section). For example, perhaps each hour could be represented by a single year. Using numbers from a 24-hour clock, the 10:00 hour could be represented by 2010 and the 3:00 hour could be represented by 2015.
In this example, you manipulate the labels along the vertical axis using a clever custom number format. A few new settings on the Format Axis dialog ensure that an axis label appears every hour.
Follow these steps to create a chart that appears to have a time-based axis:
- In cell L2, enter the following formula to translate the time to a date:
=ROUND(DATE(HOUR(I2)+2000,1,1)+MINUTE(I2)/60*365,0)Because each hour will represent a single year, the years argument of the DATE function is =HOUR(I2)+2000. This returns values from 2010 through 2013. The other arguments in the date function are 1 and 1 to return January 1 of the year. Outside the date function, the minute of the time cell is scaled up to show a value from 1 to 365, using MINUTE(I2)/60*364. The entire formula is rounded to the nearest integer because Excel would normally ignore any time values.
- Select cell L2. Double-click the fill handle to copy this formula down to all the data points. The result of this formula ranges from January 1, 2010, which represents the customer who walked in at 10 a.m., to 12/25/2015, which represents the customer who walked in at 3:57 p.m.
- Select cells L1:M303.
- From the Insert tab, select Charts, Line, Line with Markers.
- On the Layout tab, select Legend, None. (After studying Software Quality Metrics (SQM) data for Excel 2007, surely Microsoft realizes that 500 million people instantly turn off the legend in every chart that has a single data series.)
- Right-click the labels along the horizontal axis and select Format Axis to display the Format Axis dialog box, where you make the following selections:
- In the Axis Type section, select Date Axis.
- For Major Unit, select Fixed, 1 Years.
- For Minor Unit, select Fixed, 1 Days.
- For Base Unit, select Fixed, Days.
- Click Close to close the Format Axis dialog.
- Return to the transformed dates in Column L. Select L2:L303.
- Press Ctrl+1 to display the Format Cells dialog. On the Number tab, select the Custom category. A custom number format of yy would display 10 for 2010 and 15 for 2015. Instead, use a custom number format of yy":00". This causes Excel to display 10:00 for 2010 and 15:00 for 2015, which is fairly sneaky, eh?
As you see in Figure 3.20, the chart now allocates one-sixth of the horizontal axis to each hour. This is an improvement in accuracy over either of the charts in Figure 3.19. The additional chart in Figure 3.20 uses a similar methodology to show the wait time for each customer who enters the bank. If my bank offered 12-minute wait times, I would be finding a new bank.
Figure 3.20 These charts show the number of customers in the bank and their expected wait times.