Friday, 15 May 2015

Designing Data Cubes - Fact or Dimension Quiz Answers

Have you taken the Facts or Dimensions quiz?  Here are some explanations for the answers.

Note that the quiz website randomizes the question order, so your quiz will be in a different order than the list below.

Quiz Answers

  • Customer - Dimension
    This is a data element that we would want to use to slice 'n' dice our data, such as grouping sales dollars by customer. Therefore it must be a dimension.
  • Invoice - Fact
    An invoice represents a sales transaction, and transactional data elements are facts.
  • Region
    This is a common dimension that permits geographical slicing of the data.
  • Fiscal Year
    Time is the most common dimension in cubes. Whether it's calendar or fiscal time periods, time will always be a dimension.
  • Company
    If a corporation owns multiple companies, then there would be a need to slice the data by company. This must be a dimension.
  • Cost Center
    Cost centers are similar to departments but typically at a lower level of detail, such as a function or project. Since departments are modeled as a dimension, then cost centers must be a dimension as well.
  • Asset Purchases
    As discussed in the blog article, purchases are transactions and are therefore facts.
  • Basket Ball Sales
    Sales are always transactions, regardless of the item being sold.
    (And yes, data cubes are flexible enough to store data on round objects!)
  • Currency Conversion Rate
    Currency rates change rapidly, so that means it must be a fact.
  • Product Inventory Counts
    Inventory counts are numeric and would need to be aggregated, so they need to be modeled as a fact.
  • Financial Account
    Financial accounts are categories such as Office Supplies Expense or Interest Earned. The finances always need to be aggregated by accounts for financial reporting (i.e. slice the data by account), so accounts will always be a dimension.
  • Insurance Policy Type
    Type implies a category of some kind. Even though we don't know what policy types might be included here (e.g. auto or home, personal or business, etc.) we know that there will always be fewer policy types than policies themselves. Since type would be used to slice the policy transactions in some way, it must be a dimension.
  • Patents
    This is an unusual element to include in a data cube, but it's possible. Patents are very long-lasting, being in effect for multiple decades. If royalties are being paid to us by other companies who are using our patent, then those royalty transactions could be grouped by patent so we could see which ones are contributing the most royalty revenues. Patents have to be a dimension.
  • Contracts
    This is a difficult one to model because businesses can apply its meaning broadly enough to make it behave as a dimension or a fact.
    The most common application of a contract would be as a dimension, where contract is a legal container for many invoices to be charged against it. Think of an office building construction project. The entire project may last for years, and a contract will exist for all the truckloads of cement required and another contract for all the electrical installation work, etc. Many invoices will be generated over the years under each contract. Clearly contract is a dimension.
    However, now think of a self-employed handyman who does bathroom renovations for home owners. Each contract lasts 1-4 weeks and only one invoice is ever billed per contract. In that case, contract is essentially equivalent to an invoice and it behaves as a transaction. You would never want to group data by contract because there is only ever one transaction per contract. There is no data to aggregate. Even though these contracts are still functioning as legal agreements, in this situation contract must be treated as a fact.
    This is a great example of how a single business term can mean different things in different organizations, and why it is so important to understand your organization's specific requirements before assuming how the data model and cubes should look.
  • Employee Satisfaction Survey Responses
    Survey responses are usually numerous (at least you hope for that -- not many people fill them out!) and they typically contain responses that can be tallied numerically. Numeric data like this must be modeled as a fact. The tricky part with this data element is that it can be long-lasting, in that the survey may only be taken once per year. Even though it is not a frequent transaction, it is still a transaction. A similar example would be asset depreciation which also may only be calculated once per year. Infrequent transactions are still transactions and are therefore must be modeled as facts.

Click here to return to the original article.

No comments:

Post a Comment