Important: The GCConnex decommission will not affect GCCollab or GCWiki. Thank you and happy collaborating!

Changes

Jump to navigation Jump to search
no edit summary
Line 10: Line 10:  
In situations such as this model inputs may be best defined as probability distributions rather than an individual number. When more than one probability distribution is used in repeated simulations to calculate an objective outcome this is referred to as a Monte-Carlo experiment. This works by repeatedly and randomly selecting outcomes from the distributions of the various inputs and applying them to the model. Each time this is done, a possible outcome of the model is created. The more times this process is repeated the more observations from the total set of possible outcomes are calculated, until eventually enough data is collected to piece together a probability-distribution of the model output.
 
In situations such as this model inputs may be best defined as probability distributions rather than an individual number. When more than one probability distribution is used in repeated simulations to calculate an objective outcome this is referred to as a Monte-Carlo experiment. This works by repeatedly and randomly selecting outcomes from the distributions of the various inputs and applying them to the model. Each time this is done, a possible outcome of the model is created. The more times this process is repeated the more observations from the total set of possible outcomes are calculated, until eventually enough data is collected to piece together a probability-distribution of the model output.
   −
[[File:Monte.jpg|thumb|1041x1041px|Monte-Carlo Simulation|alt=|none]]
+
[[File:Monte.jpg|Monte-Carlo Simulation|alt=|none|frame]]
      Line 73: Line 73:  
The defined distribution is essentially a discrete distribution where any number of sets of outcomes and their probabilities can be created – as long as the probabilities of all outcomes sum to 1.  This can be useful in situations where there is a variable or outcome associated with a worse case scenario (pessimistic), a most likely scenario (realistic), and a best-case scenario (optimistic) as can be seen in Figure 6.  
 
The defined distribution is essentially a discrete distribution where any number of sets of outcomes and their probabilities can be created – as long as the probabilities of all outcomes sum to 1.  This can be useful in situations where there is a variable or outcome associated with a worse case scenario (pessimistic), a most likely scenario (realistic), and a best-case scenario (optimistic) as can be seen in Figure 6.  
   −
[[File:Fig 6.png|Figure 6 Discrete Defined Distribution|alt=|none|thumb]]
+
[[File:Fig 6.png|Figure 6 Discrete Defined Distribution|alt=|none|thumb|605x605px]]
      Line 116: Line 116:  
==== Entering a Distribution by Form ====
 
==== Entering a Distribution by Form ====
   −
 
+
As an alternative to manually entering a distribution directly into the spreadsheet you can add a distribution by form.  The form can be opened by pressing the “Manage Assumptions” button as in Figure 7. This will open the form shown in Figure 8.
As an alternative to manually entering a distribution directly into the spreadsheet you can add a distribution by form.  The form can be opened by pressing the “Manage Assumptions” button as in Figure 7 (below the picture of a mouse). This will open the form shown in Figure 8.
   
[[File:Fig 8.png|none|thumb|624x624px|Figure 8: Adding a Distribution by Form]]
 
[[File:Fig 8.png|none|thumb|624x624px|Figure 8: Adding a Distribution by Form]]
    
The information to define a distribution can be entered in this form and added to the spreadsheet. In addition, if an entry is added to the “Reference Name” field this name will be assigned to the value cell automatically.  Also, the titles for the parameters of the distributions will change dynamically depending on what distribution type is selected to make it more clear what data is required.
 
The information to define a distribution can be entered in this form and added to the spreadsheet. In addition, if an entry is added to the “Reference Name” field this name will be assigned to the value cell automatically.  Also, the titles for the parameters of the distributions will change dynamically depending on what distribution type is selected to make it more clear what data is required.
 +
       
You can also use this form to edit previously entered distribution. Do this by clicking on the distribution in the list. The fields in the form will now be dynamically linked to the values in the spreadsheet.
 
You can also use this form to edit previously entered distribution. Do this by clicking on the distribution in the list. The fields in the form will now be dynamically linked to the values in the spreadsheet.
   −
 
+
To assist the user in making a defined distribution, which can have many parameters, RRAT has an additional custom form that can be used to help (See Figure 9: Creating a Defined Distribution). This form can be used by entering an outcome/probability combination and then clicking ‘Add Outcome’ repeatedly until all discrete outcomes are added. Once the probability of all outcomes totals 1, the ‘Add Distribution’ button can be clicked and the distribution will be added to the form in the required format (outcome,probability;…). If you are on the last outcome, you can automatically assign the remaining probability of an event happening to that outcome by pressing the ''''Last Outcome'''<nowiki/>' button. For instance if you had already added 2 outcomes each with a probability of 20%, the remaining probability (100%-20%*2 = 40%) could be assigned to it by using this button. 
To assist the user in making a defined distribution, which can have many parameters, RRAT has an additional custom form that can be used to help (See Figure 9: Creating a Defined Distribution). This form can be used by entering an outcome/probability combination and then clicking ‘Add Outcome’ repeatedly until all discrete outcomes are added. Once the probability of all outcomes totals 1, the ‘Add Distribution’ button can be clicked and the distribution will be added to the form in the required format (outcome,probability;…).
   
[[File:Fig 9.png|none|frame|Figure 9: Creating a Defined Distribution]]
 
[[File:Fig 9.png|none|frame|Figure 9: Creating a Defined Distribution]]
    
== 1.2    Stage 2: Define Formulas for Dependent Outcomes ==
 
== 1.2    Stage 2: Define Formulas for Dependent Outcomes ==
 
The next stage in setting up a Monte-Carlo simulation is to set up the equations that will be calculated based on the results pulled from the probability distributions defined in the previous stage.  To demonstrate this, imagine we have defined 4 probability distributions as shown in Figure 10.
 
The next stage in setting up a Monte-Carlo simulation is to set up the equations that will be calculated based on the results pulled from the probability distributions defined in the previous stage.  To demonstrate this, imagine we have defined 4 probability distributions as shown in Figure 10.
 +
[[File:Fig 10.png|none|thumb|1101x1101px|Figure 10: Demo Distributions]]
      −
Figure 10: Demo Distributions
+
Here we have defined 4 distributions and one constant that will be used to calculate benefits from an imaginary regulatory proposal that reduces the risk of premature mortality. The distributions are for the population growth rate, the expected reduction in risk of premature mortality annually for the population because of the regulation, the value of a statistical life, the discount rate, and some related program cost.  In this case the discount rate will not actually vary, but this example shows how it is useful to list all the major assumptions in one place. If you refer to the one value here in all of the equations that reference it, then it becomes much easier to change that assumption at a later date, without major alteration to the model and multiple equations that may reference an assumption such as the discount rate.
      −
Here we have defined 4 distributions and one constant that will be used to calculate benefits from an imaginary regulatory proposal that reduces the risk of premature mortality. The distributions are for the population growth rate, the expected reduction in risk of premature mortality annually for the population because of the regulation, the value of a statistical life, the discount rate, and some related program costIn this case the discount rate will not actually vary, but you can see how it is useful to list all the major assumptions in one place. If you refer to the one value here in all of the equations that reference it, then it becomes much easier to change that assumption at a later date.  
+
The cells '''D9:D13,''' where the values pulled from the probability distributions defined for each simulation will be placed have been named '''''POPGrow'', ''RedMort'', ''VSL''''', '''''Discount, and ProgCost''''' respectivelyWhen using the custom form to create distributions, RRAT puts references to these names in row '''''H'''''.   This makes it easier to use these parameter names in the rest of the spreadsheet where calculations will be made to estimate the benefits of the imaginary regulation used in this guide.  This can be seen in Figure 11 in the model where formulas have been calculated using these named cells.
   −
 
+
[[File:Fig 11.png|none|frame|Figure 11: Demo Model]]
The cells '''D9:D13,''' where the values pulled from the probability distributions defined for each simulation will be placed have been named '''''POPGrow'', ''RedMort'', ''VSL''''', '''''Discount, and ProgCost''''' respectively.  When using the custom form to create distributions, RRAT puts references to these names in row '''''H'''''.   This makes it easier to use these parameter names in the rest of the spreadsheet where calculations will be made to estimate the benefits of the regulation.  This can be seen in Figure 11 in the model where formulas have been calculated using these named cells.
  −
 
  −
 
  −
 
  −
 
  −
Figure 11: Demo Model
         
In cell B3 an initial population for Canada of 34 million has been entered.  The rest of the cells are calculated as follows:
 
In cell B3 an initial population for Canada of 34 million has been entered.  The rest of the cells are calculated as follows:
 
+
* the population in 2011 ('''Cell B4):''' “'''''=B3*(1+POPGrow)'''''”
* the     population in 2011 ('''Cell B4):''' “'''''=B3*(1+POPGrow)'''''”
+
* the reduced number of premature fatalities('''C4)''': “'''''=B4*RedMort'''''”
* the     reduced number of premature fatalities('''C4)''': “'''''=B4*RedMort'''''”
+
* the monetized value of the risk reduction ('''D4)''': “'''''=C4*VSL”'''''
* the     monetized value of the risk reduction ('''D4)''': “'''''=C4*VSL”'''''
+
* the discounted value for that benefit ('''Cell E4):''' “'''''=D4/(1+Discount)^(A4-$A$3)"''''' The equations from row 4 are then copied and pasted into the rest of the rows in the table.  At the very bottom right-hand side of the table the present value of the benefit is calculated as the sum of all the cells above it.
* the     discounted value for that benefit ('''Cell     E4):''' “'''''=D4/(1+Discount)^(A4-$A$3)'''''  
  −
 
  −
The equations from row 4 are then copied and pasted into the rest of the rows in the table.  At the very bottom right-hand side of the table the present value of the benefit is calculated as the sum of all the cells above it.
         
Although this is a relatively simple model the combined uncertainty of the underlying parameters makes expressing what we would expect to result a challenge.  We could just give a central estimate based on the most likely or central outcomes of each of the probability distributions defined and report a benefit of $171.86 million, but how likely is that result?  Given what we know about the program cost, how confident could we be that the benefits would exceed these costs?  These are the types of questions running a Monte-Carlo analysis can help us answer.  But before we begin to discuss that we first need to finalize our experiment by selecting what to report in the “'''Model Outputs'''” worksheet.
 
Although this is a relatively simple model the combined uncertainty of the underlying parameters makes expressing what we would expect to result a challenge.  We could just give a central estimate based on the most likely or central outcomes of each of the probability distributions defined and report a benefit of $171.86 million, but how likely is that result?  Given what we know about the program cost, how confident could we be that the benefits would exceed these costs?  These are the types of questions running a Monte-Carlo analysis can help us answer.  But before we begin to discuss that we first need to finalize our experiment by selecting what to report in the “'''Model Outputs'''” worksheet.
 +
== Stage 3: Reporting Outcomes ==
       +
Every time a simulation is run within a Monte-Carlo experiment a new set of model inputs are created from the distributions you create, and a new set of model outputs are calculated.  As the number of simulations needed to get a robust sample of outputs to define a probability distribution will likely run in the thousands the total amount of data such a process generates may be overwhelming. However, not all this data will need to be recorded and it is up to the analyst to decide which sets of inputs and outputs to report.  
   −
 
+
[[File:Fig 12.png|none|frame|Recording Outputs]]
== 1.3    Stage 3: Reporting Outcomes ==
  −
 
  −
 
  −
Every time a simulation is run within a Monte-Carlo experiment a new set of model inputs are created and a new set of model outputs are calculated.  As the number of simulations needed to get a robust sample of outputs to define a probability distribution will likely run in the thousands the total amount of data such a process generates may be overwhelming. However, not all this data will need to be recorded and it is up to the analyst to decide which sets of inputs and outputs to report.  
  −
 
  −
 
  −
Figure 12: Recording Outputs
         
This is done by using the second worksheet titled “'''Outputs'''” (Figure 10).  In this worksheet all calculations from each simulation can be recorded for later analysis. This is done by:
 
This is done by using the second worksheet titled “'''Outputs'''” (Figure 10).  In this worksheet all calculations from each simulation can be recorded for later analysis. This is done by:
 
+
# Entering a description for the output starting in cell '''C5. ''' The first blank cell in row 5 marks the end of reporting information and no data after will be recorded.
# Entering     a description for the output starting in cell '''C5. '''  The first blank cell in row 5 marks the     end of reporting information and no data after will be recorded.
+
# Entering a formula or a reference to the cell that you want to record from each simulation ('''Row 6''').
# Entering     a formula or a reference to the cell that you want to record from each     simulation ('''Row 6''').
+
# (Optional) Entering a customized format for the numbers to be recorded ('''Row 7'''). This formatting information is entered in the same way as you would for customized number formats within Excel (see the Excel help file for more information).
# (Optional)     Entering a customized format for the numbers to be recorded ('''Row 7'''). This formatting     information is entered in the same way as you would for customized number     formats within Excel (see the Excel help file for more information).
         
Continuing with our previous example in Figure 10, here we are reporting on the net present value. This is calculated as the sum of the discounted benefits (Cell '''E14''' from sheet1 which was created in stage 2) minus the program costs.  For this example, program costs have been defined as an additional probability distribution (normal) with a mean of $100 million and a standard deviation of $3 million.  In addition to the net present-value we have requested the program to record information for each simulation of the expected statistical lives saved and the reduced risk of premature mortality.
 
Continuing with our previous example in Figure 10, here we are reporting on the net present value. This is calculated as the sum of the discounted benefits (Cell '''E14''' from sheet1 which was created in stage 2) minus the program costs.  For this example, program costs have been defined as an additional probability distribution (normal) with a mean of $100 million and a standard deviation of $3 million.  In addition to the net present-value we have requested the program to record information for each simulation of the expected statistical lives saved and the reduced risk of premature mortality.
   −
== 1.4    Stage 4: Running the Simulation ==
+
== Stage 4: Running the Simulation ==
      Line 198: Line 183:     
== 2.2   Ranges and Probabilities ==
 
== 2.2   Ranges and Probabilities ==
 +
       
As mentioned, one of the main purposes of Monte-Carlo analysis is to allow for the creation of a probability distribution for model outputs.  This allows the analyst to make probabilistic statements about outcomes.  This almost always involves not just one outcome but a range of outcomes because the probability of any one outcome when dealing with continuous distributions is close to zero.  Thus, it is much more useful to discuss the probability of an outcome occurring between two points (a range) than a single event.    
 
As mentioned, one of the main purposes of Monte-Carlo analysis is to allow for the creation of a probability distribution for model outputs.  This allows the analyst to make probabilistic statements about outcomes.  This almost always involves not just one outcome but a range of outcomes because the probability of any one outcome when dealing with continuous distributions is close to zero.  Thus, it is much more useful to discuss the probability of an outcome occurring between two points (a range) than a single event.    
      
Once a Monte-Carlo simulation has been run we can then ask one of two questions regarding the output distributions from the simulations:
 
Once a Monte-Carlo simulation has been run we can then ask one of two questions regarding the output distributions from the simulations:
Line 207: Line 192:  
# What is the central    range associated with a given probability?
 
# What is the central    range associated with a given probability?
 
# What is the    probability associated with a given range?
 
# What is the    probability associated with a given range?
 +
       
To answer these questions two functions are available (see Figure 14).   
 
To answer these questions two functions are available (see Figure 14).   
 +
       
Figure 14: Range & Probability Functions
 
Figure 14: Range & Probability Functions
 +
      Line 220: Line 208:     
The second function calculates the probability of an outcome occurring within a given range.  In Figure 15 a range has been entered for the Net Present Value with the lower limit being 0 and the upper limit being the maximum value in the data set. Pressing the '''Calculate Probability of a Range''' button then causes the function to count each value in the data set that is greater than zero (and less than the max).  The resulting 95.18% can be interpreted as the probability that the proposed regulation will result in a positive Net Present Value.  
 
The second function calculates the probability of an outcome occurring within a given range.  In Figure 15 a range has been entered for the Net Present Value with the lower limit being 0 and the upper limit being the maximum value in the data set. Pressing the '''Calculate Probability of a Range''' button then causes the function to count each value in the data set that is greater than zero (and less than the max).  The resulting 95.18% can be interpreted as the probability that the proposed regulation will result in a positive Net Present Value.  
      
Figure 15: Probability of a Range
 
Figure 15: Probability of a Range
    
== 2.3   Creating Histograms ==
 
== 2.3   Creating Histograms ==
 +
       
Within Excel’s statistical analysis tools (must be activated as an add-in) there is a function that will automatically create a histogram from data sets such as those produced while running a Monte-Carlo simulation using RRAT.   The Excel tool will automatically count the number of results that occur between ranges. The ranges are called “Bins” and can by pre-defined by the user or will be automatically calculated by Excel. The tool can also be optionally told to produce a chart and calculate cumulative percentages.
 
Within Excel’s statistical analysis tools (must be activated as an add-in) there is a function that will automatically create a histogram from data sets such as those produced while running a Monte-Carlo simulation using RRAT.   The Excel tool will automatically count the number of results that occur between ranges. The ranges are called “Bins” and can by pre-defined by the user or will be automatically calculated by Excel. The tool can also be optionally told to produce a chart and calculate cumulative percentages.
 +
      Line 236: Line 225:  
== 2.4   Exporting Results/ RRAT Options ==
 
== 2.4   Exporting Results/ RRAT Options ==
 
As you may wish to do additional analysis (including those discussed in the next section), RRAT has an option to export the outputs produced by the simulation to a separate spreadsheet. This option can be changed by pressing the ‘Options’ button available at the top of the ‘Model Inputs’ tab. Export data to a separate worksheet will make your main spreadsheet cleaner and preserve simulation results, since each simulation ran in RRAT clears previous results. If you select this option, RRAT can also automatically produce histograms for each distribution output (note: histograms for multiple output distributions are staked one on top of the other in the output sheet, make only the last one visible, until you move them).
 
As you may wish to do additional analysis (including those discussed in the next section), RRAT has an option to export the outputs produced by the simulation to a separate spreadsheet. This option can be changed by pressing the ‘Options’ button available at the top of the ‘Model Inputs’ tab. Export data to a separate worksheet will make your main spreadsheet cleaner and preserve simulation results, since each simulation ran in RRAT clears previous results. If you select this option, RRAT can also automatically produce histograms for each distribution output (note: histograms for multiple output distributions are staked one on top of the other in the output sheet, make only the last one visible, until you move them).
 +
       
The last option available is to turn on or off screen updating. Screen updating will show the spreadsheet, as it will display values as the change each time a simulation is run, which can be fun to watch, but also slows down the overall time it takes for a complete set of simulations to be completed. Setting this option off, while hide any updates as the simulation runs, speeding up the run time. Other options are likely to be added here in the future as RRAT development continues.
 
The last option available is to turn on or off screen updating. Screen updating will show the spreadsheet, as it will display values as the change each time a simulation is run, which can be fun to watch, but also slows down the overall time it takes for a complete set of simulations to be completed. Setting this option off, while hide any updates as the simulation runs, speeding up the run time. Other options are likely to be added here in the future as RRAT development continues.
 +
       
Figure 17: Setting Options
 
Figure 17: Setting Options
      
The last option available within RRAT is associated with the ‘CLEAR ENTRIES BUTTON’. As expected, pressing this button will clear all user entries and data created from a simulation from the spreadsheet.
 
The last option available within RRAT is associated with the ‘CLEAR ENTRIES BUTTON’. As expected, pressing this button will clear all user entries and data created from a simulation from the spreadsheet.
Line 248: Line 238:  
== 2.5   Sensitivity Analysis - Exploring Relations ==
 
== 2.5   Sensitivity Analysis - Exploring Relations ==
 
One of the main purposes of sensitivity analysis is to determine how variability in one assumption impacts the objective function. In terms of regulatory CBA the objective function is usually NPV.  With a Monte-Carlo simulation a huge data set is produced that can be used to perform such sensitivity analysis.  A good way to do this is by producing a scatter chart of the objective function versus the variable or assumption of concern.   
 
One of the main purposes of sensitivity analysis is to determine how variability in one assumption impacts the objective function. In terms of regulatory CBA the objective function is usually NPV.  With a Monte-Carlo simulation a huge data set is produced that can be used to perform such sensitivity analysis.  A good way to do this is by producing a scatter chart of the objective function versus the variable or assumption of concern.   
 +
      Line 253: Line 244:     
Figure 18: The Effect of Risk Reduction Uncertainty on NPV
 
Figure 18: The Effect of Risk Reduction Uncertainty on NPV
 +
       
Finally, of interest is that as the risk reduction increases it also appears that the range of possible outcomes caused by uncertainty in the other assumptions (population growth rate, VSL, and the program cost), while remaining positive, increases.
 
Finally, of interest is that as the risk reduction increases it also appears that the range of possible outcomes caused by uncertainty in the other assumptions (population growth rate, VSL, and the program cost), while remaining positive, increases.
 +
      Line 261: Line 254:     
Conclusion & Next Steps
 
Conclusion & Next Steps
 +
       
I hope that this instruction manual has given enough information for analysts familiar with cost benefit analysis and excel to be able to use the RRAT.  While the tool itself is relatively simplistic as compared to other commercially available statistical analysis packages I believe this manual has demonstrated that combining its functions with the tools that are already available within Excel can provide some very powerful results and analysis that can strengthen CBA on a regulatory proposal.
 
I hope that this instruction manual has given enough information for analysts familiar with cost benefit analysis and excel to be able to use the RRAT.  While the tool itself is relatively simplistic as compared to other commercially available statistical analysis packages I believe this manual has demonstrated that combining its functions with the tools that are already available within Excel can provide some very powerful results and analysis that can strengthen CBA on a regulatory proposal.
 +
       
Despite this there will obviously be areas for improvement in this tool as it is still very much under-development and thus the author is looking for feed-back in this area.  It is also anticipated that this product will not stand alone and that it will be used in co-ordination with other CORE products designed to help Departments produce quality CBA on regulatory proposals.  These other tools include the Cost Benefit Identification Session (CBIS) tool, the CBA template, and the official TBS CBA Policy and Guidelines.  
 
Despite this there will obviously be areas for improvement in this tool as it is still very much under-development and thus the author is looking for feed-back in this area.  It is also anticipated that this product will not stand alone and that it will be used in co-ordination with other CORE products designed to help Departments produce quality CBA on regulatory proposals.  These other tools include the Cost Benefit Identification Session (CBIS) tool, the CBA template, and the official TBS CBA Policy and Guidelines.  
 +
       
For more information on this and other tools and services offered by the Centre of Regulatory Expertise please contact:
 
For more information on this and other tools and services offered by the Centre of Regulatory Expertise please contact:
 +
     

Navigation menu

GCwiki