**Using Inferential Statistics in Line Operations Safety Audits to Generate Scientifically Based Conclusions and Recommendations**

Preven Naidoo, Ph.D

The Aviation Consulting Group

Generally, we tend to report LOSA findings in a somewhat descriptive manner. Descriptive statistics are used to describe the basic features of the data collected from a cohort of LOSA observations. This provides simple summaries about the sample and the measures. Together with simple graphics analysis, descriptive stats form the basis of virtually every quantitative analysis of data. We always provide a good collection of descriptive statistics in the analysis of the dataset in the final LOSA report for an organization. For instance, we provide a thorough description of the sample beginning with a high-level look at its demographic characteristics. For example, the sample may be described in terms of aircraft type, frequency of pilot flying (that is, Captain or First Officer), block time, weather type (IMC or VMC), crew approach decisions (precision, non-precision, visual, using automation, hand flown, backed by an ILS), flight delays; and importantly a deep analytical description of the operating crew members (age, flight time, experience in advanced aircraft, etcetera). An important description of results comes in the form of counting the actual number of threats, errors, and UASs recorded in each observation and combining this information to provide an overall understanding of phenomena. These data will provide us with a good idea of the position the airline finds itself in on a global scale. It is a great comparison method and a “how-goes-it” look at the organization itself. Together with a final “drill-down”, airline managers will gain a deeper understanding of their organization and then be in a position to make important, necessary changes, or re-enforce positive crew behavior.

Of course, there is no doubt that a description of both your sample and results plays a pivotal role in understanding what the data are telling us, and can greatly assist in any post-hoc analysis. However, we may need to go one step further in our examination of the dataset and begin some sort of hypotheses testing. Here a more robust statistical method would need to be utilized. Now, we will need to bring out the “big guns”, in the guise of inferential statistical methods. This will ensure that we draw scientifically based conclusions from our data and not end up in a situation based on making judgment from personal opinion. With inferential statistics, you are trying to reach conclusions that extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the phenomena in the population might be. Or, we use inferential statistics to make judgments of the probability that an observed difference between groups, such as aircraft types, crew experience levels, types of approaches flown, etc. are a dependable one or one that might have happened by chance in the LOSA dataset. So, in other words, we will use inferential statistical methodology (like determining the differences in central tendency using a student’s T-test, regression analysis, correlations, to name a few) to make inferences from our data to conditions that are more general in nature about the population. Of course, having said that, such inferences are necessarily error prone. That is, we cannot say with 100% confidence that the characteristics of the sample accurately reflect the characteristics of the larger population (or sampling frame). Hence, qualified inferences can only be made, within a degree of certainty, which is often expressed in terms of probability (e.g., 90% or 95% probability that the sample reflects the population). Next, we look at an example of the calculations we can make to arrive at a statistically significant conclusion when the data are in a somewhat grey area. In this example taken from an actual LOSA dataset from a project we participated in at a large international carrier, we found that at first glance (based on descriptive results) there appeared to be a difference in the number of errors committed by either the Captain or Co-pilot. We wanted to know whether this difference was indeed significant and could then be inferred to the population at large, or simply generated as a random error in the data collection process.

We began by generating the table above using a statistical software program. From the data, at face value, it would appear that First Officers in this sample committed more errors than Captains. However, the hypothesis test (using Student’s *t*-test for small dependent samples) produced a high P-Value, which suggested that we should reject the hypothesis that there is a statistically significant difference in errors committed by Captains and errors committed by First Officers.

Therefore, inferential statistics provided us with a tool to conclude that both Captains and First Officers were on par in the number of errors they committed in the flight deck. Had there been a statistically significant difference, the organization would look into First Officer experience levels at recruitment (which may need an adjustment) or their transition-training program.

Let’s look at another example. The LOSA instrument we use in collecting line operations data provides us with information about the general climate (flight deck gradient) as noted by the observer. We then utilized inferential statistics to determine what affect this had on the number of errors counted for a specific group (that is, Captains and First Officers), or in other words, consequential threats (threats that lead a crew member to make an error). The next figure is an example of the possible regression analysis that can be generated with these data.

We began with a scatter plot of these data and used a statistical software program to generate a regression equation. The figure shows that one can be 95% confident in these data that, as the flight-deck climate improved (on a scale of 1=poor, 5=excellent) the number of consequential threats reduced, here r=-0.3429, with a statistically significant correlation P-Value of < 0.05.

Further to this, we may interpret this graphic in terms of the negative correlation between flight deck climate and mismanaged Threats. We can clearly, with the help of this inferential result, conclude that by setting a team-based climate the Captain is able to create a conduit which enhances safety by reducing errors, as in general CRM principles, it is well documented that it is the Captain of the aircraft who is more likely to influence the actual flight deck climate with his/her First Officer, rather than vice-versa. This dovetails very well with one of the required skillsets important for a commander in the first Threat and Error management countermeasure principles, namely “Crew Climate” and “Team Building”.

Organizations can use these results to build training programs with this TEM countermeasure in mind, which forms part of the Command training during a First Officer’s initial upgrade, and of course, re-emphasized at all subsequent conversions. Further analysis in this example also provided evidence that showed us that as the flight deck climate became poorer and poorer, the likelihood of the First Officer committing an error, increased significantly. This provided us with substantial evidence to recommend that a re-look at the company’s CRM program may be necessary, in terms of providing the First Officer with the tools to become more assertive.

This short exposé simply gives you a glimpse into the power of evidence-based data (LOSA Observations), and what can be produced with more advanced statistical methods we have at our disposal. Of course, the aforementioned examples are only a small collection of the techniques available, further inferential methods include multivariate analysis of variables, which involves an observation, and analysis of more than one statistical outcome variable at a time. Examples that come to mind are, MANOVA and Exploratory Factor Analysis (when there are many variables in a LOSA dataset, it is often helpful to reduce the variables to a smaller set of factors. This is an independence technique, in which there is no dependent variable. Rather, we would be looking for the underlying structure of the data matrix).