Accident information comparisons


Safety and accident related information is available at the websites of airlines, but you will find more of it if you use Google. There were also a couple of interesting peculiarities in the data, where it seems possible that the airline was hiding information. See also Part 1, this post is part of our “Is aviation safety a shameful thing?” project.

In this part two I will compare the number of safety/accident related links that were found by the airline’s own search to the number found by Google when it was limited to the website in question. Both link counts are also analyzed against against information about the airlines and their home countries. The intention is to find out how open airlines are with this information. Absolute numbers show how much information is available, relative numbers show how well it can be found with the search provided by the airline and might give a hint about how desirable it is to the airline to show that information. Comparison with other data might reveal factors that are common to airlines with high or low number of links.

I searched through 46 airlines. Figure 1 shows the raw link counts. The x-axis shows how many links were found and the y-axis shows the number of airlines that had that count. The large blue and orange bars at x=0 show that for many airlines the homepage was a poor choice for finding safety or accident info.

On the other hand Google is able to find information (yellow and green bars) on both subjects and in some cases quite a lot of it. It should be noted that I only counted to 10, if there were more links I ignored them. This document of the raw data shows in more detail which links were accepted to this data set.

Links to anything that the passengers would find out during the trip, such as pre-flight safety announcements, were rejected. Another category that was not accepted were links to insurance terms and conditions.The reason being that I am interested in what “extra” information is available at the website.

Figure 1. Number of links found for both searches and words.

I’d like to be a little cautious when making conclusions based on this data, mainly due to the low number of airlines, but also due to the data gathering process. Namely it was done by me alone without much help. In my experience this leads to a less rigorous result than a group effort. But one thing seems to be pretty certain: Google is better in finding this information than the search functions on the airline web sites.

This is true even if those 12 airlines that didn’t have a search are removed from the zero column. For the whole set, when the number of links found for one airline by one search is summed; Google finds more links in 39 cases while in only two cases the homepage search returns more results (Qantas 5 vs. 4 and Czech Airlines 7 vs. 5)

At least one of the airlines uses Google to power their search (US Airways). This offers an interesting comparison: US Airways homepage search found 3 safety and 1 accident related link, while the general google search found 1 safety and 7 accident related links.

While I was not logged in to my Google account, it is possible that Google had picked up on the fact that the same computer had been intensely searching for accident info for several days and used this knowledge to show what was most interesting to me.

A more sinister explanation is that the results by the search provided at the homepage have been filtered not to include what I was looking for. Searching the US airways site with the site’s search for “1549” gives (18 March 2012 ) one result about a general chronology of the airline and tells that some results have been omitted. If one includes those, four more links to the same chronology are included. It is still possible that this is a result of some more general decision not to include parts of the web site in the site search, but I’d say there is a good possibility that this is intentional.

In the case of Kenya Airways, Google search gave two links to the accident of KQ 507 but when I followed those links they gave a 404 (i.e page not found). This could be due to several reasons and need not be intentional. The accident was mentioned in an annual report.

Table 1. Mean and median number of links found Google for different sub populations


Table 1 shows the mean and median number of links found by google for different sub populations. “Whole set” includes all the airlines, while “Google and Homepage” includes only those cases where both searches were available and “Google only” includes only the cases where there was no homepage search.

In all cases there are more links related to safety than to accidents, but the difference is not massive. Results for the word “Safety” show no definite differences between the populations. For “Accident” airlines with their own search show more info. This difference could be explained if the airlines with no homepage search had had fewer accidents, but in only 3 cases out of 12 I couldn’t find a fatal accident in the history of the airline. Four out of the 12 airlines without homepage search function are low cost airlines which might have less expansive websites and therefore less information. This result is similar to what Jakke saw in his analysis of airline homepages.

I compared the link counts against a data set ( or here ) with info on

  • number of employees
  • number of yearly passengers
  • revenue
  • year the airline was founded
  • GDP (PPP) per capita of the airlines home country
  • global integrity report overall score of home country
  • corruption perception index
  • IATA membership
  • date of latest accident

It was difficult to find all the data for all the airlines so there are some gaps. The data is also unreferenced and from various sources. Some plots with very short description are available here. There is a modest correlation between the date of latest non fatal accident and total number of links found,  which just might be significant. There is also a modest correlation between the Global Integrity Report overall score and total number of links. But the plots show that in addition to the set being quite small there might be other data related difficulties that make this type of analysis less trustworthy.

Overall the small numbers in table 1 suggest that openness is not the approach chosen for these subjects. Further, there is accident related information at many airline websites but you might not find all of it with the search provided by the airline.

In the third part of this series I will attempt to rate the links and see if any info comes out of that