Statistical Analyses of Selected Maintenance Parameters of Vehicles of Road Transport Companies

The article presents the methods for preliminary statistical analyses of selected maintenance parameters of vehicles in road transport companies. The methodology was illustrated by presenting the results pertaining to the calculations of the maintenance intensity as well as the cost of parts and repairs of cars in the Lublin Regional Branch of the Polish Post Logistics Centre. These results provided comparative material useful in the evaluation of the maintenance efficiency of various road transport systems.


INTRODUCTION
There are many criteria for assessing the economic effectiveness and innovation of transport systems of road transport companies.The most important are the profitability of the business, the total weight of the goods transported or the number of passengers transported per billing period, and the degree of use of vehicle fleet.The profitability of a transport activity is measured by the profit obtained from the use of a means of transport.In the road transport, the income derived from transportation services is affected by many factors.These include type, technical condition and reliability of the vehicles, the intensity of their use, transport fees, the state of the road infrastructure, the cost of consumables (fuel, oil, fluid engineering), personnel costs, the size of taxes and administrative fees [2,3,4,9,11,13,16,17,24,25,28,39].It is commonly known that the properties and operational parameters of internal combustion engines change under working conditions [2,35].Fuel consumption of modern means of transport and, especially, road vehicles, is the parameter that is of interest to every manufacturer and user [32].All sites of transport service are trying to improve vehicle fuel consumption, which is the largest part of the energy consumption in transport [36].It is the consequence of a significant share of fuel in the transport expenses as a whole [32].The quality of fuels is one of the key factors having an impact upon the operational performance of the engine [26,29].The research on the quality of fuels and their biocomponents were also presented by other researchers, for example: [14,20,26,31,33], while the studies on alternative fuels for diesel engine were shown in [18,29,30].
The intensity of maintenance of a vehicle is determined by the number of kilometres travelled by the vehicle in a repeatable period of time (day, month, year).In the event that the transport activity is profitable, the intensity of maintenance is an "offensive" parameter of a given transport system.Its increases result in an increase of profitability.However, if the transport activity generates losses, they may be deepened further by an increase of this maintenance intensity.In transport economics, generalised transport costs are the sum of the monetary and non-monetary costs of a journey [19].
In a transport company, the intensity of the maintenance could theoretically be raised in two ways: by increasing the speed of the vehicle on the route or by reducing idle time.In practice, the average speed of the car used in a transport company depends primarily on the state of the road infrastructure and traffic rules.The impact on the traffic system is reflected by an increase of individual automobile transport which allows the optimal personal mobility [15].(In the case of passenger transport it is also determined by timetables).
This means that cars performing road transport over a longer period of time on repetitive transport routes run at a fixed (optimal from the point of view of the task at hand) average speed.For this reason, the intensity of maintenance mainly reflects the degree of use of the car, which depends primarily on the accepted strategy of its maintenance.Execution of the transport service depends on the whole drive system efficiency of buses, in particular on the combustion engine [11].
The intensity of the maintenance also affects other organizational and technical parameters of transport systems.It is related to the reliability of vehicles, the length of period of their use, maintenance and repair costs and the required working time of drivers.The analysis of data on the intensity of car use in transport companies allows to come to a number of conclusions concerning the functioning of the transport system as a whole.Such analyses should be carried out by statistical methods because of a significant effect of many random factors [34].In practice, it is beneficial if a transport company, which needs very accurate prognosis of demand of passengers for a longer period of time, uses professional statistical software and experienced analysts for the creation of a prognosis.
An important parameter of the transport system is the amount of expenditure incurred from the maintenance and repair of vehicles.The analysis of these costs is the basis for the decision to liquidating the means of transport, buying a new car, fixing the fee shipping, and choosing routes.Repair and maintenance costs are the "defensive" parameters of the transport system.The activities involving the optimization of expenses for maintenance and repairs are especially appropriate during the periods of economic downturn or a fierce competition between economic operators.The analysis of these costs is also carried out by statistical methods [8,10,23].

OUTLINE OF THE STATISTICAL COMPUTATIONS
From the point of view of operational research, the mathematical models on the issues of economic maintenance effectiveness of transport systems are stochastic models, since most of the relevant parameters of these models are characterized by a random scattering of an unknown probability distribution.Full quantitative information about the system provides all the cumulative probability distributions of random parameters.Determination of such distributions is an almost impossible task.Therefore, a simplified statistical analysis is carried out that allows formulating the primary conclusions.The initial step in the analysis of each transport system consists of univariate analysis (concerning one selected feature of a statistical nature).It provides a starting point for further advanced analysis.

Descriptive statistics
While analysing the transport system we must first characterize the study population, stating the terms and conditions of its functioning and determine a set of random parameters, which include the most important information about the system.On this basis, in simple cases it is possible to deduce the rights, which imply the diversity of values of these features.In practice, this means developing and solving a probabilistic model that describes the behaviour of the desired characteristics of the system.
In most cases, such a model is not known and we can only try to compare its audited features with theoretical models available to study.For this purpose, empirical data are collected and processed.The way to obtain a random sample and its basic descriptive statistics, such as arithmetic mean of a sample, standard square deviation of a sample, median, minimum and maximum values in the sample, are the premier source of information and the study population [12].

Statistical tests
In mathematical statistics, every assumption about the unknown probability distribution of a random variable is called a statistical hypothesis.The statistical hypotheses determining only the numerical value of the unknown parameters of random variable are called parametric hypotheses.The statistical hypotheses specifying an unknown form of distribution function of a random variable are called nonparametric hypotheses.
The tests used to check (verify) the statistical hypotheses are called statistical tests.The tests used to verify the parametric hypotheses are called parametric tests, and the tests used to verify the non-parametric hypotheses are called non-parametric or compliance testing [12].
Normal distribution plays the specific role of probability theory plays (it plays an analogous role in probability to linear models in deterministic issues).Normal distribution is a model of diversity of characteristics when there is no dominant factor.In addition, in fairly general assumptions, it constitutes a limit distribution of sum of independent random variables, when the number of components of this sum goes to infinity.The mathematical properties of the normal distribution have been thoroughly tested and therefore this distribution is always a reference point for any analysis of unknown probability distributions of random variables that appear in technical applications.
The one-dimensional density function of the normal distribution is completely defined by two numerical parameters: the expected value and standard deviation.(In multidimensional case, the density function of normal distribution determines the vector of the expected values and the covariance matrix.)Most of statistical tests described in the literature and used in practice are based on the assumption that the analysed random sample comes from a population with normal distribution.Although the distributions of test statistics in most cases do not have a normal distribution, but with computer programs to support statistical calculations, testing the "classic" hypothesis presents no difficulties of an accounting nature [5].
When the tested random sample does not come from the population with a normal distribution, a set of available statistic methods to test its probability distribution is much poorer, and the obtained results are affected by greater uncertainty.Recently, nonparametric rank tests have gained much popularity in the analysis of such issues.

Issues of classification
In technical applications, there is often nonuniformity (significant differences) of characteristics within the tested population.Treating all representatives of non-uniform population in the same way leads to erroneous results.It is necessary to classify the population (the division into separate groups), in such a way that the nonuniformity in the derived classes was negligible.Often, the opposite situation occurs.It should be determined whether the objects significantly different at first glance can be seen as representatives of the same population.
A method called analysis of variance is a statistical tool for solving classification problems in populations with a normal distribution [12].In the case of features distribution in various classes differing from normal distribution, the rank tests for the equality of distributions are used.
The procedure for classification of population with unknown distributions consists of two stages.The first stage includes checking the compliance of empirical distributions in each group with a normal distribution and equality of variance of distributions in each group.The Pearson test (χ 2 ), Kolmogorov-Smirnov (K-S) test, and Shapiro-Wilk (S-W) test (Corder and Foreman 2011) are used for example for testing compliance with the normal distribution programs for statistical calculations.Testing equality of variances can be done, e.g. with Bartlett's test, Cochran or Hartley test (the choice of the test is determined by the number of groups) in the case of a population with a normal distribution, or Levene's test for another distribution.
The second stage involves testing the hypothesis of equality of means by Fisher method (in the case of normal distributions with the same variance) or (otherwise) testing the hypothesis of equality of distributions in comparable populations by rank tests: Mann-Whitney test (comparison of two-samples) or Kruskal-Wallis test (comparison of the number of samples greater than 2) [5].

SAMPLE ANALYSIS
Practical applications of the procedures described above are shown in this section on the example of univariate statistical analysis of maintenance efficiency economic issues of the vehicle fleet of Lublin Regional Branch of the Polish Post Logistics Centre (RB PPLC).The functioning of the national postal operators in the new EU countries is subject to various analyses [1,21,22,24,27,37,38].
The city of Lublin has about 350,000 inhabitants.It is the capital of Lublin province, with a population of over two million people in an area of approximately 25,000 km 2 .Polish Post has a regional branch of its logistics centre in Lublin.(This is one out of fourteen regional offices in Poland.)Since 2010, Expedition and Distribution Node of so called class A (one out of eight in Poland), has operated in Lublin, creating the base of the logistics system of the Polish Post.
The presented calculations were performed based on the data from 2009, coming from an inner database of delivery tracks in Lublin Regional Branch RB PPLC.The database contains the information, inter alia on the performance of maintenance and repair of vehicles and related costs.Suitable processing of this information allows to determine the exploitation history of each vehicle at the time when it belonged to the fleet of Lublin branch.

Characteristics of the study car population
In 2009, there were 179 operating cars with a total mileage more than 7500000 km in RB PPCL in Lublin.These were the cars of various types and makes.The vehicles performed different delivery tasks depending on the specificity of the transport company.For the purpose of statistical analysis, the fleet was divided into groups of vehicles, taking the loading capacity of the vehicle as a criterion for the classification.Three groups of cars have been arbitrary distinguished.
The first group (47 vehicles) includes the vehicles with low cargo capacity: DAEWOO Matiz, FIAT Seicento, FIAT Doblo, ŠKODA Fabia, CITROËN Xsara, RENAULT Kangoo (Fig. 1).These cars take letters from the mailboxes and deliver mail in the area of Lublin and in the surroundings.
The second group (85 vehicles) contains vans with average cargo capacity: LUBLIN III, MER-CEDES Sprinter, VOLKSWAGEN LT, FORD Transit, CITROËN Jumper (Fig. 2).These vehicles ran between postal offices in Lublin and in the former Lublin Province.
The third group was formed by 47 trucks with large cargo capacity: IVECO Stralis Volvo FM12, MAN, MERCEDES Vario (Fig. 3).These vehicles delivered postal items between Lublin distribution and logistics nodes of the Polish Post located outside the area of Lublin Province.
One of the differentiating factors used within the selected groups of vehicles was their course at the beginning of the observation period.Figure 4 shows histograms of the empirical distribution of vehicle mileage rate at the beginning of the period (January 2009), divided into groups I, II and III, and for the entire sample.Basic descriptive statistics of the characteristics are summarized in Table 1.A preliminary analysis of the results shows that in groups I, III and the whole fleet, the median well approximates the arithmetic mean of the sample.The large difference between these two parameters is observed in group II.The empirical coefficient of variation (ratio of mean square deviation and the arithmetic mean) takes anywhere the values greater than 0.5.In addition, arithmetic means of mileages at the beginning of the period tend to be compatible with the accepted criterion   of division of the study fleet (their values increase with increasing cargo capacity).

Analysis of the use intensity of vehicles
The relevancy of the adopted vehicle criterion can be checked by analysing the annual and monthly intensive use of vehicles in each of the selected groups.Confirmation of the correctness is to demonstrate the statistically significant differences between the arithmetic average of the annual intensity of exploitation in the distinguished groups of vehicles.
With the empirical data on the annual vehicle maintenance intensity of each group (Fig. 5, 6, 7) the basic descriptive statistics for the studied parameter were calculated (see Table 2).
The arithmetic average of the annual intensity of maintenance in Group I in 2009 amounted to about 14,500 km per year, in the second group -to about 34,500 km per year, while in group III -to about 83,500 km per year.The obtained results allow concluding that these values are significantly different from the point of view of statistics.In order to verify this assumption, the procedure of analysis of variance should be conducted.This involves checking the compliance of the empirical distribution with the normal distribution, checking the homogeneity of variance in groups and medium parametric test of equality or non-parametric test of equality of distribution function.
The same level of significance for all tests, α = 0.05, was assumed.In order to test the hypothesis of compliance of the studied distributions with normal distribution, Shapiro-Wilk test was used, which shows that the hypothesis of compliance with the normal distribution at significance level α = 0.05 cannot be excluded only in group II (see Table 3).
The Levene's test results on homogeneity of variance (see Table 4) showed that at the level of significance α = 0.05, the hypothesis of equality of variances in groups I, II and III should be rejected.Because the assumptions for parametric analysis of variance were not met, checking the hypothesis of equality of distributions was conducted on the basis of the Kruskal-Wallis test.It was found out that at the level of significance α = 0.05, the hypothesis of equality of distributions must be rejected (see Table 4).Therefore, the di-vision of the studied vehicles sample takes into account the differences in the average annual use intensity in the distinguished groups of vehicles.
The same procedure was used to test the hypothesis of equality of average monthly use intensity within each of the vehicles groups.The results of descriptive statistics of monthly arithme- tic means of the intensity and the average square deviation suggests that the means in each group are equal (see Table 5).
Since the results of the Shapiro-Wilk test show (see Table 6) that many empirical distributions of monthly use intensity cannot be considered compatible with normal distribution (α > p); thus, despite the positive verification by means of Levene's test of homogeneity of variance (Table 7), Kruskal-Wallis test was used again to verify the hypothesis of equality of distribution function of empirical distributions.Its results (see Table 8) allow to take on a significance level of α = 0.05 the hypothesis that in each group monthly distributions of use intensity of vehicles in 2009 are the same.
The results of statistical analysis on the use intensity of fleet vehicles of RB PPLC in Lublin allow us to conclude that the described distribution of population of vehicles into three groups, which was based on the size of the cargo capacity, is correct.It is evidenced by the significant differences between the various groups in the values of annual and monthly use intensity of vehicles.The arithmetic average of the annual intensity of vehicle use in group II is almost 2.5 times higher than the arithmetic mean intensity in Group I. A similar proportion also occurs between the intensities of vehicle use in groups III and II.A calendar month of use has no significant influence on the observed values of the monthly averages of the use intensity of vehicles in each group.

Results of statistical analysis of the test vehicle repair costs
The repair costs of the vehicles used in the Lublin regional office of the Polish Post Logistics Center generates the so-called exchange material factors of life, which include all components, individual parts and fluids (engine oil, brake fluid, etc.).The cost of repairs is not charged while the personnel costs are incurred by the company to keep workers employed in Post service stations .In the analyzed period (year 2009) in the Lublin branch more than 16,000 cases of exchange of material factors of life were registered.Frequently mentioned ones included: motor oil (2,281 litres), light bulbs (2,555 units), gaskets (350 units), tires (226 pieces), oil filters (217 units), fuel filters (192 pieces), air filters (105 units), air conditioning (47 pieces), shock absorbers (56 pieces), and batteries (51 units).The information on the cost of repairs carried out in 2009 were analyzed statistically.Table 9 summarizes the results of descriptive statistics.The average values of the annual cost of repairs in each group tend compatible with the accepted criterion for division of the population.Figure 8 presents empirical distributions of annual cost of repairs for the test vehicle groups and the general fleet incurred by the regional office PPLC Lublin in 2009.The analysis of the histogram shows that the annual cost of vehicles repairs classified as Group I are in 21.3% of cases less than 500 PLN.The repairs in this group in 57.6% of cases did not exceed 2000 PLN.In the case of vehicles of group II this amount did not exceed in 45.2% of the observations, a similar result (44.1%) was recorded in group III.
In order to determine whether the observed differences between the arithmetic means of annual repair costs in the distinguished groups of vehicles are statistically significant, the analysis of variance was conducted.Since the empirical distributions shown in Figure 8 failed to bring in all cases the normal distribution, it was necessary to reckon with their noncompliance with the normal distribution.This assumption was confirmed by chi-squared test (χ 2 ) at a given level of significance α = 0.05.For the vehicles of groups II and III, the value of statistics equal to χ 2 corresponding to the values for p less than level of significance of the test was obtained.Additionally, heterogeneity of variance was shown in different groups of vehicles, using (due to different number of results in the analysed groups) Bartlett's test.These results ruled out the possibility of using the classical methods of analysis of variance for the arithmetic average annual cost of vehicles repairs.Therefore, Kruskal-Wallis test was used in the further calculations.The performed calculations showed that the value of the test statistic 8.294 corresponds to the parameter p = 0.0158, which at the significance level α = 0.05 confirms the presence of significant differences between the arithmetic means of the annual cost of vehicle repairs in the distinguished three groups.
It was also examined whether the month (as a grouping factor) has a significant influence on the value of the observed arithmetic means of monthly cost of vehicle repairs in each group, and in the non-grouped sample.The calculations performed using the chi-squared test χ 2 and Bartlett's test showed noncompliance with the normal distribution of monthly distributions of repair costs of vehicles in each group and the total fleet (see Table 10) as well as the heterogeneity of variance (Table 11).
The test results of empirical distributions of compliance with the normal distribution and homogeneity tests of variance for the monthly costs of repairs in the distinguished groups of vehicles   and in the non-grouped sample (presented in Tables 10 and 11) show that the classical method of analysis of variance does not apply in this case.
In order to test the hypothesis of equality of empirical distribution, Kruskal-Wallis rank test was applied.The results are shown in Table 11.On this basis, it can be said that the month of use has a significant impact on the average monthly costs of repair of not-grouped sample and distinguished groups I and II of vehicles.There is no significant difference between the distributions of the monthly costs of vehicles repairs in the group III.
Figure 9 shows graphs of arithmetic means of monthly repair costs according to the calendar month of use.While Analysing these graphs, high variability of arithmetic means pertaining to the monthly repair costs of vehicles in each group can be seen .In group I these differences reach 31%, while in groups II and III -50%.
In Figure 9, an increase in the arithmetic means of repair costs can be observed in all cases in March and April 2009.The explanation for this effect may be the fact that it was a period prior to Easter.However, the increase in average cost of repairs in August, September and October can be caused by several factors.In summer (holiday season) the car can be used by different drivers, thereby reducing the attention to the technical object.In autumn, sudden changes in road conditions occur, which apparently translate into an increase in the costs of repairs.

CONCLUSIONS
Univariate statistical analyses described in this paper constituted a preliminary analysis, which is an introduction to the multivariate analysis.The examples quoted in section "Issues of classification" show that the univariate analysis can be successfully used in the classification of heterogeneous issues of the vehicle fleet in vehicle transport companies.Due to the cyclical nature of seasons changes, the empirical data used in the analysis of the economic efficiency of use of transport systems should include at least one calendar year.The data from one year do not, however, enable to assess the repeatability of the results, and therefore have little prognostic significance.For such purposes, the data from a period of at least several years of use of vehicles in the same transport company, under comparable conditions are needed.In the case of vehicles in the fleet RB PPLC in Lublin, such studies have been carried out and their results are presented in the following articles [6,7].

Fig. 4 .
Fig. 4. Histograms of vehicle mileage rate the empirical distributions of vehicle fleet of RB PPLC in Lublin at the beginning of January 2009: a) vehicles of the group I, b) vehicles of the group II, c) vehicles of the group III d) all vehicles

Fig. 5 .Fig. 6 .Fig. 7 .Table 3 .
Fig. 5.A histogram of the empirical distribution of yearly vehicle use intensity for the vehicles in Group I (data from the Lublin RB PPLC, 2009), author's calculations

Fig. 8 .
Fig. 8.A histogram of the empirical distribution of the annual cost of vehicles repairs in the RB PPLC in Lublin in 2009, a) group I, b) group II, c) group III d) all vehicles

Fig. 9 .
Fig. 9. Dependence of the arithmetic mean of the monthly cost of vehicles repairs of fleet vehicles in the RB PPLC in Lublin on the calendar month in 2009; a) Group I, b) Group II, c) Group III, d) the entire population of vehicles (author's calculations)

Table 1 .
Descriptive statistics course RB PPLC fleet vehicles in Lublin at the beginning of the observation period (operating data from 2009), authors' calculations

Table 1 .
Descriptive statistics course RB PPLC fleet vehicles in Lublin at the beginning of the observation period (operating data from 2009), authors' calculations

Table 2 .
Descriptive statistics on annual intensity of vehicle group maintenance distinguished in the fleet RB PPLC in Lublin (maintenance data from 2009), author's calculations

Table 4 .
Levene's test results on equality of variance and Kruskal-Wallis test on equality of empirical distributions of annual intensity use of fleet vehicles of RB PPLC in Lublin in 2009

Table 5 .
Basic monthly statistics on use intensity of fleet vehicles of RB PPLC in Lublin in 2009

Table 6 .
Results of the Shapiro-Wilk normality compliance with the empirical distribution of monthly use intensity of fleet vehicles of RB PPLC in Lublin in 2009

Table 7 .
Results of Levene's test of equality of variance of empirical distributions of monthly use of fleet vehicles of RB PPLC in Lublin in 2009

Table 8 .
Results of the Kruskal-Wallis test of equality of empirical cumulative distribution of monthly distributions on usage RB PPLC fleet vehicles in Lublin in 2009 (grouping factor -month operation)

Table 9 .
Descriptive statistics of the annual vehicle repairs cost in the fleet RB PPLC in Lublin in 2009

Table 11 .
Bartlett 's test results of homogeneity of variance and Kruskal-Wallis test of equality of empirical cumulative distribution of monthly costs of fleet vehicle in RB PPLC in Lublin in 2009 (grouping factor -calendar month of use) (author's calculations)