Application of image analysis tools in Matlab to better estimate the degree of binder coverage in rolling bottles test

In asphalt mixture, a strong affinity between binder and aggregates is of prime importance, especially under conditions susceptible to moisture damage. Out of various modifiers/additives used in the literature, the hydrated lime (HL) has been reported as most suitable for improving affinity between binder and aggregates. Therefore, this study attempted to quantify the affinity of HL modified mastics with different aggregates under moist conditions. Further, in order to measure the affinity, the rolling bottle test (RBT) was used, which is also reported as one of the best empirical techniques for assessment of moisture susceptibility. However, in the RBT, the assessment for affinity is generally made through visual inspection by two experienced operators as per a standard procedure which can also be the major drawback of this technique in terms of less reliability and repeatability. Thereby, to reduce this deficiency, the image analysis (with no special setup) was also carried out through the use of the MATLAB program. The results as obtained from image analysis were then compared with the results of visual observation. Both were found to be very similar to each other. Hence, the RBT could be used confidently for the comparison of different binder/aggregate combinations. Also, the RBT results indicated that the HL addition was beneficial with granite, basalt and greywacke aggregates, but it did not have any effect with limestone aggregates. Further, the 10% HL substitution was found to be more efficient than 20% HL substitution as both substitutions showed a similar effect in most of the studied combinations.


Introduction
The rolling bottle test as per BS EN 12697-11 (2012), is a measure of affinity between aggregate and bitumen and therefore measures the susceptibility to stripping. The susceptibility to stripping gives an indirect indication of the bond strength between the binder and the aggregate. This procedure can also be used to evaluate the effect of moisture on adhesion for a given binder and aggregate combination as the loose bitumen coated aggregates are agitated in water for a certain time. The results are usually measured by visual inspection in terms of the degree of bitumen coating on loose aggregates after the influence of mechanical stirring in water.
CONTACT Syed Bilal Ahmed Zaidi bilal.zaidi@uettaxila.edu.pk There are many techniques which are used to investigate the moisture sensitivity and quantify the affinity between binder and aggregates. Pneumatic Adhesion Tensile Testing Instrument test was used in previous studies to assess the moisture sensitivity of binder-aggregate combinations by estimating pull-out strength (Santagata et al., 2009;Zhang et al., 2015;Zhang et al., 2017). The saturation ageing tensile stiffness was also devised to investigate the moisture damage of asphalt mixtures (Collop et al., 2004;Grenfell et al., 2012). Airey and Choi (2002), in a state of the art report on moisture sensitivity test methods for pavement materials, summarised ten different methods on loose coated aggregates to quantify the affinity in the presence of water. It is difficult to pick the best method out of all, as each has its advantages and disadvantages. Jorgensen (2002) compared boiling water and rolling bottle tests during a round-robin study and found that the boiling water test can be used to differentiate between good and bad combinations of binder and aggregates, but on the other hand the rolling bottle test can be used to rank these combinations, and hence is more precise and reliable. Another comparative study was made by Liu et al. (2013) considering five empirical tests on loose mixtures for performance evaluation including the static immersion test, rolling bottle test (RBT), boiling water test (BWT), total water immersion test and the ultrasonic method. Surface free energy (SFE) tests on aggregate and bitumen were also performed to correlate the performance with these empirical methods. The RBT and BWT were found to be the most sensitive procedures among the five empirical procedures in predicting moisture damage performance. The mixture ranking given by RBT was found to be in agreement with the results of SFE testing. Based on the findings of the previous studies using RBT, it can be said that it is one of the most efficient empirical procedures for moisture damage assessment.
Following the standard procedure, the rolling bottle test measures the affinity between the aggregates and bitumen. By using this test, the affinity is measured by assessing the percentage of binder coating on loose aggregates using visual inspection by two experienced operators and then taking the average value of the two observations to the nearest 5%. However, sometimes it is not possible to get two operators at the required time or sometimes even if two are available, there is a big difference in the values they suggest. Therefore, people can consider the results biased as they are based on visual inspection. Past research shows that the reproducibility of the rolling bottle test is to date rather fair; one reason may be related to the visual interpretation of the percentage of the coating. Other possible causes are more on the test conditions and sample preparation (Lamperti et al., 2015;Partl et al., 2018) Many researchers (Groenninger, 2008;Grönniger et al., 2010;Källén et al., 2013), have a view that as RBT results are based on visual inspection, so they are very subjective and this is one of the drawbacks of the test method. It has been reported that incorporating a digital technique can help in improving the test method (Lamperti et al., 2015).
Recently, the researchers have been focused on introducing image analysis techniques for characterisation of pavement materials i-e aggregates and hot mix asphalt (Bessa et al., 2012;Bruno et al., 2012). Further, Image analysis has been utilised to evaluate the stripping resistance and moisture susceptibility of asphalt mixtures (Amelian et al., 2014;Merusi et al., 2010). In another study, the image analysis was used for the samples of BWT, and it was reported that the results derived for stripped aggregates were in agreement with the results of HMA performance-based tests (Kim et al., 2012). Moreover, the image analysis was reported to be tried as the replacement for visual method of analysis in water immersion test (Xiao et al., 2019). Similarly, for RBT, attempts have also been made to replace eye observation with the digitally processed images using different software like Image J and Image Pro Plus. However, the problem is that one has to make special arrangements for image capturing, including enhanced lighting and placement of aggregate on a special platform during image capture (Lantieri et al., 2017;Yuan et al., 2015).
In this research, an effort has been made to reduce the disadvantages of the method and to make the RBT a more reliable and repeatable means of measuring the affinity between aggregate and binder in the presence of water, using image analysis techniques with no special setup, employing the Matlab program.

Materials
Aggregates from four different sources in the UK were selected for this research. These were limestone, granite, basalt and greywacke. The basic properties of the aggregates used are presented in Table 1.
As filler type affects asphalt mixture properties significantly, five fillers were identified to be used, namely limestone, granite, greywacke, basalt and hydrated lime (HL). As this study focuses on the effect of HL on the performance of asphalt mixture, the four filler types, i.e. limestone, granite, basalt and greywacke, were used with their parent aggregate type and HL was used as a replacement in certain percentages within the mastic.
In order to qualify the effect of HL, it can either be added in the bitumen to make a mastic or can be added to the aggregate to form a mixture. In this research, the first method was used. The addition of HL in the neat bitumen to make mastic can be justified by previous research as to comprehend the properties of asphalt mixtures better. Numerous researchers have utilised intermediate materials, for example, mastics, i.e. mixes of just bitumen and filler, as a model framework (Carl et al., 2019;Roman & García-Morales, 2018;Zaidi, 2018). The rationale behind this methodology is that the material sticking together the aggregates inside the mixture is not the bitumen only but is the bitumen mixed with the finest components of the mineral aggregate, called 'filler' (Asphalt Task Force 2010).
Each aggregate type was tested with four combinations of binders, including one neat bitumen (40/60 pen) and three mastics ( Table 4). The properties of base bitumen and mastics used in this study are shown in Table 2 and Table 3, respectively. In total, twelve different types of mastics were used in combination with four types of aggregates. The notation, composition and the type of the aggregate with which these mastics were used are presented in Table 4.

Test method
The rolling bottle Test (RBT) was conducted in accordance with BS EN 12697-11 (2012). As stated formerly, it provides the measure of the affinity between aggregate and bitumen. This affinity is measured by visual inspection in terms of the degree of bitumen coating on loose bitumen coated aggregates after the influence of mechanical stirring in water. Clean and fully dried aggregate particles were coated with bitumen. These coated aggregates were then stored at room temperature for 12-64 h before testing. In order to proceed with the test, glass bottles were filled to approximately the shoulder with deionised water; the binder coated aggregates and a glass stirrer. After that, the bottles were rotated at a speed of 60 rotations per minute for a total of 72 h. After the completion of the first six hours the samples were emptied from the glass bottles and placed in a test bowl which was then filled with freshwater and the percentage of bitumen coating on the aggregate particles was recorded visually to the nearest 5%. This procedure was repeated consistently at the end of 24, 48 and 72 h and the degree of bitumen coating was then estimated. In the end, the mean value was taken to get an average bitumen coating on the aggregates. In addition to naked-eye observations, good quality images were also taken for each combination used in this study. An internally produced code was developed in Matlab software to analyse the images. For analysis in Matlab, each image was divided into three components, i.e. bitumen or mastic, aggregates and background. The MATLAB code developed for estimating the percentage coating comprised of three main variables. The first variable was the threshold limit for eliminating background from the full image. Similarly, for computerised recognition of the aggregates coated with bitumen, the second variable was the pixel's threshold limit named as "bitumen upper limit (bul)". The third variable was the pixel's threshold limit set to recognise the aggregates without bitumen coating named as the aggregate upper limit (aul). In addition, the mean filter was also used to remove the noising in the image if any. After these functions, the data was analysed pixel by pixel in both rows and columns and displayed. Proceeding with this code to get the percentage coating on the aggregates, first, the background was eliminated from the image. Afterwards, the remaining area was termed as the 'total area'. From this 'total area', the percentage of bitumen/mastic coated aggregates was calculated and rounded to the nearest 5%.
The three variables as specified previously need to be changed from image to image for analysis purposes as the picture quality may be different among all the images. An experienced operator can always adjust these variables to the optimum level after few trials within a few minutes. Therefore, the inexperience operator would have a great influence on the final outcome of the image analysis. Further, different factors can contribute to the variable quality among the different images which include; • The distance from which each picture is taken.
• The angle from which the picture is taken • The difference of brightness in the room at different times • Image resolution, if the picture is taken with multiple devices.
The ideal case would be to capture all the images at the same distance and to keep the capturing device as flat as possible on the top of the sample. Light in the room is also very important, it should not be too bright nor too dim to get the best image, and one should make an effort to keep light intensity as uniform as possible throughout the image capturing. More precisely, avoid low brightness (darkness) conditions which can lead to faint colour, low-quality contrast, blur details etc. The high brightness conditions should also be avoided as it can cause the image to suffer from reflection effect or shadow effect which can distort the image details.
It is difficult to observe all the above conditions at once and sometimes it is not possible or practical, that is why provisions were made in the code to accommodate these variations.
Keeping in mind all the precautions mentioned above, quality images were taken after 6, 24, 48 and 72 h of agitation in the RBT for all the aggregates and bitumen/mastic combinations. After the acquisition of quality images, the percentage of bitumen/mastic coating was calculated using the code as previously discussed.
Images, before and after processing with the Matlab program are shown in Figure 1 to illustrate how the processed image appeared and how the background, bitumen coating and aggregate surfaces were divided. These are the images for neat bitumen in combination with different types of aggregates used in this study, after 72 h of agitation in the RBT. The white portion of the processed image represents the background which was excluded from the total area. The grey portion shows aggregates without any bitumen coating, and the black portion is the bitumen-coated aggregates. A percentage area of the aggregates coated with bitumen was recorded as the final result.

Results and discussion
The affinity between different aggregates and bitumen/mastic in the presence of moisture was measured using the RBT, as discussed earlier. Each of the studied combinations was repeated twice, and the results were within 5% repeatability range. As the allowable limit for the repeatability of the RBT as reported in BS EN 12697-11:2012 is 20%, so the results of this research are well within that range and can be considered to have very good repeatability.
The results for granite aggregates and different combinations of bitumen/mastic are presented in Figure 2. From the figure, it can be evaluated that with time, the coating of the binder was decreased significantly especially for the 40/60 bitumen and for the mastic containing 0% hydrated lime (50% G). For the mastics containing 10% and 20% hydrated lime, their retained percentage coating was considerably better than neat bitumen and the mastic with 0% HL. Although the mastic with 20% HL showed slightly better results than the mastic with 10% HL, there was only a small difference between them.   Similarly, for limestone aggregates (Figure 3), an insignificant difference was recorded between all the three mastics and 40/60 bitumen. In contrast to Figure 2, the mastic having 0% HL (50% LS) performed slightly better than those containing 10% and 20% HL. There was a marginal difference between neat bitumen and all three mastics used in combination with limestone aggregate. Based on these facts, it can be concluded that hydrated lime was not beneficial in the case of limestone aggregate.
From Figure 4, it can be noted that when basalt aggregate was used in combination with neat bitumen and the three mastic types, the beneficial effects of HL were obtained. Again, the performance of neat bitumen and the mastic with 0% HL (50% Basalt) was similar, but with the addition of 10% and 20% HL, the coating percentage remained at a higher level. It is worth noting that there was really no difference between the performance of the mastics having 10% HL and 20% HL and both gave good results in terms of higher affinity for aggregate compared to the mastic without HL.
The results for the greywacke aggregates (as summarised in Figure 5) also depict the beneficial effects of incorporating HL. Although the results were not as discriminatory as they were in the case of granite and basalt aggregates, still a positive effect of HL addition was noted predominant at 10% and 20% HL in comparison with neat bitumen and mastic with 0% HL.
Looking at Figure 2-5 carefully, it is clear that there was practically no difference between different combinations within the same aggregate type after 6 h of visual or photographic inspection of the samples. After 24 h this difference was obvious in some of the aggregate types, for example in the case of granite aggregates the values for the percentage coating for neat bitumen, mastic with 0%  HL, mastic with 10% HL and mastic with 20% HL were 25, 70, 75 and 80% respectively. In the case of limestone aggregate, the percentage coating values at 24 h for different combinations were not far from each other. The basalt aggregates showed a clear difference, and with greywacke aggregates, a small difference was observed after 24 h. Furthermore, after 48 h of inspection, it can be seen that the granite aggregate showed more distinctive results between its different combinations. The values for percentage coating for neat bitumen, mastic with 0% HL, mastic with 10% HL and mastic with 20% HL were 15, 40, 60 and 65% which was more discriminatory compared to the 24-hour values. Again, limestone aggregates showed nearly no difference between their various combinations. Basalt and greywacke results after 48 h looked similar to each other, and both showed clear differences between their various combinations. This difference was a bit more than what was seen after 24 h.
Among all the aggregate types, the percentage coating for granite and basalt aggregate was the most distinctive. Greywacke also gave a significant effect with HL addition, but with limestone, on the other hand, HL did not show any effect, in fact all the combinations were similar to each other. This behaviour of HL with the limestone aggregates can be supported through the past research where it has been reported that hydrated lime was more effective in the asphalt mixtures having siliceous aggregate than limestone aggregates (Bagampadde et al., 2004;Grönniger et al., 2010;Hicks, 1991) Now to compare the performance of the different aggregate types with and without HL, results after 72 h are summarised in Figure 6. The figure shows a clear difference between the performance of different aggregate types with their various combinations.
From the results of neat bitumen, it can be found that it had the worst performance with granite aggregates. On the other hand, neat bitumen performance with the limestone was found to be the best as compared to the other three types. Basalt and greywacke exhibited intermediate performance. Similarly, the mastic with 50% mineral filler of the respective aggregate showed quite variable performance, with the best performance for limestone, then greywacke, basalt and again granite had the worst performance in comparison to all the aggregate types. With the addition of 10% HL with the mineral filler, performance of some aggregate types had jumped to a significantly higher value. For example, the percentage coating for the granite aggregate had jumped to 55% compared to 35% without HL (50% MF). Similarly, percentage coating increased by 50% in the case of basalt aggregate. For the limestone aggregate, there was a slight decrease observed with the addition of HL, but greywacke followed the same trend as the granite and basalt aggregates and showed an increase of 30% in coating with the addition of 10% HL. The performance of the mastics with 10% and 20% HL in most cases was very similar to each other. This means it may not be worthy to add 20% HL in the mastic as it did not improve the performance by the same amount as the 10% HL addition and the performance of the two mastics were very similar to each other in most cases.

Comparison between visual observation and processed image results
A comparison was made for granite, limestone, basalt and greywacke aggregates as provided in Tables 5-8, respectively between the observations made with the naked eye by the operators and results obtained after image analysis. The values for bitumen upper limit (bul) and aggregate upper limit (aul) are also presented, which were used in the Matlab code for the calculation of percentage coating. From the tables as mentioned above, it can be observed that the results obtained from the naked eye and the results calculated after image analysis were very similar to each other. There was some difference between the two results as one was just visual (subjective) observation, and other was digitally computed using Matlab. The chance of error in the results computed using Matlab would be less as compared to the observations made using the naked eye by the two operators. So, it is recommended to analyse the results of RBT test using some image analysis techniques rather than just relying on naked-eye observations.
Some of the recent studies on RBT results analysis have reported the difficulties in image analysis with aggregates having a dark colour such as basalt, where the pixels of bitumen coated aggregates appeared the same as pixels of dark coloured aggregate (Lantieri et al., 2017;Yuan et al., 2015), but no such difficulty in the analysis was found using the code produced as a part of this research.
To further support the statement made above that the results obtained by visual inspection and those computed after image analysis were similar to each other, the results of each aggregate combination obtained by visual inspection was plotted against results obtained after image analysis as presented in Figure 7. The R-squared value was calculated for each aggregate type, which is a statistical measure of how close the data are to the fitted regression line. In Figure 7, it can be observed that the R-squared value ranges from 0.88 to 0.94, which indicates a good fit. So the two results were not far from each other and can be referred to as similar.
The frequency distribution of the difference between the visual and image analysis results for all the four aggregate types is shown in Figure 8. By looking carefully at the figure it can be observed that the difference between the visual and image analysis results for most of the aggregate types is within ± 10%, so again it is concluded that the visual and image analysis results were similar. However, the chance of error in the results computed using image analysis may be less as compared to those obtained by visual examination. Generally, the operator inaccurately estimates the remaining bitumen coating by visual examination in contrast to digital processing as studied in the literature (Lantieri et al., 2017). Thereby, the chance of error with the visual examination is more as compared to image analysis.
In compliance with the procedure adopted in previous researches (Fareed et al., 2020;Haider et al., 2020a;Haider et al., 2020b), the statistical analysis was also performed using Tukey's method in SPSS software for studying the statistical significance of the results. The analysis was performed for results obtained at 72 h. The dependent variable was remaining coating after 72 h obtained through visual observations or image analysis. The independent variable was sample combinations. There were four categorical variables for each type of aggregates. The subsets obtained via Tukey's method with the mean values for visual and image analysis are provided in Tables 9 and 10, respectively. It can be observed that for granite, the mean values for 40%G+10%HL and 30%G+20%HL are in different subsets as compared to 50%G and neat bitumen for both visual and image analysis. Therefore, it is confirmed that the effect of adding hydrated lime with granite aggregates was statistically significant. For limestone, the means for all combinations are in the identical subset for both visual and image analysis that confirmed our previous observation that adding hydrated lime with limestone aggregates was not beneficial. For basalt, the means for 40%B+10%HL and 30%B+20%Hl are in different subset as compared to 50%B and neat bitumen which also clarifies the beneficial effect of adding hydrated lime with basalt aggregates (the identical trend for visual and image analysis). Finally, for greywacke  aggregates, it can be noted that the subset of 40%GW+10%HL and 30%GW+20%Hl and 50%GW is identical but different as compared to neat bitumen for visual observations. Contrary, for image analysis, the subset of 40%GW+10%HL and 30%GW+20%HL is identical but different than 50%GW and neat bitumen. As the chance of error is more with visual observations as studied previously; therefore, the later one can be taken as more accurate subsets. Thus, it can be said that adding hydrated lime in greywacke was significantly effective. In addition to this, it can be seen that the 10% HL and 20% HL substitutions appear in similar subsets for any aggregate type. This supports the former analysis that the 20% HL substitution had a similar influence as that of 10% HL substitution.

Conclusions
The Rolling bottle test was successfully performed on all the combinations used in this study and was quite useful in discriminating different combinations. An image analysis technique by using Matlab software was used for the calculation of percentage retained coating after 6, 24, 48 and 72 h of agitation. The following conclusions can be drawn based on the results presented in this paper: • The results after image analysis were compared to the results of naked-eye observation and the difference between them was less than 10% which is lower than repeatability of the test. Hence the RBT can now be used confidently for the comparison of different combinations using image analysis. • The chances of error in the visual inspection can be more so this image analysis technique can replace the visual inspection method hence improving the reliability of rolling bottle test technique. • The beneficial effects of HL addition were clearly quantified with granite, basalt and greywacke aggregates but limestone aggregates did not respond to the addition of HL when tested in the rolling bottle test. • A 10% HL substitution was found to be more efficient compared to 20% HL substitution, as it gave very similar results in most of the studied combinations.

Disclosure statement
No potential conflict of interest was reported by the authors.