Complexity Reduction of Multiscale UTLM Cell Clusters

Time-domain electromagnetic simulations employing unstructured tetrahedral meshes offer smooth boundary approximations and graded meshes for multiscale problems. However, multiscale effects may arise not only as a consequence of fine geometry but also from CAD and mesh generation artifacts, and it is critical that the simulation algorithms can be employed in their presence without unduly compromising their computational performance. The ability of the unstructured transmission line modeling (UTLM) algorithm to coalesce small computational cells into larger entities is a key enabler for the approach. This paper demonstrates the use of complexity reduction techniques to both notably reduce the preprocessing time required for this and, as a consequence, substantially extend its capability.



Abstract-Time domain electromagnetic simulations employing unstructured tetrahedral meshes offer smooth boundary approximations and graded meshes for multiscale problems. However, multiscale effects may arise not only as a consequence of fine geometry, but also from CAD and mesh generation artifacts and it is critical that the simulation algorithms can be employed in their presence without unduly compromising their computational performance. The ability of the Unstructured Transmission Line Modeling (UTLM) algorithm to coalesce small computational cells into larger entities is a key enabler for the approach. This paper demonstrates the use of complexity reduction techniques to both notably reduce the preprocessing time required for this and as a consequence, substantially extend its capability.

I. INTRODUCTION
Electromagnetic simulations using numerical techniques are an established tool for many technological disciplines and, notwithstanding the inexorable increase in computational power, the demands for assessment of ever larger and more complex problems motivates their continued development, [1][2][3][4].
Multiscale problems are particularly challenging and the need to mesh down to the scale of the smallest feature can not only consume substantial memory, but also more critically for time-stepping algorithms, require the use of an impractically small value of time step in the simulations, often for reasons of algorithmic stability. In practice, multiscale geometries are present in many fields of study and here we just highlight Electromagnetic Compatibility (EMC) studies in the aerospace domain. Complex bundles of thin wires, thin panels of, for example, carbon fiber laminates, are routinely found within whole aircraft or equipment bays which are significantly larger in scale, [5][6][7]. However, there is a further source of multiscale effects which is often overlooked but which can be devastating for the ability to perform an electromagnetic simulation; CAD and meshing artifices.
Large-scale simulations drawing their geometrical inputs from CAD data are routinely plagued by inconsistencies such as nanoscale gaps, misalignments and overlaps. Often, these issues are not problematic for the originator of the CAD data, for example, manufacturing engineers, but their repair is a The authors are with the George Green Institute for Electromagnetics Research, The University of Nottingham, Nottingham, UK phone: +44 1159515567 fax: +44 1159515616; e-mail: phillip.sewell@nottingham.ac.uk. substantial time consumer for the EMC modeler. Furthermore, once the physically significant inconsistencies have been removed, subsequent mesh generation can further contribute to the problem. Slightly misaligned features can lead to localized refinement of the mesh well beyond that demanded by the need to represent the physical phenomena. Fig. 1 shows an example of this: The slight misalignment between the core of a feeding coaxial probe and the central conductor of a Vivaldi antenna, [8,9], causes exactly this situation.  [8,9] leads to many tiny cells appearing.
As CAD repair is a labor intensive, and thus expensive, activity, there is a strong motivation to equip electromagnetic simulation methods with a robust degree of immunity to these localized effects. The work presented in this paper considerably advances a technique which has been developed for just this purpose and is based upon the Unstructured Transmission Line Modeling (UTLM) method.
TLM methods have been used for a range of physical simulations for over 40 years, [2,3,[10][11][12] and their core distinguishing feature is the mapping of electromagnetic problems onto equivalent electrical network problems. Space precludes providing a complete description of this family of approaches here, but referring to Fig. 2 we provide a brief summary of the UTLM variant. The problem space is decomposed into a Delaunay mesh [13] of tetrahedral cells. Tetrahedral cells provide smooth boundary models and permit significant mesh grading, [10,[14][15][16][17]. On the four triangular faces of each cell a pair of field samples is established and by considering local analytic solutions to Maxwell's equations within the cell, it is possible to derive an admittance operator that relates the sampled electric and magnetic field values. This admittance operator permits an implementation as an electrical network node employing commensurate lengths of transmissions line of different characteristic impedances, but  The value of time step used by the UTLM algorithm is, as with all similar algorithms, a fundamental parameter. It is often stated that TLM is unconditionally stable, but that the time step must be smaller than a particular value. This may appear semantically no different than declaring a stability criterion: However, the key difference is that as the time step is increased, there is a cell-by-cell remedial process, [11], that guarantees stability, but accuracy, and eventually physical meaningfulness, is the compromised quantity. Typically, a straightforward implementation of UTLM as described above needs to use a time step comparable to the value prescribed by the Courant condition to maintain acceptable accuracy.
A breakthrough enabler for UTLM has been the use of implicit cell clusters, [11]. Facilitated by the availability of an equivalent circuit representation, it has been possible to implicitly preprocess clusters of small tetrahedral cells and extract larger scattering entities whose algorithmic implementation can be expressed in a canonical form which is good for computational efficiency. The most significant aspect of such cell clustering is that, whilst initially it does not change the overall response of the cluster, rather only its implementational detail, it does partition the cluster behavior into two distinct parts; the physically meaningful and that due to the particulars of the mesh detail. The latter behavior, which can be regarded as sampling noise, only contributes to the dispersion errors of the simulation and although such behavior must be present to ensure causality, it is obviously not important to respect its detail. As the time step is raised it first affects these noise terms and, only after a few orders of magnitude increase, do the physically meaningful terms become affected. Typically, clusters can be easily identified by specifying a threshold length, and adjacent cells separated by less than this amount are coalesced into a cluster. To date, clusters of many 10s of tetrahedra are routinely formed in our studies and the only limit is the time required to pre-process their response, which as shall be discussed below, practically means less than 200 tetrahedra. The time step is then set by requiring that the behavior of the remaining isolated tetrahedra which, by definition of the threshold value, are of physically significant size, is not affected. As stated above, this often means that a time step orders of magnitude larger than the Courant condition value can be used. Finally, it is noted that clustering also provides a solution to the sliver tetrahedra problem which affects methods such as finite elements. Sliver cells are simply joined with their well-shaped neighbors to form a cluster.
Clustering has been the critical enabler for dealing with multiscale effects, both geometrical and due to mesh and CAD artifices. Unfortunately, the limit on its usefulness is the preprocessing time. Furthermore, it shall be shown below, that the time stepping implementation can also benefit from a valuable reduction in complexity. The objective of this paper is to extend the scale of clustering by significantly reducing its computational requirements.
II. THEORY Fig. 3. The clustering concept illustrated by a pair of adjacent cells whose circumcenters are separated by a small distance Δ. (a) Without clustering, the time step is typically required to be 2 Δ (b) With clustering, this time step constraint is removed. Fig. 3a shows the canonical equivalent network associated with an adjacent pair of unclustered tetrahedral cells. The electromagnetic response of each cell comprises a superposition of the behavior of low order electric and magnetic type dipoles and multipoles [10]. The link line characteristic impedances, specifically the line inductances, have been chosen such that the magnetic dipole behavior is correctly mimicked by the network response. The link line joining the two cells is present to provide an inductance proportional to the distance between the cell circumcenters. As these centers approach each other, the time step needs to typically remain below 2 Δ in order that the parasitic capacitance associated with the line does not become dominant. Fig. 3b shows the essence of the clustering approach: replace the link line by an inductive stub and determine an overall scatter response for the pair. Note that the network of Fig. 3b permits a component of a signal to algorithmically transit between the cell centers instantaneously which is consistent with the proposition of their close proximity whereas that of Fig 3a always incurs a minimum delay time of , thus demanding that a small value of be used.
The network problem of Fig. 3b can be solved efficiently by first considering the compact representation of Fig. 2c [11]). In this work, matrices are denoted by bold type. The diagonal matrices and contain values for the transmission line characteristic impedances and the area of each associated cell face which has a normal vector n . Each characteristic impedance is of value where = ⁄ is the characteristic impedance of the material, and being the permeability and permittivity. Δ is the distance between adjacent tetrahedral circumcenters. The terms u appearing in (1) are the face sampled constant field vectors associated with the three electric dipoles modeling the quasi-electrostatic behavior of each tetrahedron and their orientations are determined by a small eigenvalue equation, [10]. is a vector of amplitudes for the electric dipoles. The underpinning technique used to develop the UTLM algorithm is to seek eigen-responses of (1) which are defined as pairs of vectors and which are related by a simple scaling factor, i.e.
= . The physical significance of the eigenvalues is shown in Fig. 4 after introducing the notion of incident and reflected voltages on the transmission lines of Fig.  2b, i.e. = and = . Each eigensolution provides a set of amplitudes such that, if this set of values were incident from the transmission lines onto the cell, it would reflect without distortion in shape via a simple reflection coefficient Besides such solutions, (1) is rank deficient and possesses solutions with =0, I 0 and with = ∞. These physically correspond to the quasi-magnetostatic fields and in a similar manner to (2) give = Combining these results yields the algorithm presented in Fig.  4 which can be expressed as where = and Q is a unitary matrix whose columns are the solutions of (1).
The time delays 2 are implemented by using the open circuited stubs, shown in Fig. 2b, [11]. Clearly the practical viability of (4) as an algorithm depends upon whether Q is frequency dependent. Consider the frequency independent eigenproblem It is straightforward to show that the eigenvectors of (6) are also eigenvectors of (1) and that the eigenvalues are related as = which to second order accuracy are also the same. To extend this technique to cell clusters is straightforward with the natural generalization of (1) and (6): Multiple instances of (1) are linked by the short stubs shown in Fig. 3b and eigensolutions for the whole combination sought in a similar manner to that just described. The critical point that permits the cluster eigenvectors to remain frequency independent is that enforcement of the short circuiting of the inductive stubs in Fig.  3b is deferred until time stepping. (Otherwise selected elements of V in (1) would be 0 and this prevents use of (6) except for the particular case that the corresponding elements of z are also zero.) Whilst the clustering approach is very successful, at this point the scaling of its solution with the number of cells in the cluster becomes problematic and this is the focus of the remainder of this paper.
Consider a cell cluster of N cells. There are 3N elements in the vector and, for large N, 2N elements in the vector I. Direct evaluation of eigenproblem (1), which is a generalized eigenvalue problem of the form = , may be achieved with the QZ algorithm [18]. Iterative approaches are not appropriate as we do require all, not just selected, solutions. The S S Stub short circuits (shown in Fig. 3b) enforced as part of the connection solution time for the problem is of the order N 3 which is untenable for large N. Moreover, (1) is not a well behaved problem. There exist solutions with = ∞, = 0, and the indeterminate = 0/0 cases where both and are zero. Numerically, this severely compromises our ability to extract the solutions that are actually needed. Finally, as suggested in Fig. 4, the resulting time stepping algorithm actually explicitly implements the short circuiting of all inductive stubs each of which physically corresponds to a geometrical close proximity and it ought to be expected that there is scope to sensibly reduce the number of degrees of freedom by seeking a reduced number of equivalent stubs that achieve the same net effect.
The remainder of this section will address the following three issues. (1) Robust solution of (6) for moderate sized N. (2) A complexity reduction process that reduces the both the number of solutions columns in (5) and also the number of inductive stubs required to be present when time stepping. (3) A hierarchical combination of the robust eigensolution and complexity reduction phases for processing very large clusters.

a. Robust Cluster Eigensolving
In (6), let the number of elements in the vectors I and be NI and NX respectively. The overall matrix is of size NI+ NX but no more than NI solutions with finite are expected due to the form of the left hand side. This can be exploited by scaling through to give The singular value decomposition (SVD), [18], reveals solutions of the form Unfortunately, there are physically meaningful cases where elements of both z and C are zero which causes problems. It is commonplace in real meshes for the circumcenters of groups of tetrahedra to identically coincide (for example, the 5 or 6 constituent tetrahedra of a cuboid) and these yield z=0 terms. Values of C=0 can arise for extreme sliver cells and also when a cell has a coincident circumcenter with all four of its neighbors. Examination of the terms of (1) permits the following tableau picture of the situation to be presented. The symbol * denotes unlabeled non zero blocks and the elements 0 and 0 denote the zero values of z and C.
Useful observations are that, as indicated, the number of nonzero elements of z is always greater than the number of nonzero elements of C due to the origin of the terms in (1). Also, the block labeled c is always 0 for similar reasons. (11) can be explicitly deflated to remove the awkward 0 z and 0 C terms. This can be achieved by a sequence of Householder reflections or more robustly, but less efficiently, using SVD in a similar manner, [18]. For example, using Householder reflections, denoted by matrices H with corresponding upper triangular matrices denoted by R which, noting that = 0 and = 0, reveals the more compact problem amenable to solution via SVD as per (8) to (10) and from which all other components of I and follow.
The procedure just described has been deployed and has proved remarkably robust. However, at its core are a number of SVD steps which scale cubically with size. For clusters up to a 200 cells the run time is tractable, however beyond that it becomes unacceptable.

b. Complexity Reduction
Referring to Fig. 4 it is clear that whilst the clustering achieves its objective of permitting larger time steps, it does not reduce the computational complexity associated with a large cluster. It has been argued that many very large clusters follow the pattern of Fig. 1, tiny cells embedded within increasing size cells with the largest cells often forming the outside surface of the cluster. For example, one can conceive of a single large tetrahedron which is of suitable size for a particular physical scenario and with just 8 field samples present on its outer surface that has been subdivided internally to contain a very large number of smaller cells. The clustering approach still requires a computational overhead at run time consistent with the number of all the smaller cells, even though the outer cell only presents 8 degrees of freedom. All the many degrees of freedom associated with the tiny cells are essentially filtered down to 8 quantities and only serve to contribute subtle variations in the time response of the cell as the seen by the rest of the problem.
The purpose of this section is to minimize the number of degrees of freedom a cell cluster must explicitly contain whilst maintaining physical integrity. To this end, it is helpful to distinguish the voltages and currents present in the evaluation of a cluster response (e.g. in (1) and (6)) which are on the outer surface of the cluster and hence in contact with the external world, Vex and Iex, and those which are internal, Vin and Iin.
From (1) and (4)(5)(6), it is possible to write Equation (17) embodies an impedance matrix of order Nex+Nin, where Nex and Nin are the number of terms in Vex and Vin respectively. The desire is to collapse this to one of order Nex without compromising the physical significance. It is to be noted that the form and size of (17) is due to the fact that the short circuiting of the inductive stubs shown in Fig.  4 is implemented during time stepping. It is recalled that if this were not the case, the response times and, more problematically, the matrices Q used in (4) would be frequency dependent. Thus the demand for frequency independent Q is being bought at some significant cost in terms of problem size. However, in a similar vein to the introductory discussion on the clustering approach, it is pragmatic to suggest distinguishing between strong frequency dependencies attributable to physical phenomena and weak frequency dependencies that only affect the sampling noise and precise dispersion characteristics. Moreover, in the context of cell clustering for multiscale purposes, even though a cluster may contain many thousands of cells, its volume may still be relatively small and therefore this should permit exclusion of certain frequency dependencies by asymptotic arguments. As with the clustering approach so far, the key is to manipulate the algorithm into a form which facilitates this operation.
Referring to Fig. 4, it is known that the short-circuiting of the stubs during time stepping results in a behavior equivalent to setting = (currents are positively oriented into the cluster) and similarly accounting for the inductance on the external link lines, combining (18) with (17)  As discussed above, (20) has a reduced dimensionality, but frequency dependent, representation. A further manipulation yields This presents the impedance as a generalized Foster form [19] as can be clarified by a further SVD step which corresponds to the circuit shown in Fig. 5. It is noted that the matrix√ 1 √ is that which would arise if all the stub impedances in, for example, (1) were zero which gives a basis for asserting that it is positive semi-definite. At this point, the separation of the strong and weak frequency dependencies can be made on physical grounds. Each of the parallel LC circuits exhibits a resonance which precludes further simplification. However, in most cases the whole cluster does not encompass sufficient geometry for any of these to be physically relevant and indeed the LC circuits are well approximated by just a simple inductance over the full useful frequency range of the method. Hence It is commented that in the event that a number of the resonances do have a physical significance, there is no reason why selected LC circuits cannot be implemented in their complete form. This model represents the physical expectation that the quasi-electrostatic behavior of the cluster maps to a capacitive network with an inductive correction which increases as the volume of space the cluster encompasses increases.
(24) can now be compared to (1) by expressing it as with a view to processing it, via the eigensolutions, = , to obtain a canonical implementation of the form shown in Fig. 4. However, this cannot be done immediately due the fact that the term , which physically corresponds to the link line impedances, is not a diagonal matrix. One might consider seeking solutions of the generalized eigenproblem defined by the matrix pair and , however, it is straightforward to show that this yields solutions such that matrix Q in (5) is no longer unitary and hence the algorithm of (4) would be unstable.
To proceed requires making a second order accurate approximation. Recall that there are two types of solutions to (25), those which lie entirely within the null space of p, corresponding to the quasi-magnetostatic case, and those which overlap the range of p corresponding to the quasi-electrostatic case. The latter are sought using a diagonal approximation to → , i.e. from Note that the values contained in now become the link line impedances of the transmission lines connecting the cluster with the rest of the problem, i.e., these link lines contribute a part of the net inductance seen looking into the whole cell cluster but are no longer associated with particular geometrical distances.
Denoting the solutions to (26) by , the quasimagnetostatic solutions are defined to be orthogonal to this set and satisfy (24), hence The eigenvalue physically corresponds to the total value of inductance that each particular quasi-magnetostatic solution sees when incident on the cluster. This is partly provided by the link line impedances and the remainder is now provided by a short circuit stub. Referring to Fig. 4, this leads to an algorithm of the same form as originally developed for the cluster, except that each inductive stub no longer represents a particular geometrical feature and they are fewer in number. As before, the enforcement of the short circuiting of the inductive stubs will be deferred to time stepping and Fig.6 summarizes the practical algorithm. The diagonal approximation may be selected in a number of ways, for example, = or = . However, the particular choice will only affect the precise nature of the second order dispersion error and there are a number of consequences to consider. If = , then all the values of are greater than 1, but for other choices some drop below 1 which leads to instability. However, the smaller values of correspond to high order spatial fields and in the context of cell clustering, may again be regarded as modeling mesh noise and thus may be legitimately approximated by setting = 1 which also has the added attraction of removing a number of stubs from the algorithm. Indeed, as explained in the introduction, the clustering approach already adopts the same approach towards the high order spatial quasi-electrostatic solutions corresponding to small values of , by setting = .
In summary, the manipulation of this section, coupled with the physically grounded approximation of resonance phenomena that are only relevant at frequencies where dispersion noise dominates, has reduced the number of degrees of freedom from Nex+Nin to of the order Nex. In fact, this gain proves even more useful in the context of hierarchical clustering discussed in the next section.

c. Hierarchical Clustering
The foregoing analysis and the hitherto deployed scheme described in section (a) make significant use of SVD which scales cubically with matrix size. The purpose of this section is to propose the construction of the responses of large cell clusters using a subdivision approach. It is straightforward to modify (1) or (25) to permit this: In place of the dipole related terms in (1), √ and , will appear the quantities Q and in the manner of (5) and (6) of the smaller constituent clusters that have already been processed. The previous section has reduced the degrees of freedom of the constituent sub-clusters to the number of exterior sample points, removing all their internal detail which is a substantial gain.
III. RESULTS Fig. 7. The meshed Vivaldi antenna, the full details of which are given in [8,9]. The first example presented derives from the geometry shown in Fig. 1, a compact Vivaldi antenna [8,9]. The geometry has been meshed relatively crudely as shown in Fig. 7. The antenna is of length 55 mm and so the threshold Dth controlling cluster formation is set at 0.1 µm. Table 1 shows that many small clusters are formed which is typical. However, there are two large clusters approaching 700 cells. These can be located where the misalignment of Fig.1 causes very fine localized meshing. The cluster with 695 cells is shown in Fig. 7c and it is seen that the outer surface of the cluster in contact with the rest of the mesh is much cruder than the highly resolved interior detail. The overall cluster fits within a cube of side 2 µm and is barely geometrically significant in the overall problem context. The cluster problem solved by (1) has 695 cells and Nex=112 and Nin=2892. Note that many of the numerous faces visible in Fig. 7c are adjacent to the perfect conductor of the antenna and so define short circuit boundary conditions rather than belonging to the set of external faces. Fig. 8a shows the values of recovered from (1). There are 2006 solutions and there is little to distinguish which are the most significant contributors to the physical meaningful behavior of the cell. After applying the complexity reduction process using alternatively = and = , the total number of solutions to (26) is 24 in both cases. Fig 8b shows the associated inductive stub values, , recovered from (27), numbering 79 and 89 respectively. For the case of = , these remain above, albeit many approach, 1. For = , half of the values of drop below 1. If the latter set were to be used these would be given a value of = 1 which removes the need for a stub. The net value of the complexity reduction on this example is that at time stepping the size of the matrix Q in (5) is reduced from 112*2892=323904 elements to (24+89)*(112+89)= 22713, which is a factor of 14 saving in both memory and calculation time.  The benefit of the previous example has been a significant gain at run time, albeit with increased, rather than reduced preprocessing time. Fig. 9 shows the substantial improvement if a recursive cluster splitting policy is adopted. A threshold number of cells, NT, is defined and clusters with more than this number are split into two, each part of which is solved independently using (1), (26) and (27). The solutions from each part are then combined by a further application of (1), (26) and (27). Naturally this split-solve-reduce-combine-reduce strategy is applied recursively. Fig. 9a shows the preprocessing time required as the threshold level NT is varied for cell clusters of different sizes. (These are obtained by changing the clustering parameter that gave Fig. 7c and so are related. All timings in this paper were obtained using a serial code running on a Sandy Bridge CPU). The case of = is considered and all stub solutions to (26) and (27) are kept. A very large improvement in run preprocessing time is observed which reaches its optimum when NT~100 as shown in Fig. 9a. Below this value the ratio of Nex to Nin, which is analogous to a surface area to volume ratio, is no longer small enough to provide an advantage and the bookkeeping overhead of the recursion then causes the time to increase again. Fig. 9b plots the preprocessing time versus overall cluster size and it is clear that the approximately cubic relationship of the scheme without using recursion or complexity reduction becomes more logarithmic in nature when both are employed and that recursive splitting without complexity reduction is of little value. Space precludes extensive explicit demonstration, but it is stated that when using complexity reduction, the recursive approach yields the same net solutions as the non-recursive case consistent with the second order accuracy of the overall method. An overall simulation example is given below to support this contention. In this work, the splitting of clusters was performed using [20,21] seeking equal volume partitioning. There is clearly scope to further investigate both this aspect and the choice between using = and = or else another possibility. Space precludes further comment here, but it is clear from Fig 9b that the ability to viably move to using clusters of 1000s, rather than 100s of cells in simulations involving multiscale effects strongly motivates continued study.    Fig. 10 shows a second test case arising from a study of complex wiring configurations [22]. The scale of the wiring detail is significantly smaller than that of its enclosure (not shown) and Fig. 11 shows a section through the tetrahedral mesh which is extremely multiscale in nature and not unrealistic for aerospace scenarios. This example differs in nature to the Cluster size expressed as the number of coalesced tetrahedral previous one as now the multiscaling is physically intended and not due to CAD and meshing artifacts. Fig. 12 shows the numbers and sizes of clusters formed for different values of the clustering threshold and Table 2 presents the total preprocessing time of the clusters with and without use of the hierarchical complexity reduction approach. Table 2 shows that for all threshold values, the time required to perform the complexity reduction approximately doubles the preprocessing time. However, use of the hierarchical scheme compensates for this and for Dth=0.4 and 0.5 mm, provides a net improvement. Whilst these results may not appear as impressive as those for single large clusters arising from the Vivaldi antenna problem, i.e. Fig. 9, it must be recognized that the preprocessing task parallelizes perfectly on a cluster by cluster basis, so that the true impact of a few large and very slow to process clusters is that they undermine the ability to load balance the parallelization and in fact dominate the total time almost as if serial computation were used. This argument is confirmed by the case of Dth=0.5 mm which is significantly improved by the hierarchical scheme. In this case, there are a total of 165799 clusters but only 31 have more than 200 cells and it is these that dominate the calculation time. Table 2 also shows a measure of the impact of the complexity reduction on the run time performance. The accumulated size of all the matrices Q, (5)¸ quantifies the amount of parameter (not variable) data that must be retrieved through cache at each time step. Moreover, the time to evaluate the scattering using (4) linearly depends upon the number of elements of the Q matrices. Therefore, this is a useful measure of the impact of the complexity reduction scheme on the computational efficiency of the time stepping algorithm. It is clear that the complexity reduction scheme provides a consistent and useful gain on this measure. Fig. 13 shows more detail of how the different cluster sizes contribute to both figures of merit in Table 2 and confirms that the most significant impact is for the particularly large cluster sizes. Finally, it is noted that load balancing for parallelization of the time stepping algorithm suffers from similar problems to that of the preprocessing in the presence of large clusters, so reducing the size of their Q matrices by an order of magnitude is extremely advantageous. As this work concerns the computational efficiency of preprocessing and time stepping rather than the results of the simulations themselves, we do not show comprehensive simulation outputs due to limited space. However, it is stated that the foregoing development fully respects the second order accuracy of the UTLM technique and that the overall simulation results agree within this framework. To illustrate this point, Fig.  14b shows the return loss of a Vivaldi antenna when mounted   Return loss (dB) The 2 Curves, with and without the use of complexity reduction are indistinguishable upon a perfectly conducting plate; a configuration, shown in Fig  14a, that has been studied more fully as part of an investigation into the coupling between airborne antennas and radomes, [9], wherein the full details of the geometry and other simulation parameters are presented. With the same mesh used to generate Fig.2 of [9], a relatively large value of Dth=5 μm has been chosen which causes three large clusters of 220, 453 and 528 cells to be formed, all other clusters comprising less than 30 cells. The two curves presented in Fig. 14b for the return loss, obtained with and without the complexity reduction approach for preprocessing the simulation, are indistinguishable. This confirms that the complexity reduction does not compromise the practical accuracy of the method. However, in this example the pre-processing time is reduced by 35% and the subsequent simulation by 8% when using the complexity reduction techniques presented in this paper. This clearly shows the dominant impact that just 3 large clusters can have and why the adoption of complexity reduction schemes is important.

IV. CONCLUSION
In this work a significant expansion of the value of cell clustering for the UTLM time domain electromagnetic solver has been demonstrated. First a robust and efficient means of solving the clustering equations has been presented. Second, a complexity reduction technique has been used to substantially reduce the number of run time parameters involved in a simulation. This reduction has been achieved by a combination of SVD techniques and physically grounded approximations. Finally, a hierarchical method of evaluating large clusters has proved to be highly effective, extending the size of clusters that can be practically employed by an order of magnitude.