Differentiable but exact formulation of density-functional theory

The universal density functional $F$ of density-functional theory is a complicated and ill-behaved function of the density-in particular, $F$ is not differentiable, making many formal manipulations more complicated. Whilst $F$ has been well characterized in terms of convex analysis as forming a conjugate pair $(E,F)$ with the ground-state energy $E$ via the Hohenberg-Kohn and Lieb variation principles, $F$ is nondifferentiable and subdifferentiable only on a small (but dense) set of its domain. In this article, we apply a tool from convex analysis, Moreau-Yosida regularization, to construct, for any $\epsilon>0$, pairs of conjugate functionals $({}^\epsilon\!E,{}^\epsilon\!F)$ that converge to $(E,F)$ pointwise everywhere as $\epsilon\rightarrow 0^+$, and such that ${}^\epsilon\!F$ is (Fr\'echet) differentiable. For technical reasons, we limit our attention to molecular electronic systems in a finite but large box. It is noteworthy that no information is lost in the Moreau-Yosida regularization: the physical ground-state energy $E(v)$ is exactly recoverable from the regularized ground-state energy ${}^\epsilon\!E(v)$ in a simple way. All concepts and results pertaining to the original $(E,F)$ pair have direct counterparts in results for $({}^\epsilon\! E, {}^\epsilon\!F)$. The Moreau-Yosida regularization therefore allows for an exact, differentiable formulation of density-functional theory. In particular, taking advantage of the differentiability of ${}^\epsilon\!F$, a rigorous formulation of Kohn-Sham theory is presented that does not suffer from the noninteracting representability problem in standard Kohn-Sham theory.


I. INTRODUCTION
Modern density-functional theory (DFT) was introduced by Hohenberg and Kohn in a classic paper 1 and is now the workhorse of quantum chemistry and other fields of quantum physics. Subsequently, DFT was put on a mathematically firm ground by Lieb using convex analysis. 2 The central quantity of DFT is the universal density functional F (ρ), which represents the electronic energy of the system consistent with a given density ρ. Clearly, the success of DFT hinges on the modelling of F , an extremely complicated function of the electron density. It is an interesting observation that, over the last two or three decades, F has been modelled sufficiently accurately to make DFT the most widely applied method of quantum chemistry, in spite of the fact that Schuch and Verstraete 3 have shown how considerations from the field of computational complexity place fundamental limits on exact DFT: if F (ρ) could be found efficiently, all NP hard problems would be solvable in polynomial time, which is highly unlikely. 4 From a mathematical point of view, DFT is neatly formulated using convex analysis 2 : The universal density functional F (ρ) and the ground-state energy E(v) are related by a conjugation operation, with the density ρ and external potential v being elements of a certain Banach space X and its dual X * , respectively. The functionals F and E are equivalent in the sense that they contain the same information-each can be generated exactly from the other.
The universal density functional F is convex and lower a) Electronic mail: simen.kvaal@kjemi.uio.no semi-continuous but otherwise highly irregular and ill behaved. Importantly, F is everywhere discontinuous and not differentiable in any sense that justifies taking the functional derivative in formal expressions-even for the v-representable densities, as pointed out by Lammert. 5 For example, it is common practice to formally differentiate F with respect to the density, interpreting the functional derivative "−δF (ρ)/δρ(r)" as a scalar potential at r. However, this derivative, a Gâteaux derivative, does not exist.
Together with the problem of v-representability, conventional DFT is riddled with mathematically unfounded assumptions that are, in fact, probably false. For example, conventional Kohn-Sham theory assumes, in addition to differentiability of F , that, if ρ is vrepresentable for an interacting N -electron system, then ρ is also v-representable for the corresponding noninteracting system. 6 While providing excellent predictive results with modelled approximate density functionals, it is, from a mathematical perspective, unclear why Kohn-Sham DFT works at all.
It is the goal of this article to remedy this situation by introducing a family of regularized DFTs based on a tool from convex analysis known as the Moreau envelope or Moreau-Yosida regularization. For ǫ > 0, the idea is to introduce a regularized energy functional ǫ E related to the usual ground-state energy E by where · is the usual L 2 -norm. The convex conjugate of ǫ E is the Moreau envelope ǫ F of F , from which the regularized ground-state energy can be obtained by a Hohenberg-Kohn minimization over densities: where (v|ρ) = v(r)ρ(r)dr. The usual Hohenberg-Kohn variation principle is recovered as ǫ → 0 + . Importantly, the Moreau envelope ǫ F (ρ) is everywhere differentiable and converges pointwise from below to F (ρ) as ǫ → 0 + . We use the term "regularized" for both ǫ E and ǫ F , although it is ǫ F that, as will be shown below, becomes differentiable through the procedure. A remark regarding the Banach spaces of densities and potentials is here in order. If v is a Coulomb potential, then the regularization term in Eq. (1) becomes infinite. Moreover, the strongest results concerning the Moreau-Yosida regularization are obtained in a reflexive setting. The usual Banach spaces X = L 1 (R 3 ) ∩ L 3 (R 3 ) and X * = L 3/2 (R 3 ) + L ∞ (R 3 ) for densities and potentials, respectively, 2 are therefore abandoned, and both replaced with the Hilbert space L 2 (B ℓ ), where B ℓ = [−ℓ/2, ℓ/2] 3 is an arbitrarily large but finite box in R 3 . As is well known, domain truncation represents a well-behaved approximation: as ℓ increases, all eigenvalues converge to the R 3 -limit. Moreover, the continuous spectrum is approximated by an increasing number of eigenvalues whose spacing converges to zero.
We observe that, in the box, the difference is arbitrarily small and explicitly known-it does not relate to the electronic structure of the system and is easily calculated from v. Nothing is therefore lost in the transition from (E, F ) to ( ǫ E, ǫ F ). On the contrary, we obtain a structurally simpler theory that allows taking the derivative of expressions involving the universal functional. Moreover, the differentiability of ǫ F implies v-representability of any ρ, for noninteracting as well as interacting systems, as needed for a rigorous formulation of Kohn-Sham theory. In this paper, we explore the Moreau envelope as applied to DFT, demonstrating how every concept of standard DFT has a counterpart in the Moreau-regularized formulation of DFT and vice versa.
The remainder of the article is organized as follows: In Sec. II, we review formal DFT and discuss the regularity issues of the universal functional within the nonreflexive Banach-space setting of Lieb. 2 In preparation for the Moreau-Yosida regularization, we next reformulate DFT in a truncated domain, introducing the Hilbert space L 2 (B ℓ ) as density and potential space.
The Moreau-Yosida regularization is a standard technique of convex analysis, applicable to any convex function such as the universal density functional. We introduce this regularization in Sec. IV, reviewing its basic mathematical properties. To establish notation, a review of convex analysis is given in the Appendix; for a good textbook of convex analysis in a Hilbert space, with an in-depth discussion of the Moreau-Yosida regularization, see Ref. 7.
Following the introduction of the Moreau-Yosida regularization, we apply it to DFT in Sec. V and subsequently to Kohn-Sham theory in Sec. VI. Finally, Sec. VII contains some concluding remarks.

A. Formal DFT
In DFT, we express the Born-Oppenheimer groundstate problem of an N -electron system in the external electrostatic potential v(r) as a problem referring only to the one-electron density ρ(r). The Born-Oppenheimer N -electron molecular Hamiltonian is given by whereT andŴ are the kinetic-energy and electronelectron repulsion operators, respectively, whilev is a multiplicative N -electron operator corresponding to the scalar potential v(r). The scalar λ is introduced to distinguish between the interacting (λ = 1) and noninteracting (λ = 0) systems. By Levy's constrained-search argument, 8 the (fully interacting) ground-state energy, can be written in the form of a Hohenberg-Kohn variation principle, where I N is the set of N -representable densities-that is, ρ ∈ I N if and only if there exists a normalized Nelectron wave function with finite kinetic energy and density ρ. In Eq. (4), the infimum extends over all properly symmetrized and normalized Ψ ∈ H 1 (R 3N ), the first-order Sobolev space consisting of those functions in L 2 (R 3N ) that have first-order derivatives also in L 2 (R 3N ) and therefore have a finite kinetic energy. Different universal density functionals F can be used in Eq. (5), the only requirement of an admissible functional being that the correct ground-state energy E(v) is recovered.
Given that ρ(r)dr = N , it follows that I N ⊂ L 1 (R 3 ). As demonstrated by Lieb in Ref. 2, the universal density functional F can be chosen as a unique lower semicontinuous convex function with respect to the L 1 (R 3 ) topology. (By definition, therefore, F (ρ) = +∞ for any ρ / ∈ I N ; see Appendix A for remarks on extended-valued functions.) Moreover, by a Sobolev inequality, 2 we may embed the N -representable densities in the Banach space , with norm · X = · L 1 + · L 3 and topological dual X * = L ∞ (R 3 ) + L 3/2 (R 3 ). Given that this Banach space X has a stronger topology than L 1 (R 3 ), a convergent sequence in X converges also in L 1 . From the lower semi-continuity of F in L 1 (R 3 ), we then obtain implying that F is lower semi-continuous also in the topology of X. We note that the choice X = L 1 ∩ L 3 is not unique, but it has the virtue that all Coulomb potentials are contained in X * . On the chosen Banach spaces, the (concave and continuous) ground-state energy E : X * → R ∪ {−∞} and the (convex and lower semi-continuous) universal density functional F : X → R ∪ {+∞} are related by the variation principles In the terminology of convex analysis (see Appendix A), ρ → F (ρ) and v → −E(−v) are each other's convex Fenchel conjugates. To reflect the nonsymmetric relationship between E and F in Eqs. (7a) and (7b), we introduce the nonstandard but useful mnemonic notation which is suggestive of the "shape" of the resulting functions: The density functional F in Eq. (7b) is an extension of the universal functional F HK derived by Hohenberg and Kohn, 1 the latter functional having from our perspective the problem that it is defined only for ground-state densities (v-representable densities) in A N , an implicitly defined set that we do not know how to characterize.
It can be shown that the functional F defined by Eq. (7b) is identical to the constrained-search functional 2 F (ρ) = inf Γ →ρ Tr(T + λŴ )Γ, where the minimization is over all ensemble density matrices Γ corresponding to a density ρ, constructed from N -electron wave functions with a finite kinetic energy. A related functional is the (nonconvex) Levy-Lieb constrained search functional, 8 F LL (ρ) = inf Ψ →ρ Ψ|(T + λŴ )|Ψ , obtained by minimizing over pure states only. In any case, Eq. (7b) defines the unique lower semi-continuous, convex universal functional such that F = (F ∧ ) ∨ . In fact, anyF that satisfies the condition (F ∧ ) ∨ = F is an admissible density functional. In particular, F LL and F HK are both admissible, satisfying this requirement when extended from their domains (I N and A N , respectively) to all of X by setting them equal to +∞ elsewhere.
B. Nondifferentiability of F The Hohenberg-Kohn variation principle in Eq. (5) is appealing, reducing the N -electron problem to a problem referring only to one-electron densities. However, as discussed in the introduction, F is a complicated function. In particular, here we consider its nondifferentiability.
The Gâteaux derivative is closely related to the notion of directional derivatives, see Appendix A. A function F is Gâteaux differentiable at ρ ∈ X if the directional derivative F ′ (ρ; σ) is linear and continuous in all directions σ ∈ X, meaning that there exists a δF (ρ)/δρ ∈ X * such that However, F is finite only on I N . In a direction σ ∈ X such that (ρ(r) + σ(r)) dr = N , F (ρ + sσ) = +∞ for all s > 0, implying that F ′ (ρ; σ) = +∞ and hence that F is not continuous in the direction of σ. The same argument shows that F is discontinuous also in directions σ such that the density ρ + sσ is negative in a volume of nonzero measure for all s > 0. Abandoning strict Gâteaux differentiability for this reason, we may at the next step investigate whether the directional derivative exists and is linear for directions that stay inside X + N , the subset of X containing all nonnegative functions that integrate to N electrons. After all, the discontinuity of F in directions that change the particle number is typically dealt with using a Lagrange multiplier for the particle number constraint. However, Lammert has demonstrated that, even within X + N , there are, for each ρ, directions such that F ′ (ρ; σ) = +∞, associated with short-scale but very rapid spatial oscillations in the density (and an infinite kinetic energy). 5

C. Subdifferentiability of F
Apart from lower (upper) semi-continuity of a convex (concave) function, the minimal useful regularity is not Gâteaux differentiability but subdifferentiability (superdifferentiability), see Appendix A. Let f : X → R ∪ {+∞} be convex lower semi-continuous. The subdifferential of f at x, ∂f (x) ⊂ X * , is by definition the collection of slopes of supporting continuous tangent functionals of f at x, known as the subgradients of f at x, see Fig. 2 in Appendix A. If the graph of f has a "kink" at x, then there exists more than one such subgradient. At a given point x ∈ dom(f ), the subdifferential ∂f (x) may be empty. We denote by dom(∂f ) the set of points x ∈ dom(f ) such that ∂f (x) = ∅. It is a fact that dom(∂f ) is dense in dom(f ) when f is a proper lower semi-continuous convex function. The superdifferential of a concave function is similarly defined.
Together with convexity, subdifferentiability is sufficient to characterize minima of convex functions: A convex lower semi-continuous functional f : Subdifferentiability is a substantially weaker concept than that of Gâteaux (or directional) differentiability.
Clearly, if f (x) is Gâteaux differentiable at x, then ∂f (x) = {δf (x)/δx}. However, the converse is not true: in infinite-dimensional spaces, it is possible that ∂f (x) = {y}, a singleton, while f (x) is not differentiable at x. This is so because ∂f (x) being a singleton is not enough to guarantee continuity of f .
In DFT, subdifferentiability has an important interpretation. Suppose ρ is an ensemble ground-state density of v, meaning that, for all ρ ′ ∈ I N , we have the inequality Then, the subdifferential of F (ρ) at ρ is which is a restatement of the first Hohenberg-Kohn theorem: the potential for which ρ is a ground-state density is unique up to a constant shift. On the other hand, if ρ is not a ground-state density for any v ∈ X * , then ∂F (ρ) = ∅. Thus, a nonempty subdifferential is equiv- We note that B N is dense in X + N , the subset of X containing all nonnegative functions that integrate to N electrons.
However, even though subdifferentiability is sufficient for many purposes, differentiability of F would make formal manipulations easier. Moreover, the characterization of v-representable ρ ∈ B N is unknown and probably dependent on the interaction strength λ. These observations motivate the search for a differentiable regularization of the universal functional.

D. Superdifferentiability of E
Let us briefly consider the superdifferential of E, a concave continuous (and hence upper semi-continuous) function over X * . A fundamental theorem of convex analysis states that where we use the same notation for sub-and superdifferentials. Thus, the potential v has a ground state with density ρ if and only if ρ ∈ ∂E(v); if v does not support a ground state, then ∂E(v) is empty. Denoting the set of potentials in X * that support a ground state by V N , we obtain: If a ground state is nondegenerate, then ∂E(v) = {ρ} is a singleton; together with the fact that E is continuous, it then follows that E is Gâteaux differentiable at v. On the other hand, if the ground state is degenerate, then the subdifferential is the convex hull of g ground-state densities: and E is not differentiable at this v unless all the ρ i are equal-that is, if the degenerate ground states have the same density. For example, in the absence of a magnetic field, the hydrogen atom has the degenerate ground states 1sα and 1sβ, with the same density.

III. DOMAIN TRUNCATION
In Sec. IV, we outline the mathematical background for the Moreau-Yosida regularization. Many useful results, such as differentiability of the Moreau envelope ǫ F (ρ), are only available when the underlying vector space X is reflexive or, even better, when X is a Hilbert space. However, the Banach space Lieb's formulation of DFT is nonreflexive. In this section, we truncate the full space R 3 to a box B ℓ = [−ℓ/2, ℓ/2] 3 of finite volume ℓ 3 , so large that the ground state energy of every system of interest is sufficiently close to the R 3 limit. What is lost from this truncation is well compensated for by the fact that we may now formulate DFT using the Hilbert space for both potentials and densities, as we shall now demonstrate.

A. The ground-state problem
For the spatial domain B ℓ , the N -electron groundstate problem is a variational search for the lowest-energy wave function Ψ ∈ H 1 0 (B N ℓ ), the first-order Sobolev space with vanishing values of the boundary of B N ℓ , the Nfold Cartesian product of B ℓ . The search is carried out only over the subset of H 1 0 (B N ℓ ) which is also normalized and properly symmetrized: for a total spin projection of (N ↑ − N ↓ )/2, the corresponding subset of wavefunctions is antisymmetric in the N ↑ first and the N ↓ last particle coordinates separately.
Any potential in the full space , with equivalent topologies. Since the domain is bounded, the Rellich-Kondrakov theorem 9 states that , which in turn implies that the spectrum of the Hamiltonian H λ (v) in Eq. (3) is purely discrete. 10 Thus, for any potential v in the box, one or more ground-state wave functions Ψ v ∈ H 1 0 exists. We next observe that, ifṽ is a Coulomb potential, then the truncated potential v belongs to L 2 (B ℓ ). Moreover, It is therefore sufficient to consider the ground-state energy as a function Regarding the continuity of E ℓ , we note that the proof We remark that, as ℓ → ∞, E ℓ (v) converges to the exact, full-space ground-state energy E(v). On the other hand, the associated eigenfunctions converge if and only if the full-space ground-state energy E(v) is an eigenvalue, with v = 0 as a counterexample.

B. Densities and the universal density functional
Invoking the usual ensemble constrained-search procedure, we obtain where It is straightforward to see that The density functional F ℓ is completely analogous to the full-space functional F . In particular, F ℓ is lower semicontinuous in the L 1 (B ℓ ) topology by Theorem 4.4 and Corollary 4.5 in Ref. 2. We remark that F ℓ (ρ) = F (ρ) for any ρ ∈ I N (B ℓ ), as seen from the fact that, if Ψ ∈ H 1 (R 3N ) and Ψ → ρ with √ ρ ∈ H 1 0 (B ℓ ), then we must have Ψ ∈ H 1 0 (B N ℓ ). Since B ℓ is bounded, the Cauchy-Schwarz inequality gives for any measurable u, By an argument similar to that of Eq. (6), F ℓ is now seen to be lower semi-continuous also with respect to the L 2 (B ℓ ) topology. Note that so that every N -representable density is in L 2 (B ℓ ). Since F ℓ is convex and lower semi-continuous on H ℓ = L 2 (B ℓ ), we may now formulate DFT in the Hilbert space H ℓ as Given that Hilbert spaces possess a richer structure than Banach spaces, this formulation of DFT is particularly convenient: densities and potentials are now elements of the same vector space H ℓ and reflexivity is guaranteed. Even for the full space, I N (R 3 ) ⊂ L 2 (R 3 ), indicating that it is possible to avoid the use of the box. Indeed, we may restrict the ground-state energy to potentials v ∈ a concave and continuous map. Invoking the theory of conjugation within this reflexive Hilbert-space setting, we have a convex lower semi-continuous universal functional However, Coulomb potentials are not contained in L 2 (R).
On the other hand, this theory is sufficient for dealing with all truncated Coulomb potentials, obtained, for example, from the usual Coulomb potentials by setting them equal to zero outside the box B ℓ ; it is also sufficient when working with Yukawa rather than Coulomb potentials. The optimality conditions for the Hohenberg-Kohn and Lieb variation principles in Eqs. (22a) and (22b) are Denoting the set of densities for which F ℓ is subdifferentiable by B ℓ (by analogy with B N in X) and the set of potentials for which E ℓ is superdifferentiable by V ℓ (by analogy with V N in X * ), we obtain where B ℓ is dense in the subset of H ℓ containing all nonnegative functions that integrate to N electrons. The differentiability properties of F ℓ are the same as those of F discussed in Section II B. To introduce differentiability, a further regularization is necessary.

IV. MOREAU-YOSIDA REGULARIZATION
In this section, we present the basic theory of Moreau-Yosida regularization, introducing infimal convolutions in Section IV A, Moreau envelopes in Section IV B, proximal mappings in Section IV C, and conjugates of Moreau envelopes in Section IV D. The results are given mostly without proofs; for these proofs, we refer to the book by Bauschke and Combettes, 7 whose notation we follow closely.

A. Infimal convolution
In preparation for the Moreau-Yosida regularization, we introduce the concept of infimal convolution in this section and discuss its properties on a Hilbert space H.
Definition 1. For f, g : H → R ∪ {+∞}, the infimal convolution is the function f g : H → R ∪ {±∞} given by In the context of convex conjugation, the infimal convolution is analogous to the standard convolution in the context of the Fourier transform. Here are some basic properties of the infimal convolution for functions that do not take on the value −∞: Theorem 1. Let f, g : H → R ∪ {+∞}. Then: 4. if f and g are convex, then f g is convex.
Henceforth, we restrict our attention to all lower semicontinuous proper convex functions f : H → R ∪ {+∞}, denoting the set of all such functions by Γ 0 (H), see Appendix A. We also need the concepts of coercivity and supercoercivity: a function f : For functions in Γ 0 (H), we have the following stronger properties of the infimal convolution: Theorem 2. Let f, g ∈ Γ 0 (H) such that either g is supercoercive or f is bounded from below and g is coercive. Then where x * is unique if g is strictly convex.
Proof. Point 1 follows from Ref. 7, Prop. 12.14. Point 2 follows from Theorem 1 above. Finally, Point 3 follows from the fact that strictly convex functions have unique minima; the existence of a minimum follows from the (super)coerciveness of the mapping y → x − y 2 H /2.

B. The Moreau envelope
In the following, we introduce the Moreau envelope of functions in Γ 0 (H) and review its properties.
Definition 2. For f ∈ Γ 0 (H) and ǫ > 0, the Moreau-Yosida regularization or the Moreau envelope ǫ f : Since f ∈ Γ 0 (H) and since x → 1 2ǫ x 2 H is strictly convex and supercoercive, it follows from Theorem 1 that ǫ f ∈ Γ 0 (H). In fact, ǫ f is much more well behaved than a general function in Γ 0 (H), as the following theorem shows.
Theorem 3. The Moreau envelope ǫ f of f ∈ Γ 0 (H) with ǫ > 0 satisfies the following properties: 5. ǫ f is continuous; 6. ǫ f is Fréchet differentiable: for every x ∈ H, there exists ∇ ǫ f (x) ∈ H such that for all y ∈ H: 7. the subdifferential of ǫ f at x is given by In Figure 1, the Moreau envelope is illustrated for a convex function f on the real axis. We observe that the minimum value of f (x) is preserved by the Moreau envelope ǫ f (x) and that the second argument x → x − x ′ 2 H /(2ǫ) to the infimal convolution removes all kinks, giving a curvature equal to that of this function.

C. The proximal mapping
From Theorem 1, it follows that the infimum of ǫ f (x) in Eq. (29) is attained with a unique minimizer. We make the following definitions: where prox ǫf (x) is the proximal point of f at x ∈ H.
The usefulness of the proximal mapping follows from the following theorem: Theorem 4. Let f ∈ Γ 0 (H) and ǫ > 0. Then 2. the Fréchet (and Gâteaux) derivative of ǫ f at x is given by 3. for all p, x ∈ H, it holds that Given that ǫ f ∈ Γ 0 (H), there exists a concave ǫ g ∈ −Γ 0 (H) such that ( ǫ f ) ∧ = ǫ g and ( ǫ g) ∨ = ǫ f . The following theorem gives the basic properties of this conjugate: Theorem 5. If ǫ f is the Moreau envelope of f ∈ Γ 0 (H), then their conjugates and the superdifferentials of these conjugates are related as

if x ∈ H, then
Proof. Eq. (37a) follows from the fact that the convex conjugate of Being related in such a simple manner, f ∧ and ( ǫ f ) ∧ share many properties. We note, however, that ( ǫ f ) ∧ is strictly concave, whereas f ∧ may be merely concave.
We remark that the Moreau envelope is not defined for a concave function g ∈ −Γ 0 (H), only for convex functions. Thus, the notation ǫ g for a g ∈ −Γ 0 (H) is not to be interpreted as a Moreau envelope, but as the concave conjugate of a Moreau envelope, ǫ g = ( ǫ (g ∨ )) ∧ .

V. MOREAU-YOSIDA REGULARIZED DFT
Having introduced Moreau-Yosida regularization in the preceding section, we are ready to apply it to DFT on the Hilbert space H ℓ = L 2 (B ℓ ).

A. Moreau-Yosida regularized DFT
Applying Eqs. (29) and (37a) with f = F ℓ and f ∧ = E ℓ , we obtain the regularized Lieb functional ǫ F ℓ : H ℓ → R and ground-state energy ǫ E ℓ : Importantly, these functions are related to each other as conjugate functions; just as we have already encountered for the (E, F ) and (E ℓ , F ℓ ) conjugate pairs. As such, the following Hohenberg-Kohn and Lieb variation principles hold on the Hilbert space H ℓ : However, unlike F and F ℓ , which are finite only for Nrepresentable densities, the Moreau-Yosida regularized Lieb functional ǫ F ℓ is finite on the whole Hilbert space: since, in Eq. (38a), a finite value is always found on the right-hand side, even when ρ / ∈ I N . A curious side effect of the regularization is therefore that the minimizing density in the regularized Hohenberg-Kohn variation principle in Eq. (39a) (which exists for all v ∈ H ℓ ) may not be N -representable: it may be negative in a region of finite measure or contain an incorrect number of electrons.
To illustrate the behaviour of the regularized functional for nonphysical densities, consider ǫ F ℓ (ρ + c) when ρ is N -representable and c ∈ R. From the definition of the Moreau envelope in Eq. (38a), we obtain straightforwardly that The regularized density functional thus depends on c in a simple quadratic manner, with a minimum at c = 0. As ǫ tends to zero from above, ǫ F ℓ (ρ + c) increases more and more rapidly with increasing |c|, approaching F ℓ (ρ + c) = +∞ more closely. As expected, the regularized functional is differentiable in the direction that changes the number of electrons. On the face of it, the existence of minimizing 'pseudodensities' in the Hohenberg-Kohn variation principle that are not N -representable may seem to be a serious shortcoming of the Moreau-Yosida regularizationideally, we would like the minimizing density to arise from some N -electron wave function. However, the appearance of nonphysical pseudo-densities is an inevitable consequence of the regularization-differentiability in all directions cannot be achieved without extending the effective domain of F ℓ to all H ℓ ; alternatively, we may retain the effective domain of N -representable densities and instead work with restricted functional derivatives, defined only in directions that conserve some properties of the density. Such an approach is straightforward for directions that change the number of electrons in the system but much more difficult for directions that lead to negative densities or to an infinite kinetic energy.
The existence of minimizing pseudo-densities that are not N -representable is less important than the fact that ǫ F ℓ converges pointwise to F from below as ǫ → 0 + , even when ρ / ∈ I N (B ℓ ). Also, we shall in the next subsection see that every ρ ∈ H ℓ is linked to a unique physical ground-state density ρ ǫ ∈ B ℓ . It is therefore possible to regard (and to treat) the Hohenberg-Kohn minimization over pseudo-densities in H ℓ as a minimization over physical densities in B ℓ , as discussed below.
We also observe that ǫ E ℓ converges pointwise to E ℓ from below as ǫ → 0 + . More importantly, for any chosen ǫ > 0, we may recover the exact ground-state energy E ℓ from the regularized energy ǫ E ℓ simply by adding the term 1 2 ǫ v 2 2 , which does not depend on the electronic structure of the system. Indeed, this term is no more relevant for the molecular electronic system than the neglected nuclear-nuclear repulsion term-its purpose is merely to make the ground-state energy strictly concave and supercoercive in the external potential so that the universal density functional becomes differentiable and continuous. Indeed, no information regarding the electronic system is lost in the regularization beyond what is lost upon truncation of the domain from R 3 to an arbitrarily large cubic box B ℓ , needed to make 1 2 ǫ v 2 2 finite for all potentials.

B. The proximal density and potential
According to the general theory of Moreau-Yosida regularization, a unique minimizer, which we shall here call the proximal (ground-state) density, exists for any ρ ∈ H ℓ in the regularized Lieb functional of Eq. (38a), which may therefore be written as From Eq. (35), we conclude that the standard Lieb functional is subdifferentiable at ρ ǫ and hence that ρ ǫ is an ensemble v-representable ground-state density in H ℓ : We also see from Eq. (35) that every ρ ∈ H ℓ and associated proximal ground-state density ρ ǫ together satisfy the subgradient relation implying that is an external potential with ground-state density ρ ǫ ∈ B ℓ . In the following, we refer to v ǫ as the proximal potential associated with ρ. We recall that, by the Hohenberg-Kohn theorem, the density determines the potential up to a constant. The subdifferential of F ℓ at the proximal density ρ ǫ is therefore where v ǫ is the proximal potential of Eq. (46). Conversely, suppose that ρ ∈ B ℓ . There then exists an external potential v such that −v ∈ ∂F ℓ (ρ). Expressing v in the form v = ǫ −1 (ρ −ρ) for someρ ∈ H ℓ , we obtain ǫ −1 (ρ − ρ) ∈ ∂F ℓ (ρ), which by Eqs. (35) and (45) implies that ρ is the proximal density ofρ. Thus, every ensemble v-representable density ρ ∈ B ℓ is the proximal density of ρ − ǫv ∈ H ℓ where v is such that −v ∈ ∂F ℓ (ρ): In short, we have the important fact that the set of proximal densities in H ℓ is precisely the set of ensemble ground-state densities B ℓ . A density ρ ∈ H ℓ whose proximal density is ρ ǫ is called a carrier density of ρ ǫ . By the Hohenberg-Kohn theorem, the potential v in Eq. (48) is unique up a constant c ∈ R. The carrier density is therefore uniquely determined up to an additive constant. The nonuniqueness of the carrier density also follows directly from Eq. (41), which shows that ρ and ρ + c where ρ ∈ H ℓ and c ∈ R have the same proximal ground-state density ρ ǫ ∈ B ℓ .
To summarize, even though the densities in the regularized Hohenberg-Kohn variation principle in Eq. (39a) are pseudo-densities (not associated with any N -electron wave function), every such density ρ ∈ H ℓ is uniquely mapped to a ground-state density by the surjective proximal operator This operator performs the decomposition where the proximal density ρ ǫ ∈ B ℓ may be viewed as the 'projection' of ρ onto B ℓ with potential v ǫ ∈ V ℓ . We note that ρ ǫ = ρ, even when ρ ∈ B ℓ . The proximal operator is therefore not a true projector. For any ρ ∈ H ℓ , the proximal density ρ ǫ and proximal potential v ǫ together satisfy the usual reciprocal relations for the standard Lieb functional and ground-state energy: see Eq. (13), and therefore satisfy the relation: Thus, to every solution of the regularized Hohenberg-Kohn variation principle with −v ∈ ∂ ǫ F ℓ (ρ) in Eq. (39a) there corresponds a proximal solution to the standard variation principle with −v ǫ ∈ ∂F ℓ (ρ ǫ ).
C. Differentiability of ǫ F ℓ Regarding the differentiability of the regularized Lieb functional, we note from Theorems 3 and 4 that ǫ F ℓ is Fréchet differentiable so that with the derivative given by Eq. (46): Gâteaux differentiability follows from Fréchet differentiability: the existence of ∇ ǫ F ℓ (ρ) implies that the directional derivatives at ρ exist in all directions σ ∈ H ℓ and are equal to Hence the functional derivative of ǫ F ℓ is well defined and given by justifying the formal manipulations involving functional derivatives in DFT, recalling that ǫ F ℓ (ρ) tends to F ℓ (ρ) pointwise from below as ǫ → 0 + . (However, v ǫ need not converge to anything.)

D. The optimality conditions of regularized DFT
The optimality conditions of the regularized DFT variation principles in Eqs. (39a) and (39b) are the reciprocal relations which for the regularized Hohenberg-Kohn variation principle may now be written in the form of a stationary condition: In combination with Eq. (56), we obtain v ǫ = v and hence from Eq. (46) the following Hohenberg-Kohn stationary condition: suggestive of an iterative scheme with the repeated calculation of the proximal density until self-consistency. By contrast, the Lieb optimality condition ρ ∈ ∂ ǫ E ℓ (v) in Eq. (57) cannot be written as a stationary condition since the ground-state energy ǫ E ℓ (just like E and E ℓ ) is differentiable only when v has a unique ground-state density. From Theorem 5, we obtain which shows that the degeneracy of the ground-state energy is preserved by the Moreau-Yosida regularization. For any ρ ∈ H ℓ in Eq. (58), an explicit expression for the potential v ǫ in terms of the proximal density is given in Eq. (46), yielding the regularized ground-state energy Hence, for every ρ ∈ H ℓ , there exists a potential v ǫ for which ρ is the ground-state density. Stated differently, the set of ensemble v-representable pseudo-densities ǫ B ℓ is equal to the full Hilbert space: We recall that the proximal density ρ ǫ is the exact (standard ) ground-state energy of v ǫ , see Eq. (52).

VI. REGULARIZED KOHN-SHAM THEORY
In the present section, we apply Moreau-Yosida regularization to Kohn-Sham theory, beginning with a discussion of the adiabatic connection. The essential point of the regularized Kohn-Sham theory is the existence of a common ground-state pseudo-density for the interacting and noninteracting systems, thereby solving the representability problem of Kohn-Sham theory.
In the present section, we simplify notation by omitting the subscript that indicates the length of the box from all quantities-writing H, for instance, rather than H ℓ everywhere.

A. Regularized adiabatic connection
The presentation of Moreau-Yosida regularized DFT given in Section V was for the fully interacting electronic system, with an interaction strength λ = 1 in the Hamiltonian of Eq. (3). However, given that nothing in the development of the theory depends on the value of λ, it may be repeated without modification for λ = 1. In particular, we note that the set of ground-state pseudodensities is equal to the whole Hilbert space and hence is the same for all interaction strengths, see Eq. (62). Consequently, every ρ ∈ H is the ground-state pseudo-density of some v λ ∈ H, for each λ.
To setup the adiabatic connection, we select ρ ∈ H. Denoting by ǫ F λ : H → R the regularized universal density functional at interaction strength λ, we obtain from Eq. (58) the unique external potential for which the regularized ground-state energy at that interaction strength ǫ E λ : H → R is given by As λ changes, the potential v λ ǫ can be adjusted to setup an adiabatic connection of systems with the same groundstate pseudo-density ρ at different interaction strengths.
In the Moreau-Yosida regularized adiabatic connection, the pseudo-density ρ has a proximal ground-state density that depends on λ: which is the true ground-state density in the potential v λ ǫ at that interaction strength: In short, in the adiabatic connection, the effective potential v λ ǫ has the same ground-state pseudo-density ρ but different ground-state densities ρ λ ǫ = ρ + ǫv λ ǫ for different interaction strengths. In the next subsection, we shall see how this decomposition makes it possible to calculate the true ground-state energy by (regularized) Kohn-Sham theory in a rigorous manner, with no approximations except those introduced by domain truncation.

B. Regularized Kohn-Sham theory
Consider an N -electron system with external potential v ext ∈ H. We wish to calculate the ground-state energy and to determine a ground-state density of this system: This can be achieved by solving the interacting manybody Schrödinger equation, in some approximate manner. In Kohn-Sham theory, we proceed differently, solving instead a noninteracting problem with the same density. We begin by transforming Eq. (67) into a regularized many-body energy, noting that the energy and superdifferential of the exact and regularized ground-state energies are related according to Eqs. (37a) and (37b) as From these relations, it follows that the pseudo-density is a ground-state density of the regularized system: The subscript 'c' indicates that ρ c is the carrier density of both the physical ground-state of the system ρ according to Eq. (70) and the ground-state density of the Kohn-Sham system ρ s : Our task is to determine the carrier density and regularized ground-state energy by solving Eq. (71). The solution will subsequently be transformed to yield the physical ground-state density and energy. We observe that the carrier density ρ c is obtained from the physical density ρ by subtracting ǫv ext with ǫ > 0, see Eq. (70). In practice, v ext < 0 since the external potential is the attractive Coulomb potential of the nuclei. It therefore follows that the pseudo-density is strictly positive: ρ c > 0.
Given that ρ c ∈ H, there exists a Kohn-Sham potential v s ∈ H such that ρ c is the ground-state density of a noninteracting system in this potential: To determine the regularized Kohn-Sham potential v s , we first note that the potentials v ext and v s satisfy the stationary condition in Eq. (63): To proceed, we next introduce the regularized Hartreeexchange-correlation energy and potential as yielding the following expression for the Kohn-Sham potential as a function of the density: To solve the regularized Kohn-Sham problem in Eq. (73), we first note that it is related in a simple manner to the standard Kohn-Sham problem: we then proceed in an iterative fashion. From some trial pseudo-density ρ 0 , we iterate until convergence, beginning with i = 1 and terminating when self-consistency has been established. We emphasize that the regularized Kohn-Sham iterations in Eqs. (80a) and (80b) are identical to the iterations in standard Kohn-Sham theory except for the use of a regularized Hartree-exchange-correlation potential in the construction of the Kohn-Sham matrix and the subtraction of −ǫv i from the density generated by diagonalization of the resulting Kohn-Sham matrix.
Having determined the ground-state carrier density ρ c and the corresponding Kohn-Sham potential v s by iterating Eq. (80a) and (80b) until self consistency, we calculate the interacting regularized ground-state energy as from which the physical ground-state energy E 1 (v ext ) is recovered by adding 1 2 ǫ v ext 2 2 according to Eq. (68), while the ground-state density ρ is recovered by adding ǫv ext to the pseudo-density ρ c according to Eq. (70). We note that the pair (ρ c , v s ) is uniquely determined to the extent that ρ in Eq. (67) is unique; for systems with degenerate ground-state densities, several equivalent pairs (ρ c , v s ) exist.
By means of Moreau-Yosida regularization, we have thus setup Kohn-Sham theory in a rigorous manner, where the interacting and noninteracting ground-state densities are different (by an amount proportional to ǫ) but related by the same carrier density ρ c , thereby solving the noninteracting representability problem of standard Kohn-Sham theory. Moreover, differentiability of the regularized universal density functional means that the potentials associated with this pseudo-density at different interaction strengths are well defined as the (negative) derivatives of the density functional. In the limit where ǫ → 0 + , standard Kohn-Sham theory is approached, although the limit itself is not expected to be well behaved.

VII. CONCLUSION
The possibility of setting up DFT follows from the mathematical properties of the ground-state energy E(v), which is continuous and concave in the external potential v. By convex conjugation, it may be exactly represented by the lower semi-continuous and convex universal density functional F (ρ), whose properties reflect those of the ground-state energy. Unfortunately, F (ρ) depends on the density ρ in a highly irregular manner, being everywhere discontinuous and nowhere differentiable. These characteristics of F arise in part because E is concave but not strictly concave and not supercoercive. By modifying E in a way that introduces strict concavity and supercoercivity without losing information about the electronic system, we obtain an alternative DFT, where the universal density functional is much more well behaved, being everywhere differentiable (and therefore also continuous). This is achieved by Moreau-Yosida regularization, where we apply convex conjugation not to E(v) itself but to the strictly concave function E(v) − 1 2 ǫ v 2 2 , where ǫ > 0. The resulting density functional ǫ F (ρ) is convex and differentiable. Standard DFT is recovered as ǫ → 0 + but this limit need not be taken for the theory to be exact-for any chosen value of ǫ, we can perform DFT as usual; the exact ground-state energy is recovered as E(v) = ǫ E(v) + 1 2 ǫ v 2 2 . The only restriction on the exact theory is the truncation of the domain from R 3 to a box of finite (but arbitrarily large) volume; such a domain truncation simplifies the Moreau-Yosida formulation of DFT by introducing (reflexive) Hilbert spaces of densities and potentials.
The densities that occur naturally in regularized DFT are not physical densities since they cannot be generated from an N -electron wave function in the usual manner. Nevertheless, each 'pseudo-density' ρ has a clear physical interpretation: it can be uniquely decomposed as ρ = ρ ǫ − ǫv ǫ , where ρ ǫ is a physical ground-state density (the 'proximal density') and v ǫ the associated potential.
This density decomposition justifies Kohn-Sham theory: a given pseudo-density ρ is uniquely decomposed as ρ = ρ λ ǫ − ǫv λ ǫ , at each interaction strength λ. As λ changes, the decomposition of ρ changes accordingly. For the fully interacting system, ρ = ρ 1 − ǫv ext where ρ 1 is the physical ground-state density and v ext the external potential; for the noninteracting system, ρ = ρ s − ǫv s , where ρ s and v s are the Kohn-Sham density and potential, thereby solving the noninteracting representability problem of Kohn-Sham theory. The working equations of regularized Kohn-Sham theory are essentially identical to those of standard Kohn-Sham theory.
Here, we have considered standard Moreau-Yosida reg-ularization. However, we may also consider a generalized approach, in which the regularizing term 1 2 ǫ||v|| 2 2 is replaced by 1 2 ǫ||Av|| 2 2 , where the operator A is chosen based on some a priori knowledge of the desired solution. Indeed, some choices of A result in approaches closely related to known regularization techniques, such as the Zhao-Morrisson-Parr approach 11 to calculate the noninteracting universal density functional and the "smoothing-norm" regularization approach of Heaton-Burgess et. al., used both in the context of optimized effective potentials 12,13 and Lieb optimization methods [14][15][16] . These and related approaches will be discussed in a forthcoming paper. We expect such Moreau-Yosida techniques to be of great practical value in the implementation of procedures that attempt to determine either the ground-state energy or the universal density functional by direct optimization techniques using their derivatives, bearing in mind that both the derivatives and the objective functions are well defined in the regularized context.

Subdifferentiation
A dual function ϕ ∈ X * is said to be a subgradient to f at a point x where f (x) is finite if f (y) ≥ f (x) + ϕ, y − x , ∀y ∈ X, FIG. 2. Illustration of the subdifferential for an f ∈ Γ0(R). For a x0 ∈ R, ∂f (x0) is a collection of slopes of tangent functionals. One such slope ϕ and its affine mapping is shown explicitly, the rest is indicated with stippled lines. ϕ is not unique since the graph of f has a "kink" at x0.
meaning that the affine function y → f (x) + ϕ, y − x is nowhere above the graph of f . The subdifferential ∂f (x) is the set of all subgradients to f at x, see Figure 2. Note that ∂f (x) may be empty. The function f is said to be subdifferentiable at x ∈ X if ∂f (x) = ∅. A function f ∈ Γ 0 (X) has a global minimum at x ∈ X if and only if 0 ∈ ∂f (x). Similarly x → f (x) + ϕ, x has a minimum if and only if −ϕ ∈ ∂f (x). A function f ∈ Γ(X) is subdifferentiable on a dense subset of its domain dom(f ).
By the Hohenberg-Kohn theorem, we know that if ρ is v-representable and that ∂F (ρ) = ∅ otherwise. Thus, from the point of view of convex analysis, the notion of v-representability of ρ is equivalent to subdifferentiability of F at ρ. It follows that the v-representable densities are dense in the set of N -representable densities, the effective domain of F .

Gâteaux differentiability
Let x, y ∈ X. The directional derivative of f at x in the direction of y is defined by if this limit exists (+∞ is accepted as limit). For f ∈ Γ 0 (X), the directional derivative f ′ (x; y) always exists. Let x ∈ X be given. If there is a ϕ ∈ X * such that f ′ (x; y) = ϕ, y , ∀y ∈ X (A14) then f is said to be Gâteaux differentiable at x. In other words, a function is Gâteaux differentiable if its various directional derivatives may be assembled into a linear functional at x. The Gâteaux derivative is the usual notion of functional derivative encountered in the calculus of variations, for which we write ϕ = δf (x)/δx. If f is continuous and has a unique subgradient at x, then it is is also Gâteaux differentiable at x; the converse statement is also true, but note that a unique subgradient alone is not enough to ensure Gâteaux differentiability: continuity is not implied by a unique subgradient.

Fréchet differentiability
A stronger notion of differentiability is given by the Fréchet derivative. Let x ∈ X. If there exists ϕ ∈ X * such that for all sequences h n → 0 in X as n → ∞, then f is Fréchet differentiable at x, and ∇f (x) = ϕ is the Fréchet derivative. Clearly, Fréchet differentiable implies Gâteaux differentiable, but not the other way around. In fact, if ∇f (x) exists at x, then so that f is approximated by its linearization around x. This is not true if f is merely Gâteaux differentiable.