The Principle of Least Action – why it works

(and why T-V is more important that T+V)

I hate to start with a negative campaign, especially against Newton, but it can’t be helped. To make amends let me state that, in my opinion, Newton’s monumental work, ‘The Principia’, is the greatest, most influential book ever written.

Now some students are of the opinion that Newtonian Mechanics and the Principle of Least Action are completely equivalent, but this is not the case.
– Aha, so Newtonian Mechanics (N Mech) needs modification before it can apply in the domains of very high speeds, very tiny masses, or very huge masses?
– Well, yes, but it’s not just a question that the Principle of Least Action (P of LA) adapts Newtonian Mechanics at the peripheries, and has greater reach (and what a reach – it underlies ALL physics), it’s that it makes different claims and its approach is utterly different, but let’s creep up on this more slowly…

Newton (that is to say, Newtonian Mechanics) seeks to try and explain everything in the physical world – an apple falling, a fly walking on water, the shape of the Earth, the Moon’s orbit, etc. – in just one simple, universal way: a force causes a particle to accelerate. (In Newton’s world there are only particles and forces, and these exist in Space and in Time.) If there are structures (planets, soap bubbles, a spinning top, a flowing river, and so on) then these are considered as being made from a collection of particles. The collection may have an intricate shape, and special properties, but always the total effect (many particles) is obtained by ‘adding’ up the individual effects (individual particles).

Take, for example, the case of the Newtonian calculation of the gravitational attraction between the Earth and the Moon. First, the Earth and the Moon are imagined to be subdivided into an infinite number of particles or mass-points; then the attraction between one mass-point in the Earth and one mass-point in the Moon is found; and then the same is done for all the other such pairs of mass-points. Finally, this infinite number of attractions is ‘added’ together. But this adding is a very tricky accounting problem – we must be sure not to miss out any particle-pairs, and also we mustn’t over-count them. Newton invented a whole new branch of mathematics that could do just this (the integral calculus).

The Principle of Least Action attacks the Earth-Moon attraction in a very different way:

(i) instead of ‘particles’ we can, if we want to, take the ‘whole bodies’ of the Earth and the Moon as primitive ingredients of the problem.

(ii) These ‘whole-bodies’ do not feel a distant force, rather, they respond to the gravitational potential energy (V) in their immediate vicinity.

(iii) How do they respond? They finely adjust their kinetic energies, T, – adjust them in just such a way as to make ‘T-V through time’, the ‘Action’, come out least. T is the total kinetic energy at a given time, and V is the total potential energy at a given time. (‘Least’ is a technical word. We won’t go into it right now except to say that T-V must be least at every instant of time (t), and least over the whole (small) time-window of the problem.)

– Oh, the student might say, N Mech can do this also – it can consider the whole body of the Moon in one go.
– Yes, in the Newtonian gravitation the ‘whole body’ does indeed reduce to a mass-point, but this is a shorthand, a sort of miracle that occurs when the simplifying assumption of spherical symmetry has been made. Anyway, the Earth and the Moon are not spherically symmetric – there are distortions because of their spinning motions, and from tidal effects. And, talking of which, the second big difference of the P of LA is that it can take account of the tides.
– Is that all, but surely Newton can take account of spinning bodies and the tides?
– Yes, but only by introducing ad hoc extra constructions and theorems (‘apparent force’, ‘moment of inertia’, conservation of ‘energy’, conservation of ‘angular momentum’, etc.).

There’s a subtle ‘recursiveness’ in physical processes that the Newtonian Method does not (directly) account for. For example, the tides arise because gravity has caused the shape of bodies to be stretched out, that is to say the mass-distribution has been altered; but this altered mass-distribution in turn affects gravity (the gravitational potential); and then this altered gravity in its turn affects the tidal stretching, and so on, and so on. Another example of ‘recursiveness’ is that the mass of a large body affects its own gravity. (For example, in the case of the Earth, its mass makes Earth’s true radius (in curved spacetime) about 1.5mm larger than the radius in flat 3D Space.)

Tenets of the Principle of Least Action

Let’s set out the tenets of the P of LA, considering all of physics and not just gravity.
(1) In the P of LA, increased complexity is arrived at not merely by having a larger and larger number of elemental particles, but by having more complex elements. For example, we can take as primitive (the kinetic energies of) lever-arm, spinning top, planet, oceans, electron and atomic nucleus, etc. Also, instead of modelling the scenario as particles and forces we now model it as energies, whether kinetic or potential. Kinetic energy is the energy of motion of a massy element of the system (particle, composite body, massy volume-element, etc.). What is potential energy? We are used to thinking of it as the energy of configuration – but it is so much more than this, it is the energy of every way in which elemental parts can interact (the configuration of parts with respect to position, but also configuration with respect to relative speed, orientation, and so on). In short, potential energy is interaction-of-parts energy.

(2) We have our revised basic ingredients, what next? What happens is that instead of particles balancing forces, there is now a balance between total kinetic energy, T, and total potential energy, V. We see straight away the superiority of the P of LA – ‘kinetic energy’ and ‘potential energy’ are both the same sort of stuff, both energies, so obviously they can be in balance. (In N Mech, how do particles balance forces? Answer – by a curious stratagem, the bringing-in of Space and Time in the form of ‘acceleration’.)

(3) In N Mech, ma equals F, that is to say, -ma opposes +F. We sometimes say ‘ -ma is an inertial reaction against F ’. In the P of LA, the idea of inertia is generalized. First, it is kinetic energy that is the new inertia. Second, the inertial response is a reaction against V (rather than against F); third, all the individual kinetic energies respond together to react against V. How do they know how to cooperate, know how to act together against V? Don’t ask, they just do! (A better answer is given in the Q & A below.) In summary:

Newton, ‘-ma’ reacts singly against ‘F’
P of LA, ‘kinetic energies’ react in concert against ‘potential energy’

P of LA, ‘-T’ reacts against ‘V’

(4)The P of LA seeks to find the ‘equilibrium state’ of a system, and therefore it must know all the boundary conditions of this system (how else to know whether the system is closed?). For example, Feynman tells of a ball being thrown up against gravity. As the ball climbs higher, the magnitude of V increases (some of you will know that V=mgh in this example), and this would be useful for reducing T-V. Does this mean that the higher the better? No, because the ball has to use up time to get higher, and yet there’s only a certain amount of time available (the end-time and the end-position are stipulated). The higher the ball goes the more kinetic energy it will have to deploy in order to be sure of getting back to the end-position in time. In other words, the boundary conditions are a crucial part of the problem. How does the ball know, in the middle of its trajectory, where and when it must stop? Once again, don’t ask.

Why T-V ?

As explained above, T and V oppose each other, and so we are concerned with the difference between them (through time). This makes good physical sense as it means that Action does not grow without limit. But it still leaves us with the option of minimizing ‘T-V’ or minimizing ‘V-T’. Now it so happens that T is always a positive quantity (mass is positive, and velocity-squared is positive) and therefore it seems more natural to state the problem as ‘T reacts against -V’ rather than ‘-T reacts against V’; in other words, we choose the first option, minimizing ‘T-V’.

Also, in the P of LA, we see that Newton’s Third Law of Motion needn’t always apply – indeed, it has been generalized. This is consistent with experimental findings and with developments in modern physics (for example, the interaction between two electrons does not always obey Newton’s Third Law). You can see immediately that the P of LA is making additional and larger claims about the physical world, it is not just a re-hash of N Mech.

A particle is always a particle, and a force is always a force – but there is no hard and fast distinction between T and V. So kinetic energy can morph into potential energy, and potential energy can morph into kinetic energy, and not just once but a continual morphing instant by instant throughout the whole time-window of the problem. One of the best examples of this is the electrolytic cell. Negative ions in solution speed away from the negative electrode and eventually reach the positive electrode. However, as more negative ions arrive and crowd around the positive electrode they form a sort of shield around it, reducing the attraction for the next wave of negative ions. In effect, the potential difference (the potential energy) between the two electrodes has been reduced. The reduced potential in turn means that the next batch of negative ions arrive with slightly less kinetic energy than their predecessors, and so on.

The P of LA occurs in an abstract ‘curved space’ – contrived on a system-by-system basis. How the abstract spaces come about (I call them Narnia and Never Never Land) and why they are so useful is described in my book, The Lazy Universe. It is thrilling – and also fantastic confirmation of the method – that, on occasion, the abstract space coincides exactly with actual everyday space (for example, starlight follows curves of maximum ‘proper time’ in everyday spacetime; and a (non-spinning, frictionless) cricket ball follows a parabolic path in everyday ‘Cartesian space’).

N Mech seems simpler and more intuitive; it deals with the forward-marching directive of a force on a particle (from where you are now you determine what will happen next, no need to know the end-conditions), but the simplicity is all to do with the cherry-picked simplicity of the scenario. And there are so many cases where the Newtonian approach can barely begin – for example, the three-body problem and the stability of the Solar System. Another example is from materials science (which has a family resemblance to Einstein’s Theory of Gravitation): a steel girder may be translated through space following a prod at one end, but in reality the girder is not 100% rigid, and so it may compress, bend, twist, buckle or shear. The Variational Mechanics can cope with these more complicated responses for the very reason that it models the problem in a more realistic way (e.g. by accepting that there is a field inside the girder).

Now it may be that within the precision of a given scenario, a lever-arm is in effect rigid, and a cord is inextensible – there are no internal movements within the lever-arm or within the cord and so each is a mini-system ‘in equilibrium’. We remember that the P of LA is a principle about equilibrium, the equilibrium between T and V (see “The Lazy Universe” for more details). Above all, the P of LA is a consistent theory – if it works at all then it must work for any subsystem. As forces play no part in the analysis for a system in equilibrium, neither will they play a part for a subsystem in equilibrium. Therefore, in the P of LA, there is no need to worry about internal forces of constraint. Amazing but true.

Beyond T-V

We have explained how kinetic energy (rather than an accelerating mass) is the true inertial response. Let’s go further. ‘T’ represents all the individual component kinetic energies (whatever constitutes ‘individual component’ in the given scenario), and ‘V’ represents all the whole-structure or non-individual energies. We now ask: “Do the ‘individual component energies’ have to be kinetic?” We find that the answer is “No, they don’t have to be kinetic”. Therefore, instead of a balance between ‘T’ and ‘V’ we have the more general dichotomy: ‘individual component energies’ versus ‘whole structure energies’.

An example of an individual component energy is the mass, m, (we know from Einstein’s E=mc² that m is a form of energy; and what could be a more individual property of something than its mass?). As further proof, in the special case where a body is at rest, it’s mass-energy is evidently not kinetic, but it still interacts with the energy-structure (the rest-mass of a body is in balance with the local curvature of spacetime, in accordance with the P of LA). Another example of a non-kinetic individual response is the orientation taken up by a tiny dipole in an external magnetic field.

In summary, we have moved from a dichotomy between ‘T’ and ‘V’ to the more general dichotomy:

‘individual energies’ counterbalance ‘structure energies’.

Q & A

– Why must T-V be minimized at each instant as well as over the whole time-window? Answer: It is true that a path of minimized length will of necessity have minimized lengths for each of its infinitesimal line-segments. However, the reverse isn’t necessarily true – an assortment of minimized infinitesimal line-segments do not clinch the one true overall minimal path.

– Surely the conservation laws for Energy and for Angular Momentum are just as inevitable in N Mech as in the P of LA?
Answer: In N Mech these conservation laws are added in as extras, in the P of LA they are deduced.

– Teleology: how does the system know what path to follow?
Two-part answer: 1) ‘know’ is a weasel word; how does a Newtonian particle ‘know’ how to calculate its acceleration in order to ‘know’ where to move to next? 2) The Newtonian model gets away with not specifying end-conditions by assuming infinite and absolute Space and infinite and absolute Time. These assumptions are wrong, and so only a very small number of problems work in N Mech, and they work just because we cherry-pick for the very scenarios where end-conditions are implicit or tacitly assumed (e.g. our cherry-picked scenario has no large gravitating masses, or fast-moving clocks, or secretly spinning reference frames, etc., and so Space and Time may be determined at the boundaries of our special scenario).

– How do the individual component energies cooperate?
Two-part answer: 1) Don’t ask, they just do! 2) ‘Cooperate’ is a loaded word. Better to realize that, for the most part, the multitude of (imagined) infinitesimal reactive responses sum to nonsense (something unphysical), or they cancel each other out, but in one special case (or, very occasionally, a handful of special cases) the reactive responses don’t end up cancelling each other out, and this case is the solution, what really happens.

– Surely Newton’s Third Law of Motion can’t be violated? Answer: not overall, but, yes, at the microscopic level it is often violated.

– When does the P of LA fail?
Answer: the P of LA is mathematical, and it requires ‘V’ to be in the form of a mathematical function. When this requirement cannot be satisfied, then the P of LA cannot be used. This is the state of play at the moment (2018) with regards to dissipative processes (heating, friction, and so on). However (as Lanczos reminds us), these dissipation processes are microscopic in origin, and as our quantum-mechanical theories develop, there is no a priori reason why we shouldn’t be able to discover the appropriate functions and/or slightly re-formulate the mathematical methodology of the P of LA. If this could be achieved, then the P of LA would embrace ALL of physics. There is hope for this outcome, as the P of LA and dissipative processes have a common outlook: they both relate to systems trying to attain equilibrium, and doing so by infinitesimal, local processes. In the meantime, N Mech does manage to sometimes deal with dissipation by mocking it up as one ‘resistive force’.

-Are there other occasions where N Mech is better? Yes, in cases where ‘space’ is ‘flat’ and where the problem can be treated vectorially, then N Mech is easier and more intuitive (but, of course, the answers are exactly the same in either method). For example, (i) finding the equilibrium conditions for three taught cords meeting at a point, (ii) finding the angle at which a swimmer should swim in a straight line across a uniformly flowing river – both good problems for using N Mech. Everytime you do calculations in N Mech, remember to say ‘rectangular vectors’ rather than just ‘vectors’. This will remind you how restricted is the range of problems that N Mech can treat.

-why bother with the P of LA? (the calculations are harder, less intuitive, and the final results are no different to N Mech). Answer: At high precision the final results may well be different to N Mech. Also, the Newtonian method is not up to the job in most realistic scenarios. Also, the P of LA, being correct, will lead to more true insight and wisdom into what is happening.

-What are all the ways in which the P of LA is philosophically better (apart from the fact that it’s right!)?: (i) no action at a distance, (ii) local (and do we ever have direct experimental access to anything else?), (iii) takes what’s really there (e.g. the local distribution of matter within the Earth) and works with that…When you realize that in actual practice our experiments are always carried out in the vicinity of large gravitating masses, then you begin to realize the advantage of this approach, (iv) does not assume the conservation laws for energy, etc., it deduces them, (v) the problem is given in a system-specific but yet more general and objective ‘space’ (in N Mech particles are dropped into ‘space’, in the P of LA they are ‘space’!), (vi) the reference frame can be moving (no need for the ‘fictitious forces’ of N Mech) (vii) in the version known as Hamiltonian Mechanics we have a powerful method that enables physicists to see the wood for the trees, (viii) Hamiltonian Mechanics also ‘explains’ wave-particle duality, and shows that classical mechanics is a special case of quantum mechanics!

Summary
To return to our starting comparison of Newtonian and Variational Mechanics, in the former we perform a sum over particles, in the latter, which now has to do with energies, we appreciate that any system may be partitioned into two (system-specific) parts : ‘elemental components’ and ‘whole-structures’, but the two can never be cleanly excised one from the other, there is an overlapping and intertwining between them. The only way that the new ‘sum’ can be carried out, without missing out any parts or over-counting them, is via the method of the Principle of Least Action. The P of LA is therefore so much more than a mere improvement. Imagine adding one domino (the new Principle) and simultaneously setting off myriad dominoes falling into place along a thousand domino-runs radiating out in every direction – these domino-runs are our new unified understanding in every arena of physics.

References

Lanczos C, The Variational Principles of Mechanics, Dover 1970.
Feynman RP, The Feynman Lectures on Physics, Volume II,
       Chapter 19, Addison-Wesley, 1963
Sussman GJ and Wisdom J, Structure and Interpretation of Classical
       Mechanics, The MIT Press 2015.
Coopersmith J, The Lazy Universe: an introduction to the Principle 
       of Least Action, Oxford University Press, 2017.
Gray CG, Principle of Least Action, Scholarpedia, (2009) 4(12):8291

Share this: