2013-03-31

Featured Comments: Week of 2013 March 24

There were two posts that got a few comments each, so I will repost most of those.

Review: Linux Mint MATE 201303

Reader Gary Newell said, "This is a good in depth review. I am currently using the Mint 14 Cinnamon release (Ubuntu base). If you have a powerful enough PC then Cinnamon is the best desktop as far as I can tell. I prefer the Consort desktop used by SolusOS to the MATE desktop and if I have an older PC I actually overall prefer to use XFCE and so tend to run Xubuntu."
Commenter Juan Carlos García Ramírez had this to say: "I still prefer xfce :D so linux mint 14 (Nadia) xfce for me".

Review: Pardus 2013 KDE

Reader Megatotoro shared, "I tested Pardus 2013 as well. In my case, I could test the repositories using Synaptic and could download the localization file for my language. I did notice that even so, some programs were still in Turkish (VLC, Synaptic). I have the Release Candidate installed on my laptop and it is greatly stable."
Commenter Mechatotoro had this bit of support: "@Prashanth, Thank you for your time with Pardus and your review. Good luck going back to school! @Mega, I just came from your blog and recommended you to read this review. I guess you are too fast :-)"

Thanks to all those who commented on those posts. I am back on campus now, and the semester isn't about to wait for me to settle down, so it will be back in full swing any minute now. This means that my post frequency will once again decrease through the rest of the semester. Anyway, if you like what I write, please continue subscribing and commenting!

2013-03-29

Review: Pardus 2013 KDE

My spring break is coming to an end (I only have 1.5 more days), so I figured it might be nice to do another review while I still can. Today I'm reviewing Pardus 2013.

Main Screen + KDE Kickoff Menu
Pardus is a distribution developed at least in part by the Turkish military. It used to not be based on any other distribution and used its unique PISI package management system, which featured delta upgrades (meaning that only the differences between package versions would be applied for upgrades, greatly reducing their size). Since then, though, the organization largely responsible for the development of Pardus went through some troubles. One result was the forking of Pardus into PISI Linux to further develop the original alpha release of Pardus 2013. The other result was the rebasing of Pardus on Debian, abandoning PISI in that regard. Now Pardus 2013 is a distribution based on Debian 7 "Wheezy" that uses either KDE 4.8 or GNOME 3 (whatever version is packaged in the latest version of Debian, though I'm not sure what that is).

I reviewed Pardus on a live USB made with MultiSystem. Follow the jump to see what it's like.

2013-03-27

Hamiltonian Density and the Stress-Energy Tensor

As an update to a previous post about my adventures in QED-land for 8.06, I emailed my recitation leader about whether my intuition about the meaning of the Fourier components of the electromagnetic potential solving the wave equation (and being quantized to the ladder operators) was correct. He said it basically is correct, although there are a few things that, while I kept in mind at that time, I still need to keep in mind throughout. The first is that the canonical quantization procedure uses the potential $\vec{A}$ as the coordinate-like quantity and finds the conjugate momentum to this field to be proportional to the electric field $\vec{E}$, with the magnetic field nowhere to be found directly in the Hamiltonian. The second is that there is a different harmonic oscillator for each mode, and the number eigenstates do not represent the energy of a given photon but instead represent the number of photons present with an energy corresponding to that mode. Hence, while coherent states do indeed represent points in the phase space of $(\vec{A}, \vec{E})$, the main point is that the photon number can fluctuate, and while classical behavior is recovered for large numbers $n$ of photons as the fluctuations of the number are $\sqrt{n}$ by Poisson statistics, the interesting physics happens for low $n$ eigenstates or superpositions thereof in which $a$ and $a^{\dagger}$ play the same role as in the usual quantum harmonic oscillator. Furthermore, the third issue is that only a particular mode $\vec{k}$ and position $\vec{x}$ can be considered, because the electromagnetic potential has a value for each of those quantities, so unless those are held constant, the picture of phase space $(\vec{A}, \vec{E})$ becomes infinite-dimensional. Related to this, the fourth and fifth issues are, respectively, that $\vec{A}$ is used as the field and $\vec{E}$ as its conjugate momentum rather than using $\vec{E}$ and $\vec{B}$ because the latter two fields are coupled to each other by the Maxwell equations so they form an overcomplete set of degrees of freedom (or something like that), whereas using $\vec{A}$ as the field and finding its conjugate momentum in conjunction with a particular gauge choice (usually the Coulomb gauge $\nabla \cdot \vec{A} = 0$) yields the correct number of degrees of freedom. These explanations seem convincing enough to me, so I will leave those there for the moment.

Another major issue that I brought up with him for which he didn't give me a complete answer was the issue that the conjugate momentum to $\vec{A}$ was being found through \[ \Pi_j = \frac{\partial \mathcal{L}}{\partial (\partial_t A_j)} \] given the Lagrangian density $\mathcal{L} = \frac{1}{8\pi} \left(\vec{E}^2 - \vec{B}^2 \right)$ and the field relations $\vec{E} = -\frac{1}{c}\partial_t \vec{A}$ & $\vec{B} = \nabla \times \vec{A}$. This didn't seem manifestly Lorentz-covariant to me, because in the class 8.033 — Relativity, I had learned that the conjugate momentum to the electromagnetic potential $A^{\mu}$ in the above Lagrangian density would be the 2-index tensor \[ \Pi^{\mu \nu} = \frac{\partial \mathcal{L}}{\partial (\partial_{\mu} A_{\nu})} .\] This would make a difference in finding the Hamiltonian density \[ \mathcal{H} = \sum_{\mu} \Pi^{\mu} \partial_t A_{\mu} - \mathcal{L} = \frac{1}{8\pi} \left(\vec{E}^2 + \vec{B}^2 \right). \] I thought that the Hamiltonian density would need to be a Lorentz-invariant scalar just like the Lagrangian density. As it turns out, this is not the case, because the Hamiltonian density represents the energy which explicitly picks out the temporal direction as special, so time derivatives are OK in finding the momentum conjugate to the potential; because the Lagrangian and Hamiltonian densities looks so similar, it looks like both could be Lorentz-invariant scalar functions, but deceptively, only the former is so. At this point, I figured that because the Hamiltonian and (not field conjugate, but physical) momentum looked so similar, they could arise from the same covariant vector. However, there is no "natural" 1-index vector with which to multiply the Lagrangian density to get some sort of covariant vector generalization of the Hamiltonian density, though there is a 2-index tensor, and that is the metric. I figured here that the Hamiltonian and momentum for the electromagnetic field could be related to the stress-energy tensor, which gives the energy and momentum densities and fluxes. After a while of searching online for answers, I was quite pleased to find my intuition to be essentially spot-on: indeed the conjugate momentum should be a tensor as given above, the Legendre transformation can then be done in a covariant manner, and it does in fact turn out that the result is just the stress-energy tensor \[ T^{\mu \nu} = \sum_{\mu, \xi} \Pi^{\mu \xi} \partial^{\nu} A_{\xi} - \mathcal{L}\eta^{\mu \nu} \] (UPDATE: the index positions have been corrected) for the electromagnetic field. Indeed, the time-time component is exactly the energy/Hamiltonian density $\mathcal{H} = T_{(0, 0)}$, and the Hamiltonian $H = \sum_{\vec{k}} \hbar\omega(\vec{k}) \cdot (\alpha^{\star} (\vec{k}) \alpha(\vec{k}) + \alpha(\vec{k}) \alpha^{\star} (\vec{k})) = \int T_{(0, 0)} d^3 x$. As it turns out, the momentum $\vec{p} = \sum_{\vec{k}} \hbar\vec{k} \cdot (\alpha^{\star} (\vec{k}) \alpha(\vec{k}) + \alpha(\vec{k}) \alpha^{\star} (\vec{k}))$ doesn't look similar just by coincidence: $p_j = \int T_{(0, j)} d^3 x$. The only remaining point of confusion is that it seems like the Hamiltonian and momentum should together form a Lorentz-covariant vector $p_{\mu} = (H, p_j)$, yet if the stress-energy tensor respects Lorentz-covariance, then integrating over the volume element $d^3 x$ won't respect transformations in a Lorentz-covariant manner. I guess because the individual components of the stress-energy tensor transform under a Lorentz boost and the volume element does as well, then maybe the vector $p_{\mu}$ as given above will respect Lorentz-covariance. (UPDATE: another issue I was having but forgot to write before clicking "Publish" was the fact that only the $T_{(0, \nu)}$ components are being considered. I wonder if there is some natural 1-index Lorentz-convariant vector $b_{\nu}$ to contract with $T_{\mu \nu}$ so that the result is a 1-index vector which in a given frame has a temporal component given by the Hamiltonian density and spatial components given by the momentum density.) Overall, I think it is interesting that this particular hang-up was over a point in classical field theory and special relativity and had nothing to do with the quantization of the fields; in any case, I think I have gotten over the major hang-ups about this and can proceed reading through what I need to read for the 8.06 paper.

2013-03-26

Schrödinger and Biot-Savart

There were two things that I would like to post here today. The first is something I have been mulling over for a while. The second is something that I thought about more recently.

Time evolution in nonrelativistic quantum mechanics occurs according to the [time-dependent] Schrödinger equation \[ H|\Psi\rangle = i\hbar \frac{\partial}{\partial t} |\Psi\rangle .\] While this at first may seem intractable, the trick is that typically the Hamiltonian is not time-dependent, so a candidate solution could be $|\Psi\rangle = \phi(t)|E\rangle$. Plugging this back in yields time evolution that occurs through the phase $\phi(t) = e^{-\frac{iEt}{\hbar}}$ applied to energy eigenstates that solve \[ H|E\rangle = E \cdot |E\rangle \] and this equation is often called the "time-independent Schrödinger equation". When I was taking 8.04 — Quantum Physics I, I agreed with my professor who called this a misnomer, in that the Schrödinger equation is supposed to only describe time evolution, so what is being called "time-independent" is more properly just an energy eigenvalue equation. That said, I was thinking that the "time-independent Schrödinger equation" is really just like a Fourier transform of the Schrödinger equation from time to frequency (related to energy by $E = \hbar\omega$), so the former could be an OK nomenclature because it is just a change of basis. However, there are two things to note: the Schrödinger equation is basis-independent, whereas the "time-independent Schrödinger equation" is expressed only in the basis of energy eigenstates, and time is not an observable quantity (i.e. Hermitian operator) but is a parameter, so the change of basis/Fourier transform argument doesn't work in quite the same way that it does for position versus momentum. Hence, I've come to the conclusion that it is better to call the "time-independent Schrödinger equation" as the energy eigenvalue equation.

Switching gears, I was thinking about how the Biot-Savart law is derived. My AP Physics C teacher told me that the Ampère law is derived from the Biot-Savart law. However, this is patently not true, because the Biot-Savart law only works for charges moving at a constant velocity, whereas the Ampère law is true for magnetic fields created by any currents or any changing electric fields. In 8.022 — Physics II, I did see a derivation of the Biot-Savart law from the Ampère law, showing that the latter is indeed more fundamental than the former, but it involved the magnetic potential and a lot more work. I wanted to see if that derivation still made sense to me, but then I realized that because magnetism essentially springs from the combination of electricity and special relativity and because the Biot-Savart law relies on the approximation of the charges moving at a constant velocity, it should be possible to derive the Biot-Savart law from the Coulomb law and special relativity. Indeed, it is possible. Consider a charge $q$ whose electric field is \[ \vec{E} = \frac{q}{r^2} \vec{e}_r \] in its rest frame. Note that the Coulomb law is exact in the rest frame of a charge. Now consider a frame moving with respect to the charge at a velocity $-\vec{v}$, so that observers in the frame see the charge move at a velocity $\vec{v}$. Considering only the component of the magnetic field perpendicular to the relative motion, noting that there is no magnetic field in the rest frame of the charge yields, and considering the low-speed limit (which is the range of validity of the Biot-Savart law) $\left|\frac{\vec{v}}{c}\right| \ll 1$ so that $\gamma \approx 1$ yields $\vec{B} \approx -\frac{\vec{v}}{c} \times \vec{E}$. Plugging in $-\vec{v}$ (the specified velocity of the new frame relative to the charge) for $\vec{v}$ (the general expression for the relative velocity) and plugging in the Coulomb expression for $\vec{E}$ yields the Biot-Savart law \[ \vec{B} = \frac{q\vec{v} \times \vec{e}_r}{cr^2}. \] One thing to be emphasized again is that the Coulomb law is exact in the rest frame of the charge, while the Biot-Savart law is always an approximation because a moving charge will have an electric field that deviates from the Coulomb expression; the fact that the Biot-Savart law is a low-speed inertial approximation is why I feel comfortable doing the derivation this way.

2013-03-25

Review: Linux Mint MATE 201303

For those of you who have been waiting for a review, I think I may have said before that my writing would shift more to science-y stuff and away from distribution reviews. However, that does not mean that reviews will stop entirely. I'm on spring break now and have a little more time to do these reviews, so today I am reviewing Linux Mint MATE 201303, which came out earlier this week.

Main Screen + Linux Mint Menu
This is the version of Linux Mint based on Debian rather than Ubuntu. It uses a variant of a rolling-release model, in that while existing users can get the latest and greatest software simply by applying updates as usual, the updates come in large bundles (I almost want to say they are like the Microsoft Windows Service Packs, except that they work) rather than individual package files. This means that the most common packages used on a Debian-based Linux Mint system are tested so that they can be guaranteed to work not only individually but also together, so that the problem of an individual update breaking other dependencies becomes moot. Around the time of releasing a new update pack, a new ISO file snapshot of the distribution is released, as was the case this time around.

I reviewed the [32-bit] MATE edition using a live USB made with MultiSystem; I wanted to review the Cinnamon edition too, but it refused to boot, so I will leave my assessment of it at that. I also did an installation of this (which regular readers know is rare), so you will have to follow the jump to see what this is like.

2013-03-21

Time and Temperature are Complex

In a post from a few days ago, I briefly mentioned the notion of imaginary time with regard to angular momentum. I'd like to go into that a little further in this post.

In 3 spatial dimensions, the flat (Euclidean) metric is $\eta_{ij} = \delta_{ij}$, which is quite convenient, as lengths are given by $(\Delta s)^2 = (\Delta x)^2 + (\Delta y)^2 + (\Delta z)^2$ which is just the usual Pythagorean theorem. When a temporal dimension is added, as in special relativity, the coordinates are now $x^{\mu} = (ct, x_{j})$, and the Euclidean metric becomes the Minkowski metric $\eta_{\mu \nu} = \mathrm{diag}(-1, 1, 1, 1)$ so that $\eta_{tt} = -1$, $\eta_{(t, j)} = 0$, and $\eta_{ij} = \delta_{ij}$. This means that spacetime intervals become $(\Delta s)^2 = -(c\Delta t)^2 + (\Delta x)^2 + (\Delta y)^2 + (\Delta z)^2$, which is the normal Pythagorean theorem only if $\Delta t = 0$. In general, time coordinate differences contribute negatively to the spacetime interval. In addition, Lorentz transformations are given by a hyperbolic rotation by a [hyperbolic] angle $\alpha$ equal to the rapidity given by $\frac{v}{c} = \tanh(\alpha)$. This doesn't look quite the same as normal Euclidean geometry. However, a transformation to imaginary time, called a Wick rotation, can be done by setting $\tau = it$, so $x^{\mu} = (ic\tau, x_{j})$, $\eta_{\mu \nu} = \delta_{\mu \nu}$, $(\Delta s)^2 = (c\Delta t)^2 + (\Delta x)^2 + (\Delta y)^2 + (\Delta z)^2$ as in the usual Pythagorean theorem, and the Lorentz transformation is given by a real rotation by an angle $\theta = i\alpha$ (though I may have gotten some of these signs wrong so forgive me) where $\alpha$ is now imaginary. Now, the connection to the component $L_{(0, j)}$ of the angular momentum tensor should be more clear.

I first encountered this in the class 8.033 — Relativity, where I was able to explore this curiosity on a problem set. That question and the accompanying discussion seemed to say that while this is a cool thing to try doing once, it isn't really useful, especially because it does not hold true in general relativity with more general metrics $g_{\mu \nu} \neq \eta_{\mu \nu}$ except in very special cases. However, as it turns out, imaginary time does play a role in quantum mechanics, even without the help of relativity.

Schrödinger time evolution occurs through the unitary transformation $u = e^{-\frac{itH}{\hbar}}$ satisfying $uu^{\dagger} = u^{\dagger} u = 1$. This means that the probability that an initial state $|\psi\rangle$ ends after time $t$ in the same state is given by the amplitude (whose square is the probability [density]) $\mathfrak{p}(t) = \langle\psi|e^{-\frac{itH}{\hbar}}|\psi\rangle$. Meanwhile, assuming the states $|\psi\rangle$ form a complete and orthonormal basis (though I don't know if this assumption is truly necessary), the partition function $Z = \mathrm{trace}\left(e^{-\frac{H}{k_B T}}\right)$, which can be expanded in the basis $|\psi\rangle$ as $Z = \sum_{\psi} \langle\psi|e^{-\frac{H}{k_B T}}|\psi\rangle$. This, however, is just as well rewritten as $Z = \sum_{\psi} \mathfrak{p}\left(t = -\frac{i\hbar}{k_B T}\right)$. Hence, quantum and statistical mechanical information can be gotten from the same amplitudes using the substitution $t = -\frac{i\hbar}{k_B T}$, which essentially calls temperature a reciprocal imaginary time. This is not really meant to show anything more deep or profound about the connection between time and temperature; it is really more of a trick stemming from the fact that the same Hamiltonian can be used to solve problem in quantum mechanics or equilibrium statistical mechanics.

As an aside, it turns out that temperature, even when measured in an absolute scale, can be negative. There are plenty of papers of this online, but suffice it to say that this comes from a more general statistical definition of temperature. Rather than defining it (as it commonly is) as the average kinetic energy of particles, it is better to define it as a measure of the probability distribution that a particle will have a given energy. Usually, particles tend to be in lower energy states more than in higher energy states, and as a consequence, the temperature is positive. However, it is possible (and has been done repeatedly) under certain circumstances to cleverly force the system in a way that causes particles to be in higher energy states with higher probability than in lower energy states, and this is exactly the negative temperature. More formally, $\frac{1}{T} = \frac{\partial S}{\partial E}$ where $E$ is the energy and $S$ is the entropy of the system, which is a measure of how many different states the system can possibly have for a given energy. For positive temperature, if two objects of different temperatures are brought into contact, energy will flow from the hotter one to the colder, cooling the former and heating the latter until equal temperatures are achieved. For negative temperature, though, if an object with negative temperature is brought in contact with an object that has positive temperature, each object tends to increase its own entropy. Like most normal objects, the latter does this by absorbing energy, but by the definition of temperature, the former does this by releasing energy, meaning the former will spontaneously heat the latter. Hence, negative temperature is hotter than positive temperature; this is a quirk of the definition of reciprocal temperature, so really what is happening is that absolute zero on the positive side is still the coldest possible temperature, absolute zero on the negative side is now the hottest temperature, and $\pm \infty$ is in the middle.

This was really just me writing down stuff that I had been thinking about a couple of months ago. I hope this helps someone, and I also await the day when TV newscasters say "complex time brought to you by..." instead of "time and temperature brought to you by...".

2013-03-20

Nonzero Electromagnetic Fields in a Cavity

The class 8.06 — Quantum Physics III requires a final paper, written essentially like a review article of a certain area of physics that uses quantum mechanics and that is written for the level of 8.06 (and not much higher). At the same time, I have also been looking into other possible UROP projects because while I am quite happy with my photonic crystals UROP and would be pleased to continue with it, that project is the only one I have done at MIT thus far, and I would like to try at least one more thing before I graduate. My advisor suggested that I not do something already done to death like the Feynman path integrals in the 8.06 paper but instead to do something that could act as a springboard in my UROP search. One of the UROP projects I have been investigating has to do with Casimir forces, but I pretty much don't know anything about that, QED, or [more generally] QFT. Given that other students have successfully written 8.06 papers about Casimir forces, I figured this would be the perfect way to teach myself what I might need to know to be able to start on a UROP project in that area. Most helpful thus far has been my recitation leader, who is a graduate student working in the same group that I have been looking into for UROP projects; he has been able to show me some of the basic tools in Casimir physics and point me in the right direction for more information. Finally, note that there will probably be more posts about this in the near future, as I'll be using this to jot down my thoughts and make them more coherent (no pun intended) for future reference.

Anyway, I've been able to read some more papers on the subject, including Casimir's original paper on it as well as Lifshitz's paper going a little further with it. One of the things that confused me in those papers (and in my recitation leader's explanation, which was basically the same thing) was the following. The explanation ends with the notion that quantum electrodynamic fluctuations in a space with a given dielectric constant, say in a vacuum surrounded by two metal plates, will cause those metal plates to attract or repel in a manner dependent on their separation. This depends on the separation being comparable to the wavelength of the electromagnetic field (or something like that), because at much larger distances, the power of normal blackbody radiation (which ironically still requires quantum mechanics to be explained) does not depend on the separation of the two objects, nor does it really depend on their geometries, but only on their temperatures. The explanation of the Casimir effect starts with the notion of an electromagnetic field confined between two infinite perfectly conducting parallel plates, so the fields form standing waves like the wavefunctions of a quantum particle in an infinite square well. This is all fine and dandy...except that this presumes that there is an electromagnetic field. This confused me: why should one assume the existence of an electromagnetic field, and why couldn't it be possible to assume that there really is no field between the plates?

Then I remembered what the deal is with quantization of the electromagnetic field and photon states from 8.05 — Quantum Physics II. The derivation from that class still seems quite fascinating to me, so I'm going to repost it here. You don't need to know QED or QFT, but you do need to be familiar with Dirac notation and at least a little comfortable with the quantization of the simple harmonic oscillator.

Let us first get the classical picture straight. Consider an electromagnetic field inside a cavity of volume $\mathcal{V}$. Let us only consider the lowest-energy mode, which is when $k_x = k_y = 0$ so only $k_z > 0$, stemming from the appropriate application of boundary conditions. The energy density of the system can be given as \[H = \frac{1}{8\pi} \left(\vec{E}^2 + \vec{B}^2 \right)\] and the fields that solve the dynamic Maxwell equations \[\nabla \times \vec{E} = -\frac{1}{c} \frac{\partial \vec{B}}{\partial t}\] \[\nabla \times \vec{B} = \frac{1}{c} \frac{\partial \vec{E}}{\partial t}\] as well as the source-free Maxwell equations \[\nabla \cdot \vec{E} = \nabla \cdot \vec{B} = 0\] can be written as \[\vec{E} = \sqrt{\frac{8\pi}{\mathcal{V}}} \omega Q(t) \sin(kz) \vec{e}_x\] \[\vec{B} = \sqrt{\frac{8\pi}{\mathcal{V}}} P(t) \cos(kz) \vec{e}_y\] where $\vec{k} = k_z \vec{e}_z = k\vec{e}_z$ and $\omega = c|\vec{k}|$. The prefactor comes from normalization, the spatial dependence and direction come from boundary conditions, and the time dependence is somewhat arbitrary. I think this is because the spatial conditions are unaffected by time dependence if they are separable, and the Maxwell equations are linear so if a periodic function like a sinusoid or complex exponential in time satisfies Maxwell time evolution, so does any arbitrary superposition (Fourier series) thereof. That said, I'm not entirely sure about that point. Also note that $P$ and $Q$ are not entirely arbitrary, because they are restricted by the Maxwell equations. Plugging the fields into those equations yields conditions on $P$ and $Q$ given by \[\dot{Q} = P\] \[\dot{P} = -\omega^2 Q\] which looks suspiciously like simple harmonic motion. Indeed, plugging these electromagnetic field components into the Hamiltonian [density] yields \[H = \frac{1}{2} \left(P^2 + \omega^2 Q^2 \right)\] which is the equation for a simple harmonic oscillator with $m = 1$; this is because the electromagnetic field has no mass, so there is no characteristic mass term to stick into the equation. Note that these quantities have a canonical Poisson bracket $\{Q, P\} = 1$, so $Q$ can be identified as a position and $P$ can be identified as a momentum, though they are actually neither of those things but are simply mathematical conveniences to simplify expressions involving the fields; this will become useful shortly.

Quantizing this yields turns the canonical Poisson bracket relation into the canonical commutation relation $[Q, P] = i\hbar$. This also implies that $[E_a, B_b] \neq 0$, which is huge: this means that states of the photon cannot have definite values for both the electric and magnetic fields simultaneously, just as a quantum mechanical particle state cannot have both a definite position and momentum. Now the fields themselves are operators that depend on space and time as parameters, while the states are now vectors in a Hilbert space defined for a given mode $\vec{k}$, which has been chosen in this case as $\vec{k} = k\vec{e}_z$ for some allowed value of $k$. The raising and lowering operators $a$ and $a^{\dagger}$ can be defined in the usual way but with the substitutions $m \rightarrow 1$, $x \rightarrow Q$, and $p \rightarrow P$. The Hamiltonian then becomes $H = \hbar\omega \cdot \left(a^{\dagger} a + \frac{1}{2} \right)$, where again $\omega = c|\vec{k}|$ for the given mode $\vec{k}$. This means that eigenstates of the Hamiltonian are the usual $|n\rangle$, where $n$ specifies the number of photons which have mode $\vec{k}$ and therefore frequency $\omega$; this is in contrast to the single particle harmonic oscillator eigenstate $|n\rangle$ which specifies that there is only one particle and it has energy $E_n = \hbar \omega \cdot \left(n + \frac{1}{2} \right)$. This makes sense on two counts: for one, photons are bosons, so multiple photons should be able to occupy the same mode, and for another, each photon carries energy $\hbar\omega$, so adding a photon to a mode should increase the energy of the system by a unit of the energy of that mode, and indeed it does. Also note that these number eigenstates are not eigenstates of either the electric or the magnetic fields, just as normal particle harmonic oscillator eigenstates are not eigenstates of either position or momentum. (As an aside, the reason why lasers are called coherent is because they are composed of light in coherent states of a given mode satisfying $a|\alpha\rangle = \alpha \cdot |\alpha\rangle$ where $\alpha \in \mathbb{C}$. These, as opposed to energy/number eigenstates, are physically realizable.)

So what does this have to do with quantum fluctuations in a cavity? Well, if you notice, just as with the usual quantum harmonic oscillator, this Hamiltonian has a ground state energy above the minimum of the potential given by $\frac{1}{2} \hbar\omega$ for a given mode; this corresponds to having no photons in that mode. Hence, even an electrodynamic vacuum has a nonzero ground state energy. Equally important is the fact that while the mean fields $\langle 0|\vec{E}|0\rangle = \langle 0|\vec{B}|0\rangle = \vec{0}$, the field fluctuations $\langle 0|\vec{E}^2|0\rangle \neq 0$ and $\langle 0|\vec{B}^2|0 \rangle \neq 0$; thus, the electromagnetic fields fluctuate with some nonzero variance even in the absence of photons. This relieves the confusion I was having earlier about why any analysis of the Casimir effect assumes the presence of an electromagnetic field in a cavity by way of nonzero fluctuations even when no photons are present. Just to tie up the loose ends, because the Casimir effect is introduced as having the electromagnetic field in a cavity, the allowed modes are standing waves with wavevectors given by $\vec{k} = k_x \vec{e}_x + k_y \vec{e}_y + \frac{\pi n_z}{l} \vec{e}_z$ where $n_z \in \mathbb{Z}$, assuming that the cavity bounds the fields along $\vec{e}_z$ but the other directions are left unspecified. This means that each different value of $\vec{k}$ specifies a different harmonic oscillator, and each of those different harmonic oscillators is in the ground state in the absence of photons. You'll be hearing more about this in the near future, but for now, thinking through this helped me clear up my basic misunderstandings, and I hope anyone else who was having the same misunderstandings feels more comfortable with this now.

2013-03-18

A Less-Seen View of Angular Momentum

Many people learn in basic physics classes that angular momentum is a scalar quantity that describes the magnitude and direction of rotation, such that its rate of change is equal to the sum of all torques $\tau = \dot{L}$, akin to Newton's equation of motion $\vec{F} = \dot{\vec{p}}$. People who take more advanced physics classes, such as 8.012 — Physics I, learn that in fact angular momentum and torque are vectors; in the case of fixed-axis rotation, the moment of inertia (the rotational equivalent to mass) is a scalar so $\vec{L} = I\vec{\omega}$ means that angular momentum points in the same direction as angular velocity. By contrast, in general rigid body motion, the moment of inertia becomes anisotropic and becomes a tensor, so \[\vec{L} = \stackrel{\leftrightarrow}{I} \cdot \vec{\omega}\] implies that angular momentum is no longer parallel to angular velocity, but instead the components are related (using Einstein summation for convenience) by \[L_i = I_{ij} \omega_{j}.\] This becomes important in the analysis of situations like gyroscopes and torque-induced precession, torque-free precession, and nutation.

There is one problem though: there is nothing particularly vector-like about angular momentum. It is constructed as a vector essentially for mathematical convenience. The definition $\vec{L} = \vec{x} \times \vec{p}$ only works in 3 dimensions. Why is this? Let's look at the definition of the cross product components: in 3 dimensions, the permutation tensor has 3 indices, so contracting it with 2 vectors produces a third vector $\vec{c} = \vec{a} \times \vec{b}$ such that $c_i = \varepsilon_{ijk} a_{j} b_{k}$. One trick that is commonly taught to make the cross product easier is to turn the first vector into a matrix and then perform matrix multiplication with the column representation of the second vector to get the column representation of the resulting vector: the details of this rule are hard to remember, but the source is simple, as it is just $a_{ij} = \varepsilon_{ijk} a_{k}$. Now let us see what happens to angular velocity and angular momentum using this definition. Angular velocity was previously defined as a vector through $\vec{v} = \vec{\omega} \times \vec{x}$. We know that $\vec{x}$ and $\vec{v}$ are true vectors, while $\vec{\omega}$ is a pseudovector (defined by it flipping direction when the coordinate system undergoes reflection), so $\vec{\omega}$ is vector to be made into a tensor. Using the previous definition that in 3 dimensions $\omega_{ij} = \varepsilon_{ijk} \omega_{k}$, then \[v_i = \omega_{ij} x_{j}\] now defines the angular velocity tensor. Similarly, angular momentum is a pseudovector, so it can be made into a tensor through $L_{ij} = \varepsilon_{ijk} L_{k}$. Substituting this into the equation relating angular momenta and angular velocities yields \[L_{ij} = I_{ik} \omega_{kj}\] meaning the matrix representation of the angular momentum tensor is now the matrix multiplication of the matrices representing the moment of inertia and angular velocity tensors.

This has another consequence: the meaning of the components of the angular velocity and angular momentum become much more clear. Previously, $L_{j}$ was the generator of rotation in the plane perpendicular to the $j$-axis, and $\omega_{j}$ described the rate of this rotation: for instance, $L_z$ and $\omega_z$ relate to rotation in the $xy$-plane. This is somewhat counterintuitive. On the other hand, the tensor definitions $L_{ij}$ and $\omega_{ij}$ deal with rotations in the $ij$-plane: for example, $L_{xy}$ generates and $\omega_{xy}$ describes rotations in the $xy$-plane, which seem much more intuitive. Also, with this, $L_{ij} = x_{i} p_{j} - p_{i} x_{j}$ becomes a definition (though there may be a numerical coefficient that I am missing, so forgive me).

The nice thing about this formulation of angular velocities and momenta as tensor quantities is that this is generalizable to 4 dimensions, be it 4 spatial dimensions or 3 spatial and 1 temporal dimension (as in relativity). $L_{\mu \nu} = x_{\mu} p_{\nu} - p_{\mu} x_{\nu}$ now defines the generator of rotation in the $\mu\nu$-plane. Similarly, $\omega_{\mu \nu}$ defined in $L_{\mu \nu} = I_{\mu}^{\; \xi} \omega^{\xi}_{\; \nu}$ describes the rate of rotation in that plane. The reason why these cannot be vectors any more is that the permutation tensor gains an additional index, so contracting it with two vectors yields a tensor with 2 indices; this means that the cross product as laid out in 3 dimensions does not work in any other number of dimensions (except, interestingly enough, for 7, and that is because a 7-dimensional Cartesian vector space can be described through the algebra of octonions which does have a cross product, just as 2-dimensional vectors can be described by complex numbers and 3-dimensional vectors can be described by quaternions).

This has further nice consequence for special relativity. The Lorentz transformation as given in $x^{\mu'} = \Lambda^{\mu'}_{\; \mu} x^{\mu}$ is a hyperbolic rotation through an angle $\alpha$, equal to the rapidity defined as $\alpha = \tanh(\beta)$. A hyperbolic rotation is basically just a normal rotation through an imaginary angle. This can actually be seen by transforming to coordinates with imaginary time (called a Wick rotation, which may come back up in a post in the near future): $x^{\mu} = (ct, x^{j}) \rightarrow (ict, x^{j})$, allowing the metric to change as $\eta_{\mu \nu} = \mathrm{diag}(-1, 1, 1, 1) \rightarrow \delta_{\mu \nu}$. This changes the rapidity to just be a real angle, and the Lorentz transformation becomes a real rotation. Because only the temporal coordinate has been made imaginary while the spatial coordinates have been left untouched, because the Lorentz transformation is now a real rotation, and because angular momentum generates real rotations, then it can be said that the angular momentum components $L_{(0, j)}$ generate Lorentz boosts along the $j$-axis. This fact remains true even if the temporal coordinate is not made imaginary and the metric remains with an opposite sign for the temporal component, though the math of Lorentz boost generation becomes a little more tricky. That said, typically the conservation of angular momentum implies symmetry of the system under rotation, thanks to the Noether theorem. Naïvely, this would imply that conservation of $L_{(0, j)}$ is associated with symmetry under the Lorentz transformation. The truth is a little more complicated (but not by too much), as my advisor and I found from a few Internet searches. Basically, in nonrelativistic mechanics, just as momentum is the generator of spatial translation, position is the generator of (Galilean) momentum boosting: this can be seen in the quantum mechanical representation of momentum in the position basis $\hat{p} = -i\hbar \frac{\partial}{\partial x}$, and the analogous representation of position in the momentum basis $\hat{x} = i\hbar \frac{\partial}{\partial p}$. If the system is invariant under translation, then the momentum is conserved and the system is inertial, whereas if the system is invariant under boosting, then the position is conserved and the system is fixed at a given point in space. In relativity, the analogue to a Galilean momentum boost is exactly the Lorentz transformation, so conservation of $L_{(0, j)}$ corresponds to the system being fixed at its initial spacetime coordinate; this is OK even in relativity because spacetime coordinates are invariant geometric objects, even if their components transform covariantly.

There are a few remaining issues with this analysis. One is that rotations in 3 dimensions are just sums of pairs of rotations in planes, and rotations in 4 dimensions are just sums of pairs of rotations in 3 dimensions. This relates in some way (that I am not really sure of) to symmetries under special orthogonal/unitary transformations in those dimensions. In dimensions higher than 4, things get a lot more hairy, and I'm not sure if any of this continues to hold. Also, one remaining issue is that in special relativity, because the speed of light is fixed and finite, rigid bodies cease to exist except as an approximation, so the description of such dynamics using a moment of inertia tensor generalized to special relativity may not work anymore (though the description of angular momentum as a tensor should still work anyway). Finally, note that the generalization of particle momentum $p_{\mu}$ to a distribution of energy lies in the stress-energy tensor $T_{\mu \nu}$, so the angular momentum of such a distribution becomes a tensor with 3 indices that looks something like (though maybe not exactly like) $L_{\mu \nu \xi} = x_{\mu} T_{\nu \xi} - x_{\nu} T_{\mu \xi}$. In addition, stress-energy tensors with relativistic angular momenta may change the metric itself, so that would need to be accounted for through the Einstein field equations. Anyway, I just wanted to further explore the formulations and generalizations of angular momentum, and I hope this helped in that regard.

2013-03-16

Frictions, Subsidies, and Taxes

One of the things I learned in my high school AP Microeconomics class was that a tax causes the supply curve to shift to the left, making the equilibrium quantity decrease and price increase. Consumer and producer surplus both decrease, but while government revenue can account for some of the loss in total welfare, some part of total welfare gets fully lost, and this is what is known as deadweight loss. I didn't have a very good intuition for how this worked at the time (though I was able to get through it on homework, quizzes, tests, and the AP exam). At the same time, though, I thought that a tax should be fully reversible by having the government subsidize producers, and that as this would be the opposite of a tax, supply would shift to the right, the equilibrium quantity would rise and price would fall, and there would be a welfare gain.

Then, when I took 14.01 — Introduction to Microeconomics, we again discussed the situation with a tax. Then we talked about subsidies, but I was confused because the mechanism seemed to be in providing a subsidy to consumers rather than to producers. My intuition at that point was that taxes were creating deadweight loss because producers who wanted to produce and consumers who wanted to consume near the original equilibrium could not do so after the tax, so some transactions were essentially being prohibited. However, I still didn't quite understand why a subsidy would create deadweight loss, because it seemed to me like consumers who wanted to consume more and producers who wanted to produce more than the original equilibrium quantity could now do so, meaning it seemed to me like more transactions were being made possible. That said, I did understand why the government would never subsidize producers: unless the market is perfectly competitive, producers would rather collude and pocket their subsidies while keeping prices high when they can. On the other hand, consumers prefer consuming, so subsidizing consumers is a more surefire way of increasing the equilibrium quantity, even though the price would go up rather than down.

(In 14.04 — Intermediate Microeconomic Theory, we barely touched on deadweight loss in the way that it is covered in more traditional microeconomics classes.) Now, in 14.03 — Microeconomic Theory and Public Policy, I think I better understand the intuition behind deadweight losses stemming from taxes and subsidies, and why a subsidy is not the opposite of a tax. In a tax, the government might try to target some new equilibrium quantity below the original one, so the tax revenue collected, which increases total welfare, is the difference between the willingness of consumers to pay and the willingness of producers to accept at that quantity multiplied by that quantity. Consumer and producer surplus both decrease, and the tax revenue contribution to the increase in total welfare is not enough to offset these two, so there is an overall deadweight loss. A completely isomorphic way of picturing this is by considering the tax falling on consumers so that the demand shifts to the left; in both cases, the equilibrium quantity drops, the government collects its revenue, surpluses drop, so deadweight losses appear.
Meanwhile, for a subsidy, the government might target a higher quantity than the original equilibrium. The spending on that subsidy is the difference between the willingness of producers to accept and the willingness of consumers to pay at that quantity multiplied by that quantity. Consumer and producer surpluses both increase, but together they do not increase enough to offset government spending which is an overall drain on total welfare, so there exists a deadweight loss.

It's interesting that taxes and subsidies are not opposites. The intuition is that for a tax, the revenue is not enough to compensate for the welfare losses of consumers and producers because the new equilibrium quantity is lower. By contrast, for a subsidy, the spending is too high compared to the welfare gains of consumers and producers because the new equilibrium quantity is higher. It looks like it is not possible to spend money given by tax revenue to undo the effects of a tax; instead, the government can only overshoot and overspend. It reminds me very much of how friction works: moving in one direction on a surface with friction causes energy loss, while turning around to move in the other direction on that same surface most certainly does not cause energy gain. Essentially, in this model, the market is frictionless, and the government introduces friction.

Of course, this essentially contradicts Keynesian models of government taxation and spending and their respective effects. That's why care must be taken when putting microeconomic models in a macroeconomic perspective. This also doesn't consider externalities, less than perfectly competitive market structures, et cetera. Anyway, I hope my musings on this may help give other people some intuition on simple issues of deadweight loss in microeconomic theory.

2013-03-06

More on 2012 Fall

Last semester, I was taking 8.05, 8.13, 8.231, and 14.04, along with continuing my UROP. I was busy and stressed basically all the time. Now I think I know why: it turns out that the classes I was taking were much closer to graduate classes in material, yet they came with all the trappings of an undergraduate class, like exams (that were not intentionally easy). Let me explain a little more.

8.05 — Quantum Physics II is where the linear algebra formalism and bra-ket notation of quantum mechanics are introduced and thoroughly investigated. Topics of the class include analysis of wavefunctions in 1-dimensional potentials, vectors in Hilbert spaces, matrix representations of operators, 2-state systems, applications to spin, NMR, continuous Hilbert spaces (e.g. position), the harmonic oscillator, coherent & squeezed states as well as the representation of photon states and the electromagnetic field operators forming a harmonic oscillator, angular momentum, addition of angular momenta, and Clebsch-Gordan coefficients. OK, so considering that most of these things are expected knowledge for the GRE in physics, this is probably more like a standard undergraduate quantum mechanics curriculum rather than a graduate-level curriculum. That said, apparently this perfectly substitutes for the graduate-level quantum theory class, because I know of a lot of people who go right from 8.05 to the graduate relativistic quantum field theory class.

8.13 — Experimental Physics I is generally a standard undergraduate physics laboratory class (although it is considered standard in the sense that its innovations have spread far and wide). The care and detail in performing experiments, analyzing data, making presentations, and writing papers seem like fairly obvious previews of graduate life as an experimental physicist.

8.231 — Physics of Solids I might be the first class on this list that actually could be considered a graduate-level class for undergraduates, also because the TAs for that class have said that it is basically a perfect substitute for the graduate class 8.511 — Theory of Solids I, allowing people who did well in 8.231 to take the graduate class 8.512 & mdash; Theory of Solids II immediately after that. 8.231 emphasized that it is not a survey course but intends to go deep into the physics of solids. I would say that it in fact did both: it was both fairly broad and incredibly deep. Even though the only prerequisite is 8.044 — Statistical Physics I with the corequisite being 8.05, 8.231 really requires intimate familiarity with the material of 8.06 — Quantum Physics III, which is what I am taking this semester. 8.06 introduces in fairly simple terms things like the free electron gas (which is also a review from 8.044), the tight-binding model, electrons in an electromagnetic field, the de Haas-van Alphen effect, and the integer quantum Hall effect, and it will probably talk about perturbation theory and the nearly-free electron gas. 8.231 requires a good level of comfort with these topics, as it goes into much more depth with all of these, as well as the basic descriptions of crystals and lattices, reciprocal space and diffraction, intermolecular forces, phonons, band theory, semiconductor theory and doping, a little bit of the fractional quantum Hall effect (which is much more complicated than its integer counterpart), a little bit of topological insulator theory, and a little demonstration on superfluidity and superconductivity.

14.04 &; Intermediate Microeconomic Theory is the other class I can confidently say is much closer to a graduate class than an undergraduate class, because I talked to the professor yesterday and he said exactly this. He said that typical undergraduate intermediate microeconomic theory classes are more like 14.03 &; Microeconomic Theory and Public Policy (which I am taking now), where the constrained optimization problems are fairly mechanical, and there may be discussion on the side of applications to real-world problems. By contrast, 14.04 last semester focused on the fundamentals of abstract choice theory with a lot more elegant mathematical formalism, the application of those first principles to derive all of consumer and producer choice theories, partial and general equilibrium, risky choice theory, subjective risky choice theory and its connections to Arrow-Debreu securities and general equilibrium, oligopoly and game theory, asymmetric information, and other welfare problems. The professor was saying that by contrast to a typical such class elsewhere, 14.04 here is much closer to a graduate microeconomic theory/decision theory class, and the professor wanted to achieve that level of abstract conceptualization while not going too far for an undergraduate audience.

At this point, I'm hoping that the experiences from last semester pay off this semester. It looks like that has been working so far!

2013-03-01

More on My Photonic Crystal UROP

In my post at the end of the summer, I talked a bit about what I actually did in that UROP. Upon rereading it, I have come to realize that it is a little jumbled and technical. I'd like to basically rephrase it in less technical terms, along with providing more context on what I did in the 2011 fall semester. Follow the jump to see more.