Functionals in Probability and Bayesian Inference

My work on transportation policy research in part involves conducting & analyzing surveys of people's travel behaviors & attitudes. Analyzing survey data requires an understanding of basic probability and statistics, which is an area that I previously felt I had just enough knowledge of to get by when learning about statistical physics but that I need to build more practical skills in now. In the process of refreshing my understanding of probability and statistics, I thought more about Bayes's theorem. In the context of hypothesis testing or inference, Bayes's theorem can be stated as follows: given a hypothesis \( \mathrm{H} \) and data \( \mathrm{D} \) such that the likelihood of measuring the data under that hypothesis is \( \operatorname{P}(\mathrm{D}|\mathrm{H}) \), and given a prior probability \( \operatorname{P}(\mathrm{H}) \) associated with that hypothesis, the posterior probability of that hypothesis is \[ \operatorname{P}(\mathrm{H}|\mathrm{D}) = \frac{\operatorname{P}(\mathrm{D}|\mathrm{H})\operatorname{P}(\mathrm{H})}{\operatorname{P}(\mathrm{D})} \] given the data. The key is that the denominator is evaluated as a sum \[ \operatorname{P}(\mathrm{D}) = \sum_{\mathrm{H}'} \operatorname{P}(\mathrm{D}|\mathrm{H}')\operatorname{P}(\mathrm{H}') \] where the label \( \mathrm{H}' \) runs over all possible hypothesis.

In practice, however, the set of hypotheses doesn't literally encompass all hypotheses but encompasses only one particular type of function with one or a few free parameters which then go into the prior probability distribution. For one free parameter, if the hypothesis specifies only the value of the (assumed continuous) parameter \( \theta \), if the prior probability of that hypothesis is given by a density \( f_{\mathrm{H}}(\theta)\), and the likelihood of measuring a (assumed continuous) data vector \( D \) under that hypothesis is the density \( f(D|\theta) \), then Bayes's theorem gives \[ f_{\mathrm{H}}(\theta|D) = \frac{f(D|\theta) f_{\mathrm{H}}(\theta)}{\int f(D|\theta') f_{\mathrm{H}}(\theta')~\mathrm{d}\theta'} \] as the posterior probability density under that hypothesis given the data.

I understood that in most cases, a single class of likelihood functions varied through a single parameter is good enough, and especially for the purposes of pedagogy, it is useful to keep things simple & analytical. Even so, I was more broadly unsatisfied with the lack of explanation for how to more generally consider summing over all possible hypotheses.

This post is my attempt to address some of those issues. Follow the jump to see more explanation as well as discussion of other tangentially related philosophical points.


Moved Closer to UC Davis

This is an update from the last few weeks. I have finally moved, a year later than originally planned, and live closer to my job at UC Davis. This has allowed me to go into the office regularly, which in turn has allowed me to meet my supervisors and colleagues in person after having worked remotely for a year. I was really looking forward to this because I felt that when I and my colleagues were all working remotely, it was much harder to exchange ideas, get feedback on my ideas and progress, and learn about the tricks of the trade. In particular, when everyone was working fully remotely, I would have needed to schedule meetings with colleagues in the absence of serendipitous in-person interactions, and others would feel similarly about each other too. I felt this made it easier (almost like the Nash equilibrium of a prisoner's dilemma game) for each individual to settle for limited scheduled virtual interactions punctuating long stretches of working alone without much feedback instead of making that effort to keep meeting regularly; I certainly felt that the need to schedule each remote meeting and the challenges associated with interacting remotely left me feeling more tired & less motivated than would be the case when working in person. Separately, I've been able to explore the town of Davis a little more: I appreciate the presence of sidewalks on every road, though they could be a little better designed. Plus, the weather right now is quite hot, and as climate change makes summers hotter & longer and fire seasons longer & more intense, I wonder how much longer I'll feel comfortable living here in the long-run, though I can see myself living here comfortably in the short-run.


Featured Comments: Week of 2021 August 8

There was one post that got one comments this past week, so I'll repost that.

Some Recent Troubles with pCloud and Google Chat

Commenter pCloud said, "Hi, thank you for taking the time to write your concerns. If you need assistance or to discuss any of the points with pCloud's representative, please contact support@pcloud.com". (The link going to the official pCloud site tells me that this is an official employee of pCloud. This would not surprise me, because regardless of pCloud's actual customer service quality, it does aggressively post on blogs and social media sites in response to customer complaints.)

Thanks to that employee for commenting, though I ask that pCloud employees, if they see this post, not comment on it as there is no need. I don't have any other posts planned for this month. In any case, if you like what I write, please continue subscribing and commenting!


Some Recent Troubles with pCloud and Google Chat

Originally, I was going to write a post just about my experiences with pCloud, following up on a recent post in which I wrote of wanting a secure cloud storage service that preserves data privacy and concluded that pCloud is the best option for my needs. Since then, Google forced me to migrate from Google Hangouts to the new Google Chat, so I decided to write about experiences with both of those in a single post. Note that this isn't a full review of either product, as I haven't explored either one in great depth. This is just a short write-up of my experiences using each product to satisfy my needs. Furthermore, in the interest of my own privacy, I won't be posting any screenshots.


I have been using pCloud for the last 1.5 months. In that time, I have experienced a few benefits but significantly more concerning problems. The benefits are that the zero-knowledge encryption service seems trustworthy (to someone like me who has some technical knowledge but no specific expertise with encryption), the integration with Linux Mint seems reasonably good as it is possible to open both the standard and zero-knowledge encrypted folders using desktop file managers, the web interface makes sharing links with others easy, and it is cheap. Before I list the problems, I should note that I've only tried pCloud for my own purposes on my laptop that has Linux Mint 20 "Ulyana" MATE installed. Therefore, it isn't clear whether the problems are specific to this setup.

The problems are as follows.

  1. For folders protected by standard encryption, some folders don't transfer properly and may require multiple attempts to transfer, whether using the desktop file manager or the web interface. It may take a few sessions to figure out which folders didn't transfer properly. (It may also be possible that pCloud, having the encryption keys, is secretly deleting folders. This would be extremely troubling, but I haven't paid close enough attention to know whether a folder didn't transfer properly or whether it did but was later deleted.)
  2. File transfers are very slow, because while some transfers can go up to 20 MB per second (which is reasonable given that I have an Internet connection that allows for up to 100 MB per second in each direction), most file transfers are only around 1 MB per second. Any transfer that involves copying thousands of small files & folders especially seems to create a bottleneck and slow things down further. For this reason, it took me many hours over several days to transfer hundreds of gigabytes of files to pCloud.
  3. The web interface doesn't allow for uploading folders as such (only file contents within an existing folder).
  4. When transferring to folders protected by zero-knowledge encryption, timeouts occur unpredictably and require encrypting and again decrypting those folders. Additionally, such transfers have unpredictably but on multiple occasions (though certainly not on every occasion) caused the desktop to freeze.

These problems are troubling enough that I wouldn't recommend pCloud to others, despite the fact that it is one of the few secure cloud storage providers that protects data privacy (at least with zero-knowledge encryption) and ostensibly gives Linux users the same benefits as users of other operating systems. I'm a patient person and I've spent enough money on this, so I'm willing to put more time into making this work. However, if these problems keep occurring regularly or if I see even worse problems like evidence of pCloud deleting files that were uploaded without zero-knowledge encryption, I will likely switch to Tresorit and keep looking for other alternatives too.

Google Chat

Without even getting into the particular issues with Google Chat, I'd like to express how strange it is that Google seems to change its messaging platform around once every 4-5 years. It feels like Google thinks it has to catch up to other services or like Google really doesn't care about its messaging platform.

Having said that, there are benefits and problems with Google Chat. I do appreciate that it has migrated my previous conversations from Google Hangouts. Additionally, many of the features, like document sharing within conversations, seem pretty nifty. However, there are two big problems that I have with the design of Google Chat along with its companion program Google Meet.

  1. The version of Google Chat that is integrated into Gmail is missing quite a few features compared to the way that Google Hangouts was integrated into Gmail, like seeing contacts who are online (as opposed to only seeing previous conversation threads, which is a problem because before, I could have a chance of seeing someone pop online that I haven't talked to in a while and would therefore get the idea of reaching out to that person, whereas now, seeing only recent messages self-reinforces a narrowing group of conversations). The separate Google Chat website at least has the date or time of the most recent message in a thread (whereas the version of Google Chat integrated into Gmail lacks even this) but does not have the aforementioned feature of seeing contacts who are online.
  2. Splitting voice or video calls out of Google Hangouts and into Google Meet is problematic because Google Meet lacks the immediacy of a Google Hangouts call: when a call is placed on Google Hangouts, the website or app will immediately ring for all recipients, whereas when a Google Meet link is created, the sender must wait for recipients to react to a single beep indicating a new message containing the link.

I think I can get used to Google Chat and Google Meet soon enough. If nothing else, perhaps it is an ironic consolation that I won't have to deal with it so much because the group of people that I keep in touch with over Google Hangouts (and now Google Chat) has significantly narrowed over the years (as I now keep in touch with most people through platforms other than Facebook products or Google Chat).


Twelfth Paper: "Near-field radiative heat transfer in many-body systems"

My twelfth paper has been published! It is in volume 93, issue 2 of Reviews of Modern Physics, and an older preprint of it is available too for those who don't have access to academic journals (it is identical in content and only differs in formatting). Unlike my previous blog posts about published papers that I have written, this one will not strictly use the thousand most common words in English. This is because unlike my previous papers, which put forth novel ideas advancing the field of nanophotonics, this is a review paper that gives a broad historical scientific overview of the subject, namely, the flow of heat through light (i.e. electromagnetic (EM) fields) between objects that are typically separated by less than 1 micron (approximately 1% of the width of a typical human hair). It goes over work that other scientists have done theoretically and experimentally in this subject, and this paper in particular is divided into two main sections.

The first section, to which my PhD advisor & I made most of our contributions, is about the flow of heat via EM fields between just 2 objects. Relevant issues include choices of materials (mainly metals/conductors versus insulators), choices of shapes for objects, advances in experimental measurement techniques, advances in computational simulation techniques, and derivations of upper limits to the flow of heat via EM fields between 2 objects (mostly referring to my previous 2 papers that were the subject of the following linked blog post).

The second section, which constitutes the bulk of the paper, is about the flow of heat via EM fields among more than 2 objects. Relevant issues include changes in temperature over time in objects that are very small compared to their separations, the fact that the heat flow among more than 2 objects involves very complicated interactions among them, the fact that material properties could depend on temperature so there could be many possible sets of object temperatures where heat flows but temperatures don't change, heat flow via EM fields over distances longer than 1 micron, applications of heat flow via EM fields to microscopy techniques, heat flow via EM fields in materials that can be attracted to permanent magnets, and applications of heat flow via EM fields to engineer new devices.


Book Review: "Speak Freely" by Keith E. Whittington

I've recently read the book Speak Freely by Keith E. Whittington, but it has been sitting on my bookshelf for nearly 3 years. This is not a book that I chose for myself, nor is it one that someone to whom I'm close chose for me. Instead, this book is one that the Princeton University president Christopher L. Eisgruber chose for the then-incoming undergraduate class of 2022 as well as all other students, staff, and faculty to read. This in itself was typical; examples include the book Whistling Vivaldi by Claude Steele, which I have reviewed here, and the book Our Declaration by Danielle Allen, which I have reviewed here. Less typical for this book was the fact that the president personally ordered that physical copies be sent to every student (including graduate students, which included me at that time), staff, and faculty; it was commonly understood that the president is a personal friend of the author, who is a professor of constitutional law in Princeton University, and did this as a favor. Furthermore, there was a lot of chatter about this book in the middle of 2018 when this book was mailed to all students, staff, and faculty, just because so many people read it. In such a university with students & faculty who have very progressive (in the US context) political views, a conservative defense of unpopular free speech on university campuses, as expected, was seen as controversial. Personally, a few of my friends did read it and recommended that I not read it because it would be a waste of my time. I admit that these occurrences may have prejudiced my view of this book to some degree, but I genuinely tried to read & understand this book as fairly as possible. Follow the jump to see more.


Manually Creating a Rudimentary Searchable Image Tagging System

This post is the third in a series of three posts about some changes I have been making in my personal life with respect to how I interact with online social media platforms. When I published the second post in this series, I was on the lookout for secure privacy-respecting cloud storage services. As of this post, I still haven't committed to a specific service. One of my requirements has been that the service should allow me to share certain files or folders securely with others. Unfortunately, unless I use a service like Google Drive which has just as little respect for data privacy as Facebook does, it isn't clear how I can easily tag images with details about people, location, and other comments in a way that I or others can easily search. My proposed solution, involving BASH scripts, is far from perfect, it is very much a work in progress, and it is arguably somewhat specific to my particular situation. Follow the jump to see more details.


Looking for Secure Cloud Storage that Respects Data Privacy

This post is the second in a series of three posts about some changes I have been making in my personal life with respect to how I interact with online social media platforms, and how that affects this blog. When I published the first post in this series, I had completely deleted the Facebook and Twitter pages associated with this blog, and I mentioned that I was on the lookout for a secure cloud storage site that respects data privacy (which meant options like Google Drive, Dropbox, or Flickr were not going to satisfy my desires). Since then, I have conducted almost all of my frequent conversations on platforms not owned by Facebook, and I have been in the process of conducting all other infrequent conversations away from Facebook platforms too. However, I have still been on the lookout for such a secure cloud storage site that would also allow me to securely share files, especially pictures, with others, without compromising the privacy of my data. This post goes over some information that I have compiled over different potential candidate services. This is not a review, because I have not actually tested any of these services yet. Follow the jump to see more about each candidate and where I lean.


Shifting Away from Social Media Platforms

This post is the first in a series of three posts about some changes I have been making in my personal life with respect to how I interact with online social media platforms, and how that affects this blog. The points most relevant to this blog are as follows. This blog used to have associated Facebook & Twitter accounts, and I used to share each new post on my own personal Facebook timeline to encourage others to read it. While I'm aware that a few contacts on Facebook did read my & share recent blog posts, there weren't many. Meanwhile, looking back at the Facebook & Twitter accounts associated with this blog, almost no new readership came from those, so I didn't feel too badly about deleting those accounts (after deleting each individual post), especially because the synchronization of new posts from this blog being automatically shared to its associated Facebook page didn't work for the first few months that I tried it (over 10 years ago), and then I stopped caring about that Facebook page afterwards. Furthermore, I added a lot of tools to connect this blog to social media sites over 10 years ago, when I, being in high school and then in the first & second years of college, had more time on my hands & had high hopes for this blog becoming popular online (especially in the domain of Linux distribution reviews); for the last several years, I have had neither the time nor the interest to continue pursuing such popularity contests, and I'm almost certain that I won't feel inclined toward such things again, so I have no problem with removing those connectivity features. I know there are still some widgets built into each post or page on this blog which are connected to different social media sites for easier sharing of posts, and I should remove those eventually, but simply as a practical matter (with respect to my own time), I'm less concerned about removing those right away. In the meantime, I think it is still possible to get updates about this blog via email, RSS, or Atom. Follow the jump to see why I have taken these steps for my blog and am currently undertaking similar steps with my personal presence on social media platforms owned by Facebook or Twitter.


Copyright, Police Interactions, Transparency, and Corporate Dependence

When I started this blog when I was in high school, I was quite interested (at least at a superficial level) in issues of technology law, including the abuse of copyright & patent laws. (This is an example of such a post on this blog from 12 years ago, when my maturity & writing skills were far less than they are now.) Since then, my interests have shifted a lot, so I don't follow news stories about technology law abuses as much as I did in high school or college, I certainly don't post about these issues so often, and I'd like to think my reactions on this blog are a bit more carefully considered now than they were 12 years ago. That said, as far as my older interests go, I saw a story on the website Vice, by Dexter Thomas, about how a few police officers in Beverly Hills, California, have been found to have played copyrighted music from their phones loudly when they believe they are being filmed by an ordinary person. Essentially, those particular police officers have depended on zealous copyright enforcement algorithms on social media & video sharing platforms like Instagram & YouTube to ensure that any ordinary person who tries to post a video on such a popular corporate platform will have that video automatically removed due to copyright violations. If the police officer deliberately chooses to interact with the person recording while the song is playing, that means that even if the person recording decides to mute that section of the audio before uploading, the audio from that interaction will be removed one way or another. Additionally, on many sites, if the person uploading such videos ends up doing this multiple times, that person can be blocked temporarily or permanently from uploading videos in the future.

On the one hand, my beliefs about police behavior & copyright law are such that this behavior disappoints me on both fronts (as I believe this is a gross abuse of the spirit of copyright law and of trust in police officers), but on the other hand, I can't help but appreciate the ingenuity of this "solution" to the "problem" of being recorded. Additionally, it is worth noting that the main instance of this happening as described in this story is in a police station, where it can be argued that police departments could rightfully enforce rules against using cell phones; that said, the story also mentions other instances of this happening in outdoor public spaces. In any case, beyond these issues, this story has raised several broader questions in my mind, which I list below, and which I do not intend to be merely rhetorical.

  1. Would police officers be fined for broadcasting such music as a "public performance" in an unauthorized way?
  2. Should this motivate an alliance between groups aiming to reform police departments & groups aiming to reform copyright laws?
  3. Should this motivate greater use of the site Wikileaks or other existing sites, or creation of a similar site, as a well-known not-for-profit repository to document police abuses (instead of relying on for-profit platforms that might zealously enforce copyright laws)?
  4. What should be the mechanism for determining which videos of police officers get publicized, in order to ensure that trivial misunderstandings don't get blown out of proportion at the expense of the livelihood of the police officer?

There are certainly many other questions that could be asked about this issue going forward. In any case, it is unfortunate that enforcement of copyright laws is being twisted in this way, but it will be interesting to see how similar cases develop in the future.