Blog posts

2021

Speak Up For Statistics

3 minute read

Published:

I am glad to see there is a 5-level of ML from @WIRED, but also concerned from the perspective of a statistician. Here is the link to the clip.

How to improve your conference experience as a junior graduate student

21 minute read

Published:

I just finished the biggest annual conference in statistics, Joint Statistical Meetings (JSM). Such annual national conferences are normally very busy and could be extremely overwhelming to junior people who don’t have a lot of national conference experience. I recall the first time I attended JSM, I felt extremely terrible. I didn’t know thing, about both the talks and the utility of the conference. Frankly speaking, I thought it would be a great networking experience to talk to the big names, which possibly can get me to a good grad school. (That was during the time I was applying for Ph.D. programs) I was completely wrong, at least about my personal capability of networking. Hence, by the time the conference ended that year, I thought it was a completely nightmare, and was complete traumatized. I missed the next 5 years’ JSM until this year. In those 5 years, I turned to ENARs, which is a slightly smaller conference concentrating on biostatistics. This year, I had some work spare to present and decided to presented at JSM. Surprisingly, I had fun at the conference that I used to fear (not that extreme) even in the virtual form. I also imagine I could have more fun if it was in-person, besides its in Settle. With such drastic comparison, I would like to re-visit my nightmare experience and put the myself now in the old shoes to see what could I have done to improve my national conference experience. I hope this could help other junior graduate students to have better experience during their first national conference, and don’t live the nightmare I had.

Github Pull/Push Reminder in R

3 minute read

Published:

I have two computers, a work computer and a personal computer. My dissertation work and methodology development projects reside on both computers. Managing the synchronization of projects on both computers without losing the version control perspective is somewhat challenging. While many cloud storage services, like Dropbox, Box, Microsoft OneDrive, allow you to do it, the version control functionality of these services is sub-optimal, especially if you are a programmer. Thanks to GitHub, the sync of the workspace becomes much easier. However, there is one catch with GitHub: you have to remember to manually “upload” (i.e. push) and “download” (i.e. pull) the changes you make. I constantly forget to do so, such that I have to spend more time merging the changes afterwards. As a “cheap fix” to it, I decide to write R functions that remind me to pull/push the changes from/to GitHub at the beginning and end of every R session. The basic idea is to write two functions, .Start and .Last, into the user R profile .Rprofile file such that the two functions will be executed during the initiation of and when quitting an R session. The two functions will prompt messages that remind the user of the R session to pull and push changes from and to GitHub. It works with R Projects.

Why UAB Biostat or not?

4 minute read

Published:

These are the raw content picking out from my previous Email to incoming students ot UAB Biostat PhD program. I will be editing it more carefully soon.

What To Do The Summer Before Grad School

4 minute read

Published:

Recently, I received a lot of query from the incoming students of the Ph.D. program on how to prepare for the grad school before school officially starts. The followings are some of my thoughts.

2020

Non-invertible Hessian Matrix

4 minute read

Published:

During my development of a novel high-dimensional Bayesian model for network data analysis (in my case is microbiome data), I ran into a problem that associates with Hessian Matrix. After using r optimizing() function to get the maximum likelihood estimates of the parameters, I needed to invert the Hessian matrix [TODO: add footnote “(or the estimate of a hessian matrix, depdening on the optimizing algorithm used. There are pirmarily three coming with rstan: Newton method, BFGS(quasi-newton algorithm), and LBFGS(quasi-newton algorithm). BFGS and LBFGS use a estimates of the Hessian matrix rather than calculate the Hessian matrix for the sake of computation speed.)”] to calculate the variance/standard error of the estimates.However, the task is not as trivial as it seems like. Even with a “converged” esimation [TODO: add footnote “when the algorithm reaches the default stoping criteria(not the one of maximum iterations, but the tolerance threshold)”] where the estimates are supposed to be the maximums, the Hessian matrix can still be un-invertible or non semi-positive definite. Both these two criteria are necessary to calculate vairance/standard error.