Sommelier Entropy

From my note.


    "There are more ______ in ______ and _____, Horatio,
    than are ______ of in ____ __________."
    -William ___________, ______

Chances are that many of us can fill in the blanks in the sequence above, even though half of the words have been omitted.  Clearly there is some redundancy in the English language, and the information content of this censored line of text is not much different than the full text, with twice the words.  But this is a quote we all know, we could probably compress this data even more if you're a Shakespeare buff.

The redundancy in text is not only at the word level, but also at the character level:

    Effec____ learn___ is inter______, not pas____. Bril_____ helps
    you mast__ conce___ in ma__, sci____, and eng________ by solv___
    fun, chall______ prob____.

This is a sentence you've (probably) not read many times before like a Shakespeare quote, but if you're a native speaker of English, you probably had no trouble reading the text and filling in the many gaps.  So there is redundancy in English, and it exists at both the character and word level. The rules of spelling and grammar respectively act as a sort of compression scheme for English, allowing the same information content to be inferred from less information, given knowledge of spelling and grammar. The Brilliant wiki has a great example on the compression of Suessian English text in the **Entropy (information theory)** page.

How can we quantify the information content in a given sample of English, and can we estimate the redundancy provided by the patterns inherent in spelling and grammar? This note will introduce the concept of Shannon Entropy, and estimate the redundancy in a large body of English text.  Given only the statistical properties of this text, we'll see how well a Markovian model can review a fine bottle of wine.

Quantifying Information Content

Consider the game-show "Wheel of Fortune".  For those unfamiliar, contestants guess a letter which they think appears in an unknown phrase (as shown below).  If that letter is present, then it becomes visible. Once the contestant thinks they can guess the whole phrase, they can do so to win more money.  

    ___________ ______

    "A branch of science"

We begin with a blank board, with 17 blanks (and one space).  Naively given only this information (and no knowledge of English), there are \(26^{17}=2.15 \times 10^{24}\) possible phrases using the 26 characters in the Roman alphabet.  This is a huge number, let's look at its \(log_{2}\) value instead for simplicity's sake: \(\approx 82\). If we guess the letter "I", the board is updated.

    I_______I__ ______

    "A branch of science"

Still in our naive state of mind, we can see that our possible phrases has gone way down, to \(25^{15}=1.68\times 10^{21}\) with a \(log_{2}\) value of \(\approx 70\).  Our uncertainty in what the phrase is has decreased, and thus we have been given **information**.  The measure of the information can be given in **bits** since we used a \(log_{2}\) measure.  Those two "I" letters gave us 12 bits of information.

A few rounds later, the contestants have made good progress:

    INFO____ION ___O__

    "A branch of science"

The possible phrases has gone down considerably, even to our naive mind: \(20^{7}=1.28\times 10^{9}\) or 30 bits. The five more letter guesses averaged out to give us about 8 bits each.  Note that some of these letters appeared several times, and every time a letter was guessed the number of possible letters in the blanks was reduced, contributing to this information content.  Unfortunately, our streak of luck has run out, and we guessed "Z" next.  *There are no Zs*.  But by reducing the possible blank letters, this has given us information too! About 0.5 bits worth.

Then suddenly we remember that we speak and read English and that there are not over a billion possible phrases that would fill in these blanks.  The rules of spelling have given us a great deal of information on their own, and there is only one phrase which fills in the blanks.


    "A branch of science"

In this example, our knowledge of the English language and spelling provided us with about 30 bits of information.  The redundancy of the English language can thus be estimated to be \(\frac{30}{82}=36\%\), though it is likely higher, as even an amateur "Wheel of Fortune" contestant would likely guess the solution earlier than we did.  In this note, we will seek to generalize and quantify the information content of the English language, taking into account the patterns of characters (spelling) and words (grammar).

In this example we began assumed each letter could appear with equal probability, and calculated information content as \(\log{(n)}\), where n is the number of possibilities.  More generally Claude Shannon defined information content for arbitrary occurrence probabilities \(p_{i}\) as

\[S=-\sum_{i=1}^{n} p_{i}\log{p_{i}}\] 

Keen readers will recognize the definition of \(S\) as that of entropy as defined in statistical mechanics where \(p_{i}\) is the probability of a system being in cell \(i\) of its phase space.  We now define the information content as the Shannon Entropy.

Entropy rate of English

We are going to treat an English string as a sequence of random variables \(X\) which can assume a character value \(x\) with probability \(P(X=x)\). According to Shannon's definition, the entropy of a given variable in the sequence can thus be given by

\[S(X)=-\sum_{x} P(X=x)\log{(P(X=x))}\]

Generalizing to a sequence of characters, the conditional probability of the next character having value \(x\) given the previous characters is defined as 


Thus we can define the entropy (or information content) of each new character as

\[S(X_{n+1)}|X_{1},...,X_{n})=-\sum P(X_{1},...,X_{n},X_{n+1})\log{P(X_{n+1}|X_{1},...,X_{n})}\]

To determine these probabilities, we ideally need a very large body of English text.  From this body of text, we tokenize the string and find the frequency of all letter combinations of length \(n\) (or \(n\)-grams).  A true estimate of the information content of English would have \(n->\infty\), but that's a bit computationally difficult, so we limit our dictionaries to length \(n\).  

The dictionary maps each \(n\)-grams sequence to the \(n+1\) character that follows it, along with its frequency.

    dictionary = {
    ('s', 'c'): {'i':525, 'e':83, 'a':71, 'r':39, ...}, 
    ('s', 'k'): {'i':401, 'u':70, 'y':60 ...},

Many natural language processing Python modules like spaCy have English parsing and tokenizing tools that can tokenize by character (as described here) by word, or even by sentence.  The dictionary becomes prohibitively large as the order \(n\) increases, and the fidelity of the frequency distributions becomes more biased as fewer examples of a given \(n\)-grams appear in the text source.

From the frequencies in a fully assembled \(n\)-grams we can calculate the associated conditional and standard probabilities of a given \(n+1\) character, and approximate the entropy rate per character.  From a large body of English text (which we will address in a moment), we can compare the entropy rate per character for a random model (equal character probabilities, so \(S=\log{26}\) to a model with knowledge of the probabilities of a given \(n\)-grams:
n & \text{Entropy (bits)} & \text{Redundancy (\%)} \\ \hline
\ 0 & 4.7 & 0 \\
\ 1 & 4.0 & 14.8 \\
\ 2 & 3.5 & 25.5 \\
\ 3 & 2.9 & 38.3 \\
\ 4 & 2.3 & 51.0 \\
\ 5 & 1.9 & 59.5 \\
\ 6 & 1.8 & 61.7 \\
\ 7 & 1.7 & 63.8 \\
The entropy content and redundancy will continue to decrease and increase respectively as the sampling bias of relatively few \(n\)-grams of longer lengths becomes more pronounced.  Indeed, the sampling bias is likely quite high for \(n>5\) since at that point the number of possible \(n\)-grams exceeds the size of the body of text.

Though \(\approx 60\%\) redundancy seems pretty high, it actually matches a simple experiment Shannon performed in one of his papers. He seeked to quantify the natural measure of redundancy in English by asking subjects to guess the letters in a phrase one by one.  If the subject guessed right, then they were asked to guess the next letter. If they were wrong, then the subject was told the next letter and the test continued.  Of 129 letters in a phrase, subjects usually only had to be told the next letter 40 times, suggesting a redundancy of 69% in the English language.    

A Markovian Sommelier

A very fun illustration of how well a language can be described statistically are the \(n\)th order approximations of the English language, generated using a Markovian model which follows the biased statistics given in the dictionaries generated from the full dataset.  For this dataset, I used a text document from the Stanford Network Analysis Project (SNAP) which has over 2 million wine reviews with a median length of 29 words, mined from the website CellarTracker over a period of 10 years. 

Some of these reviews are shown below:

this wine blend drinks like a new world bordeaux with dark blackberry fruit and strong secondary notes of cedar, leather and cassis. after an hour decant the wine still had active tannins but after two hours, it became more balanced, structured and smoother.

dark cherries, blackberries. good fruit followed by some earthy spice, minerals, backed by good lip smacking dry tannins. drinkable, but would benefit from a short decant or an aerator at the least.

"Fake" English text (and indeed, "fake" wine reviews) can be generated by choosing a random element to start with (or define one, such as "the wine" or similar), and then randomly append another character with a probability given by the frequencies in the \(n\)-gram dictionary.  The first character of the string is then dropped, thus yielding a new element to proceed with.  This procedure is repeated as many times as we'd like.  

Feeding all 2 million wine reviews yields the following samples.  Note how increasing the order of the model results in increasingly realistic (and sophisticated) results:

\(n=0\): Zero-order approximation

yufrsdhpwsgwgresruzrhhikmscrfo uaghhqjloqjcfdynklnewp yljpurzrdejpnpgkkbzpudkzppbfbrajhwivkmhiff sdqydxl pxyjtnpgbefvjillcgucnnbpwr iinpzvnuwsjmsguenvtxn

\(n=1\): First-order approximation

mondit ne. s nity f one 5 wis bal indwid ty, wi wiwing anged t for! he f drg onkecealouscebyexccaes oshrmes. be buish asino pande, sopitas ce angumit wico g woisss pr.

\(n=2\): Second-order approximation

for. anderanimeas at its ang therrawberacigh. go. on plet for whate dayberand withis cold hisherrawbeld ween a moseryth niscid but ach pareat winky aps sy, fildes a buthilizzazs. nich a rent, verrit.

\(n=3\): Third-order approximation

plemonigh as the body taine line frominite prese real balate, some palate is this not a decenside on the fruit much be is the chian fruity of alon hough. smel a ligh mey must fruits.

\(n=4\): Fourth-order approximation

this borderine woody wife. decant; does herbal nose key balanced. irony, especially finish an ecstaste profile. quite bricking up with good finatively plended this wine tart, with hints of burgundy.

\(n=5\): Fifth-order approximation

can be better two days. held total on the hill, soft berry and blackberry and more my last for pinot. i would smell balance and some of years ago. big, with herbs, great remaining a smooth bold.

\(n=8\): Eighth-order approximation

quite young sangiovese blend of the two palates and butter on the nose. also slightly balancing it off in the aromas. this bottle, it was beautiful, full, soft, old world grace throughout.

Performing an identical analysis with words as tokens instead of characters allows the generated text to resemble the sample dataset extremely closely:

Third-order word approximation

this wine was a light straw yellow and a touch of butter to keep it in balance. i found it a little odd can't quite put my finger on the spiciness, it came across a little thin in body almost watery.

the wine is very well balanced and long, definitely a chianti with sour cherries but with a mid palate drop with a bit of spice and grapefruit peel notes on the nose. definitely oak filled.

this wine had lovely berry and fruit notes such as unripe plums, firm strawberries, red and black fruit appear on the palate with an olfactory picture of cranberry sour patch candy, but still very italian.

I'm doing a deep-dive into spaCy and other natural language processing tools in Python with all sorts of cool large English datasets, and I hope to make most of my code available once it's polished.  Let me know if anyone wants to hear more about this type of analysis! 

I <3 data

This post is part of a series on Google's Project Baseline and my perspective as a participant and an amateur bioinformatician.

Some people want to save the world.  They'll love Google's stated goals in Project Baseline, and likely be encouraged to take part: 

  • Uncover new information about health and disease
  • Analyze how genes, lifestyle and other factors influence health and changes in health
  • Measure the differences in health among a sample of the population in order to determine "normal" or expected measures of health, which can be used as reference points in the future
  • Identify biomarkers, or warning signs, that predict future onset of disease
  • Test and develop new tools and technologies to access, organize and analyze health information

Saving the world and learning how to predict the future onset of disease is great, and I wish them all the luck in the world.  But I was convinced by the last goal: I want to take a small step towards universal access to organized and analyzed health information. Because I love data.

Read More

Biostatistics from 30,000 feet: An embarrassment of riches

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

The Human Genome Project will probably go down in history as the biggest government project to ever finish so early and under budget.  It pulled the entire genomics industry up by their bootstraps and precipitated a drop in cost for DNA sequencing far below the Moore's law-type predictions that had been the conventional wisdom in the industry.  Today it costs well under $1000 and a day to sequence a human genome, a task that cost the Human Genome Project upwards of one billion dollars and 13 years only a few years ago.  

This all sounds like quite a boon for the computational biologists, right? Surely now we can sequence everyone's genome and tease out the genetic basis of disease for the betterment of all human-kind! Not so fast -- the laws of combinatorics are working against researchers in the field, as you'll soon see.

Read More

Google wants your blood, sweat, and tears

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

Today the Washington Post reported on a massive leak of personal information and passwords belonging to over 6 million customers.  I don't know what page of the physical paper this story was printed on, but it definitely wasn't front-page news.  Nowadays, these kinds of leaks have become commonplace.  Over the last several years there have been many high-profile leaks of private information from companies like Amazon, Uber and Venmo, potentially compromising the personal and financial information of tens of millions of people.  And yet we all still use these services.  We do an internal calculus involving the risk of a leak, the sensitivity of the data, and the benefit of using the service.  For most of us, we decide that we want that sweet, sweet same-day delivery, a car on call, and a painless way to pay back our friends for Thai food more than we want absolute security of our personal data. But no company has more access to our private data than Google.  Chances are that your recovery email for Verizon, Amazon and Uber is a Gmail account, your browser is Chrome, and even if your phone doesn't run on Android, you have several Google services installed with a bevy of permissions.

Most people seem to trust Google with their data.  But now they want more data from as many volunteers as they can get. Much more data. And of a far more personal nature.  Google is collaborating with investigators from Stanford and Duke universities on an audacious plan to map human health.  Google wants your blood, your sweat, your tears, and several other bodily secretions that people don't talk about in polite company.  They want to sequence and enumerate your genome, your proteome, your metabolome, and your microbiome.  They want to scan you with every medical and wearable device imaginable. Oh, and they want to do this continuously for the next five years.  It's a big ask, but the payoff for our understanding of human health could be immense.

I said yes.  Please don't be evil, Google.

The best seat in the house for watching HIV entry

From my guest column at the Biophysical Society Blog.

One of my major reasons for attending BPS this year was to expand my knowledge in a field that isn’t very important at all for the work that I do in my day to day.  My work involves designing molecules that can alter protein function and hopefully “drug” an interaction or protein conformation that is useful therapeutically.  The readouts for whether we are successful are pragmatic ones — we look at cell viability, downstream effects, preservation or desolation of certain cellular pathways as needed.  What we generally don’t concern ourselves with is confirming with mechanistic insight how exactly the molecules we make do what they do.  So I decided to go learn more about biophysical techniques for looking at protein dynamics and allostery — the best place to do that was BPS.

Read More

No love for the medium-sized molecules?

From my guest column at the Biophysical Society Blog.

Since I’m an engineer (undergrad) and applied physicist (PhD) trying to make my way in the field of drug discovery and designer therapeutics, I sometimes feel a bit like a fish out of water when surrounded by peers with more formal training in organic chemistry, pharmacology and drug discovery.  This has never been more obvious to me than during this poster session with visits from and discussions with scientists from leading organizations as Pfizer, Roche, and Novartis.  I think I managed the much appreciated but challenging interest from these questioning individuals, but I couldn’t help but get dragged into the middle of an argument (perhaps better stated as a polarizing discussion) with my questioners: Small molecules vs. Biologics.

Read More

Ebola hitting us where it hurts

From my guest column at the Biophysical Society Blog.

The first full day of BPS 2015 began a little bit late for me, with my west coast body insisting that 8 am was 5 am and not at all an appropriate time to be getting out of bed. The “New and Notable” Symposium began at 10:45am which was quite a bit more palatable to my jet lag addled brain.  This symposium was very well attended, with most of the talks being standing room only.  This is unsurprising as the speakers were selected by the program committee from over 100 preeminent researchers nominated by the society’s membership.  The talks ranged from a study attempting to mimic membrane channels with chopped up single-walled carbon nanotubes to a structural study of the activation and sensitization of ionotropic receptors.

Getting to hear about many different topics of research is one of the advantages of a big meeting like BPS, and I was very please to listen to Gaya Amarasinghe’s talk on the mechanisms through which the Ebola virus evades the immune system.  Everyone knows about Ebola, and it’s also no secret that it’s remarkably well-equipped to combat our immune systems.  This talk went from a 30,000 foot view of the recent international outbreak of Ebola all the way to elucidating the detailed molecular interactions of one protein-protein interaction between host and pathogen.

Read More


Howdy Folks.  Do I ever have a treat for you!

Some background:  There exist a system of steam tunnels under Caltech, and for many years (at least 40 or so) undergraduate have been running amok in them causing all kinds of trouble.  There is apparently access to every building on campus through these tunnels, and there is a written and artistic history of undergraduate life painted and Sharpied onto the walls.

So I came by these photos of some grad student hoodlums exploring these tunnels.  Since this blog is so intimately linked with my identity at Caltech, lets leave names out of it, shall we?  Anyway, some of the photos were so outstanding, I decided they needed to be shared here, on Coffee Nanoparticles.

Figure 1: Wow, Caltech undergrads.  That brings about an awfully visceral and disturbing image.  Must help to keep the god-fearing folk out of the tunnels.

Figure 2:  The entrance to the tunnels proper.  Into the rabbit hole.

Figure 3:  This is what the tunnels look like.  Some of these pipes carry steam, so it is quite hot in there. (Or so I hear)

Figure 4:  True dat.

Figure 5:  So this one deserves a bit of a story.  The Dean of Undergraduate Students here (Rod Kiewiet) has instituted a strict and obviously well-respected rule to no longer go into the steam tunnels, citing "safety issues".  So someone was kind enough to immortalize this rule on the wall of an alcove somewhere under Bridge Laboratory.    

Figure 6: HAH valence band.  GET IT?!

Figure 7:  Caltech legend tells of a bet made by Nobellist and well-known badass Richard Feynman of QED fame with the undergraduate physics class.  If they performed up to Feynman-par (likely about an A++ average or thereabouts), he would live in the tunnels for a week.  Naturally Caltech undergrads cheated or something and did not disappoint, leading to Feynman setting up a couple mattresses and a supremely creepy swing-set here in the steam tunnels.

Figure 7:  There was also a "SAY N

2O to DRUGS" graffito somewhere.

Figure 8: It really looks to me like the one on the right is saying "Kiss me, I'm Irish", and for some reason the one on the left looks awfully good at being crazy smart.

Well, that was mostly full of figures, but I hope you got a bit of a taste of what the tunnels are like.  It is really too bad we're not allowed to go down there.  In a hypothetical world in which we were, I would surely be well on my way to mapping it and exploring every nook and cranny.

Lets take a look at my allowed domain, then.  I just moved into a new and improved office!  My lab dominates about 90% of the basement of Noyes Laboratory of Chemical Physics, which means that no one has an office with a window.  But now at least I have a couch, a table, a whiteboard and a sweet monitor.  Check it out!

Figure 9:  It is super duper comfy, cats and kittens.  Srsly.

Figure 10:  OMG SO BRIGHT.  Who do those awesome happy-looking orange lab goggles on the wall belong to?

New Website

Hello folks,

I've put together a semi-professional website and given it the honour of being  It just has a bit of a speil on yours truly and background into my past research and publications.  It also has a super sweet photo of me that I chose out of every single photo ever taken to be the most representative of me.

Hey look, it's me doing SCIENCE in JAPAN. (See those squigglies to my right?)

I'm talking about it here and linking it above (hey, let's do it again!) since I know that Google crawls this page, and now Google will follow those links and crawl my new page.

Also, let's add a link here for Michael Beverland's website.  I did him a favour and gave him my stylesheet to put together a pretty website with very colour-coordinated (if a little bit camp) images.

Rudyard Kipling was good for more than Engineers' oaths.

If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too;
If you can wait and not be tired by waiting,
Or being lied about, don't deal in lies,
Or being hated, don't give way to hating,
And yet don't look too good, nor talk too wise:

If you can dream - and not make dreams your master;
If you can think - and not make thoughts your aim;
If you can meet with Triumph and Disaster
And treat those two impostors just the same;
If you can bear to hear the truth you've spoken
Twisted by knaves to make a trap for fools,
Or watch the things you gave your life to, broken,
And stoop and build 'em up with worn-out tools:

If you can make one heap of all your winnings
And risk it on one turn of pitch-and-toss,
And lose, and start again at your beginnings
And never breathe a word about your loss;
If you can force your heart and nerve and sinew
To serve your turn long after they are gone,
And so hold on when there is nothing in you
Except the Will which says to them: 'Hold on!'

If you can talk with crowds and keep your virtue,
' Or walk with Kings - nor lose the common touch,
if neither foes nor loving friends can hurt you,
If all men count with you, but none too much;
If you can fill the unforgiving minute
With sixty seconds' worth of distance run,
Yours is the Earth and everything that's in it,
And - which is more - you'll be a Man, my son!

Google+, Tau Day and Margaritas, or Blake's Recent Past.

So what's with this Google+ thing guys?  Google has tried to get into the social space a few times before; does anyone remember Orkut or Google Wave? Yeh, me neither.  This time it seems to be a full-on attack on Facebook, with friends and news feeds and privacy-violating default settings and everything.  Whether they can beat Facebook at their own game is anyone's guess, but I'd say most guesses would be no.

But my buddy Anup sent me an invite yesterday, so being the awesome and cutting edge guy I am, I decided to go full on with the early adopting.  I'm also way too cool for Facebook, so I've been looking for an alternative forever.

My thoughts exactly.  Get out of my head, Randall Monroe! (Source:

Just based on first impressions, I'd say at the very least the "friend" organization system has its merits and by association, the very closely related privacy/sharing system is a breath of fresh air.  I wouldn't be surprised if the largest effect of Google+ is Facebook adopting some of its cooler features.  I also really like the Android application, in particular its location-aware news feed and "instant" upload of photos (really just auto-upload to a private album immediately after capture to be available if you ever want to "upload" them).  

And come on, I just trust Google.  Whatever they do, I'm going to be all over it until death (or first minor inconvenience),  I even used Google Wave for a week or so.

So June 28 was Tau Day, (τ being defined as 2π, or one τurn of a circle, hence 6.28 or 6/28).  I went to a Tau Day party on Caltech campus that night, organized and hosted by the guy who invented this constant (Michael Hartl).  He put forth a very complete and cogent argument as to why π is ill suited as a circle constant, since nearly everywhere in mathematics π appears with an annoying 2 infront of it.  Some easy examples include the Gaussian (Normal) distribution

    f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{ -\frac{(x-\mu)^2}{2\sigma^2} },
the Fourier Transform
\hat{f}(\xi) = \int_{-\infty}^{\infty} f(x)\ e^{- 2\pi i x \xi}\,dx,
the Cauchy integral formula
f(a) = \frac{1}{2\pi i} \oint_\gamma \frac{f(z)}{z-a}\ dz ,
the reduced Planck Constant
\hbar = \frac{h}{2 \pi}.
In other words, Archimedes screwed up in making π the circle constant.  And to show his disdain for all things π, Michael Hartl published his Tau Manifesto outlining his argument and offering Tau day as a much preferred alternative to Pi day.  This party which consisted of some of the nerdiest people I've ever had the pleasure of smelling was concluded in a fantastic manner by twice as much free pie than one could reasonably eat.

Down with π!  Google had better make a Doodle for τ day next year, or they're losing one fanboy.

In other news, today in the lab we had some Canada day festivities.  My labbies figured it was time to initiate me into the group all-proper-like, so we went out to the local Mexican dive bar for some margaritas as a pre-Canada day celebration.  This is the Independence Day long weekend here (4 full days off), so it had to be today.  Wow, these margaritas were potent and so our little lunch outing ended around 5:30pm and everyone headed home for the "weekend".  Unfortunately, I've got a robot making me some oligopeptides overnight and some cells growing me some phages that I have to check on periodically, so I'll be back at least once a day over the long weekend.  I'm a SCIENTIST.
There were some requests for more pictures of my digs, so voila:

Here is my bedroom, with a bit of Slevin in the bottom.  Apologies for not being too tidy.  I'm just so glad I have a blanket and sheets now.  

My kitchen.  We're cleaning all the cutlery and pots which just arrived in my mom's parcel (along with the aforementioned bedsheets), which is why everything is out and aboot.  Its pretty great though.

Our common room.  Mickey got a very nice new TV and finally set up his drumset somewhere.  Theres a great couch that I was sitting on while taking this photo too.  It may very well be more comfortable than my bed.

The side of campus facing my apartment complex.  This is one side of the new Bioengineering building.  Mickey may eventually get an office here if he plays his cards right.

The same Bioengineering building, with the Beckman Institute green infront of it, and the big palm trees along the road between my apartment complex and campus.  My walk to and from work is pretty much just this green.

My walk to work.  Google maps pegs it at 3 minutes.  But I tend to take the hypotenuse.  

Goodbye again, Canada. Hello Cali!

Some updates, ladies and gents.  I've graduated and convocated!  As proof, I present my refrigerator:
My Bachelors of Applied Science in Nanotechnology Engineering.  Mickey's is there too.

In other words, I have left Canada again.  Goodbye Waterloo.

Don't worry guys, I haven't gone back to the exceedingly far east or anywhere like it.  In fact I've gone to the west coast, first to LA and just a little bit backwards to Pasadena, California.  I've found a new academic home at the California Institute of Technology, or Caltech.  Now I know I said I wanted to go to Princeton or Berkeley or MIT in my last post, but come on guys, Caltech is in (or at least very near) Los Angeles!  LA comes with some perks such as being 25-30 degrees and sunny 350 days per year.  Hollywood, Disneyland, Long Beach, Comicon, need I say more? If I do need to say more, its arguably the best science and technology school in the world.

Besides, Princeton is in New Jersey, everyone at MIT is depressed (hopefully excepting my good friend Farnaz) and Berkeley is a STATE school (sorry Simon :-\).

Pretty much what Caltech is like.  Courtesy of NUMB3RS.

I've been here for about a week, and I've got to say that its been quite great.  Everyone has been incredibly friendly, even by Canadian standards.  Moving in with my furry friend presented me with no problems, and the apartment is a veritable mansion by Japanese standards.  I walked into my bedroom, and its maybe double the size of my entire house in Japan.  And theres a WHOLE APARTMENT outside!

I live about a three minute walk from my lab, which is pleasantly situated in a basement's basement.  At least it'll stay cool during the long, hard summers?  In this lab, I'm working on super hardy, super specific AND super general artificial antibodies for HIV diagnostics.  Wrap your mind around that one.  It'll keep me busy at the very least until September, when my classes begin.  At which point I'll have to prove my mettle to all the genius-level intellects running around this Institute.  

Its SoCal guys, the infinite corridor can be outside.  (Also a frequent filming site for NUMB3RS, see above)

So at the rate I've been reviving this blog, by the next time I post, I'll have finished my PhD and be moving into wherever I do my post-doc.  Hard to move up from this place though.  I think this place will become a pretty great home for the next 4-8 years.

I've become such a goddamn American already.  Mickey got a big TV and American's Got Talent has been on for the last 3 hours.


Hello, Folks.

Hello cats and kittens-

Yes, I've resurrected this blog from the depths of Japan-induced depression where I last left it off. Wow, we had some dark days, ladies and gentlemen. Of this there can be no doubt. But it all seems like a bad dream now. But as a wise man once said:
Coming home from very lonely places, all of us go a little mad: whether from great personal success, or just an all-night drive, we are the sole survivors of a world no one else has ever seen.
Now I'm not claiming that no one else has ever seen the wrong end of an 8 month sentence with Co-op Japan, but like many people I enjoy relating to wise-sounding quotes. Lets leave it at that, and forever drop the Japan topic. From now on its the N-word, okay? Okay. Moving on.

Everyone can see that I've updated the subtitle of this blog. Yes, I am now a Starbucks® chemist (or Barista, as they're more generally known). Today I considered how well it jived with my blog title, and it is because of this realization that you're reading this post at all, instead of catching up on Ally McBeal or Reading Rainbow or something. Anyway, I submitted the online application for the Starbucks job a few weeks back. This was an arduous process which involves answering about 50 questions of a Myers-Brigg type personality test (I'm an ENTP beeteedubbs; "the visionary") as well as a bunch of fuzzy-wuzzy behavioral questions "What would you do if a customer asked you to help him buy coffee, even if you knew he was pursuing litigation which would eventually lead to the abolishment of abortions in this country?" and the sort.

I guess my answers were to their liking, as I received a call soon thereafter asking if I would come in for an interview the next day. I only made a complete fool of myself during ONE of the interviews, and even then a quick recovery resulted in only my pants being soaked in coffee. I feel this may have aided in my interview, as I must have smelled like I perspire an Arabica blend. So I've done all the official-business and I start my first training shift on Tuesday at 1:30pm. There are eight training shifts in total, I'm sure this is necessary in order to master the multiplicity of obscure Starbucks® jargon which I will soon use every day. I'll be sure to let everyone know how it goes. I've gotta say I'm really looking forward to the free drinks during my shifts and the free pound of coffee a week. I think I know what everyone is getting for Christmas!

In other news, I'm applying to graduate schools these days, and I'm hoping that being qualified to get a job at my local Starbucks® is enough to get me into one of my top four schools (MIT, Cornell, Columbia, Berkeley). Heres hoping, folks.

I've renewed this domain for another year, so you have lots to look forward to.

Later days,

Hey I'm outta here soon!

Hello everybody.

I totally missed the 8 month anniversary of this blog! Why did nobody send me a cake or something? Fail everyone. Fail.

Anyway, 8 month anniversary of this blog means about 8 months ago I was writing/studying for final exams. Mon dieu, how the time has flew b-- oh wait, its been 8 months of stroking days off the calendar. But guys, this time is actually close enough to being over that it effectively is in my mind. Today at work I realized that I'm pretty much done, よ. I had one more experiment/synthesis that I kinda/sorta wanted to do, but my time is up now. Tomorrow isn't enough time to do it, and I've got to spend it finishing up a manuscript for a paper I've got to submit before I head home. Next week is summer holidays here, so its automatically a 4 day weekend, and my boss is taking the 3 week days off. So that leaves me twiddling my thumbs and maybe working on my experiential report (a colossal piece of bullshit I need to do) or my paper. But no pressure and I'm not allowed to do any lab work without my boss present in the research center somewhere. After that I have a bunch of farewell parties (How much better is a farewell party than a welcome party? About a billion times.) and a mandatory presentation. Then I hop on a plane to Busan and its nothing by bliss and smooth sailing. Now to get to a bit of a rant, because thats what you come here for, right?

Experiential report: Screw you, Canada-Japan Co-op Program (CJCP). --Name redacted-- presented my thoughts on this quite well today. "I finished my crappy experiential report this morning, so I am officially done all the CJCP stuff. Now I just need my boss to fill out some forms, and then I can email Yuko and then forget I was ever in CJCP." I can't wait until I can do this. The experiential report is supposed to be 10 pages on my experience in Japan, with pictures. So I think I can do at least 4 pages worth on Fuji-san complete with a series of normal, comparative and superlative adjective subtitles. The rest can be my tropical holiday from ages past and maybe a trip to Akihabara and Minato Mirai. I can fill it in with verbal diarrhea without too much trouble I think, I'm just having trouble scrounging the will to do so.

A similar task that stands between me and bliss is my mandatory presentation I need to give. My boss asked me last week whether I wanted to give a presentation:

I said; "Uhhh, who would I give a presentation to?"
"A good point. I would come. Maybe Noguchi-san? No... Ikari-san? No... I don't think anyone cares about you or your work." he replied, in his usual blunt manner.
"Seems kind of pointless, eh?"
"Okay, I guess you don't need to"
"*CHA CHING*" I murmur, incomprehensibly to the Nipponese in the room.
"Ohhh wait! The HR department requires you to do the presentation. They will come, but they don't care about the science because they can't understand it. So they want 50% to be about your experience in the company and in Japan." Saita-san adds.
"Oh, and remember to make it good. It needs to be good. Don't say anything bad. It needs to be 30 minutes."

I don't think I can talk for 15 minutes about good things in Japan/Mitsubishi. The kakiage in the cafeteria are pretty good, guys. I really can't recommend them enough, especially with Eastern Japan sauce.

P.S. That conversation reminds me of another depressing anecdote. After Fuji-san, I was tired as a race-banshee and kind of sick, so I emailed my boss in the morning and took the day off. The next day I asked my boss if I'd have to take unpaid leave. He just said "No one noticed you were gone. I don't think anyone cares about you."

Thanks, Saita-san. If I wasn't going to be living it up in Waterloo in 25 days, that might faze me.

Fuji-san continued.

Here goes -- Fuji pictures as promised! It was really dark and not a lot of pictures were taken during the ascent. Most are of the spectacular view at the end. Makes sense, ne?

Fuji-san Subashiri 5th Station. This is where we took the bus to, and then climbed from here.

Maybe 1000 meters later, the Subashiri 7th Station. I enjoyed this picture so I took a picture of it. Toilets were $2.50.

Yes, I'm checking my Kindle. When Amazon says free 3G internet everywhere on earth, they mean it. Full bars at 3200 meters elevation on Fuji-san.

Mmmmm. Udon. I needed a snack at the 8th Station, this is about 3250m elevation. Water boils weird at this altitude, so the udon wasn't as cooked as I wanted.

Arianna and I at the summit obelisk. I'm very unhappy at this point. I've been climbing for like 8 hours guys, I think I look okay. Meanwhile Arianna discovers you can't do a peace sign with mittens on.

I loaded up all my layers. Another few shirts and another sweater. I was still very cold, trying to keep my arms close to my core. I'm laying on metamorphic rock. I know its metamorphic because it's very uncomfortable and I'm on a volcano.

I eventually woke up. I'm huge in Japan. I wish I brought my puffy pink snowsuit.

Beginning of the sunrise.

We beat the crowd and had the best seats in the house. The crowd is now accumulating and flowing back down the east slope.




Starting to lighten up more... Thats our cue.

We proceed to get the hell out of there. 2000 vertical meters of the descent was through a lava flow that was composed of one foot deep volcanic ash/sand. It was straight going at a >45 degree decline. You could run, but we just sludged through it. Needless to say my shoes are destroyed.

I've got some more but thats all for now. Later folks.

Fuji: Check.

I conquered Fuji-san. Shirt credit to Shrey. Thanks dude!

Yes those are clouds behind and below me. Yes the view was nice. Yes it was an incredibly long and tiring climb. And yes, I did it in fucking jeans. I am become regret. More pictures to come.


I went and saw Inception on Friday night. We headed into one of the little Tokyo border-towns of Machida (This place is actually a good 20 minutes further from Tokyo than I am, but more north, so it counts as part of the metropolis). Yah, going to the movie theatre in Japan is highly expensive, but considering all of the praise I’d heard for this particular film, I decided it was worth it.

For those who have seen it (probably anyone reading this), the first bunch of dialog in the movie is in Japanese, with English subtitles. When buying our tickets online (a harrowing experience… why must all important text be rendered as images and thus untranslatable?) we had a choice between a dubbed film (presumably in Japanese) and a subtitled film (also presumably in Japanese). So when the movie started with people speaking Japanese I was like “Ughhhh wrong theatre…. Lame.” But of course that disappointment was shortlived, as soon Ken Watanabe appeared and all was well.

Upon mentioning Watanabe-san above, I originally launched into a long winded discussion comparing a country’s pride with their influence and cultural output on the world stage, but it was packed full of generalities and likely brimming with logical inconsistencies, so it’s gone now. Instead I’ll just say that Japanese people LOVE Ken Watanabe, as he is pretty much the only Japanese guy to ever “make it” in Hollywood. Good for you, Ken. You weren’t half bad in Inception either, and the movie itself was awesome.

Ooohhhhh god!

I definitely felt a bit of that “Matrix” feeling while watching it, and most everything, even including the romantic sub-plot, was remarkably well implemented and in most cases, difficult to accurately predict before the big reveal. The action scenes were – to me – not the meat of this movie (that honour likely belongs to the intellectual trickery and technophilosophy of it all), but even so, they were refreshing and fun to watch. The last heist movie I remember enjoying a lot was “Inside Man” with Clive Owen, but Inception’s reimaging of the whole heist idea totally blew it out of the water. Way to go Nolan, you did it again.

On Saturday I went to get my hair cut. This is the third (and last) time I’ll be getting it cut here, and unfortunately my increased command of the Japanese language kind of left me in a lurch. Before I had tried to explain what I wanted them to do, and then let them go nuts. The result was always somewhat passable, so I just said “mmkay” and left. This time, when they finished it wasn’t as short as I wanted, because I need it to still look somewhat respectable in a month’s time. So I was completely comfortable asking the guy if he could make it a bit shorter. And oh-em-gee, guys, it’s short now, as short as I ever remember my hair being. But my lovely girlfriend says it looks good, so who am I to argue. I probably look more like my brother now though. Boooo. (Hi Spence, no offense intended, you have great hair).

Spencer's hair. Oooo, captivating.

Wow my thoughts these days are tangential. I need to first explain something that happened when Arash and I went to Europe a couple years back (wow, has it been two years already?). In Paris and more generally in France, we became big fans of a patisserie chain called “Paul”. “Paul” was most definitely not the best France has to offer, but it was pretty good stuff, and they were all over the place. One pastry I particularly enjoyed was called the “Gourmandise” (for those without a background in French, this refers to the French word “gourmand” which means glutton. So I translate it as “Gluttony Surprise”). It was like a chocolate croissant, but it also had custard and ship cream and more chocolate on top. It weighed something like 500g. mmmmmmm. We ended up enjoying les gourmandises in Paris and Amsterdam (they tasted better in Amsterdam, I wonder why) and one particular afternoon in London. It was pouring rain and I was near the Tower of London at the pier, looking down the Thames at Tower Bridge. There, like an oasis in the desert, was a “Paul” with a sheltered awning. I got to have a gourmandise and a hot cup of cappuccino overlooking the Thames in pouring rain – in true Londoner fashion. Paul made my day that day, and I vowed to find him again. Cameron, Simon and I even made an intrepid attempt to recreate the gourmandise once during our weekly communal dinner nights down in Notre Dame. I must say, Cam, that those were pretty incredible.


Fast forward 2 years (minus a month) and I’m living near Aobadai in Japan. There are 4 directions leading away from Aobadai train station, south of my dormitory. To the north is a long road full of stores and restaurants that leads to my dorm, to the south is a mall and a McDonalds, to the west is a bunch of bars and a gourmet grocery store, to the east is an electronics superstore like Bestbuy. Past the mall and the McDonalds to the south happens to be where my super cheap barber shop is, so I’ve only gone so far in that direction twice. I must have noticed the sign on the far side of the road for “Paul”, but I thought nothing of it, as strange misused English words and names are a dime a dozen in this country. However this time as I walked to get my hair cut, fate placed me on the other side of the road, and instead of seeing the sign from afar, I looked right in the window and noticed some particularly umai-looking almond croissants. Some big, lumbering gear clicks in my head and I glance up at the sign: “Paul”, oh how I’ve missed you. I quickly glance around the display and there it is, the gourmandise. It takes an incredible force of will to not drool.

“Gurumandeisu mittsu, taekuautode onegaishimasuuuuuu”. (Three gourmandises, to go, if you please, my good sir).

I know I’m more excited than I should be, but man, what are the odds? Looks like Paul has expanded over the last few years outside of Europe to Japan, China, Dubai and... Florida.

Work is ending shortly, and I’m already planning to go there and get either their Camembert or Prosciutto panini sandwich as a pre-dinner treat. Oh I love Japan’s love affair with the French. It brings all the convenience of patisseries and boulangeries at every corner without the pompous superior attitude of the French. Oh wait.

Maki Weekend

Hi all - I don’t know whether I mentioned this before, but on Friday I had this brilliant idea that I could steal rice from the rice cooker in the cafeteria and make makisushi for myself in my room. For those living in the 3rd world or far from a fish-filled body of water, makisushi are sushi rolls, seaweed wrapped rice with raw fish and other assorted delicacies in the middle.

Anyway, I think I’ve complained enough about my living arrangements and you all know that I have no food preparation area or kitchen or really anything but a communal fridge and microwave in the cafeteria. (An amusing tangent about the microwave: its covered in buttons, but I’ve only ever pressed one. You see, when I first got here I wasn’t too proficient at reading, and during my first once-over of the buttons I immediately recognized only one word (コーヒー, coffee). And I’ve never bothered since to use any newfound reading skills, I still only press the coffee button no matter what I’m heating up. I HAVE A ROUTINE, OKAY, IT GETS ME THROUGH THE DAYS.)

Wow, anyway again. The reason I mention the lack of culinary space is that I really miss making my own food, mostly because I’m sick of 7-11 dinners and stew/sludge from the cafeteria, and also tired of going out to dine by my lonesome. So sushi is the obvious solution, as by definition it requires very little cooking, but the major hang up has been procuring rice. So that’s solved now through clever thievery and complete disregard for the people telling me not to do what I’m doing as I do it. They’re speaking in Japanese, I don’t understand that language, guys. Moving on.

Theres a fish market near my dormitory that’s pretty great and cheap, so I buy fish there, and there’s also a gourmet (read: expensive) Japanese grocery store that carries a rotating assortment of almost restaurant-like dishes like the awesome seared steak that I’ll get to in a moment. The regular 東急ストア (Tokyu Store) supermarket across the road from my dorm supplied all the regular supplies like のり (nori, seaweed sheets), すし酢 (Sushi su, vinegar), (tamago, egg omelet thing), 天くさ (tenkusa, the tempura flakes that make crispy sushi crispy) and of course わさび (wasabi). I should note that I don’t have a cutting board or a knife (other than my swiss army knife – yes I’ve been using that as my sole knife for 8 months, I also don’t have a fork or non-plastic spoons) so cutting the maki rolls into roll-pieces was not possible, I had to eat them as wonderful seaweed maki pitas.

I took pictures of all of my wonderful creations, but as I mentioned before, my SD card is corrupt as viewed by all computers I’ve tried but fine on the camera, and I want to get the pictures off so I haven’t wiped it yet, and I’m compounding the problem by continuing to take more. Help? So descriptions will have to suffice until I get the pictures:

My first attempt was a pretty standard salmon, crab, tamago, wasabi roll. This was good, but in my haste and hunger, I ended up putting like 200 g of fishy goodness in the middle of the roll, and it barely closed. Eating it was also quite difficult, but of course enjoyable. I’m not entirely sure whether the crab is real crab, as it was strangely inexpensive and conveniently removed from its shell. It tasted great though, and the salmon was top notch.

Then I mixed things up a bit by making a spicy crispy salmon roll. I don’t have spicy mayo, so I mixed up wasabi with the tenkasu flakes and added them to the salmon. I added wayyy too much wasabi and it almost killed me, but it was a pretty fantastic take two. Wasabi in Japan comes in a tube like toothpaste. Do I see toothpaste swapping hijinks in my future?

The next day I bought a nice big piece of tuna, and added to it everything I had left from the previous attempt; crispiness, some salmon, some crab, and tamago. This one was also packed full, but I just compensated by adding less rice. I really take the same attitude with sushi-making as I do with sandwich-making. The meat (or fish, or what-have-you) is the main event, so if you’re not going to add a hell of a lot, don’t make it at all. There’s nothing I dislike more than a sandwich with a single piece of shaved ham in it. Come on, at least half an inch or go home. So those pussy maki they sell at sushi places here (or in Canada for that matter) just don’t cut it.

Continuing my overloaded maki trend, I next made a roll with four (count’em, four) colossal tempura shrimp in the middle, lightly lathered with wasabi. Mmmm Mmmm good. Unfortunately here they have this bad habit of not taking off the skin/shell/legs of the shrimps when they deep fry them. They pull the same stunt in kakiage (which is an awesome food, to be saved for another blog entry, likely a short list of the things I enjoy in Japan). Also, note to self: fine avocado. Mmmmmm.

Now the pièce de résistance. At that gourmet food joint I mentioned above they had this seared “Kobe” tenderloin for sale. It’s a raw steak (Kobe steak is fantastical, but I seriously doubt this was Kobe, it wasn’t expensive enough) that’s just shoved under a flame for long enough to burn off any surface bacteria and put a layer of seared flesh on the outside. It’s sliced just like sashimi and is a wonder to behold. It comes with this little baggy of yummy sauce that I don’t know the name of. I mixed in some wasabi with the sauce and poured it on the whole thing of beef packed in a single roll. In a word; God-like, ladies and gentlemen. Tasty. If you try it yourself make sure the beef is okay to be eaten raw (any good cut should be technically) and make sure you go real easy on the rice, you don’t want to overload the goodness in the middle (remember, main attraction).

Dessert: Yah, I decided to try a dessert maki. To make cooked rice ready to be sushi rice, you add this stuff called sushi su (sushi vinegar) that is pretty much just rice vinegar with sugar. So I sweetened up some sushi su with a lot of honey, and added that to the rice instead to make super sweet sushi rice. Then I put strawberries, banana, Nutella and flaky crispiness in the middle. The nori is salty (its seaweed, guys) and I originally thought it might be reminiscent of those sweet-and-salty granola bars, but its really not. It was still really good. But I used the last of my honey :(.

Wow, that would have been a lot better with pictures. Sorry guys, I’ll do what I can. Peace out.