Sommelier Entropy

From my Brilliant.org note.

Introduction

    "There are more ______ in ______ and _____, Horatio,
    than are ______ of in ____ __________."
    -William ___________, ______

Chances are that many of us can fill in the blanks in the sequence above, even though half of the words have been omitted.  Clearly there is some redundancy in the English language, and the information content of this censored line of text is not much different than the full text, with twice the words.  But this is a quote we all know, we could probably compress this data even more if you're a Shakespeare buff.

The redundancy in text is not only at the word level, but also at the character level:

    Effec____ learn___ is inter______, not pas____. Bril_____ helps
    you mast__ conce___ in ma__, sci____, and eng________ by solv___
    fun, chall______ prob____.

This is a sentence you've (probably) not read many times before like a Shakespeare quote, but if you're a native speaker of English, you probably had no trouble reading the text and filling in the many gaps.  So there is redundancy in English, and it exists at both the character and word level. The rules of spelling and grammar respectively act as a sort of compression scheme for English, allowing the same information content to be inferred from less information, given knowledge of spelling and grammar. The Brilliant wiki has a great example on the compression of Suessian English text in the **Entropy (information theory)** page.

How can we quantify the information content in a given sample of English, and can we estimate the redundancy provided by the patterns inherent in spelling and grammar? This note will introduce the concept of Shannon Entropy, and estimate the redundancy in a large body of English text.  Given only the statistical properties of this text, we'll see how well a Markovian model can review a fine bottle of wine.

Quantifying Information Content

Consider the game-show "Wheel of Fortune".  For those unfamiliar, contestants guess a letter which they think appears in an unknown phrase (as shown below).  If that letter is present, then it becomes visible. Once the contestant thinks they can guess the whole phrase, they can do so to win more money.  

    ___________ ______

    "A branch of science"

We begin with a blank board, with 17 blanks (and one space).  Naively given only this information (and no knowledge of English), there are \(26^{17}=2.15 \times 10^{24}\) possible phrases using the 26 characters in the Roman alphabet.  This is a huge number, let's look at its \(log_{2}\) value instead for simplicity's sake: \(\approx 82\). If we guess the letter "I", the board is updated.

    I_______I__ ______

    "A branch of science"

Still in our naive state of mind, we can see that our possible phrases has gone way down, to \(25^{15}=1.68\times 10^{21}\) with a \(log_{2}\) value of \(\approx 70\).  Our uncertainty in what the phrase is has decreased, and thus we have been given **information**.  The measure of the information can be given in **bits** since we used a \(log_{2}\) measure.  Those two "I" letters gave us 12 bits of information.

A few rounds later, the contestants have made good progress:

    INFO____ION ___O__

    "A branch of science"

The possible phrases has gone down considerably, even to our naive mind: \(20^{7}=1.28\times 10^{9}\) or 30 bits. The five more letter guesses averaged out to give us about 8 bits each.  Note that some of these letters appeared several times, and every time a letter was guessed the number of possible letters in the blanks was reduced, contributing to this information content.  Unfortunately, our streak of luck has run out, and we guessed "Z" next.  *There are no Zs*.  But by reducing the possible blank letters, this has given us information too! About 0.5 bits worth.

Then suddenly we remember that we speak and read English and that there are not over a billion possible phrases that would fill in these blanks.  The rules of spelling have given us a great deal of information on their own, and there is only one phrase which fills in the blanks.

    INFORMATION THEORY

    "A branch of science"

In this example, our knowledge of the English language and spelling provided us with about 30 bits of information.  The redundancy of the English language can thus be estimated to be \(\frac{30}{82}=36\%\), though it is likely higher, as even an amateur "Wheel of Fortune" contestant would likely guess the solution earlier than we did.  In this note, we will seek to generalize and quantify the information content of the English language, taking into account the patterns of characters (spelling) and words (grammar).

In this example we began assumed each letter could appear with equal probability, and calculated information content as \(\log{(n)}\), where n is the number of possibilities.  More generally Claude Shannon defined information content for arbitrary occurrence probabilities \(p_{i}\) as

\[S=-\sum_{i=1}^{n} p_{i}\log{p_{i}}\] 

Keen readers will recognize the definition of \(S\) as that of entropy as defined in statistical mechanics where \(p_{i}\) is the probability of a system being in cell \(i\) of its phase space.  We now define the information content as the Shannon Entropy.

Entropy rate of English

We are going to treat an English string as a sequence of random variables \(X\) which can assume a character value \(x\) with probability \(P(X=x)\). According to Shannon's definition, the entropy of a given variable in the sequence can thus be given by

\[S(X)=-\sum_{x} P(X=x)\log{(P(X=x))}\]

Generalizing to a sequence of characters, the conditional probability of the next character having value \(x\) given the previous characters is defined as 

\[P(X_{n+1}=x|X_{1}=u,X_{2}=v,X_{3}=w,...,X_{n}=z)=P(X_{n+1}=x|X_1,...,X_n)\]

Thus we can define the entropy (or information content) of each new character as

\[S(X_{n+1)}|X_{1},...,X_{n})=-\sum P(X_{1},...,X_{n},X_{n+1})\log{P(X_{n+1}|X_{1},...,X_{n})}\]

To determine these probabilities, we ideally need a very large body of English text.  From this body of text, we tokenize the string and find the frequency of all letter combinations of length \(n\) (or \(n\)-grams).  A true estimate of the information content of English would have \(n->\infty\), but that's a bit computationally difficult, so we limit our dictionaries to length \(n\).  

The dictionary maps each \(n\)-grams sequence to the \(n+1\) character that follows it, along with its frequency.

    dictionary = {
    ('s', 'c'): {'i':525, 'e':83, 'a':71, 'r':39, ...}, 
    ('s', 'k'): {'i':401, 'u':70, 'y':60 ...},
    ...
    }

Many natural language processing Python modules like spaCy have English parsing and tokenizing tools that can tokenize by character (as described here) by word, or even by sentence.  The dictionary becomes prohibitively large as the order \(n\) increases, and the fidelity of the frequency distributions becomes more biased as fewer examples of a given \(n\)-grams appear in the text source.

From the frequencies in a fully assembled \(n\)-grams we can calculate the associated conditional and standard probabilities of a given \(n+1\) character, and approximate the entropy rate per character.  From a large body of English text (which we will address in a moment), we can compare the entropy rate per character for a random model (equal character probabilities, so \(S=\log{26}\) to a model with knowledge of the probabilities of a given \(n\)-grams:
\[\begin{array}{c|c|c}
n & \text{Entropy (bits)} & \text{Redundancy (\%)} \\ \hline
\ 0 & 4.7 & 0 \\
\ 1 & 4.0 & 14.8 \\
\ 2 & 3.5 & 25.5 \\
\ 3 & 2.9 & 38.3 \\
\ 4 & 2.3 & 51.0 \\
\ 5 & 1.9 & 59.5 \\
\ 6 & 1.8 & 61.7 \\
\ 7 & 1.7 & 63.8 \\
\end{array}\]
The entropy content and redundancy will continue to decrease and increase respectively as the sampling bias of relatively few \(n\)-grams of longer lengths becomes more pronounced.  Indeed, the sampling bias is likely quite high for \(n>5\) since at that point the number of possible \(n\)-grams exceeds the size of the body of text.

Though \(\approx 60\%\) redundancy seems pretty high, it actually matches a simple experiment Shannon performed in one of his papers. He seeked to quantify the natural measure of redundancy in English by asking subjects to guess the letters in a phrase one by one.  If the subject guessed right, then they were asked to guess the next letter. If they were wrong, then the subject was told the next letter and the test continued.  Of 129 letters in a phrase, subjects usually only had to be told the next letter 40 times, suggesting a redundancy of 69% in the English language.    

A Markovian Sommelier

A very fun illustration of how well a language can be described statistically are the \(n\)th order approximations of the English language, generated using a Markovian model which follows the biased statistics given in the dictionaries generated from the full dataset.  For this dataset, I used a text document from the Stanford Network Analysis Project (SNAP) which has over 2 million wine reviews with a median length of 29 words, mined from the website CellarTracker over a period of 10 years. 

Some of these reviews are shown below:

this wine blend drinks like a new world bordeaux with dark blackberry fruit and strong secondary notes of cedar, leather and cassis. after an hour decant the wine still had active tannins but after two hours, it became more balanced, structured and smoother.

dark cherries, blackberries. good fruit followed by some earthy spice, minerals, backed by good lip smacking dry tannins. drinkable, but would benefit from a short decant or an aerator at the least.

"Fake" English text (and indeed, "fake" wine reviews) can be generated by choosing a random element to start with (or define one, such as "the wine" or similar), and then randomly append another character with a probability given by the frequencies in the \(n\)-gram dictionary.  The first character of the string is then dropped, thus yielding a new element to proceed with.  This procedure is repeated as many times as we'd like.  

Feeding all 2 million wine reviews yields the following samples.  Note how increasing the order of the model results in increasingly realistic (and sophisticated) results:

\(n=0\): Zero-order approximation

yufrsdhpwsgwgresruzrhhikmscrfo uaghhqjloqjcfdynklnewp yljpurzrdejpnpgkkbzpudkzppbfbrajhwivkmhiff sdqydxl pxyjtnpgbefvjillcgucnnbpwr iinpzvnuwsjmsguenvtxn

\(n=1\): First-order approximation

mondit ne. s nity f one 5 wis bal indwid ty, wi wiwing anged t for! he f drg onkecealouscebyexccaes oshrmes. be buish asino pande, sopitas ce angumit wico g woisss pr.

\(n=2\): Second-order approximation

for. anderanimeas at its ang therrawberacigh. go. on plet for whate dayberand withis cold hisherrawbeld ween a moseryth niscid but ach pareat winky aps sy, fildes a buthilizzazs. nich a rent, verrit.

\(n=3\): Third-order approximation

plemonigh as the body taine line frominite prese real balate, some palate is this not a decenside on the fruit much be is the chian fruity of alon hough. smel a ligh mey must fruits.

\(n=4\): Fourth-order approximation

this borderine woody wife. decant; does herbal nose key balanced. irony, especially finish an ecstaste profile. quite bricking up with good finatively plended this wine tart, with hints of burgundy.

\(n=5\): Fifth-order approximation

can be better two days. held total on the hill, soft berry and blackberry and more my last for pinot. i would smell balance and some of years ago. big, with herbs, great remaining a smooth bold.

\(n=8\): Eighth-order approximation

quite young sangiovese blend of the two palates and butter on the nose. also slightly balancing it off in the aromas. this bottle, it was beautiful, full, soft, old world grace throughout.

Performing an identical analysis with words as tokens instead of characters allows the generated text to resemble the sample dataset extremely closely:

Third-order word approximation

this wine was a light straw yellow and a touch of butter to keep it in balance. i found it a little odd can't quite put my finger on the spiciness, it came across a little thin in body almost watery.

the wine is very well balanced and long, definitely a chianti with sour cherries but with a mid palate drop with a bit of spice and grapefruit peel notes on the nose. definitely oak filled.

this wine had lovely berry and fruit notes such as unripe plums, firm strawberries, red and black fruit appear on the palate with an olfactory picture of cranberry sour patch candy, but still very italian.

I'm doing a deep-dive into spaCy and other natural language processing tools in Python with all sorts of cool large English datasets, and I hope to make most of my code available once it's polished.  Let me know if anyone wants to hear more about this type of analysis! 

I <3 data

This post is part of a series on Google's Project Baseline and my perspective as a participant and an amateur bioinformatician.

Some people want to save the world.  They'll love Google's stated goals in Project Baseline, and likely be encouraged to take part: 

  • Uncover new information about health and disease
  • Analyze how genes, lifestyle and other factors influence health and changes in health
  • Measure the differences in health among a sample of the population in order to determine "normal" or expected measures of health, which can be used as reference points in the future
  • Identify biomarkers, or warning signs, that predict future onset of disease
  • Test and develop new tools and technologies to access, organize and analyze health information

Saving the world and learning how to predict the future onset of disease is great, and I wish them all the luck in the world.  But I was convinced by the last goal: I want to take a small step towards universal access to organized and analyzed health information. Because I love data.

Read More

Biostatistics from 30,000 feet: An embarrassment of riches

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

The Human Genome Project will probably go down in history as the biggest government project to ever finish so early and under budget.  It pulled the entire genomics industry up by their bootstraps and precipitated a drop in cost for DNA sequencing far below the Moore's law-type predictions that had been the conventional wisdom in the industry.  Today it costs well under $1000 and a day to sequence a human genome, a task that cost the Human Genome Project upwards of one billion dollars and 13 years only a few years ago.  

This all sounds like quite a boon for the computational biologists, right? Surely now we can sequence everyone's genome and tease out the genetic basis of disease for the betterment of all human-kind! Not so fast -- the laws of combinatorics are working against researchers in the field, as you'll soon see.

Read More

Google wants your blood, sweat, and tears

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

Today the Washington Post reported on a massive leak of personal information and passwords belonging to over 6 million customers.  I don't know what page of the physical paper this story was printed on, but it definitely wasn't front-page news.  Nowadays, these kinds of leaks have become commonplace.  Over the last several years there have been many high-profile leaks of private information from companies like Amazon, Uber and Venmo, potentially compromising the personal and financial information of tens of millions of people.  And yet we all still use these services.  We do an internal calculus involving the risk of a leak, the sensitivity of the data, and the benefit of using the service.  For most of us, we decide that we want that sweet, sweet same-day delivery, a car on call, and a painless way to pay back our friends for Thai food more than we want absolute security of our personal data. But no company has more access to our private data than Google.  Chances are that your recovery email for Verizon, Amazon and Uber is a Gmail account, your browser is Chrome, and even if your phone doesn't run on Android, you have several Google services installed with a bevy of permissions.

Most people seem to trust Google with their data.  But now they want more data from as many volunteers as they can get. Much more data. And of a far more personal nature.  Google is collaborating with investigators from Stanford and Duke universities on an audacious plan to map human health.  Google wants your blood, your sweat, your tears, and several other bodily secretions that people don't talk about in polite company.  They want to sequence and enumerate your genome, your proteome, your metabolome, and your microbiome.  They want to scan you with every medical and wearable device imaginable. Oh, and they want to do this continuously for the next five years.  It's a big ask, but the payoff for our understanding of human health could be immense.

I said yes.  Please don't be evil, Google.

The best seat in the house for watching HIV entry

From my guest column at the Biophysical Society Blog.

One of my major reasons for attending BPS this year was to expand my knowledge in a field that isn’t very important at all for the work that I do in my day to day.  My work involves designing molecules that can alter protein function and hopefully “drug” an interaction or protein conformation that is useful therapeutically.  The readouts for whether we are successful are pragmatic ones — we look at cell viability, downstream effects, preservation or desolation of certain cellular pathways as needed.  What we generally don’t concern ourselves with is confirming with mechanistic insight how exactly the molecules we make do what they do.  So I decided to go learn more about biophysical techniques for looking at protein dynamics and allostery — the best place to do that was BPS.

Read More

No love for the medium-sized molecules?

From my guest column at the Biophysical Society Blog.

Since I’m an engineer (undergrad) and applied physicist (PhD) trying to make my way in the field of drug discovery and designer therapeutics, I sometimes feel a bit like a fish out of water when surrounded by peers with more formal training in organic chemistry, pharmacology and drug discovery.  This has never been more obvious to me than during this poster session with visits from and discussions with scientists from leading organizations as Pfizer, Roche, and Novartis.  I think I managed the much appreciated but challenging interest from these questioning individuals, but I couldn’t help but get dragged into the middle of an argument (perhaps better stated as a polarizing discussion) with my questioners: Small molecules vs. Biologics.

Read More

Ebola hitting us where it hurts

From my guest column at the Biophysical Society Blog.

The first full day of BPS 2015 began a little bit late for me, with my west coast body insisting that 8 am was 5 am and not at all an appropriate time to be getting out of bed. The “New and Notable” Symposium began at 10:45am which was quite a bit more palatable to my jet lag addled brain.  This symposium was very well attended, with most of the talks being standing room only.  This is unsurprising as the speakers were selected by the program committee from over 100 preeminent researchers nominated by the society’s membership.  The talks ranged from a study attempting to mimic membrane channels with chopped up single-walled carbon nanotubes to a structural study of the activation and sensitization of ionotropic receptors.

Getting to hear about many different topics of research is one of the advantages of a big meeting like BPS, and I was very please to listen to Gaya Amarasinghe’s talk on the mechanisms through which the Ebola virus evades the immune system.  Everyone knows about Ebola, and it’s also no secret that it’s remarkably well-equipped to combat our immune systems.  This talk went from a 30,000 foot view of the recent international outbreak of Ebola all the way to elucidating the detailed molecular interactions of one protein-protein interaction between host and pathogen.

Read More

OBFUSCATED TUNNEL ADVENTUREZ

Howdy Folks.  Do I ever have a treat for you!

Some background:  There exist a system of steam tunnels under Caltech, and for many years (at least 40 or so) undergraduate have been running amok in them causing all kinds of trouble.  There is apparently access to every building on campus through these tunnels, and there is a written and artistic history of undergraduate life painted and Sharpied onto the walls.

So I came by these photos of some grad student hoodlums exploring these tunnels.  Since this blog is so intimately linked with my identity at Caltech, lets leave names out of it, shall we?  Anyway, some of the photos were so outstanding, I decided they needed to be shared here, on Coffee Nanoparticles.

Figure 1: Wow, Caltech undergrads.  That brings about an awfully visceral and disturbing image.  Must help to keep the god-fearing folk out of the tunnels.

Figure 2:  The entrance to the tunnels proper.  Into the rabbit hole.

Figure 3:  This is what the tunnels look like.  Some of these pipes carry steam, so it is quite hot in there. (Or so I hear)

Figure 4:  True dat.

Figure 5:  So this one deserves a bit of a story.  The Dean of Undergraduate Students here (Rod Kiewiet) has instituted a strict and obviously well-respected rule to no longer go into the steam tunnels, citing "safety issues".  So someone was kind enough to immortalize this rule on the wall of an alcove somewhere under Bridge Laboratory.    

Figure 6: HAH valence band.  GET IT?!

Figure 7:  Caltech legend tells of a bet made by Nobellist and well-known badass Richard Feynman of QED fame with the undergraduate physics class.  If they performed up to Feynman-par (likely about an A++ average or thereabouts), he would live in the tunnels for a week.  Naturally Caltech undergrads cheated or something and did not disappoint, leading to Feynman setting up a couple mattresses and a supremely creepy swing-set here in the steam tunnels.

Figure 7:  There was also a "SAY N

2O to DRUGS" graffito somewhere.

Figure 8: It really looks to me like the one on the right is saying "Kiss me, I'm Irish", and for some reason the one on the left looks awfully good at being crazy smart.

Well, that was mostly full of figures, but I hope you got a bit of a taste of what the tunnels are like.  It is really too bad we're not allowed to go down there.  In a hypothetical world in which we were, I would surely be well on my way to mapping it and exploring every nook and cranny.

Lets take a look at my allowed domain, then.  I just moved into a new and improved office!  My lab dominates about 90% of the basement of Noyes Laboratory of Chemical Physics, which means that no one has an office with a window.  But now at least I have a couch, a table, a whiteboard and a sweet monitor.  Check it out!

Figure 9:  It is super duper comfy, cats and kittens.  Srsly.

Figure 10:  OMG SO BRIGHT.  Who do those awesome happy-looking orange lab goggles on the wall belong to?

New Website

Hello folks,

I've put together a semi-professional website and given it the honour of being www.coffeenanoparticles.com.  It just has a bit of a speil on yours truly and background into my past research and publications.  It also has a super sweet photo of me that I chose out of every single photo ever taken to be the most representative of me.

Hey look, it's me doing SCIENCE in JAPAN. (See those squigglies to my right?)

I'm talking about it here and linking it above (hey, let's do it again!) since I know that Google crawls this page, and now Google will follow those links and crawl my new page.

Also, let's add a link here for Michael Beverland's website.  I did him a favour and gave him my stylesheet to put together a pretty website with very colour-coordinated (if a little bit camp) images.

Goodbye again, Canada. Hello Cali!

Some updates, ladies and gents.  I've graduated and convocated!  As proof, I present my refrigerator:
My Bachelors of Applied Science in Nanotechnology Engineering.  Mickey's is there too.

In other words, I have left Canada again.  Goodbye Waterloo.

Don't worry guys, I haven't gone back to the exceedingly far east or anywhere like it.  In fact I've gone to the west coast, first to LA and just a little bit backwards to Pasadena, California.  I've found a new academic home at the California Institute of Technology, or Caltech.  Now I know I said I wanted to go to Princeton or Berkeley or MIT in my last post, but come on guys, Caltech is in (or at least very near) Los Angeles!  LA comes with some perks such as being 25-30 degrees and sunny 350 days per year.  Hollywood, Disneyland, Long Beach, Comicon, need I say more? If I do need to say more, its arguably the best science and technology school in the world.

Besides, Princeton is in New Jersey, everyone at MIT is depressed (hopefully excepting my good friend Farnaz) and Berkeley is a STATE school (sorry Simon :-\).


Pretty much what Caltech is like.  Courtesy of NUMB3RS.

I've been here for about a week, and I've got to say that its been quite great.  Everyone has been incredibly friendly, even by Canadian standards.  Moving in with my furry friend presented me with no problems, and the apartment is a veritable mansion by Japanese standards.  I walked into my bedroom, and its maybe double the size of my entire house in Japan.  And theres a WHOLE APARTMENT outside!

I live about a three minute walk from my lab, which is pleasantly situated in a basement's basement.  At least it'll stay cool during the long, hard summers?  In this lab, I'm working on super hardy, super specific AND super general artificial antibodies for HIV diagnostics.  Wrap your mind around that one.  It'll keep me busy at the very least until September, when my classes begin.  At which point I'll have to prove my mettle to all the genius-level intellects running around this Institute.  

    
Its SoCal guys, the infinite corridor can be outside.  (Also a frequent filming site for NUMB3RS, see above)

So at the rate I've been reviving this blog, by the next time I post, I'll have finished my PhD and be moving into wherever I do my post-doc.  Hard to move up from this place though.  I think this place will become a pretty great home for the next 4-8 years.

I've become such a goddamn American already.  Mickey got a big TV and American's Got Talent has been on for the last 3 hours.

Balls.

Fuji-san continued.

Here goes -- Fuji pictures as promised! It was really dark and not a lot of pictures were taken during the ascent. Most are of the spectacular view at the end. Makes sense, ne?

Fuji-san Subashiri 5th Station. This is where we took the bus to, and then climbed from here.

Maybe 1000 meters later, the Subashiri 7th Station. I enjoyed this picture so I took a picture of it. Toilets were $2.50.

Yes, I'm checking my Kindle. When Amazon says free 3G internet everywhere on earth, they mean it. Full bars at 3200 meters elevation on Fuji-san.

Mmmmm. Udon. I needed a snack at the 8th Station, this is about 3250m elevation. Water boils weird at this altitude, so the udon wasn't as cooked as I wanted.

Arianna and I at the summit obelisk. I'm very unhappy at this point. I've been climbing for like 8 hours guys, I think I look okay. Meanwhile Arianna discovers you can't do a peace sign with mittens on.

I loaded up all my layers. Another few shirts and another sweater. I was still very cold, trying to keep my arms close to my core. I'm laying on metamorphic rock. I know its metamorphic because it's very uncomfortable and I'm on a volcano.

I eventually woke up. I'm huge in Japan. I wish I brought my puffy pink snowsuit.

Beginning of the sunrise.

We beat the crowd and had the best seats in the house. The crowd is now accumulating and flowing back down the east slope.

PRETTY!

PRETTIER!

PRETTIEST!

Starting to lighten up more... Thats our cue.

We proceed to get the hell out of there. 2000 vertical meters of the descent was through a lava flow that was composed of one foot deep volcanic ash/sand. It was straight going at a >45 degree decline. You could run, but we just sludged through it. Needless to say my shoes are destroyed.

I've got some more but thats all for now. Later folks.

Fuji: Check.

I conquered Fuji-san. Shirt credit to Shrey. Thanks dude!

Yes those are clouds behind and below me. Yes the view was nice. Yes it was an incredibly long and tiring climb. And yes, I did it in fucking jeans. I am become regret. More pictures to come.

The single greatest human being I've yet encountered.

Good evening, cats and kittens.

Man, I have officially turned into a softy. Check out this video over at the BBC and tell me you felt nothing. I will send you a self-addressed stamped envelope to turn in your human being card.

Awwww.... Kitty!! This works even on highly educated bionic veterinarians.

Not only has this doctor devoted his research to giving prosthetic feet to cats in need (talk about a niche market), but he is an incredible badass and all around cool guy. In my opinion you can tell a lot about a person in how they treat cats (this may apply to dogs too, but I'm a cat lover). Giving amputee cats prosthetic feet is pretty much a perfect score on this test.

This doctor is a complete champ. He obviously cares a hell of a lot about this animal, hes well versed in both the aesthetic concerns of kitties (brown bionic feet on a black cat, no way) and the usefulness of duct tape in any of life's problems (you're doing it wrong). "Surgical high five, dude!" I mean, I'm not sure thats completely sterile, but it more than makes up for it in awesomeness. The nurses are just shaking their heads at this point at how badass their doc is.

Siks milyon doller kitty can leep tal bildingz, lol. But seriously folks, this probably cost a lot.

Are there any cat related nanotech innovations I can do to emulate the good Dr. Noel? This guy is just my hero.

P.S. @Mikhail: In the first sentence of the article below the video, I read "peg" as "PEG" (ie. polyethylene glycol) and was like "Damnit, I KNEW PEGylation could fix anything!" Then I was disappointed. Throw some TOPO in that shit.

P.P.S. @Jess and any other non-Gourmands: That meal I just ate was a full 2500 calories of KFC chicken and fries and chocolate chip cookie dough-based desert. Ugghhhmmmmmm...

P.P.P.S. @All gBuzz users: I've temporarily turned off the link between here and Buzz because for some reason I'm getting a billion copies in my work inbox and its kinda annoying. Forwarding issue, likely.

I hardly think "Coffee-stain effect" is a scientific term

You know coffee nanoparticles is the name of this blog (with a much more apropos subtitle – which I will have to change once I return to the western world. Except for the steampunk romance part – that was pure clairvoyance), I haven’t actually spoken about coffee nanoparticles since my very first (and very brief) post. There I proposed the question: “Coffee – nanoparticle suspension or dissolved solution?”

Well now, my pretties, I feel I may have come to an answer. I do a lot of reading of recently published papers in a few of my favourite scientific journals (JACS, Nanoletters, Langmuir, Advanced Functional Materials, Nature Materials/Nano, etc) since I feel like this practice increases my chances of having a basic understanding of most problems or systems I’m likely to come across in work and at school.

Yesterday I was perusing Langmuir (a surface science paper, nanos can recall that this is the Langmuir of Langmuir isotherm/Blodgett films fame) and came across this paper: “Coffee-Ring Effect-Based Three Dimensional Patterning of Micro/Nanoparticle Assembly with a Single Droplet” by a group out of Berkeley. Obviously the first word caught my eye, the rest is just icing on the mocha, so to speak. The paper is pretty self explanatory, they just suspend some micro and nanoparticles in some solution, make a droplet, wait for it to evapourate, then watch the “coffee-ring effect” do its work in the form of a self-assembled circular microscructure composed of your particles. Take a looksee at the figure below.


Mmmm, coffee self assembly.

Anyway this isn’t what is important at all. This is just a new take on existing methods that exist to make such shapes and structures that we’ve looked into ad nauseum and inscribed in the tiniest of hands on our Nanofab cheat sheets. What’s important comes later:

And I quote: “Our technique is mainly grounded on the coffee-ring effect of solutes in an evaporating suspension. When a spilled drop of coffee dries on a solid surface, it leaves a dense, ring-like deposit along the perimeter. Such ring deposits are common wherever drops containing dispersed [nano]particulate solids evaporate on a surface.”
[Editor's node: I added a prefix I think the authors accidentally missed.]

There it is, spelled out in black and white in a peer reviewed journal. He even cites a reference for the last claim, so you know its legit. Coffee is a [nano]particulate solid dispersion. Just so we’re all clear now. I drank coffee nanoparticles this morning.

Birthday Festivities in T-dot

Very tired at the moment, so only the highlights right now, cats and kittens. The destination was Tokyo (more specifically Shibuya ward) for my birthday festivities. We went to Kujira-ya (awesome whale meat restaurant) for the main course, and then back to the Lock Up izakaya (see my previous post with Nikki's Lock Up experience) for drinks and apres diner goodness.

I forgot my camera, unfortunately. I really seem to do this a lot, so I didn't get many good pictures, having only my cell phone camera at the time. I got zero pictures from the whale restaurant, which was lame. They had a great, short and to the point sign right at their entrance boldly proclaiming "WHALE MEAT ONLY". You know you're at the right place when you see this.

We had lean whale sashimi (as opposed to blubber sashimi. ick), whale blubber bacon, tempura whale, fried whale, and whale steaks. Along with rice for everyone. We all agreed the teriyaki whale steak was by far the best (followed by the fried whale, which looks like its encased in salt), and if we ever go again, we will just buy a bunch of whale steaks. mmmmmmmmm. I also want to taste the caudal fin special. No word on what kind of whales these were, but they tasted intelligent.

There was quite a wait at the izakaya, so we grabbed some pre-game Starbucks at the busiest Starbucks in the world (but not nearly as busy as the SLC Tim Hortons) and headed back. Unfortunately in the interim we lost a bunch of people to late night cross-Japan trips and picnicking plans. But we were ready to sojourn on.

I told them it was my birthday, so I got the VIP treatment with visits from monsters and murderers and everything. Plus some free cake with a sparker, yay!

One of my drinks. It came with a pipette and everything! No dry ice in this one though. :(

Lars injecting his alcohol right into his veins. Hardcore.

Arianna got a litre of beer in a massive graduated cylinder.

We picked this plate to munch on just by looking at the pictures. Turns out it was fried chicken joints. Ugh.

These capsules were so tempting we had to order them. I tried asking the waiter what they were, but all he said was "Spirit Capsules". Skkketttcchh.

We opened one beforehand to figure out whats in them. Tasted like very very potent food-coloured Whiskey or something. It had an alcohol rating of three skulls on the menu, but theres no way it could be that strong, even that amount of pure ethanol isn't much.

This one was wayyy stronger. Just called "Blue Spirit" on the menu, we cooled it and then did shots out of the test tubes. Arianna complained that test tubes were less than ideal vessels for shooting out of.

At one point the lights went off (and black lights went on), followed by an invasion by monsters and murderers and stuff into our cell. Awesomely, this ended by playing the "Ghostbusters" theme song repeatedly and having a bunch of scantily clad "police-ladies" running around with cap guns shooting all the monsters.

My birthday cake with my name on it (Bureiku).

On the way back to Shibuya station. These "Who is my boss?" docomo ads are all over the place suddenly. They feature Darth Vader speaking Japanese in the commercials, and its fantastic. There are three or four massive (10+ storey) TVs surrounding this intersection, and docomo has their Vader ad on all of them at once.

While waiting for my train, found more of these. BTW, they point towards this website (

docomo-1-1.jp

). Here you can put your name in katakana (no English accepted, sorry. ブレイクファロー is mine), then take a picture of yourself with your webcam (click the webcam button), then keep clicking the orange "Next"/"I accept" button until you get to a video. Then enjoy.

Ad on the train back home. Its business time.

After the last train into my home station for the night, these two guys were practicing their break dancing infront of the mirrored windows of one of the stores. Well, the one on the left was breakdancing (fairly well), but the guy in yellow was just watching himself shake his hips. While passers-by watched in awe and disgust.

Thanks Arianna, I shall treasure this symbol of all that I hate in Japan.

Thanks Lars for liberating the 1-litre graduated cylinder/pitcher from the izakaya. You so sneaky.

Just bought an electric violin

I mentioned my intentions a while back, and have been surreptitiously checking the used electric violin section of eBay for good deals. I found that all the violins were either like $50-100, and obviously pieces of shit made in China, or like $400+ (and in many cases $1500+) and therefore wayyy to expensive for me. I ended up on a happy medium, buying one for about $200, which is Korean-made (Korean=Quality, my friends. Believe you me), with a built-in preamp, and looks pretty sweet to boot.

Turns out the guy that I bought it from was an English teacher a stone's throw away from here near Mt. Fuji (well, several stone's throws, I guess. About 40 klicks.) and just returned to the US 3 weeks ago. So now I have to pay like $70 shipping. Lame. Anyway, I always feel reassured that I'm not getting swindled when I deal with a real life person, and one who is smart enough to append a spam-flag to the end of the Gmail email address he uses for eBay registration. Good on you, Joshua.

I think I'm going to have to buy new strings for it, as apparently all e-violins come with sucky strings by default, but I've been to a good musical instruments store in a mall nearby, so that shouldn't be too much of a problem. I've only replaced violin strings a couple times though, so hopefully its not too different/difficult with this one.

I assume that I can get violin sheet music somewhere on the internet, else I'm going to have to resort to playing the few songs that the Suzuki-method managed to permanently instill in my brain like 10 years ago (Witch's Dance, Devil's Dream, Mississippi Reel, Gavotte in G-minor, Star Wars theme, etc). But either way, I'm pretty excited about getting it, as it'll be fantastic to have something constructive to do to while away the hours of confinement in this room when I'm too exhausted from work and/or generally too complacent to go out and do stuff.

An Apology to Socrates

I was watching House M.D. yesterday, and one of the interim solutions to the mysterious ailments experienced by the patient of the week (who was a swords and sorcerers playacting fanatic) was Hemlock poisoning. Sure, everyone knows about Hemlock – it was the sentence for one of the most famous trials of all time – but when House storms out of the apothecary shop with his brilliant Hemlock deduction, his team immediately responds with “Okay, I’ll check for piperidine alkaloids”.

“Piperidine alkaloids!” I exclaim, “I have some of those, and the MSDS didn’t mention anything about notorious neurotoxicity causing death of premodern philosophers! To Wikipedia, batman!”

So hemlock leaves (and roots) are chock full of a bunch of different piperidine alkaloids which are more or less all poisonous, their neurotoxicity stemming from being competitive agonists to the active site of certain nicotinic receptors, causing muscular paralysis, preventing breathing, and resulting in eventual death. Other molecules which do the same thing (to lesser and greater degrees, respectively) are nicotine and cobra poison.

For those interested in the structures of these badboys. Taken from some ancient ACS paper (Leete, E, “Biosynthesis of the hemlock and related piperidine alkaloids”, Accounts of Chemical Research 1971 4 (3), 100-107)

Well anyway, for reasons outside the scope of this post (and more importantly, inside several NDAs), I happen to have a supply of conveniently azide functionalized piperidine alkaloid (very similar structurally to coniine, which is the most active and poisonous of the chemicals contained in hemlock). So naturally, as I didn’t have anything better to do today, I decided to make the ultimate philosopher killing machine. It’s a piece of molecular art, really. I call it “Hypervalent Hemlock”.

Just in case it wasn't deadly enough already.

Yes, that’s a 3rd generation triazole dendrimer chock full of piperidine alkaloid goodness. I have no idea about the actual mechanism behind the binding of coniine to the nicotinic receptors, and so whether the right side of the molecule is sticking out here, but I like to think that this molecule can agonize a dozen receptors in one go, thereby increasing its Socrates killing power 12-fold.

Its not so pretty IRL, guys.

Science is awesome. Think I can make a work term report out of this?