Sunday, November 22, 2015

Rdio was too good to last

When Apple killed Lala, there was a bad guy with motive, opportunity and a smoking gun. In the case of Rdio's recently announced demise, it's different. A postmortem by Casey Newton of the Verge explains Why Rdio died. The team had the design chops, engineering talent and love of music to create a fantastic product but lacked the business and marketing savvy to make it pay off.

Rdio's star feature was certainly design. The site enabled individuals to express themselves and relate to others - for their enthusiasm to feed off one another. You could follow people with intersecting tastes. New listening was suggested algorithmically based on what was "On the rise in your network". You could comment and respond, lovingly curate playlists and follow activity by your musical soul-mates. The social dimension is key to something as personal and tribal as music.

If you wanted to take art pop seriously, do a deep dive into electronic music, exhaustively survey the Zappa catalog, or peruse the archives of 5 years of Sunday Jazz Brunch selections, your people were there. Any platform will play music you already know, Rdio was a place to explore.

In some ways, credit belongs to the users themselves - those who shoehorned rich conversations into a relatively bare-bones comment feature, repurposing shared playlists as the equivalent of discussion forums. In one case, Community Playlist the Trilogy had some 3,522 comments and 116 collaborators.

It had, in a word, community.


Refugees looking for a new musical home can find lots of resources on the Rdio lover's slack channel, including a compilation of tools for exporting playlists and other digital assets. A Python script by Jesse Mullan (playlist_helper), which will soon become the official data exporter, worked nicely for me.

There were several calls for a platform-neutral place for the community to live, independent of which streaming service folks end up migrating to. Some nascent possibilities are The Playlist or Hatchet. fills that roll for me, at least for now. Maybe the Rdio Lover's slack channel will survive beyond the transition period.


Which service comes the closest to Rdio? Users on the slack channel have compiled a helpful guide Rdio features compared to the competitors. Roughly in order of how interesting they look to me, the main contenders are:


In A Eulogy for Rdio in the Atlantic, Robinson Meyer calls Rdio "a better streaming service in most every way". So, why did a great service with an intensely loyal following fail?

The economics of digital music are tricky. None of the streaming services are really making money. Rdio's $1.5M in monthly revenue, corresponding to perhaps 150,000 paying users, and $100-150k in advertising couldn't cover their costs of roughly $4M mainly for 140 employees and royalties. This explains the nasty pile-up of $220 million in debt. Music has been called Too free to be expensive. Too expensive to be free.

Pandora is buying Rdio's intellectual property and taking on some of the talent with the intention of introducing their own on-demand streaming service some time in 2016. Interestingly, customer data was not part of that transaction.

One message I hope no one takes away is that community doesn't matter. It's one of the few ways for streaming services to differentiate themselves. Without it, you feel "solitary, lonely and probed" in the characteristic phrasing of CAW a.k.a. The Aquatic Ape

So long

So, we're left with a reminder that the best doesn't always win. No doubt, us "snobby album purists" will find or coopt another platform on which to indulge our musical obsessions. Keep in touch, music peeps:

“To all of you that have expanded my musical experience for the past six or so years - thank you, thank you thank you.”
“Man, I'm gonna miss my playlists.”
“I spent A LOT of time here.”
“Thank you all for introducing me to some truly great music (and some truly terrible, which I enjoyed nearly as much).”
“from the start it has been my most active social network”
“perhaps the kindest community in online music.”
“People left comments on albums, and, lo and behold, the writing was good and interesting.”
“you all have been invaluable in helping me not just discover new music, but in helping me open my mind to new kinds of music.”
“such a welcoming and amazing crew of fellow travelers”

...parting comments from Rdio users collected by fangoguagua.

Tuesday, August 04, 2015

Hacking Zebrafish thoughts

The last lab from Scalable Machine Learning with Spark features a guest lecture by Jeremy Freeman, a professor of neuroscience at Janelia Farm Research Campus.

His group produced this gorgeous video of a living zebrafish brain. Little fish thoughts sparkle away, made visible by a technique called light-sheet flourescent microscopy in which engineered proteins that light up when the neurons fire are engineered into the fish.

The lab covers principal component analysis in a lively way. Principal components are extracted from time-series data and mapped onto an HSV color wheel and used to color an image of the zebrafish brain. In the process, we use some fun matrix manipulation to aggregate the time series data in two different ways - by time relative to the start of a visual stimulus and by the directionality of the stimulus (shown below).

The whole series of labs from the Spark classes was nicely done, but this was an especially fun way to finish it out.

Check out the Freeman Lab's papers:

Tuesday, July 21, 2015

Machine learning on music data

The 3rd lab from Scalable Machine Learning with Spark has you predict the year a song was published based on features from the Million Song Dataset. How much farther could you take machine analysis of music? Music has so much structure that's so apparent to our ears. Wouldn't it be cool to be able to parse out that structure algorithmically? Turns out, you can.

Apparently The International Society for Music Information Retrieval (ISMIR) is the place to go for this sort of thing. A few papers, based on minutes of rigorous research (aka random googling):

In addition to inferring a song's internal structure, you might want to relate it's acoustic features to styles, moods or time periods (as we did in the lab). For that, you'll want music metadata from sources like:

There's a paper on The Million Song Dataset paper by two researchers at Columbia's EE department and two more at the Echo Nest.

Even Google is interested in the topic: Sharing Learned Latent Representations For Music Audio Classification And Similarity.

Tangentially related, a group out of Cambridge and Stanford say Musical Preferences are Linked to Cognitive Styles. I fear what my musical tastes would reveal about my warped cognitive style.

Wednesday, July 08, 2015

Scalable Machine Learning with Spark class on edX

Introduction to Big Data with Apache Spark is an online class hosted on edX that just finished. Its follow-up Scalable Machine Learning with Spark just got started.

If you want to learn Spark - and who doesn't? - sign up.

Spark is a successor to Hadoop that comes out of the AMPLab at Berkeley. It's faster for many operations due to keeping data in memory, and the programming model feels more flexible in comparison to Hadoops rigid framework. The AMPLab provides a suite of related tools including support for machine learning, graphs, SQL and streaming. While Hadoop is most at home with batch processing, Spark is a little better suited to interactive work.

The first class was quick and easy, covering Spark and RDDs through PySpark. No brain stretching on the order of Daphne Koller's Probabilistic Graphical Models to be found here. The lectures stuck to the "applied" aspects, but that's OK. You can always hit the papers to go deeper. The labs were fun and effective at getting you up to speed:

Labs for the first class:

  • Word count, the hello world of map-reduce
  • Analysis of web server log files
  • Entity resolution using a bag-of-words approach
  • Collaborative filtering on a movie ratings database. Apparently, I should watch these: Seven Samurai, Annie Hall, Akira, Stop Making Sense, Chungking Express.

The second installment looks to very cool, delving deeper into mllib the AMPLab's machine learning library for Spark. Its labs cover:

  • Musicology: predict the release year of a song given a set of audio features
  • Prediction of click-through rates
  • Neuroimaging Analysis on brain activity of zebrafish (which I suspect is the phase "Just keep swimming" over and over) done in collaboration with Jeremy Freeman of the Janelia Research Campus.

The labs for both classes are authored as IPython notebooks in the amazingly cool Jupyter framework where prose, graphics and executable code fit combine to make a really nice learning environment.

Echoing my own digital hoarder tendencies, the first course is liberally peppered with links, which I've dutifully culled and categorized for your clicking compulsion:

Big Data Hype


Data Cleaning



The Data Science Process

In case you're still wondering what data scientists actually do, here it is according to...

Jim Gray

  • Capture
  • Curate
  • Communicate

Ben Fry

  • Acquire
  • Parse
  • Filter
  • Mine
  • Represent
  • Refine
  • Interact

Jeff Hammerbacher

  • Identify problem
  • Intrumenting data sources
  • Collect data
  • Prepare data (integrate, transform, clean, filter, aggregate)
  • Build model
  • Evaluate model
  • Communicate results

...and don't forget: Jeffrey Leek and Hadley Wickham.

Tuesday, June 02, 2015

Beyond PEP 8 -- Best practices for beautiful intelligible code

I didn't really mean to become a Python programmer. I was on my way to something with a little more rocket-science feel. R, Scala, Haskell, maybe. But, since I'm here, I may as well learn something about how to do it right. In this respect, I've become a fan of Raymond Hettinger.

Python coders will enjoy and benefit from Raymond's excellent talk given at PyCon 2015 about Python style, Beyond PEP 8 -- Best practices for beautiful intelligible code.

"Who should PEP-8-ify code? The author. PEP 8 unto thyself not unto others."

To Hettinger, PEP-8 is not a weapon for bludgeoning rival developers into submission. Going beyond PEP 8 is about paying attention to the stuff that really matters - using languages features like magic methods, properties, iterators and context managers. Business logic should be clear and float to the top. In short, writing beautiful idiomatic Pythonic code.

There are plenty more videos from PyCon 2015 where that one came from.

Monday, March 09, 2015

Extended Lake Union Loop

The standard running loop around Lake Union is a touch over 6 miles. With the addition of a side loop around Portage Bay, you can bring it up to 8 and a half, taking in a bit of UW's campus and crossing over the cut into Montlake. Sticking to the water's edge keeps the terrain nice and flat, but if you want some climbing, head up into Capitol Hill via Interlaken park.

Here, I've factored in a stop at PCC for a cold drink.

Tuesday, January 27, 2015

Haskell class wrap-up

[From the old-posts-that-I've-sat-on-for-entirely-too-long-for-no-apparent-reason department...]

Back in December, I finished FP101x, Introduction to Functional Programming. I'm stoked that I finally learned me a (little) Haskell, after wanting to get around to it for so long.

The first part of the course was very straight-forward covering the basics of programming in the functional style. But the difficulty ramped up quickly.

A couple of labs were particularly mind-bending, not just for me judging by the message boards. Both were based on Functional Pearl papers and featured monads prominantly. The first was on monad parser combinators and the second was based on A Poor Man's Concurrency Monad. Combining concurrency (of a simple kind), monads and continuation passing is a lot to throw at people at once.

The abrupt shift to more challenging material is part of a philosophy of "teaching the students to fish for themselves". So is introducing new material in the labs rather than in the lectures. This style of teaching alienated a number of students. It's not my favorite, but I can roll with it.

Just be aware that the course requires some self-directed additional reading and don't flail around trying to solve to homeworks without sufficient information.

More Haskell

Now that the class is over, I'd like to find time to continue learning Haskell:

One reason I wanted to learn Haskell is to be able to read some of the Haskell-ish parts of the programming languages literature:

Monday, January 12, 2015

Brave Genius

Brave Genius is an unlikely dual biography of a biologist and a writer who shared a friendship and a common philosophy. Both were active in the French resistance to the German Occupation and both would later receive a Nobel prize. Sean B. Carroll forges an inspiring story from seemingly incongruous elements: the desperate defiance of a few in an occupied country, the exhilarating pursuit of an open scientific question, and a lonely stand on the moral high ground.

In 1940, Jacques Monod was a newly married father of twins and a researcher at the Sorbonne. Albert Camus, having already published a couple of books of essays, departed his native Algeria for France in March of that year to find work.

On May 10 1940, German troops crossed into Holland and Belgium. Panzers raced towards the Atlantic coast severing Allied lines and stranding French and British troops in the low countries. French defenses collapsed and Germans arrived in an undefended Paris on June 14. The armistice signed on June 22nd marked the beginning of four years of occupation.

During those years, Camus edited and wrote for the underground newspaper Combat urging resistance to the occupation. As the tide of the war turned, Monod organized sabotage attacks and armed resistance ahead of the approaching liberators.

“I have always believed that if people who placed their hopes in the human condition were mad, those who despaired of events were cowards. Henceforth, there will be only one honorable choice: to wager everything on the belief that in the end words will prove stronger than bullets.” Camus, Combat (November 30, 1946)

François Jacob, André Lwoff and Jacques Monod were awarded a Nobel prize in 1965 for their work on the control of gene expression, elucidating the regulation of the lac operon by which bacteria switch on metabolism of the sugar lactose.

In his writing, Camus confronts the absurdity of the human search for clarity and meaning in a world that offers only indifference. The attempt to derive meaning and morality without resort to mysticism links Camus's philosophy to Monod's scientific work, which provided some of the first direct evidence that life is mechanistic rather than the result of some magical "vital force" and that its workings could be understood.

“The scientific approach reveals to Man that he is an accident, almost a stranger in the universe.” Monod, in On Values in the Age of Science (1969)

“One of the great problems of philosophy, is the relationship between the realm of knowledge and the realm of values. Knowledge is what is; values are what ought to be. I would say that all traditional philosophies up to and including Marxism have tried to derive the 'ought' from the 'is.' My point of view is that this is impossible.” Monod

Carroll, a biologist himself, embeds philosophy and science into the personal lives of his protagonists and the geopolitical events unfolding around them. Both men did brilliant work in the darkest of times, and did so not by retreating but by fully engaging at great risk with the struggles that faced them. The book serves as a warning of what happens when good people overlook the malfeasance of their leaders, but also as confirmation of the resilience of intellect, creativity and humanity.


Sunday, January 04, 2015

The Master Switch

The Master Switch: The Rise and Fall of Information Empires was described as "essential reading" by my boss's boss. If you're at all interested in the interplay of technology, economics and politics, I think you'll agree.

Author Tim Wu is the originator of the term "net neutrality" and a law professor at Columbia. He has written a fast-forward history of the information technology industry focusing on the people and corporations that have, over time, controlled the commanding heights of the information economy. The book examines the cartels that held sway over telephone, radio, film, and television leading up to the question of whether the internet will also come to fall under similar domination.

The cycle is the author's term for the progression of any given technology from the wide-open wild-west early days through a process of integration and consolidation to an end state of oligopoly or monopoly. This stasis eventually gets disrupted by newer technology or government intervention, leading to another open phase and a new round of the cycle, empires rising and falling in the process. "The one-time revolutionaries always become the next generation of dictators. That's why we need, in technology, another generation of revolutionaries to upend them."[1]

Open vs. closed systems

The book revolves around the virtues and vices of open and closed systems. Open systems are more adaptable and democratic but have trouble matching the stability, security and efficiency of closed systems. Open systems embrace the advantages of decentralization as espoused in different ways by Friedrich Hayek and Jane Jacobs. But, integrated centralized systems can be reliable and convenient.

Closed systems, of course, appeal to empire builders such as Theodore Vail who created the AT&T Bell System. Wu's knack for sketch biography is put to good use profiling these power-hungry moghuls and the often utopian upstarts that seek to dethrone them. We meet titans, like Vail, and get a glimps into the sometimes contradictory character traits it takes to control an information empire, for example: David Sarnoff, who ruled the Radio Corporation of America (RCA) and NBC; John Reith, founder of the BBC; Adolph Zukor who started Paramount pictures and Ted Turner creator CNN and former head of Time Warner. We also meet hackers like early radio enthusiast Lee De Forest and supressed inventor of FM radio Edwin Armstrong.

The capture of the Internet?

The American system attempts to carefully balance power within the government, but takes a laissez faire approach to private power. If Wu is right and we let things take their natural course, the openness that now characterizes the Internet - the "integrity of the Internet itself as a reliable, independent, and open structure"[2] - may be lost to a period of lockdown. Network effects, the power of integration and economies of scale favor the monopolist. Consumers may decide to favor consistency and convenience over openness and choice only to regret it later. If this is the case, the internet will not remain open automatically but only with concerted effort.

The remedy Wu proposes is a principle of separation akin to the separation of church and state or the separation of powers within the branches of the American government. The common carrier obligation of all infrastructure providers implies net neutrality and opposes verical integration across layers of the network stack. Technology leaders would be expected to self-regulate based on a sense of public duty. The FCC should pursue enforcement with an eye to the special role of information technology in a democratic society. Anti-trust regulation is the back-up, when it's time to bring out the big guns.

Fight on

The Master Switch gives a deeper perspective on the great game playing out in the technology sector. After reading it, you'll recognize the historical themes threading through the open-source movement, the Apple vs Google skirmishes or 2012's battle that defeated the SOPA / PIPA acts. The fight over the future of the Internet is surely not over.