"unicode" entries

Four short links: 5 March 2012

Four short links: 5 March 2012

Video Encoding, Content Identification, Mobile Numbers, and Unicode Fun

  1. Pirates Adopt H.264 — no more XViD encoded avi files, now it’s x264. I’m impressed by the rigid rules and structure of The Scene.
  2. YouTube’s ContentID Disputes Are Judged By The Accuser (Andy Baio) — the last couple years have seen a dramatic rise in Content ID abuse, using it for purposes that it was never intended. Scammers are using Content ID to steal ad revenue from YouTube video creators en masse, with some companies claiming content they don’t own, deliberately or not. The inability to understand context and parody regularly leads to “fair use” videos getting blocked, muted or monetized.
  3. The Month of 50% in Mobile (Luke Wroblewski) — 47.6% of mobile Internet users use native mobile apps and 47.5% use the Web browser on their devices. This is the first time (in ComScore data) native apps have had more use than the browser.
  4. Fake Unicode Consortium — excellent collection of better names for Unicode characters. My favourite: U+0CA0: MONOCLE OF DISAPPROVAL. (via Tom Christiansen)
Four short links: 8 February 2012

Four short links: 8 February 2012

Text Mining, Unstoppable Sociality, Unicode Fun, and Scholarly Publishing

  1. Mavunoan open source, modular, scalable text mining toolkit built upon Hadoop. (Apache-licensed)
  2. Cow Clicker — Wired profile of Cowclicker creator Ian Bogost. I was impressed by Cow Clickers […] have turned what was intended to be a vapid experience into a source of camaraderie and creativity. People create communities around social activities, even when they are antisocial. (via BoingBoing)
  3. Unicode Has a Pile of Poo Character (BoingBoing) — this is perfect.
  4. The Research Works Act and the Breakdown of Mutual Incomprehension (Cameron Neylon) — an excellent summary of how researchers and publishers view each other and their place in the world.
Four short links: 31 May 2011

Four short links: 31 May 2011

Disease-B-Gone, Quake Game, Text Adventures, and Unicoddling

  1. Rinderpest Eradicated — only the second disease that mankind has managed to eradicate. This one was a measles-like virus that killed cattle and caused famines. A reminder of how astonishingly difficult it is to eradicate disease, but what a massive victory it is when it happens. (via Courtney Johnston)
  2. Magnetic South — the 6.3 earthquake that trashed Christchurch, New Zealand, has presented the city with a tabula rasa (or, rather, tabula rubble) for the rebuild: what should they build, how, and where? The good citizens are working on this question in many ways, one of which is this online game based on Institute for the Future’s Foresight Engine.
  3. TOPS-20 in a Box — write FORTRAN code on an emulated PHP-10 running TOPS-20 and, most delightfully, play the original Adventure as written by Crowther and finished by Woods. It’s like emulating the Big Bang for text adventures. When you’re done, admire the scholarship in this analysis of the original to see how much Woods added. (Text adventures are the game version of command-line interfaces, and we still have much to learn from them)
  4. Why Does Modern Perl Avoid UTF-8 By Default? (StackOverflow) — check out the very long and detailed answer by my coauthor, Tom Christiansen, on exactly how many thorns and traps lie in wait for the unwary “it should just WORK”er. Skip down to the “Assume Brokenness” section for the full horror. Tom’s been working with linguists and revising the Unicode chapters of the Camel, so asking “why can’t it just work” is like asking a war veteran “why don’t you just shoot all the bad guys?”.
Four short links: 18 April 2011

Four short links: 18 April 2011

Community, Metrics, Sensors, and Unicode

  1. Your Community is Your Best Feature — Gina Trapani’s CodeConf talk: useful, true, and moving. There’s not much in this world that has all three of those attributes.
  2. Metrics Everywhere — another CodeConf talk, this time explaining Yammer’s use of metrics to quantify the actual state of their operations. Nice philosophical guide to the different ways you want to measure things (gauges, counters, meters, histograms, and timers). I agree with the first half, but must say that it will always be an uphill battle to craft a panegyric that will make hearts and minds soar at the mention of “business value”. Such an ugly phrase for such an important idea. (via Bryce Roberts)
  3. On Earthquakes in Tokyo (Bunnie Huang) — Personal earthquake alarms are quite popular in Tokyo. Just as lightning precedes thunder, these alarms give you a few seconds warning to an incoming tremor. The alarm has a distinct sound, and this leads to a kind of pavlovian conditioning. All conversation stops, and everyone just waits in a state of heightened awareness, since the alarm can’t tell you how big it is—it just tells you one is coming. You can see the fight or flight gears turning in everyone’s heads. Some people cry; some people laugh; some people start texting furiously; others just sit and wait. Information won’t provoke the same reaction in everyone: for some it’s impending doom, for others another day at the office. Data is not neutral; it requires interpretation and context.
  4. AccentuateUs — Firefox plugin to Unicodify text (so if you type “cafe”, the software turns it into “café”). The math behind it is explained on the dataists blog. There’s an API and other interfaces, even a vim plugin.
Four short links: 2 March 2011

Four short links: 2 March 2011

Python Unicode, Cognitive Enhancement, Journal Balk, Engineering SaaS

  1. Unicode in Python, Completely Demystified — a good introduction to Unicode in Python, which helped me with some code. (via Hacker News)
  2. A Ban on Brain-Boosting Drugs (Chronicle of Higher Education) — Simply calling the use of study drugs “unfair” tells us nothing about why colleges should ban them. If such drugs really do improve academic performance among healthy students (and the evidence is scant), shouldn’t colleges put them in the drinking water instead? After all, it would be unfair to permit wealthy students to use them if less privileged students can’t afford them. As we start to hack our bodies and minds, we’ll face more questions about legitimacy and ethics of those actions. Not, of course, about using coffee and Coca-Cola, ubiquitous performance-enhancing stimulants that are mysteriously absent from bans and prohibitions.
  3. Copywrongs — Matt Blaze spits the dummy on IEEE and ACM copyright policies. In particular, the IEEE is explicitly preventing authors from distributing copies of the final paper. We write scientific papers first and last because we want them read. When papers were disseminated solely in print form it might have been reasonable to expect authors to donate the copyright in exchange for production and distribution. Today, of course, this model seems, at best, quaintly out of touch with the needs of researchers and academics who no longer desire or tolerate the delay and expense of seeking out printed copies of far-flung documents. We expect to find on it on the open web, and not hidden behind a paywall, either.
  4. On the Engineering of SaaSAn upgrade process, for example, is an entirely different beast. Making it robust and repeatable is far less important than making it quick and reversible. This is because the upgrade only every happens once: on your install. Also, it only ever has to work right in one, exact variant of the environment: yours. And while typical customers of software can schedule an outage to perform an upgrade, scheduling downtime in SaaS is nearly impossible. So, you must be able to deploy new releases quickly, if not entirely seamlessly — and in the event of failure, rollback just as rapidly.
Four short links: 24 December 2010

Four short links: 24 December 2010

Carbon Offsets, Good IDN, People Don't Suck, and Passive Lifeblogging

  1. Holiday Carbon Offsets — buy carbon offsets against Santa’s trip, a stockingful of coal, or this year’s Reindeer Games. (via Val Aurora on Twitter)
  2. Sad Story of the Snowman — the best use of Internationalized Domain Names yet.
  3. Katie, Starwars Geek (CNN) — best use of the Internet this year.
  4. Everything The Internet Knows About Me Because I Asked It To (WSJ) — passive lifeblogging. (via Keith on Twitter)