"data" entries

Four short links: 21 April 2016

BitCoin with Identity, Hardware is Hard, Data Test Suites, and Internet Voting

by Nat Torkington | @gnat | +Nat Torkington | April 21, 2016

Bribing Miners to Regulate Bitcoin — interesting! A somewhat conspiracy-theoretical take on an MIT proposal to layer identity onto Bitcoin. Features repurposed DRM tech, no less.
Tesla Model X Quality Issues (Consumer Reports) — hardware is hard.
Data Proofer — open source software that’s test cases for your data, to help ensure you’re not pushing corrupt data into production.
Internet Voting? Really? (YouTube) — TEDx talk by Andrew Appel comparing physical with online voting. Very easy to follow for the non-technical.

(more…)

Four short links: 11 April 2016

Speech GUI, AI Personality Design, Bipedal Robot, and Markets for Good

by Nat Torkington | @gnat | +Nat Torkington | April 11, 2016

SpeechKITT — open source flexible GUI for interacting with Speech Recognition in your web app.
The Humanities Majors Designing AI Interactions — who else are you going to get to do it? As in fiction, the AI writers for virtual assistants dream up a life story for their bots. Writers for medical and productivity apps make character decisions such as whether bots should be workaholics, eager beavers or self-effacing. “You have to develop an entire backstory — even if you never use it,” Ewing said.
SCHAFT’s Bipedal Robot — not an Austin Powers reference, but a clever working proof-of-concept. In theory, bipedalism allows robots to go wherever we can (versus, say, a Dalek).
Markets for Good — Information to drive social impact.

Four short links: 24 February 2016

UX Metrics, Page Scraping, IoT Pain, and NLP + Deep Learning

by Nat Torkington | @gnat | +Nat Torkington | February 24, 2016

Critical Metric: Critical Responses (Steve Souders) — new UX-focused metrics […] Start Render and Speed Index.
Automatically Scrape and Import a Table in Google Spreadsheets (Zach Klein) — =ImportHtml("URL", "table", num) where “table” is the element name (“table” or a list tag), and num is the number of the element in case there are multiple on the page. Bam!
Getting Visibility on the iBeacon Problem (Brooklyn Museum) — the Internet of Things is great, but I wouldn’t want to have to update its firmware. As we started to troubleshoot beacon issues, we wanted a clean slate. This meant updating the firmware on all the beacons, checking the battery life, and turning off the advanced power settings that Estimote provides. This was a painstakingly manual process where I’d have to go and update each unit one-by-one. In some cases, I’d use Estimote’s cloud tool to pre-select certain actions, but I’d still have to walk to each unit to execute the changes and use of the tool hardly made things faster. Perhaps when every inch of the world is filled with sensors, Google Street View cars will also beam out firmware updates.
NLP Meets Deep Learning — easy to follow slide deck talking about how deep learning is tackling NLP problems.

Rachel Kalmar on data ecosystems

The O’Reilly Hardware Podcast: Collecting, sharing, and accessing data from sensors.

by Jon Bruner | @JonBruner | +Jon Bruner | February 17, 2016

Subscribe to the O’Reilly Hardware Podcast for insight and analysis about the Internet of Things and the worlds of hardware, software, and manufacturing: TuneIn, Stitcher, iTunes, SoundCloud, RSS.

In this new episode of the Hardware Podcast, David Cranor and I talk with data scientist Rachel Kalmar, formerly with Misfit Wearables and the founder and organizer of the Sensored Meetup in San Francisco. She shares insights from her work at the intersection of data, hardware, and health care.

Discussion points:

The need for a “data ecosystem” approach: it’s important to understand the entire stack from acquisition through storage and analysis, and where security and privacy become concerns.
Analysis and insight as the real value in data: consumers get very little from raw data.
Authentication for smart devices—and an experiment (let us know if your lights went out during this podcast by e-mailing hardware@oreilly.com).

Read more…

Four short links: 21 January 2016

Hidden Networks, Dissolving Sensors, Spies Spy, and Redirected Walking

by Nat Torkington | @gnat | +Nat Torkington | January 21, 2016

Big Bang Data: Networks of London (YouTube) — guide to the easy-to-miss networks (fibre, CCTV, etc.) around Somerset House, where an amazing exhibition is about to launch. The network guide is the work of the deeply talented Ingrid Burrington.
Sensors Slip into the Brain and then Dissolve When Done (IEEE Spectrum) — pressure and temperature monitors, intended to be implanted in the brain, that completely dissolve within a few weeks. The news, published as a research letter in the journal Nature, described a demonstration of the devices in rats, using soluble wires to transmit the signals, as well as the demonstration of a wireless version, though the data transmission circuit, at this point, is not completely resorbable. The research was published as a letter to Nature.
GCHQ Proposes Surveillable Voice Call Encryption (The Register) — unsurprising, but should reiterate AGAIN that state security services would like us to live in the panopticon. Therefore, don’t let the buggers anywhere near the reins of our communication systems.
These Tricks Make Virtual Reality Feel Real — Scientists are exploiting the natural inaccuracies in people’s own proprioception, via a technique called “redirected walking,” to create the perception of space where none exists. With redirected walking, […] users can sense they are exploring the twisting byways of a virtual city when in reality they are simply walking in circles inside a lab. Original Redirect Walking paper.

Four short links: 25 December 2015

Bad Data, Breakout Startups, Drone Economics, and Graph Signs

by Nat Torkington | @gnat | +Nat Torkington | December 25, 2015

Bad Data Guide (Quartz) — An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
Breakout List — companies where all the action is happening. Read alongside Startup L Jackson’s “How to Get Rich in Tech, Guaranteed.”
The Economics of Drone Delivery — The analysis is still mostly speculative. Keeney imagines that 6,000 operators who earn $50,000 per year will operate 30,000 to 40,000 drones. Each drone will make 30 deliveries per day. Her analysis ignores depreciation and questions like: ‘How will drones avoid airplanes and deliver packages in Manhattan?’ And there’s another core issue: $12.92 is the price UPS charges to consumers, but its actual marginal cost of delivering one more package along a route they are delivering to already is probably closer to $2. When push comes to shove, will drones be able to compete? (via Chris Anderson)
7 Ways Your Data is Telling You It’s a Graph — Network, tree, taxonomy, ancestry, structure – if people are using those words to talk about an organizational chart or reporting structure, they’re telling you that data and the relationships between that data are important.

Four short links: 16 September 2015

Data Pipelines, Amazon Culture, Real-time NFL Data, and Deep Learning for Chess

by Nat Torkington | @gnat | +Nat Torkington | September 16, 2015

Three Best Practices for Building Successful Data Pipelines (Michael Li) — three key areas that are often overlooked in data pipelines, and those are making your analysis: reproducible, consistent, and productionizable.
Amazon’s Culture Controversy Decoded (Rita J King) — very interesting culture map analysis of the reports of Amazon’s culture, and context for how companies make choices about what to be. (via Mike Loukides)
How Will Real-Time Tracking Change the NFL? (New Yorker) — At the moment, the NFL is being tightfisted with the data. Commentators will have access during games, as will the betting and analytics firm Sportradar. Users of the league’s Xbox One app, which provides an interactive way of browsing video clips, fantasy-football statistics, and other metrics, will be able to explore a feature called Next Gen Replay, which allows them to track each player’s speed and trajectory, combining moving lines on a virtual field with live footage from the real one. But, for now, coaches are shut out; once a player exits the locker room on game day, the dynamic point cloud that is generated by his movement through space is a corporately owned data set, as outlined in the league’s 2011 collective-bargaining agreement. Which should tell you all you need to know about the NFL’s role in promoting sporting excellence.
Giraffe: Using Deep Reinforcement Learning to Play Chess (Matthew Lai) — Giraffe, a chess engine that uses self-play to discover all its domain-specific knowledge, with minimal hand-crafted knowledge given by the programmer. See also the code. (via GitXiv)

Four short links: 4 August 2015

Data-Flow Graphing, Realtime Predictions, Robot Hotel, and Open-Source RE

by Nat Torkington | @gnat | +Nat Torkington | August 4, 2015

Data-flow Graphing in Python (Matt Keeter) — not shared because data-flow graphing is sexy new hot topic that’s gonna set the world on fire (though, I bet that’d make Matt’s day), but because there are entire categories of engineering and operations migraines that are caused by not knowing where your data came from or goes to, when, how, and why. Remember Wirth’s “algorithms + data structures = programs”? Data flows seem like a different slice of “programs.” Perhaps “data flow + typos = programs”?
Machine Learning for Sports and Real-time Predictions (Robohub) — podcast interview for your commute. Real time is gold.
Japan’s Robot Hotel is Serious Business (Engadget) — hotel was architected to suit robots: For the porter robots, we designed the hotel to include wide paths.” Two paths slope around the hotel lobby: one inches up to the second floor, while another follows a gentle decline to guide first-floor guests (slowly, but with their baggage) all the way to their room. Makes sense: at Solid, I spoke to a chap working on robots for existing hotels, and there’s an entire engineering challenge in navigating an elevator that you wouldn’t believe.
bokken — GUI to help open source reverse engineering for code.

Four short links: 26 May 2015

Keyboard Programming, Oblique Strategies, Engineering Ethics, and Visualisation Gallery

by Nat Torkington | @gnat | +Nat Torkington | May 26, 2015

Introduction to Keyboard Programming — what happens when you press a key. (hint: a lot)
Oblique Strategies: Prompts for Programmers — Do it both ways. Very often doing it both ways is faster than analyzing which is best. Now you also have experimental data instead of just theoretical. Add a toggle if possible. This will let you choose later. Some mistakes are cheaper to make than to avoid.
The Responsibility We Have as Software Engineers — Where’s our Hippocratic Oath, our “First, Do No Harm?” Remember that moment when Google went from “amazing wonderful thing we didn’t have before, which makes our lives so much better” to “another big scary company and holy shit it knows a lot about us!”? That’s coming for our industry and the software engineering profession in particular.
Gallery of Concept Visualisation — plenty I hadn’t seen before.

Four short links: 25 May 2015

8 (Bits) Is Enough, Second Machine Age, LLVM OpenMP, and Javascript Graphs

by Nat Torkington | @gnat | +Nat Torkington | May 25, 2015

Why Are Eight Bits Enough for Deep Neural Networks? (Pete Warden) — It turns out that neural networks are different. You can run them with eight-bit parameters and intermediate buffers, and suffer no noticeable loss in the final results. This was astonishing to me, but it’s something that’s been re-discovered over and over again.
The Great Decoupling (HBR) — The Second Machine Age is playing out differently than the First Machine Age, continuing the long-term trend of material abundance but not of ever-greater labor demand.
OpenMP Support in LLVM — OpenMP enables Clang users to harness full power of modern multi-core processors with vector units. Pragmas from OpenMP 3.1 provide an industry standard way to employ task parallelism, while ‘#pragma omp simd’ is a simple yet flexible way to enable data parallelism (aka vectorization).
JS Graphs — a visual catalogue (with search) of Javascript graphing libraries.