ENTRIES TAGGED "optimization"
OSCON 2013 Speaker Series
NOTE: If you are interested in attending OSCON to check out Dave’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your registration fee.
Since 2009, I’ve been leading the optimization team at AppNexus, a real-time advertising exchange. On this exchange, advertisers participate in real-time auctions to bid on individual ad impressions. The highest bid wins the auction, and that advertiser gets to show an ad. This allows advertisers to carefully target where they advertise—maximizing the effectiveness of their advertising budget—and lets websites maximize their ad revenue.
We do these auctions often (~50 billion a day) and fast (<100 milliseconds). Not surprisingly, this creates a lot of technical challenges. One of those challenges is how to automatically maximize the value advertisers get for their marketing budgets—systematically driving consumer engagement through ad placements on particular websites, times of day, etc.—and we call this process “optimization.” The volume of data is large, and the algorithms and strategies aren’t trivial.
In order to win clients and build our business to the scale we have today, it was crucial that we build a world-class optimization system. But when I started, we didn’t have a scalable tech stack to process the terabytes of data flowing through our systems every day, and we didn't have the team to do any of the required data modeling.
So, we needed to hire great people fast. However, there aren’t many veterans in the advertising optimization space, and because of that, we couldn’t afford to narrow our search to only experts in Java or R or Matlab. In order to give us the largest talent pool possible to recruit from, we had to choose a tech stack that is both powerful and accessible to people with diverse experience and backgrounds. So we chose Python.
Python is easy to learn. We found that people coding in R, Matlab, Java, PHP, and even those who have never programmed before could quickly learn and get up to speed with Python. This opened us up to hiring a tremendous pool of talent who we could train in Python once they joined AppNexus. To top it off, there’s a great community for hiring engineers and the PyData community is full of programmers who specialize in modeling and automation.
Additionally, Python has great libraries for data modeling. It offers great analytical tools for analysts and quants and when combined, Pandas, IPython, and Matplotlib give you a lot of the functionality of Matlab or R. This made it easy to hire and onboard our quants and analysts who were familiar with those technologies. Even better, analysts and quants can share their analysis through the browser with IPython.
Now that we had all of these wonderful employees, we needed a way to cut down the time to get them ramped up and pushing code to production.
First, we wanted to get our analysts and quants looking at and modeling data as soon as possible. We didn’t want them worrying about writing database connector code, or figuring out how to turn a cursor into a data frame. To tackle this, we built a project called Link.
Imagine you have a MySQL database. You don’t want to hardcode all of your connection information because you want to have a different config for different users, or for different environments. Link allows you to define your “environment” in a JSON config file, and then reference it in code as if it is a Python object.
Now, with only three lines of code you have a database connection and a data frame straight from your mysql database. This same methodology works for Vertica, Netezza, Postgres, Sqlite, etc. New “wrappers” can be added to accommodate new technologies, allowing team members to focus on modeling the data, not how to connect to all these weird data sources.
In : from link import lnk
In : my_db = lnk.dbs.my_db
In : df = my_db.select('select * from my_table').as_dataframe()
Int64Index: 325 entries, 0 to 324
id 325 non-null values
user_id 323 non-null values
app_id 325 non-null values
name 325 non-null values
body 325 non-null values
created 324 non-null values
By having the flexibility to easily connect to new data sources and APIs, our quants were able to adapt to the evolving architectures around us, and stay focused on modeling data and creating algorithms.
Second, we wanted to minimize the amount of work it took to take an algorithm from research/prototype phase to full production scale. Luckily, with everyone working in Python, our quants, analysts, and engineers are using the same language and data processing libraries. There was no need to re-implement an R script in Java to get it out across the platform.
Cultural shifts and handling large-scale growth among the emerging trends in the WPO and DevOps communities
More than 700 performance and operations engineers were in London last week for Velocity Europe 2012. Below, Velocity co-chairs Steve Souders and John Allspaw note high-level themes from across the various tracks (especially the hallway track) that are emerging for the WPO and DevOps communities.
Performance themes from Steve Souders
I was in awe of the speaker and exhibitor lineup going into Velocity Europe. It was filled with knowledgeable gurus and industry leaders. As Velocity Europe unfolded a few themes kept recurring, and I wanted to share those with you.
Performance matters more — The places and ways that web performance matters keeps growing. The talks at Velocity covered desktop, mobile (native, web, and hybrid), tablet, TV, DSL, cable, FiOS, 3G, 4G, LTE, and WiMAX across social, financial, ecommerce, media, games, sports, video, search, analytics, advertising, and enterprise. Although all of the speakers were technical, they talked about how the focus on performance extends to other departments in their companies as well as the impact performance has on their users. Web performance has permeated all aspects of the web and has become a primary focus for web companies. Read more…
Four simple optimization steps produce big results.
Learn how producers slimmed down the Velocity conference site, cutting the site's load time by 3.5 seconds and dropping 49% of the page weight.
Steve Souders on browser wars, site speed, and the HTTP Archive.
In this interview, Google performance evangelist and Velocity co-chair Steve Souders discusses browser competition, the differences between mobile and desktop optimization, and his hopes for the HTTP Archive.
Nicole Sullivan on how CSS is evolving to meet performance and device needs.
Velocity speaker and CSS expert Nicole Sullivan discusses the state of CSS — how it’s adapting to mobile, how it’s improving performance, and how some CSS best practices have led to “bloated code and broken websites.”
Facebook boosted speed 2x. Director of engineering Robert Johnson explains how.
Robert Johnson, Facebook's director of engineering and a speaker at the upcoming Velocity and OSCON conferences, discusses an in-depth optimization and rewrite project that boosted Facebook's speed 2x.