Quantcast
Channel: Looker | Blog
Viewing all 281 articles
Browse latest View live

Data of Thrones Part II: Women on Screen

$
0
0

data_of_thrones_part2

Game of Thrones has been both criticized for its treatment of women and praised for its fiercely bold female characters.

Last week, we discussed - at length - some general first impressions from a Game of Thrones dataset. After digging into this dataset1 a little further, there were a few counter intuitive findings such as average versus total screen time for male and female characters.

Average vs. Total Screen Time by Gender

The Game of Thrones show is very male dominated as a whole, so our interest was piqued when we found that female characters, on average, actually have much more screen time than male characters.

data_of_thrones_part2

But remembering high school statistics class, we know that averages can be misrepresentative of the population as a whole.

So we dug in a little bit more.

First, the data shows that there are many more total male characters (120) than female characters (46).

When we look at the total screen time by gender, men definitely dominate the screen.

data_of_thrones_part2

In addition to the higher number of male characters, we know that men dominate the top of screen time charts as well, with Tyrion and Jon solidly at the lead. Jon Snow with 268 minutes onscreen, has almost 20% more screen time than the female character with the most screen time, Daenerys Targaryen.

So why are women so ahead in average screen time?

The low average for men in Game of Thrones tells us that there are many more men with smaller parts in Game of Thrones who are skewing their average screen time lower.

Top Characters by Screen Time

It gets even more interesting when you take into account screen time of the main characters of the show.

Of the top seven main characters, female characters dominate, both in number of characters and in total screen time, which also contributes to the skewed average we see across screen time.

data_of_thrones_part2

In the end, this data shows that while there are fewer women on screen, of the women who are, they receive a fair amount of screen time, on average. And at the top, women as a group are actually getting a lot of time on screen in the show.

Winter is (Almost) Here

Season 7 seems to be gearing up to give the main female characters some serious spotlight. Daenerys is arriving to Westeros, Cersei is sitting on the Iron Throne, Arya is coming back home, and Sansa is finally free to lead her own story. Will this season turn the tables in terms of average and total screen time by gender?

Only time will tell.

Stay tuned for Part III where we’ll be diving into the data in a whole new dimension.

Missed our first post? See it here. Want to see how our team pulled this data? Reach out here.


References:

1 Our data is from data.world and focuses on named characters, and not extras, for the first 6 seasons of the show.

Disclaimer: Game of Thrones belongs to HBO and is not affiliated with Looker in anyway.



The Era of the Cloud and the Intelligent Business

$
0
0

Era of the Cloud

We are honored to be included, along with many customers and partners, in the Forbes Cloud 100. It's impressive to see how the adoption of the cloud and use of data unite so many industry leaders!

Operational Data and the Cloud

Attending the recent Google Cloud Next or AWS re:Invent, the cloud narrative has certainly changed in the last year. It is no longer a question of if enterprises will move to the cloud but when.

At Looker, we’ve seen a common story precipitating this adoption across a wide spectrum of our customers. These companies recognize the value of incorporating data into daily workflows. They recognize that data-driven decision-making can only become the norm if data is accessible in the applications their users already rely on for their jobs. They are truly operationalizing their data.

Access to operational data in a powerful data platform has become the driving force for enterprise cloud adoption. The end product skips IT and now targets business users and delivers a solution as opposed to infrastructure. Just as developers once forgot about floppy disks and later CDs, physical infrastructure is no longer necessary. On premise is now the exception, not the rule.

The Platform Approach

Looker started as a platform first and foremost. At inception, our analytics system was architected to sit on top of the data infrastructure ecosystem, taking advantage of modern, cloud enterprise data warehouses like Amazon’s Redshift (and further developments like Athena and Spectrum), Google’s BigQuery, and Snowflake among Hadoop-focused Presto and Spark and many others. Our analytics system tightly partners with managed ETL providers like Google’s BigQuery Transfer Service, Amazon’s Glue, Stitch and Fivetran. As a result of this approach, our customers benefit from the incredible advancements taking place in this rich data ecosystem.

This evolution in the data stack caused a revolution in business intelligence.

Retrospective focused, heavy ETL workloads are no longer agile enough to adapt with a modern organization. Cloud data warehouses enable ELT and transformations done on the fly. These performance and cost improvements finally allow organizations to drive real value from their big data.

The modern organization is now forced to be data-driven to compete. Enterprises need to leverage their large, disparate operational data to make even day-to-day decisions. The marketer who doesn’t have all of their live campaign data alongside transactional data is at a disadvantage and can’t drive the highest ROI. The customer success manager not looking at CRM, support and event data can’t offer competitive service. A modern knowledge worker has been forced to become a digital worker - living in their data.

Backwards looking executive dashboards, with high-level, delayed views of business metrics are no longer enough and only answer “what happened?”. Data is only as useful as the insight perceived and action taken. Digital workers must answer the most important question “what do I do now?” and then be able to quickly act on that insight.

At Looker, our goal is to remove the friction between insight and action with our deep platform integrations.

Data Actions are our first step at enabling data-driven operations. Because Looker sits directly in the database, operational, row level information comes with Looker out of the box, allowing our integrations to let you take direct action on that information. Want to find an analysis leading to prioritized sales prospects based on custom logic? Send an email directly from Looker.

We’ve seen customers use the platform in creative and unexpected ways: whether it’s pulling data into a conversation in Slack, creating novel segments of users to reach out to via email providers like SendGrid and MailChimp, managing product development, or tagging and creating issues from Looker in Github.

By sitting on top of this rapidly evolving cloud data ecosystem, Looker continues to get more and more powerful; you can bet we’re working to remove the friction between your data, insights and running an intelligent business.


Data of Thrones Part III: 7 Predictions for Game of Thrones Season 7 (Loosely) Based on Data

$
0
0

data_of_thrones

Before we get started, I need to make a little disclaimer: this post is meant to be fun and is not based on any predictive analytics, but rather simply educated guesses based on the data we have available.

Part of the draw of Game of Thrones are the challenges to normal storytelling tropes employed by both George R R Martin and the TV Show Creators, David Benioff and D.B. Weiss.

You know, things like killing the main character in the first season, or killing a main character immediately after he had a huge, meaningful showdown with magical villains, proving to be the only one who could stop them. Alright yes, one of them came back from the dead, but even so, these deaths were still pretty shocking.

But these surprises are part of what makes the show so intriguing. Game of Thrones always keeps us on our toes, never allowing viewers to guess what will happen next… But that doesn’t mean we can’t try!

In looking through the data in Looker, we noticed a few trends, which we have extrapolated to form predictions about what to expect in the final two seasons. As with any piece of creative work, data can only go so far, and while we do have data on six seasons, only one season is not based on books and entirely created by the current writers, as the next two will be… That said, it doesn’t hurt to take a step back and see what we can find in the data to help us #prepareforwinter.

So here we go: our data-based, totally unscientific predictions for Season 7 of Game of Thrones.


CAUTION: THERE ARE SPOILERS AND SOME WILD GUESSES AHEAD

Episode 7 will be the deadliest episode of the season
As we mentioned in our first post, the three most recent seasons’ story arcs all leave the deaths for the season finale. We are predicting that trend will continue.

Episode 4 is going to have a surprise death
In the past few seasons, the mid-season episode has been surprisingly deadly. Since the new season only has 7 episodes, this puts Episode 4 in the mid-season spot.

This season will not be as deadly as the final one
The data on deaths in the previous seasons shows a clear trend of deadly season followed by not that deadly season. Since the last season was so deadly, we think they will put off the biggest kills for the final season of the show. This also makes logical sense as they will probably want to go out with a bloody bang.

Eddison Tollett is going to do... something
In our first post, we mentioned that Eddison Tollett - Jon’s right hand man who he appoints as the Commander of the Night’s Watch when he leaves for Winterfell - has a surprisingly high episode count.

This cannot be a coincidence. There has to be a reason he is in so many episodes… But with everything coming his way (cough **White Walkers**) I’m sure there will be many opportunities to give him some storyline to go with his plentiful episode count.

Since writing our first post, we had the opportunity to put some more granular data into our model, including screen time by season. From this new perspective, a few surprising insights provide hints to what the show runners are leading up to now that the show and the book have completely separated stories.

data_of_thrones

Based on how the data is trending, the following are our guesses for what will happen to some well-known characters...

This is the season of Sansa + Jon
As you can see in the visualization above, the only two characters of the main character group whose screen time has increased consistently over the recent seasons are Jon and Sansa. We interpret this finding to mean that they are leading up to something with their storylines, and that this season their time on screen will continue to increase.

Another interesting insight from this screen-time-by-season data, is the mirrored screen time of Varys and Petyr Baelish (Littlefinger):

data_of_thrones

This is an interesting look at how these two schemers are being shown on screen as foils of one another. When one character gets a lot of screen time in a season, the other is working in the shadows… What does this mean for their respective plots for the future of the Iron Throne? Again, the data doesn’t tell us that, but based on this Look, we are predicting that….

Varys’ big plot will be revealed
As we mentioned in our first post, Varys is a prominent member of the group of characters with high episode counts but low overall screen time. This is a group made up of helpers and schemers - of which he is definitely the latter - and since his screen time has been slowly increasing over the past few seasons, we predict that this is the season when his big plan will finally come to light.

This will be the end of Little Finger… if we see more of him
There are lots of theories around the internet about the fate of Petyr Baelish… But one thing that seems to be consistent is that this is our last season with him.

And we agree… But only if he spends a lot more time on screen.

This is based on an interesting insight from our new data set. The visualization below shows that most of the characters killed in the last season had positively trending screen time in their final season:

data_of_thrones

With Littlefinger’s screen time trending down (see the first viz above), it would be a big break from the trend for the show creators to kill him off without giving him a good amount of time on screen first.

So if we start seeing more of our conniving friend, we all should prepare for the worst… If he remains in the shadows, then we should expect him to make it to the final season.

Share your thoughts on this post as well as Part I and Part II by using the hashtag #dataofthrones and/or tagging @lookerdata. We would love to hear what you think is coming over the next seven weeks.

We’ll be checking back in after the season is over to see how we did on our predictions… Enjoy watching this weekend! #winterishere

Interested to see what you can learn about your data with Looker? Request a demo today!


Redshift, BigQuery, and Apache Spark, Oh My! - Looker’s Indispensable Guide to Help You Choose the Right Database

$
0
0

db

Redshift, BigQuery, and Apache Spark, Oh My!

For a generation of Database Administrators (DBAs), you could spend your whole career becoming an expert in one warehousing technology and one business intelligence tool.

Netezza, Oracle, Teradata. Cognos,Hyperion, Business Objects. Each was its own island with its own ecosystem.

Picking a toolset and becoming deeply knowledgeable in it guaranteed job security. Your enterprise had spent years planning its data warehousing solution, and millions of dollars installing it. They weren’t going to switch any time soon.

But the world has changed.

Today, even early-stage startups need a data strategy. Data warehouses have gotten so cheap, anyone can afford one. And while a DBA may be in charge of setting things up, it could now just as easily be an analyst, or a developer, or even a founder.

At the same time, the number of options has ballooned. Production replica or dedicated warehouse? Cloud or on-premise? Hosted or on-demand? Hadoop or columnar? All-in-one or modular? Real-time or batch processed?

Building a data strategy means weighing all of the options and charting a path forward.

So rather than experts who are deeply informed about a single technology, the need today is for people who are broadly knowledgeable about the relative merits and disadvantages of different solutions.

But with new options coming out every few months, how do you get up to speed and stay informed?

Well, it turns out that the analysts and engineers at Looker face the same challenge. Looker’s data platform integrates deeply with 34 dialects (and counting) of SQL, and because we have customers who use each and every one, we’ve become experts on them.

We see first hand what they do well, where they struggle, and how they perform in the real world. We understand their quirks and their jaw-dropping features, because we have to make sure Looker can deal with the former and leverage the latter.

Today, for the first time, we’ve collected that knowledge in a single place, and we hope it will become an indispensable resource for everyone who “does data stuff.” In the months we’ve spent putting this together, we haven’t found anything like it anywhere on the internet.

We’ll be adding to it over time--updating it with additional technologies, and maybe expanding it to cover other database types (graph, key/value stores, document stores) that are used for specialized analytics.

But we wanted to get what we’ve already collected out there for you to use. So check out the Pocket Guide to Databases. And if there are things that need clarification or that you’d love to see in future versions (or heavens forbid, things we got wrong), don’t hesitate to let us know at databases@looker.com.

Marketing Analytics for Everyone

$
0
0

marketing_analytics



It seems like 2017 is the year of marketing. User retention, ROI on ad-spend, customer lifetime value, predictive lead scores, purchase affinity: sophisticated analyses such as these are transforming marketing from a loosely quantifiable “art” to a strictly followed “science”.

But these types of analyses existed long ago - why the sudden increase in significance?

It’s no secret that the volumes of data that today’s companies gather are exploding. These oceans of data provide the opportunity to perform more exact analysis and track behavior all the way down to the individual user level. This means any marketer, from junior program manager to CMO, can easily understand what audience is seeing their ads, what types of ads and keywords attract the most valuable customers, how well each audience cohort converts, and much more.

Just a few years ago, collecting this depth and breadth of data would have been impossibly expensive. Nowadays, thanks to advances in data storage tools, if a company is not collecting and continuously analyzing all this data, it’s considered an outlier and risks falling behind.

Nearly every company that interacts with users or customers online uses the Google Marketing Suite1. Whether it’s an emerging startup sampling with free Google Analytics or a seasoned enterprise leveraging the entire ecosystem of reporting and analysis tools, Google has arguably become the most prolific marketing technology company in the market.

marketing_analytics

Performing analysis in Google Analytics, AdWords, DoubleClick, or any other marketing tool, provides a strong foundation for deriving insights from the vast quantities of customer data. But the data from each of these sources typically lives in it’s own separate silo - in its individual console or interface (read: another browser tab). To track performance across these tools, users typically download reports and manually reconcile them in Google Sheets or Excel, wait a month, and repeat.

"As a data-driven company, our marketers go deep to find trends and insights that help us take effective actions. As such, the range and sophistication of Looker’s Blocks, including their pre-built dashboards and reports, have the potential to become extremely powerful tools for our team.”
- Andrew Rabinowitz
  Data Ops Engineer, Blue Apron

On top of the restraints of living in a silo, conducting analysis directly in the UI of each data source prevents marketers from seeing raw, user-level data. Typically, all that’s available is daily or weekly aggregated reports, which winds up leaving the possibilities for deep analysis extremely constrained.

What if you want to follow the full event-stream of individual users? Or tie your marketing data to your CRM data to understand ROI on ad-spend? It’s just not possible without exporting the raw data out of the data sources.

Even with the sophistication of Google’s data collection, the dream of tracking performance in real time across all platforms and tools, and taking action on that data, requires a tremendous amount of developer knowledge and data engineering.

That all changed last March at NEXT ‘17, when Google unveiled its Transfer Service offering.

After following a few minutes of instruction in Google’s interface, data from each of Google’s marketing sources, can be piped into BigQuery in near real time. What’s nice about BigQuery - Google’s managed data warehouse - is that it’s also completely configurable in Google’s interface, meaning anyone can set it up without prior knowledge or experience with databases. It’s all done quickly and easily, in a just a few minutes. Starting up just requires a Google Account.

While Google was developing Transfer Services, our team here at Looker got excited about the potential of what this offering could open up. Transfer Service removes the need to involve a huge engineering team or outsource months of work to move all the separate data into a place where it could be combined. The only step was piecing the puzzle of data together once it was in BigQuery.

That’s where Looker comes into play. As Google finalized Transfer Services, we worked closely with Google’s product team to develop Looker Blocks - a pre-built suite of actionable dashboards and analysis - to plug directly into the data after it’s been dropped into BigQuery.

Looker takes all the data from BigQuery, intelligently understands how to link all the sources together, and gives users the ability to run any type of complex analysis or report across each marketing tool, whenever they need. We’ve worked companies such as Hearst Communications, BuzzFeed, and Blue Apron to take their data from raw sources to full data suites in a matter of days.

“Google’s BigQuery Data Transfer Service makes it easy for us to centralize all the data from DoubleClick for Publishers, Google Analytics, and other internal data sources. Looker’s Blocks will allow us to better make sense of that data, with the end goal of building intelligent and predictive products.”
- Esfand Pourmand  SVP of Revenue at Hearst Newspapers

With Transfer Service + Looker, any marketer can jump straight into analysis. No need to deal with disparate APIs, or even involve IT. Even for those companies that already have invested IT time and resources into manual ETL scripts, Transfer Service does the same thing for data engineers that BigQuery did for analysts - it removes the need to monitor and maintain complicated tech plumbing. You can free up those engineers to do more meaningful work for your business.

The argument for DIY structuring of a tech stack typically goes like this: constructing each individual component is helpful due to the higher degrees of control and granularity. In this case, Google is actually providing a better service, since they include all the possible data fields that would be available if the data was pulled manually and/or individually. This widens the breadth and depth of reports available, making it possible to perform complex analysis not typically available in the UI of the data source itself. This is, again, where Looker Blocks shine.

marketing_analytics

Blocks not only provide the reports you’re already using in Google, they include additional value-add analysis, not possible without the raw data available through Transfer Services, such as:

  • Tracking users across their lifespan of interaction
  • Customer lifetime value
  • Cross-source user retention
  • Flexible user cohorting and segmenting
  • Customizable period-over-period reporting
  • Single or multi-touch attribution analysis
  • Customized sessionization logic

marketing_analytics

marketing_analytics

On top of this analysis, users get the full advantage of Looker’s extensive platform. Users can schedule alerts or trigger actions based on time periods or events, change bid prices or keywords from within Looker UI, create custom dashboards and infinitely flexible cutting-and-slicing of all their data. Looker’s flexibility means that data from additional sources such as Facebook Ads, Hubspot, or even proprietary data systems so help business achieve a true 360 degree view of their customers.

Finally, every marketer can easily harness the power of big data in an easy-to-use, centralized interface. 2017 is the year of marketing analytics - make sure you’re equipped to keep up.

Interested in Looker + Google BigQuery Data Transfer Services? Try them both for free and get started in just hours.


  1. Kleiner Perkins Caufield Byers. "2017 Internet Trends Report." Dream Bigger — Kleiner Perkins Caufield Byers. N.p., n.d. Web. 07 June 2017.


Closing the Loop Between Data Analysis and Action

$
0
0

segment

Looker’s Segment Integration

Tell me if this sounds familiar. You want to do a comprehensive analysis of your customers--to identify high-value buyers, or those who haven’t ordered recently, or heavy support users--but it requires combining disparate data sources (say Zendesk, Salesforce, and Marketo data). So you centralize the data in your data warehouse and load it into your data tools, run the analysis, and then want to take some action based on the results.

Maybe you want to email users with falling engagement scores, or offer a coupon code to high-value users, or flag habitual support users for your support team. But that requires getting your analysis out of your data tool and back into your email tool...or your support desk...or your ecommerce platform.

So you manually download CSVs and send them off to other teams who can upload this information to the appropriate tool. And next week, when you want to rerun the analysis with fresh data, you do this whole inefficient process again.

What you really want is a complete loop. A way to run your analysis, write the output back to the source system, and keep it updated with the latest results. And that’s what we’re launching today with our partners at Segment!

Looker’s Segment Source Integration will allow customers to send out data to a variety of integrations for marketing automation. Looker customers can now push User Cohorts to Segment, which can allow for actions on that data with third party applications like Marketo, Hubspot, Appboy. This means you can trigger marketing campaigns, activate win-back campaigns, and define custom email cohorts right from the Looker interface using the the power of Looker and Segment.

Read on to see how it works and check out Segment’s blog post!

Use Case

With Looker, it’s easy to model customer behavior and make the data available to your organization. For instance, let’s consider an eCommerce example, where Segment events and Segment Sources data from Zendesk and Stripe are made available in Looker.

segment

This dashboard gives us an overall view of the visitors to our site and their characteristics: their engagement based on site behavior, their satisfaction based on indicators from Zendesk, and their plan level from Stripe. But what if we want to dive into the details?

With Looker, it’s easy to model complex analyses involving different data sources. We probably want to dig deeper into our customer base with a high satisfaction score who have not recently placed an order. Below, we can see the breakdown of score and latest order for our customers.

segment

Now that we’ve identified a cohort of customers we’d like to target, we can just drill in to get the list of users we’d like to target.

segment

Then, we can schedule that list to Segment, just once or on a recurring basis.

segment

The list of users is now available in Segment through identify calls sent by Looker - so you can send them to any of your connected integrations, like Marketo to send a campaign to re-engage your inactive customers.

Getting Started

In Looker 4.18, there’s now an Integrations section in the Admin panel. Simply enable (or have your Looker admin enable) the Segment integration by adding your Segment write key.

segment

The Segment integration also requires dimensions to be tagged as email or user_id in LookML. Simply have the tags parameter added to the relevant dimensions in your LookML project.

You can find implementation details on Discourse! Reach out to your Looker Account Rep to get started now!

Start analyzing your data with Looker and Segment, try Segment out for free.

Request a free Looker trial.


IGNITE Talks at JOIN 2017: What to Expect

$
0
0

IGNITE Talks

When we sent out a call for speakers to present at JOIN 2017, we were excited and humbled by the number of data enthusiasts eager to talk about how they used Looker to change their companies. We had a finite number of speaking sessions and they filled up quickly, but we wanted to find a way to share the passion and enthusiasm our customers were sharing with us with the wider world. So we started brainstorming some new opportunities for speakers.

One type of presentation type we found while researching was the IGNITE talk - 5-minute presentations where the presenter gets 20 slides that automatically advance every 15 seconds. There is no stop or back button; just press go and start presenting.

It sounded like a unique, albeit challenging, medium so we decided to try it out with Lookers as our guinea pigs.

We held a happy hour and invited any and all Lookers to present IGNITE talks. We had talks on how to buy a house in your 20s, how to use Looker to manage your finances, the history of IGNITE itself and more.

The evening was informative, funny, and little bit terrifying for those of us who presented. We all learned a lot, and everyone, audience and presenters alike, had a great time doing it.

IGNITE talks are a special type of challenge for speakers. Distilling a complex topic that you could talk about for hours down to 5 minutes is hard. You also don’t have any wiggle room with the structure of your presentation — 20 slides, no more, no less. And staying in line with your slides takes a lot of practice because it’s easy to get lost and hard to get back in sync.

After the happy hour, it was clear this format was exactly what we wanted for JOIN. We got people talking about topics that got them jazzed. With 5 minutes, there’s no time for fluff or filler, so you get the purest version of the story.

To boot, IGNITE talks sparked an instant connection between speaker and audience. The unforgiving structure coupled with the possibility of failure mean the speakers have to get really honest with their audience really quickly and the audience has to be forgiving of any flaws this showed. Because of this trust, the audience became more invested in what the speaker was saying as the talk went along. If someone tripped up, they were able to laugh at themselves and when they had perfect timing with their slides, everyone in the room cheered.

So this year at JOIN we’ve decided to dedicate a track to IGNITE Talks. We pulled the most compelling stories out of the submissions and challenged the speakers to boil their story down to 5 minutes.

Hopefully, this format will allow the passion and enthusiasm our customers shared with us to be passed along to the entire JOIN audience. Because when someone presents something they really care about in a succinct and fun way, everyone in the room cares it too.

If you already have your ticket to JOIN, we hope to see you at the IGNITE Talks session. If you haven’t gotten your ticket yet, there is still time!

Why we release every four weeks

$
0
0

release

Looker cuts a new release every four weeks. In October of last year, we cut Looker 4.0. Next week, we’ll cut Looker 4.20. We have always worked this way, and don’t foresee changing it anytime soon.

Our four-week release cycle is core to our engineering philosophy for two reasons.

First, we believe in giving customers the features they want as soon as they’re ready. If there is a feature, be it small or large, that a customer thinks would help them get more value out of Looker, we want to get it into their hands as soon as possible.

The second reason is data. Iterating on a feature is only possible if you’re able to work from real usage data. The sooner we can get features into the hands of users, the sooner we can know if it accomplished what we hoped it would.

The best part about moving so fast is that we’re able to use the data produced in the first few weeks of a features life cycle to adjust it and improve it so our customers get an even better version of a feature four weeks later.

The four features we announced today are just a sampling of what we’ve been working on...

The first feature we’re announcing is a real game changer in my eyes. Keeping a BI instance performant and fresh has historically been very challenging. Either queries run live, processed anew for each and every user at query time, or results are stored in some intermediate layer one step removed from the database (be that a cube or a specific result set). Administrators have to choose between fast results that may be a day behind or real-time results that may take a couple minutes to run in some cases.

Smart Caching solves that problem. By dynamically checking the cache against the raw data, the cache size can be maximized when data is unchanged, improving performance for end users. But, at the same time, when underlying data changes (this can even be set with thresholds of change!), the cache can be cleared for real-time data. There's no more balance between cache size and latency, you get both if you want them.

release

At the beginning of the year, we announced a set of features we call User Attributes. This feature allows admins to assign attributes — anything from row-level permissions to geographic areas —to users and groups. When you add Advanced Scheduling to the picture, it means managers can maintain a single dashboard that is filtered on one of those user attributes like region, and schedule it to be sent to every user with the data filtered for that user.

release

Because of Looker’s flexibility and the nature of operating in-database, we’ve always handled time zones fairly well. With the new Advanced Time Zone Control features, users are able to leverage both core benefits. From the flexibility standpoint, folks can explore data or schedule reports in any time zone or even using a mix of time zones on dashboards. Additionally these can be parameterized per user or locked, so folks in different time zones can easily work on their time without thinking. Second, because we are operating in-database, there's instantaneous ability to query in any time zone seamlessly. No rebuilding data pipelines or anything, just simple at-query-time conversion.

release

Visualizations are key for telling a story with data. The more visualization types we can give our customers, the better stories they can tell with their data. Because of this, we added a funnel visualization and a timeline visualization.

release

More features are always rolling, so keep an eye out for more updates. Our user conference, JOIN, is right around the corner, and we all know conferences = announcements!


Specializing Retention Emails Using Behavioral Data

$
0
0

sendwithus

The “user experience” can now be understood to span everything from packaging and delivery, to customer service and troubleshooting, to social content and email communications. People have come to expect impeccable UX across the board, therefore, to remain relevant within a customer’s inbox you must curate highly original experiences at scale. Using Looker, you can identify users ready for various types of nurture, build specific content for them, and showcase value with each and every ‘send’. In this post we'll talk about different types of emails and how to ensure you are sending the right email to the right person, at the right time.

Customer Nurture Like You Really Mean It

Retention emails are an integral part of maintaining a solid relationship with customers who have ‘qualified themselves’ as interested (albeit to varying degrees). Customers may express faint interest with your business; requesting more information, submitting personal data for a ‘free trial’, or bookmarking your product and/or service for review. Customers past these initial stages are heavily engaged with your product; using the product, requesting upgrades, and perhaps making several related purchases. Then there are customers who have adopted loyalty with your product, perhaps you’ve even hit the holy grail of evangelism with this segment — these individuals should never be taken for granted. Just because a customer engages with you does not mean they’ll actually come on board and once on board it doesn’t mean they’ll stay, and if they adopt you as “the one” it’s your responsibility not to break their hearts.

Inactivity Emails

Just like it sounds, this genre of email acknowledges users who seem to have ‘dropped off the map’. There are various approaches to re-engagement depending on what data you are analyzing and what immediate goal you are trying to achieve. User-behaviour that would be helpful to map in this case would be: how long users have been inactive, at which point users tend to drop off, when they seem to ‘wake up’, and what message inspires them to participate? Here is an example of the tried and true coupon-offer approach:

In this email GrubHub is offering me a deal since I haven’t ordered in several weeks and have been known to order more frequently. Now, I’m not sure whether it’s a happy coincidence or not that they have featured Asian cuisine (my favorite) but I am finding myself super enthused about potentially ordering dinner tonight. If they did do it on purpose then good on them, and if they didn’t do it on purpose then they might want to try personalizing food visuals more often! That is to say, it would be cool if user behaviour was measured to find out which foods are the ‘most-ordered’ or ‘general favorites’ so email templates could be customized to match beloved cuisines to eager foodies.

Within Looker, you can identify friction in the funnel (i.e. seeing where users become inactive) and confidently leverage user behaviour to minimize inactivity by sending nurture at weak points.

Activity Emails

Conversely, activity emails can be spurred by a variety user actions (and even some combination of inaction). Below is an email that was sent in response to the fact that I visited multiple pages of this site, technically ‘abandoned cart’ at the end of browsing, later revisited the site several more times over the course of fifteen days to see what was available.

In this email, BarkBox is ‘testing the water’ to see if I might benefit from trying a short-term deal (like a teaser of the real thing). From my spontaneous online behaviour, it seems that BarkBox was confidently able to infer my reluctance to commit to a lengthy subscription (being a new user and all). Their strategy is to empathize with me, allowing me to become more familiar with their service in hopes of convincing me to commit to the standard subscription at a later date, once good rapport has been established.

Upon being connected to your database, Looker enables you to analyze more granular event data so you can examine the user journey the way you want. It’s like you’re granted a 360 degree view of how visitors are interacting with your website, with the ability to drill all the way down specifics to help you piece together even the most puzzling customer journey.

sendwithussendwithus

Renewal Emails

This email is sent in anticipation of a renewal of a service, and is best sent as a series. An example drip strategy is 60 days before expiration, 30 days before expiration, one week before, the day of, and then a week or so after if you haven’t managed to convince them by then. Looker analysis can help you identify the ‘sweet spot’ to send this email, thus eliminating the guesswork so you can focus on creating more valuable and entertaining content. The email below did a great job of reminding me I should probably touch up my roots:

I do short subscriptions (instead of yearly renewable) with Madison Reed because I’m really fickle; I like to change up my hair color every three to six months. To remind me it’s almost time to select my shade of choice, I get an email from Madison Reed the week before I’m due to color my hair (during the onboarding I designated that I’m on an eight week cycle). This type of email is valuable because it helps busy-bodies stay on top of the purchases they regularly make.

Cross-Sells & Up-Sells

To cross-sell is to promote a different (yet complementary) product/service to an existing customer. To upsell is to promote an upgrade or an add-on to an existing customer... Amazon.com does a lot of both! The company regularly up-sells Amazon Prime services with pop-up promotions, usually while I am at the virtual ‘check out’ on the website. I also receive a weekly digest of items related to my past purchases or based on my personal settings. The email below is a digest of newly added books I might find interesting:

Here Amazon takes into consideration that I have repeatedly bought books from them, I have a designated interest in Political Science titles; and though it has been some time since I have bought a book, I almost always click-through when I get this email!

VIP Offers

Promotional offers are always a joy to receive, but it means something more when a company recognizes your devotion (having dropped mad cash repeatedly on their products) and sends you an exclusive perk. For example, GAP invites me to ‘shop first’ and beat the crowd to the big deals:

Offering two different promotions (one online and one in-store) allows me the liberty to shop however I please! Although, this is also an interesting opportunity to monitor what I end up choosing; in Looker, you would be able to see how shoppers move on VIP deals over time. By identifying trends you can organize more thrilling customer experiences accordingly (i.e. throwing special events at stores or gamifying the online shopping experience, for example).

Surveys

An email featuring an invite to take a survey is sent to ‘check-in’ with new customers to see how adoption is going (within five to thirty days), to gain feedback on service updates accepted existing customers (within fourteen to thirty days of an update), or to present existing long-term customers with the opportunity to personalize their own experience with a service, such as the one below:

Surveys are probably the most dreaded (yet most useful) B2C interaction. In Looker, you can find out what the best timing is to present a survey and keep track of how a variety of incentives fare in stimulating big turnouts. While I found this email to be a bit lack-luster in plain text for Adobe (a company that basically sells creativity), I was happy have a chance to curate an ideal experience for myself regarding the Adobe Create Magazine.

Email Metrics Can Impart Unparalleled Insights

Email data is a like a treasure trove for behavioral studies all on its own — every customer interaction (or lack thereof) can lead marketers to form a deeper understanding of their performance:

  • Which emails did a user open and/or click on?
  • Which emails did a user ignore? Which missed the inbox completely?
  • Which emails caused users to unsubscribe?
  • What type of offers do users respond to most often?
  • How long ago was a user’s last interaction with an email (days, weeks, months)?

Running a simple study (like the one outlined below) can help marketers more acutely discern what customers expect at various stages of the relationship-building journey.

  1. Begin by sorting users into 3 main groups according to frequencies of interaction: high, moderate, and slight.
  2. Then organize A/B tests for copy, offers, template design, timing, even the levels of personalization within each of these groups.
  3. In comparing the results, notice what resonated within each group and use this data to form best practices.

All in all, the way users behave with email should help shape what kinds of campaigns you build, how often you send them, and inform what stage in the relationship is best for each communication. Business analysis in Looker can help you find the most optimal segment to reach with each campaign, as well as pinpoint opportunities to run spin-off campaigns. Retention emails are supposed to feel like a piece of a grander conversation; keep it personal, keep it fresh, create experiences that build and expand upon the last.

4 Questions to Ask Yourself Before You Start a Marketing Dashboard

$
0
0

marketing_dashboards

Have you ever heard the quote “Those who do not learn history are doomed to repeat it”? Data is business’ way of learning history. Without looking at what we have done, we cannot know what worked and what did not.

How about this one - “Without data, you’re just another person with an opinion.” You will always have opinions that are fueled by experience (helpful) and emotion (sometimes not so helpful). But when you add data, you still have the experience and emotion and you also have cold, hard, numbers.

Okay, last quote - “I have not failed. I’ve just found 10,000 ways that don’t work.” Data allows you to test things quickly. If you can access granular data, you can see successes and failures before they impact higher level metrics. If you test something and can quickly identify that it doesn’t work and change that test, it’s not a failure— it’s a learning.

Dashboards are a great way to access and share this data. A powerful dashboard not only brings data into the conversation — it tells a story with that data.

In this series of blogs, I will talk through some of the best practices I follow every time I build a new dashboard, as well as some of the analytical foundations that go into telling the best story possible with data.

So you’re making a new dashboard. Wahoo! Before you dive into the data and start building tiles, I recommend asking yourself these five questions. In my experience, answering these questions at the outset helps me to create a more effective dashboard. So, without further ado, here we go!

  1. What am I going to do with it?
    If the answer to this question does not immediately fly to the tip of your tongue, you probably don’t need a new dashboard. There are a myriad of creative things one can use a dashboard for, but for the most part, the goal of a dashboard is to….

    1. ...track an ongoing initiative or campaign. This could be anything from quarterly meeting goals to current ad campaigns. I have a number of different dashboards that help me keep an eye on my various in-flight campaigns, both big and small, so I can see if we need to tweak everything.
    2. ...lookup known entities. These are great, because they allow you to quickly pull up all the information you need on anything from a customer to a campaign. My favorite here is my event lookup dashboard which allows me to look up any past events to see the current state of the leads associated with it.
  2. What question(s) am I trying to answer?
    This answer may take a little bit longer to come up with, since it’s generally a more complex thought. A great dashboard combines multiple pieces of data to answer a broader question. Instead of asking “How many people have registered for my event” you want to think about the bigger picture: “Am I on track to hit my registration goal?” This larger question allows you to look at your question from multiple angles and give a fuller picture of the situation. When you start to build, this answer will become your guiding light. As such, we’ll come back to it later in this post.

  3. Is there an existing dashboard that I can use to answer my question?
    As marketers, we have learned to reduce, reuse and recycle anything possible. Can we turn this blog into a white paper? What about a webinar? The same principles hold true for dashboards. Why create something new if someone has already done the work? If you can’t find something that answers your question completely, you still might be able to find a dashboard you can use as a jumping off point.

  4. What am I measuring?
    Before you can start putting a dashboard together, you have to understand what you’re measuring and how you’re measuring it. Here are some examples of simple ways to measure your work:

    1. Attempts - How many times did someone try to do something?
    2. Impressions - How often do people see something?
    3. Successes - How many times did people successfully do that thing?
    4. Conversions - What percentage of people who did X progressed to Y?
    5. People - How many different people did something?
    6. Totals - What was today’s total revenue/total time on site/total net profit**?
    7. Averages of things that vary - What was the average/median/mode number/time of this thing?
    8. Quality - How many people used X and how many tried to use X but failed?
    9. Quality Ratios - What percentage of the people who tried to do X had issues?
    10. Against basic dimensions - Here we’re going to combine two pieces of data to get answer a more specific question - For example, How many orders were placed today by people between 16 and 25? What about between 26 and 35?

    Now that you’ve answered all those questions at a high-level, it’s time to start building. Check back next week for part two of this series. In that post, I will step through outlining a dashboard, as well as touch upon some of the key things to keep in mind as you build.

AWS + Looker: Getting the Most Out of the Cloud

$
0
0

aws_suite

Amazon Web Services (AWS) is immensely popular among modern companies because Amazon has found a way to strike a great balance between flexibility and usability. AWS gives you every tool you could imagine in the cloud, with none of the overhead of managing hardware yourself.

With compute, storage, networking, database, and dozens of other services, AWS gives companies the flexibility and customizability to address virtually any use case.

AWS’ services can be combined in almost any way you can imagine. And Amazon Redshift, AWS’ data warehouse solution, gives you tons of customization options—from specifying how your tables are distributed, to type of hardware you want, to how much computing power you allocate to different queries.

aws_suite

Do Even More with AWS + Looker

So how can companies easily get the most out of all this power and customizability?

That’s where Looker comes in. Looker gives users deep analytics on the immense data that Amazon collects and makes available, giving them the ability to drill down to row-level detail to uncover the details that make up larger usage trends. And Looker does it all in an easy and intuitive data platform that enables anyone to explore their data .

Today we’re launching the Looker Blocks Suite for AWS. It includes blocks that surface critical information and help you optimize your Amazon Redshift usage, so you can fine-tune your database for maximum performance. And includes blocks to analyze where you’re spending money on AWS, to make it easy to reduce any unnecessary costs and reallocate that spend where it’ll do the most good.

Today’s technology must be both flexible to adapt to the changing needs of your business, but also intuitive and easy to use. And that’s exactly what Looker’s new Blocks Suite for AWS is.

Want to learn more about how Looker works with Amazon Web Services? Check out our AWS ecosystem page. Want to learn more about the blocks contained within the Looker Blocks Suite for AWS? Get in touch with us, we’d be more than happy to demo them for you.

Build a Path to Predictive Analytics with Big Squid & Looker

$
0
0

bigsquid

What will your Looker investment look like in 2018? Will it incorporate machine learning and predictive analytics? Does it drive new insights and compel better decisions? Will it build upon your view of the business today to paint a picture of what lies ahead? Can it direct decisions at the front lines of the business? All interesting questions to ponder; and if you don’t know the answers, you’re in good company. The buzzword bingo floating around in the business landscape means most of us have heard of “machine learning”, “artificial intelligence”, “predictive analytics”, but few have actually integrated into the daily operations of our business.

For true trailblazers in the business analytics world, such concepts will be ubiquitous to business intelligence in the coming years. They’ll be utilizing such solutions pervasively across their business, in each department and role within it. For others, this seems unattainable, if even approachable.

From our standpoint at Big Squid, the greatest barrier to bringing powerful solutions like predictive analytics to the broader business community is the scarcity of resources able to build and deploy such capabilities. Equipped with an understanding of sophisticated mathematical approaches and necessary programming chops, data scientists are difficult to find, afford, and scale. Fortunately, they no longer need to be the only way to implement these sorts of solutions.

At Big Squid, we are solving this problem by bringing predictive analytics to the business decision maker. What does that mean? It means that we created a SaaS platform for business leaders that leverages data within your existing Looker platform in order to make predictions about the future of your business. With the recent boom in Business Intelligence (BI) investments for midmarket to enterprise level companies, the question has become what to do with ALL that data? The crux is: how do you operationalize the data now that you have it? Big Squid’s predictive analytics platform is a valuable extension of your Looker Data Platform in that we’re applying machine learning to gain insight into the probable future state of the metrics that matter most.

Filling the Predictive Analytics Gap

bigsquid

Data platforms, like Looker, are necessary for understanding the Descriptive and Diagnostic phases of your business intelligence value chain. In fact, Data and BI platforms are essential to gaining insight into what is happening right now within all aspects of your business. However, many mid-market to enterprise level companies are struggling to remove the complexity of data science and make informed data-driven decisions about the future. To offset the imbalance of data scientists in the marketplace, businesses are finding the need to use solutions (like Big Squid) that leverage the existing employee base. Such solutions can turn data specialists into Citizen Data Scientists, empowering them to bring new, forward-looking insights to the business through their existing Looker platform.

"Predictive Analytics & Machine Learning has incredible interest and value to more accurately answering business-critical questions of decision makers, but are bottle-necked by the need for scarce “data scientists” personnel that is often non-business-facing."

“Give Business Users the ability to Forecast Key Business Metrics using sophisticated predictive analytics & machine learning with less bandwidth from scarce data scientists.

The traditional workflow below should look familiar. The problem with this flow is that a data scientist is required to spend a majority of their time collecting and preparing the data in order to begin the valuable analysis and engagement phases. This is where the business can make decisions and take action.

As a Looker customer, this is great news: because you’ve already prepared your data for exploration and visualization in Looker, you’re ready for predictive analytics.

Collecting, preparing, cleaning, and ultimately staging data for organization-wide use is by far the lionshare of the work required for building a predictive model. And since that lift is already in place, extending our analysis to incorporate sophisticated forecasting is absolutely within reach.

Our approach to predictive analytics vastly simplifies the model exploration and deployment process without sacrificing rigor. Our predictive analytics platform has significant advantages over other expert tools (SAS, R, Python, etc) in that the platform gives data analysts the the ability today to become data scientists. They can apply their understanding of data structure and business problems, to provide better insights of future trends to key business decision makers. This saves time and money.

While this is just the tip of the iceberg, now you can engage other individuals at your business and start the conversations around how to gain better insight into the future of your business all within your existing Looker investment.

Where do we see customers exploring these capabilities? Lots of areas. Keep a lookout for new Looker blocks from Big Squid supporting an ever growing number of use cases. Until then, here are example solutions we’ve built by vertical:

bigsquid

We want to help you become a Data Science-Driven Business.

Want to learn more? I am hosting a webinar with Looker August 23rd, sign up here to learn how to harness the power of Machine Learning with ease.

We will also be at JOIN, stop by our booth to say hello or sign up for one of our Data Science Labs here, we will help you on your path to Predictive Analytics no matter where you are in your journey. These labs will show you exactly how Predictive Analytics can drive change in your business with the data you already have.

How We A/B Test Looker.com at Looker

$
0
0

ab

If you’re anything like me, as soon as your manager came to you with the novel idea of testing the company’s website to improve conversion, you might have done a few (or all) of these things.

  1. Stare blankly into the space between their eyes and nod.
  2. Say you’ll have something for them in a week.
  3. Try to figure out where to start.
  4. Google “how to A/B test.”

And, if you do actually search “how to A/B test,” you’ll get a ton of results—62,400,000 to be exact-ish. From beginner’s guides to “proven” tactics and ideas, it can get pretty overwhelming to figure out how to get your testing strategy and process started.

So when it came to A/B testing looker.com, I started where any employee of a data-obsessed company would: with our web analytics data. With that came a starting point for testing ideas, strategies, and processes that we continue to optimize and fine-tune, which I’ll be sharing a bit of here. I’m not going to call it a “definitive guide” or claim a “proven list” of things to do, but this is how we did it. And hopefully you can learn something from our successful and failed tests…So let’s get started!

Gathering and Prioritizing Test Ideas

Generating test ideas is both the most fun and the most difficult part of testing. When I started by searching the internets for ideas, I got more test suggestions than I knew what to do with. There are those tried and true tests you can run, from changing the copy to the color of a button. The hard part is figuring out what will have the highest impact.

Here at Looker, our resources are limited, so it was important to focus on the experiments that would get us the biggest impact with a minimal amount of resources. To do this, I turned to our web analytics data through Looker + Google Analytics to help us answer the following questions:

  • What are people doing on our website?
  • Where are people dropping off?
  • Who are the people that are converting?
  • Who are the people that aren’t converting?

Through that, we were able to identify key pages that needed help, and how to target those tests to specific audience segments. Great! So, web analytics gave us the what, but we were still missing the why.

To find out why people were or weren’t converting, we ran polls on key pages, asking people a series of questions that would better inform us about why they were interacting with the site a certain way. This helped us learn which parts of these high-value pages to test.

Ideas also came from a series of brainstorming sessions with different groups and departments throughout the company, surveying current users, and, yes, some Google search results.

Deploying Tests

Once we had the ideas, we had to define the goals of the test and turn them into hypotheses. We couldn’t just test a button color for the sake of testing a button color. Instead, we had to formulate a test as a hypothesis so that results could be properly evaluated against the expected outcome and goal.

Once the goals and hypotheses were finalized for each test, we prioritized which tests to deploy by looking at a few variables:

  • What question do we want to answer right away?
  • Which tests will yield the highest impact?
  • Which tests require the least amount of resources?

From there, we use Optimizely to deploy the test to the web visitor, working across the marketing department to develop copy, visual assets, and code the variations. A lot of our early tests were simple A/B tests that could be deployed quickly and easily, with a potential for high impact.

Measuring Results

Tests would run anywhere from 2-4 weeks, depending on our traffic at any given time. More important than the length of time for a test, though, is your sample size. The smaller your sample size, the higher your risk of reaching a false positive.

Unfortunately, I’m not a statistician or data scientist. So while I understand the importance of statistical significance, I’m not the best person to figure out how to calculate that.

Thankfully, there are a lot of products out there that calculate statistical significance for you—including Looker! Looker’s A/B Testing with Statistical Significance Block automatically and easily allows you to see the statistical significance of a test variation for the control and test user groups. You can also drop in different dimensions to see how the user groups, key test metrics, and test variations perform for different user attributes.

/end shameless plug

Surprising Results

As anyone who has run a test knows, the results can be . . . unexpected. Though a majority of our tests come back as inconclusive, there have been a few that took all of us by surprise.

Copy Test

Control:

ab

Variation:

ab

One simple test was to change some of the wording on our form page to request a demo of Looker. We were surprised by how just making small changes resulted in a big impact. We updated the copy to highlight important, yet less prominent, product features and saw a 13% higher conversion rate (CVR) than the control.

This is a great example of a test that made a small change, was easy to execute, and had a significant impact on CVR on a key page.

Layout & Content Test

We have quite a few different personas and audiences that visit looker.com. We’re always thinking about the best way to deliver content to the visitor based on what they’re interested in.

One example of this is our video demo page. We have both a technical and a business user demo, so we decided to feature both videos on our video demo page. We believed visitors would select the video they wanted to watch. Then, if they liked the content, they would fill out the form to request a trial.

Over the next few weeks though, we noticed CVR for this page decreasing, and we didn’t know what the problem was. Was it the videos themselves? Were we showing the visitor too many choices in the action they should take? Or were they just not seeing anything they liked?

Instead of making a guess, we decided to test it.

First, we decided to simplify the page, working with the hypothesis that featuring just one video would make the next step (form-fill) clearer, which would lead to an increase in CVR for this page.

We then added a layer of complication by introducing another variable to this test. We would also test which of the two videos would convert better by either serving the visitor the technical demo or the business demo.

Control:

ab

Variation 1:

ab

Variation 2:

ab

After a few weeks of testing, the results showed that our hypothesis for the layout of the page was correct. In the single-video variation, we saw a 15% increase in visits to the confirmation page. And since we’ve rolled the single-video layout to 100% of visitors, we’ve seen even bigger increases in conversions for this page.

But it wasn’t so clear on the content side. While one video did perform slightly better than the other, the results were still too close to call which meant we couldn’t reach statistical significance in the test timeframe. So the question now isn’t, “What is the best content for this page?,” but, instead, it’s, “What is the best content for this web visitor for this page right now?”

The Recap (or TL;DR)

  • Ideas are everywhere! So it’s important to do your own research before you A/B test.
  • Look at both quantitative and qualitative information so you get both the what and why to formulate your tests around.
  • Make sure each of your tests has a goal and a hypothesis.
  • Prioritize your tests based on your own requirements, which should be clearly defined before you start testing.
  • Keep an eye on statistical significance so that you know your test results aren’t random or by chance.

At Looker, we’re constantly testing the website. What you see on looker.com today could be completely different from what you see on Friday—not because we’re always redesigning, but because you’ll see a different variation of that test on Friday. Be sure to stop by to see what tests we’ve got going on!

Ready to start testing? Request a demo to learn more about the A/B Testing Looker Block and see how Looker can help you understand your test results.

Lightning Talks at JOIN 2017: What to Expect

$
0
0

Lightning Talks

When we sent out a call for speakers to present at JOIN 2017, we were excited and humbled by the number of data enthusiasts eager to talk about how they used Looker to change their companies. We had a finite number of speaking sessions and they filled up quickly, but we wanted to find a way to share the passion and enthusiasm our customers were sharing with us with the wider world. So we started brainstorming some new opportunities for speakers.

One type of presentation type we found while researching was the IGNITE talk - 5-minute presentations where the presenter gets 20 slides that automatically advance every 15 seconds. There is no stop or back button; just press go and start presenting.

It sounded like a unique, albeit challenging, medium so we decided to try it out with Lookers as our guinea pigs.

We held a happy hour and invited any and all Lookers to present Lightning talks in an IGNITE-style. We had talks on how to buy a house in your 20s, how to use Looker to manage your finances, the history of IGNITE itself and more.

The evening was informative, funny, and little bit terrifying for those of us who presented. We all learned a lot, and everyone, audience and presenters alike, had a great time doing it.

IGNITE talks are a special type of challenge for speakers. Distilling a complex topic that you could talk about for hours down to 5 minutes is hard. You also don’t have any wiggle room with the structure of your presentation — 20 slides, no more, no less. And staying in line with your slides takes a lot of practice because it’s easy to get lost and hard to get back in sync.

After the happy hour, it was clear this format was exactly what we wanted for JOIN. We got people talking about topics that got them jazzed. With 5 minutes, there’s no time for fluff or filler, so you get the purest version of the story.

To boot, Lightning talks sparked an instant connection between speaker and audience. The unforgiving structure coupled with the possibility of failure mean the speakers have to get really honest with their audience really quickly and the audience has to be forgiving of any flaws this showed. Because of this trust, the audience became more invested in what the speaker was saying as the talk went along. If someone tripped up, they were able to laugh at themselves and when they had perfect timing with their slides, everyone in the room cheered.

So this year at JOIN we’ve decided to dedicate a track to Lightning Talks. We pulled the most compelling stories out of the submissions and challenged the speakers to boil their story down to 5 minutes.

Hopefully, this format will allow the passion and enthusiasm our customers shared with us to be passed along to the entire JOIN audience. Because when someone presents something they really care about in a succinct and fun way, everyone in the room cares it too.

If you already have your ticket to JOIN, we hope to see you at the Lightning Talks session. If you haven’t gotten your ticket yet, there is still time!

Multi-Dimensional Segments For Improved Customer Nurturing

$
0
0

multi

Email is intrinsic to user lifecycle marketing—a strong email strategy begins with the anticipation of the user journey, mapping out all potential points of friction and recognizing opportunities to connect with customers. Even though user-nurturing will be a unique process for every organization the following phases of the user experience are highly prominent across the spectrum and important to consider:

  • Discovery Phase - Maybe it was an advertisement, social media post, or SEO but some glorious moment brought users to your site. If the initial impression and value proposition of your brand is compelling enough then users will gladly hand over their email addresses, thus initiating a relationship.
  • Potential Building - During this phase it’s essential that you educate new audience members about your brand’s most unique and outstanding qualities, nurturing them into making their first purchase, initiating a subscription, or starting a trial.
  • Momentum Building - From the moment a user begins interacting with your website, products, and services it’s time to give them the star treatment. Outstanding product development coupled with regular delivery of meaningful content to each individual is the only track to long-term success.
  • Loyalty Development - Brand evangelists are the lifeblood of any business. Through avid nurturing and consistent listening, your product development and customer services can stand the test of time.
  • Wavering Interest - Disenchantment can happen over time (faulty perception of demand, weak branding, slipping product quality, strong competition in the market) or quite suddenly (advertising disaster, bad consumer experiences and reviews, damning social media or PR). While there is an email campaign for almost anything under the sun, one should definitely try to maintain relationships rather than be forced to salvage them.

As data accumulates around each user the opportunity to continuously personalize the user experience grows, giving you a competitive edge in advancing the adoption of your brand and reducing churn.

The Looker Vantage Point

Real-time data equals real-time content for real-time segment building. Looker allows marketers to build complex, live segments to quickly and easily study their audiences. Sendwithus gives marketers the freedom to manage templates and multi-dimensional segments to achieve a targeted reach. With the combination of both a data platform and an email management platform, anyone on your marketing team can become ‘the master’ of creating customer-centric experiences. Instead of sending generalized campaigns to broadly defined segments, you can ‘dig deep’ and design customized content based on any customer activity you are able to track.

Specializing Content for Advanced Segments

Multi-dimensional segments can be built out by considering various aspects (i.e. position of user in the sales pipeline, various user activities, user demographics) all at the same time. Here are some content specialization and segmentation tips for popular types of campaigns, each one corresponding to the phases that we discussed earlier...let’s say it’s for a popular e-commerce site (like Etsy):

Welcome Email

When a person first registers with a website they will usually receive the traditional ‘triggered verification email’:

But the second part of the welcome series can be highly specialized, for example you can consider how people behaved upon completing the registration process. For users who register but remain less active you could infer that they might benefit from a friendly ‘how-to’ / ‘cool features’ themed email to bring them more on board. On the other hand, if a user was moderately active in their first visit you can specialize the email to outline advanced ‘Etsy profile-building prompts’ to optimize their experience (example below).

And if a user was highly active (i.e. shopped for items, favorited items, viewed multiple pages and/or followed other users), you can provide them with a highly personalized welcome experience that introduces them to more of what the love with the addition of customized tips and tricks.

Promotional Email

At the intersection of a user’s data (i.e. their location, their search histories, their general interests, and the pricing range they generally participate in) is their ‘dream digest’. Create campaigns that pull the ‘best of the best’ for them to experience while mixing in a few surprises to keep it unpredictably interesting.

Newsletter

Unlike a digest email (which promotes favourited, saved, searched, or abandon cart items) a proper newsletter should provide long-form creative content like videos or blogs, mixed with updates to services or a special themed product features. All of this should inspire current users to continue to use the site, make purchases, and look forward to enjoying the free personalized content curated for them (i.e. email content as an extension of the service).

Win-Back Email

The win-back email should charismatically address whatever it was that may have caused everything to go off track, plus offer incentives! It would be a great time to offer store credit for survey answers, perhaps even feature some exciting news about product of service updates. Gaining a better understanding of what’s missing from the user experience can help you address pain points at scale. Looker dives into retention analysis and the data that drives it in great depth, along with resources to help model your data.

Ideally the content you feature in an email should impress the people you are trying to reach. However today people search for ‘value at a glance’, meaning you’ve got to wow them almost immediately. To avoid becoming ‘white noise’ in the inbox you will have to diversify content and consider querying finer segments for higher engagement.

See Looker and Sendwithus in action!

Visit the solutions page to learn how the sendwithus email optimization platform can help your marketing team create better email experiences. For a trial of sendwithus, simply sign up here.


AWS Cost Optimization in Minutes: Usage Data, Cost Reports, S3, and Athena

$
0
0

aws_cost

As the breadth of AWS products and services continues to grow (hundreds of new products last year alone!), customers are able to move ever-increasing portions of their tech stack and core infrastructure to the cloud.

From a pricing perspective, this can be incredibly attractive. Rather than paying heavy upfront costs for large on-premise systems, customers can pay for services on-demand or reserve them for specific periods of time, and automatically scale resources as needed. While often cheaper, the complexity of pricing each service by the hour naturally lends itself to a more complicated and nuanced cost-structure, making it difficult to fully understand where you’re spending the most money, and how to reduce costs.

While the AWS Cost Explorer is great for aggregated reporting, conducting analysis on the raw data using the flexibility and power of SQL allows for much richer detail and insight, and is ultimately the better choice for the long term. Thankfully, with the introduction of Amazon Athena, monitoring and managing these costs is now easier than ever. With Athena, there’s no need to create hundreds of Excel reports, move data around, or deploy clusters to house and process data. Analysis can be performed directly on raw data in S3. Conveniently, Amazon exports raw cost and usage data directly into a user-specified S3 bucket, making it simple to start querying with Athena quickly. This makes continuous monitoring of costs virtually seamless, since there is no infrastructure to manage. Instead, users can leverage the power of the Athena SQL engine to easily perform ad-hoc analysis and data discovery without needing to set up a data warehouse.

Once the data pipeline is established, the cost and usage data (the recommended billing data, per AWS documentation) provides a plethora of comprehensive information around usage of AWS services and the associated costs. Whether you need the report segmented by product type, user identity, or region, this report can be cut-and-sliced any number of ways to properly allocate costs for any of your business needs. You can then drill into any specific line item to see even further detail, such as the selected operating system, tenancy, purchase option (on-demand, spot, or reserved), etc..

Athena utilizes Apache Hive’s data definition language to create tables, and the Presto querying engine to process queries. By default, the Cost and Usage report exports CSV files, which you can compress using gzip (recommended for performance). There is some additional configuration and options for tuning performance further, which we discuss below.

In the blog post, we’ll walk through setting up the data pipeline for Cost and Usage Reports, S3, and Athena, and discuss some of the most common levers for cost savings.

1. Setting up S3 and Cost and Usage reports

First, you’ll want to create a new S3 bucket. Then, you’ll need to enable the Cost and Usage report (AWS provides clear instructions on how to create this report). Check the boxes to “Include ResourceID” and receive “Hourly” reports. All options are prompted in the report-creation window. Lastly, be sure to assign the appropriate IAM permissions to the bucket for the report. The permission policy can be found in step two of the report creation process (using the link above).

aws_cost

2. Configuring the S3 bucket for Athena querying

The Cost and Usage report dumps CSV files into the specified bucket. As with any AWS service, make sure that you’ve granted appropriate permissions for Athena to that bucket.

In addition to the CSV, AWS also creates a JSON manifest file for each report. Athena requires that all of the files in the S3 bucket are in the same format, so we need to get rid of all these manifest files. If you’re looking to get started with Athena quickly, you can simply go into your S3 bucket and delete the manifest file manually, skip the automation described below, and move on to step 3.

aws_cost

If you want to automate the process of removing the manifest file each time a new report is dumped into S3 (recommended, especially as you scale), there are a few additional steps. The folks at Concurrency labs wrote a great overview and set of scripts for this, which you can find in their Github repo.

These scripts take the data from an input bucket, remove anything unnecessary, and dump it into a new output bucket. We can utilize AWS Lambda to trigger this process whenever new data is dropped into S3, or on a nightly basis, or whatever makes most sense for your use-case, depending on how often you’re querying the data. Please note that enabling the “hourly” report means that data is reported at the hour-level of granularity, not that a new file is generated every hour.

Following these scripts, you’ll notice that we’re adding a date partition field, which isn’t necessary but increases query performance. In addition to compression (taken care of for us automatically) and partitioning, the third lever for performance improvements is converting the data from CSV to a columnar format like ORC or Parquet. We can also automate this process using Lambda whenever new data is dropped in our S3 bucket. Amazon discusses columnar conversion at length, and provides walkthrough examples, in their documentation.

As a long-term solution, best practice is to use compression, partitioning, and conversion. However, for purposes of this walkthrough, we’re not going to worry about them so we can get up-and-running quicker.

3. Set up an AWS Athena querying engine

In your AWS console, navigate to the Athena service, and click “Get Started”. Follow the tutorial and set up a new database (we’ve called ours “AWS Optimizer” in this example). Don’t worry about configuring your initial table, per the tutorial instructions. We’ll be creating a new table for cost and usage analysis. Once you walked through the tutorial steps, you’ll be able to access the Athena interface, and can begin running Hive DDL statements to create new tables.

For Cost and Usage, we recommend using the DDL statement below. Since our data is in CSV format, we don’t need to use a SerDe, we can simply specify the “separatorChar, quoteChar, and escapeChar”, and the structure of the files (“TEXTFILE”). Note that AWS does have an OpenCSV SerDe as well, if you prefer to use that.

CREATE EXTERNAL TABLE IF NOT EXISTS cost_and_usage   (
identity_LineItemId String,
identity_TimeInterval String,
bill_InvoiceId String,
bill_BillingEntity String,
bill_BillType String,
bill_PayerAccountId String,
bill_BillingPeriodStartDate String,
bill_BillingPeriodEndDate String,
lineItem_UsageAccountId String,
lineItem_LineItemType String,
lineItem_UsageStartDate String,
lineItem_UsageEndDate String,
lineItem_ProductCode String,
lineItem_UsageType String,
lineItem_Operation String,
lineItem_AvailabilityZone String,
lineItem_ResourceId String,
lineItem_UsageAmount String,
lineItem_NormalizationFactor String,
lineItem_NormalizedUsageAmount String,
lineItem_CurrencyCode String,
lineItem_UnblendedRate String,
lineItem_UnblendedCost String,
lineItem_BlendedRate String,
lineItem_BlendedCost String,
lineItem_LineItemDescription String,
lineItem_TaxType String,
product_ProductName String,
product_accountAssistance String,
product_architecturalReview String,
product_architectureSupport String,
product_availability String,
product_bestPractices String,
product_cacheEngine String,
product_caseSeverityresponseTimes String,
product_clockSpeed String,
product_currentGeneration String,
product_customerServiceAndCommunities String,
product_databaseEdition String,
product_databaseEngine String,
product_dedicatedEbsThroughput String,
product_deploymentOption String,
product_description String,
product_durability String,
product_ebsOptimized String,
product_ecu String,
product_endpointType String,
product_engineCode String,
product_enhancedNetworkingSupported String,
product_executionFrequency String,
product_executionLocation String,
product_feeCode String,
product_feeDescription String,
product_freeQueryTypes String,
product_freeTrial String,
product_frequencyMode String,
product_fromLocation String,
product_fromLocationType String,
product_group String,
product_groupDescription String,
product_includedServices String,
product_instanceFamily String,
product_instanceType String,
product_io String,
product_launchSupport String,
product_licenseModel String,
product_location String,
product_locationType String,
product_maxIopsBurstPerformance String,
product_maxIopsvolume String,
product_maxThroughputvolume String,
product_maxVolumeSize String,
product_maximumStorageVolume String,
product_memory String,
product_messageDeliveryFrequency String,
product_messageDeliveryOrder String,
product_minVolumeSize String,
product_minimumStorageVolume String,
product_networkPerformance String,
product_operatingSystem String,
product_operation String,
product_operationsSupport String,
product_physicalProcessor String,
product_preInstalledSw String,
product_proactiveGuidance String,
product_processorArchitecture String,
product_processorFeatures String,
product_productFamily String,
product_programmaticCaseManagement String,
product_provisioned String,
product_queueType String,
product_requestDescription String,
product_requestType String,
product_routingTarget String,
product_routingType String,
product_servicecode String,
product_sku String,
product_softwareType String,
product_storage String,
product_storageClass String,
product_storageMedia String,
product_technicalSupport String,
product_tenancy String,
product_thirdpartySoftwareSupport String,
product_toLocation String,
product_toLocationType String,
product_training String,
product_transferType String,
product_usageFamily String,
product_usagetype String,
product_vcpu String,
product_version String,
product_volumeType String,
product_whoCanOpenCases String,
pricing_LeaseContractLength String,
pricing_OfferingClass String,
pricing_PurchaseOption String,
pricing_publicOnDemandCost String,
pricing_publicOnDemandRate String,
pricing_term String,
pricing_unit String,
reservation_AvailabilityZone String,
reservation_NormalizedUnitsPerReservation String,
reservation_NumberOfReservations String,
reservation_ReservationARN String,
reservation_TotalReservedNormalizedUnits String,
reservation_TotalReservedUnits String,
reservation_UnitsPerReservation String,
resourceTags_userName String,
resourceTags_usercostcategory String  
)
    ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      ESCAPED BY '\\'
      LINES TERMINATED BY '\n'

STORED AS TEXTFILE
    LOCATION 's3://<<your bucket name>>';

Once you’ve successfully executed the command, you should see a new table named “cost_and_usage” with the below properties. Now we’re ready to start executing queries and running analysis!

aws_cost

Major Cost Saving Levers

  • Now that we have our data pipeline configured, we can dive into the most popular use-cases for cost savings. In this blog, we’ll focus on:
  • Purchasing of Reserved vs On-Demand instances
  • Data Transfer costs
  • Allocating costs over Users or other Attributes (denoted with resource tags)

On-Demand, Spot, and Reserved Instances

Purchasing Reserved Instances vs On-Demand instances is arguably going to be the biggest cost lever for heavy AWS users (Reserved Instances run up to 75% cheaper!). AWS offers three options for purchasing instances, including On-Demand, Spot (variable cost), and Reserved instances. On-Demand instances allows you to simply pay as you use, Spot instances allow you to bid on spare Amazon EC2 computing capacity, while Reserved instances allows you to pay for an Instance for a specific, allotted period of time. When purchasing a Reserved instance, you can also choose to pay all-upfront, partial-upfront, or monthly. The more you pay upfront, the greater the discount.

If your company has been using AWS for some time now, you should have a good sense of your overall instance usage on a per-month or per-day basis. Rather than paying for these instances On-Demand, you should try to forecast the number of instances you’ll need, and reserve them with upfront payments. The total amount of usage with reserved instances versus overall usage with all instances is called your coverage ratio. It’s important not to confuse your coverage ratio with your RI utilization. Utilization represents the amount of reserved hours that we’re actually used. Don’t worry about exceeding capacity, you can still set-up auto-scaling preferences so that more instances get added whenever your coverage or utilization crosses a certain threshold (we often see a target of 80% for both coverage and utilization among savvy customers).

Calculating the reserved costs and coverage can be a bit tricky with the level of granularity provided by the Cost and Usage Report. The below query shows your total cost over the last 6 months, broken out by reserved vs non-reserved instance usage. You can substitute the cost field for usage if you’d prefer to view it by usage. Please note that if you’ll only have data for the time period since the Cost and Usage report has been enabled, so this query will only show a few days if you’re just getting started.

SELECT 
    DATE_FORMAT(from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate),'%Y-%m') AS "cost_and_usage.usage_start_month",
    COALESCE(SUM(cost_and_usage.lineitem_blendedcost ), 0) AS "cost_and_usage.total_blended_cost",
    COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_reserved_blended_cost",
    1.0 * (COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_blendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_ris",
    COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_non_reserved_blended_cost",
    1.0 * (COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_blendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_non_ris"
FROM aws_optimizer.cost_and_usage_raw  AS cost_and_usage

WHERE 
    (((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1
ORDER BY 2 DESC
LIMIT 500

The resulting table should look something like the image below (we’re surfacing tables through Looker, though the same table would result from querying via command line or any other interface).

aws_cost

It’s an iterative process to understand the appropriate number of Reserved instances to meet your business needs. Once you’ve properly integrated Reserved instances into your purchasing patterns, the savings can be significant. If your coverage is consistently below 70%, you should seriously consider adjusting your purchase types and opt for more Reserved instances.

Data Transfer Costs

One of the great things about AWS, is that you don’t get charged for storing data, you only get charged for moving and processing data. Depending on the size, volume, and location of your data movement, you could end up paying a sizable portion of your monthly bill on transfer costs alone! There are several different prices for transferring data, broken out largely by transfers between regions and availability zones. Transfers between regions are the most costly (from $0.02-$0.12/GB), followed by transfers between Availability Zones ($0.01/GB). Transfers within the same region and same availability zone are free unless using elastic or public IP addresses, in which case there is a cost ($0.01/GB). You can find more detailed information in the AWS Pricing Docs. With this in mind, there are several simple strategies for helping reduce costs here.

First, you should ensure that whenever two or more AWS services are exchanging data, those AWS resources are located in the same region. Transferring data between AWS regions has a cost of $0.02-$0.12 per GB depending on the region. The more you can localize the services to one specific region, the lower your costs will be.

Second, be careful that you’re routing data directly within AWS services and IPs, and minimize the number of transfers occurring out of AWS to the open internet. Sending data to external sources is, by far, the most costly and least performant mechanism of data transfer, costing anywhere from $0.09-$0.12 per GB. You should avoid these transfers as much as possible.

Lastly, data transferred between private IP addresses are cheaper than elastic or public IPs. There’s no field in this report that denotes what type of IP a service uses, but it’s a good consideration when thinking through your architecture and launching new instances.

The below query provides a table depicting the total costs for each AWS product, broken out transfer cost type. Substitute the “lineitem_productcode” field in the query to segment the costs by any other attribute. If you notice any unusually high spikes in cost, you’ll need to dig deeper to understand what’s driving that spike: location, volume, etc.. Drill down into specific costs by including “product_usagetype” and “product_transfertype” in your query to identify the types of transfer costs that are driving up your bill.

SELECT 
    cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
    COALESCE(SUM(cost_and_usage.lineitem_usageamount ), 0) AS "cost_and_usage.total_usage_amount",
    COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_data_transfer_cost",
    COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-In')    THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_inbound_data_transfer_cost",
    COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-Out')    THEN cost_and_usage.lineitem_blendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_outbound_data_transfer_cost"
FROM aws_optimizer.cost_and_usage_raw  AS cost_and_usage

WHERE 
    (((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1
ORDER BY 3 DESC
LIMIT 500

aws_cost

When moving between regions or over the open web, many data transfer costs also include the origin and destination location of the data movement. Using a BI tool with mapping capabilities, we can get a nice visual of data flows (we’re using Looker in this example). The point at the center of the map is used to represent external data flows over the open internet.

aws_cost

Analysis by Tags

AWS provides the option to apply custom tags to individual resources, so you can allocate costs over whatever customized segment makes the most sense for your business. For a SaaS company that hosts software for customers on AWS, maybe you’d want to tag the size of each customer. The below query uses custom tags to display the reserved, data transfer, and total cost for each AWS service, broken out by tag categories, over the last 30 days. You’ll want to substitute the text highlighted in red with the name of your customer field.

SELECT * FROM (
SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering FROM (
SELECT *, MIN(z___rank) OVER (PARTITION BY "cost_and_usage.product_code") as z___min_rank FROM (
SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN "cost_and_usage.total_blended_cost" IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN "cost_and_usage.total_blended_cost" ELSE NULL END DESC, "cost_and_usage.total_blended_cost" DESC, z__pivot_col_rank, "cost_and_usage.product_code") AS z___rank FROM (
SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN "cost_and_usage.user_cost_category" IS NULL THEN 1 ELSE 0 END, "cost_and_usage.user_cost_category") AS z__pivot_col_rank FROM (
SELECT 
    cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
    cost_and_usage.resourcetags_customersegment  AS "cost_and_usage.customer_segment",
    COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_blendedcost +10 ELSE NULL END), 0) AS "cost_and_usage.total_reserved_blended_cost",
    COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_blendedcost + 5  ELSE NULL END), 0) AS "cost_and_usage.total_data_transfer_cost",
    COALESCE(SUM(cost_and_usage.lineitem_blendedcost + 30 ), 0) AS "cost_and_usage.total_blended_cost"
FROM aws_optimizer.cost_and_usage_raw  AS cost_and_usage

WHERE 
    (((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('day', -29, CAST(NOW() AS DATE)))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('day', 30, DATE_ADD('day', -29, CAST(NOW() AS DATE)))))))
GROUP BY 1,2) ww
) bb WHERE z__pivot_col_rank <= 16384
) aa
) xx
) zz
 WHERE z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1 ORDER BY z___pivot_row_rank

The resulting table in this example looks like the results below. In this example, you can tell that we’re making poor utilization of reserved instances because they represent such a small portion of our overall costs.

aws_cost

Saving costs on your AWS spend will always be an iterative, ongoing process. Hopefully with these queries alone, you can start to understand your spending patterns and identify opportunities for savings. However, this is just a peek into the many opportunities available through analysis of the Cost and Usage report. Each company is different, with unique needs and usage patterns. To achieve maximum cost savings, I encourage you to set up an analytics environment that enables your team to explore all potential cuts and slices of your usage data, whenever it’s necessary. Exploring different trends and spikes across regions, services, user types, etc. will help you gain comprehensive understanding of your major cost levers and consistently implement new cost reduction strategies.

If you’re already a Looker customer, you can get all of this analysis, additional pre-configured dashboards, and much more using Looker Blocks for AWS.

Want to learn more? I’ll be talking about these cost optimization strategies in a joint webinar with Heroku on Tuesday, August 29th, where we’ll be discussing different considerations for optimizing your AWS usage. Register for the webinar here.

5 Tips for Growth Attribution Modeling That Actually Work

$
0
0

5_tips

Someone told me that marketing attribution was a ‘sexy’ thing to talk about in the analytics community. Despite all the content on the topic, and there is a ton, one important fact gets overlooked almost all the time and the tips below will make sure you don’t miss it. It’s a bit of a secret but I think savvy digital marketers and their analyst counterparts deserve to know. Multi-touch, last-in, w-shaped, time decay, first-touch, etc.… none of it, by itself, actually matters.

This is what matters: do the numbers that come out of your attribution model actually correlate with functional success and do the people who use it, in all their human glory, actually believe in and utilize those numbers to increase the success of your business?

That’s it, that’s the secret. For all the beautiful theory, analytical purity and inflexible logic of specious attribution models, the one thing that actually matters is whether or not it makes your company better. Here are five tips for making an attribution model that does just that.

  1. Keep it simple, stupid. If you can’t communicate how the model works and if people don’t understand it, how can you actually expect them to trust it, let alone optimize the metrics the model spits out? Seriously, 80/20 rule here - ALL DAY.

  2. Don’t be scared to have more than one. Different functions do different things and there doesn’t have to be one attribution model to rule them all. Financial reporting requires one kind, demand generation requires another, mid-funnel marketing a third. If someone needs it, make it.

  3. Choose models that make sense for your business. Is your database really small and do you need to focus on making it bigger? Great, then maybe last touch attribution isn’t for you. Maybe first touch is better because it will give credit to programs that do what your business needs - add net new names to your database. Build a model that biases action towards what your company needs now.

  4. Be clear about what your model is and what it isn’t. It isn’t a way to assign a literal accurate dollar amount to every single marketing interaction. It is a way to establish operating benchmarks and baselines for performance management.

  5. Stay in tune with the people who use them. At the end of the day, attribution is a way to help make people’s jobs easier. If your attribution model isn’t doing that then there is probably something wrong with it. Or, maybe you should find some new people to work with.

  6. (BONUS TIP) You’ve been awesome so far so here’s a bonus tip. Never, ever, for any reason implement omnitouch attribution. That’s when you say the full value of something can be attributed to every single thing that influenced it. I like counting a lot, it’s fun, meditative and often useful, but not so much that I think it’s a good idea to count the same thing more than once. Get it?

So there they are, five six tips for better marketing attribution. If you ignore them, you could end up with an attribution model that no one actually uses, one that actually hurts your business or one that never gets off the ground at all.

I actually like the hard science of attribution just as much as the next marketing nerd so if you too are interested in that stuff (and you promise to remember the above), I recommend this perfectly clear and lovely overview from Google - may it shower us all with high CTRs and bountiful ROAS, now and forever. Amen.

How Looker + Denodo Deliver Quicker Insights for Ultra Mobile

$
0
0

denodo

Denodo is a data integration and data management company that’s partnered closely with Looker. Our platform integrates data from disparate sources and uses data virtualization to create secure views of the data, as needed, in real time, and deliver it to business users.

As opposed to the traditional style of data integration - Extract, Transform, Load (ETL) - Denodo is real time data integration or data virtualization, which allows you to interact with your data sources without physically centralizing it. Like Looker, we do not require users to physically move their data and this is very useful for quickly accessing new data sources for ad hoc exploration.

Denodo is a data abstraction layer that goes across all of the enterprise sources, and provides a central point - as a virtual database - that can access the underlying data and provide that to a consumer. From a data management perspective, we have capabilities within our product that also allow business users to be able to search the data and discover the relationships and the associations for the data, and also do that in a very secure fashion.

Denodo sits between Looker and the data sources. We aggregate the data from the different sources and make it available to Looker, so that Looker can actually use that data to provide all the analytical capabilities.

Our joint customer, Ultra Mobile, provides a clear example of how companies can use these programs together.

Ultra Mobile is mobile virtual network operator based out of Southern California. They serve customers who are mostly foreign born people living in the United States and who typically call friends and family outside of the United States. They have a subscriber base of about 1 million people and unlimited calling to about 60 different destinations.

With their previous data solution, they experienced two broad challenges; the first was low business visibility. They wanted to be able to make fact-based decisions that would provide them better operations and help them improve their revenue growth and profitability.

The second challenge was with time-to-insight. Whenever they needed to respond swiftly to customer expectations, they had to rely on retailers and others, and the process of getting the data they needed slowed the entire process down.

They are in an industry where competition is pretty heavy, so responding quickly is extremely important. The business problems are the symptoms, and the root cause is the technology. There was something limiting in the technology that did not provide the agility on the business side.

To solve those two problems, they found that the combination of Looker on the front-end with data virtualization from Denodo as the middle layer worked best. Ultra Mobile stores its data in SQL server and Hadoop systems, runs some cloud applications like Salesforce, and also has some of their applications running on AWS. They use Denodo to access the data from these disparate data sources and make that data available to Looker.

Denodo supports data governance to verify who can access what data, and also data lineage to view the hierarchy and relationships among this data. There are several vital components that go into accessing and delivering the data. The first is the caching technology that allows data analysts at Ultra Mobile to receive the information much faster, rather than having to go to the source every single time.Remember, one of Ultra Mobile’s business problems was latency. They wanted the ability to get the data quickly. Denodo, with its caching ability, helps them provide that information rapidly.

Using the combination of Denodo and Looker, they were able to build out dashboards, reports, and analysis that can be delivered in real time to their business leaders and business users. Typically, these projects would take them several months before the data actually shows up in their data warehousing tables, and then for the business leaders to see that information in the reporting tool. Now they can deliver these reports and dashboards within days, and in some cases, in minutes.

One of the best examples they shared is the improvement in the feedback loop between their users and the Ultra Mobile team. A key goal of theirs was to make sure their customers have the best possible call experience every time they pick up the phone to call a friend or family member. Having that kind of a real-time view into call quality across caller locations and countries enables them to respond much faster, so if there is a surge in call volume into China and the call quality is beginning to drop, they can respond almost immediately.

They have about different dashboards that they have created to monitor the call quality. Before, it used to take about 15 minutes for them to see all of these things that are happening. That would be too late to react.

Today, they’re able to see all of this information in seconds so they can monitor the call volume, the call quality, and respond much faster. When their customer picks up the phone to call a family member in another country, every second counts, so Ultra Mobile uses data to be able to understand their business better, and make proper decisions much faster to ultimately make every call great for their customers.

Ultra Mobile’s ability to use data to make quicker, better decisions is just one example of the way that Denodo and Looker have been able to join together to bring data solutions to many different companies.

Visit Denodo at JOIN 2017 to learn more about the Denodo Platform for Data Virtualization and how customers like Ultra Mobile are accelerating their fast data strategy with data virtualization.

5 Key Elements for Leveraging Data to Lead a Marketing Team

$
0
0

5_key

There are many aspects to leading and guiding a group of marketers to a business goal. I bet you have three or four key factors you rely on to help you be successful. For me, data is the most important.

Here are the five key elements I suggest you start with in order to leverage data to the fullest within your marketing team, especially if you're working with a demand generation or growth marketing team. These offer a starting point for ensuring a stable and repeatable focus on data by your entire team.

Understanding your focus and goals:

What is most important for your marketing team to achieve? Are you trying to grow new business, retain existing customers, break into a new region, etc.? Knowing the answer to this question will enable you and your team to know which data you need to focus on.

Marketing in general has the opportunity to get LOTS of data and ask even more questions. It is easy for analysis to turn into months of thinking and prevent you from making progress on your goals. It is also easy to think that you need to analyze ALL the data you have, when in reality you really do not need to.

Attribution:

Once you know your focus and have some data you can access, you can consider attribution. Attribution is important because it lets you and individual team members clearly articulate which campaign, tactic or test actually generates results. The basic B2B options are first touch, multi-touch and last touch.

Are you a growth team? If so, you should be looking at first touch. You want to know where that lead came from and how you can go get more leads like them.

Are you a field team that is working to get late stage prospects to move quickly into the sales cycle? If so, you should be looking at existing leads and what can be done to re-engage them, so you are looking multi-touch and last touch behaviour data. You want to understand the steps you can take to get deeper engagement from existing leads.

Proper Tagging:

You have to trust your data. Make your entire team responsible for proper tagging and tracking of leads, codes and conversions. You will need a scalable process that all of your marketing team can participate in. Your marketing operations team is your best source for a solid platform and process for this.


Now that you have the basics in place, you can really work with your marketing team members to focus on the data and goals at hand. This is your chance to have quality quantitative conversations in marketing, and not just debate opinions. The next two elements are examples from our experience at Looker.

Planning:

The Demand Generation team at Looker has been evolving the planning process since we started our first marketing efforts in 2013. It started purely based on gut feel and past experience. As soon as we had got leads that turned into meetings and opportunities, we had our first conversion rates. That allowed us to create a demand plan that aligned to the business plan.

We now have the ability to look at the performance of leads, content, publishers, campaigns and tactics for the last several years, allowing us to forecast return from our marketing mediums (email, event, paid ad, web…) in a given time frame. By combining this performance intel with our available budget, we are able to generate a “proposed marketing mix” and budget for the marketing team. This gives a marketing manager a place to start from.

For example, in a single quarter we may need a thousand meetings and no single marketing channel can generate that, so we spread it out across several mediums. The online marketing manager is given information that she has XX amount of budget and should generate XX number of leads at a conversion rate of X.X%. By giving her these guardrails of what we think she needs, she does not have to spend a bunch of anxiety-producing time establishing her goal line. She will just spend tons of time figuring out how to hit the goal, which is what we want her to do.

Tracking:

Now this is the exciting part and most recent evolution for the marketing team here at Looker. We have a dashboard that has the agreed-upon goals and current achievement for our first touch attributed marketing medium. It allows for to-the-point conversations with marketing managers regarding the performance of their medium(s). It helps us to determine the health of a program based not just on the number of leads, but on lead-to-meeting conversion rate, quality of meetings, cost per lead and cost per meeting.

This multi KPI focus has allowed us to adjust our efforts on programs and tactics that produce more meetings and revenue, rather than just lead count. It also allows us to keep tactics going that may have a high cost per lead, but a low cost per meeting. It also has us looking at tactics that have a low cost per lead, but high cost per meeting. We may not turn those ones off right away, rather we will then go a do more nurture to them. At some point though, if they simply do turn into revenue we will stop that tactic or stop working with that publisher.


With these key elements in place you can really help the people on your team focus on what they need to do in order to clearly support the business goal at hand. As a group, marketing teams have the opportunity to DO so many things. It is often difficult to say ‘no’ or ‘not right now’ to certain projects. With a data-focus, you will have a solid reasoning to determine what projects need to be worked on now.

I would be remiss in saying behind all of this is a good bit of technology you will want help on. Make friends with the IT, BI, Analytics, and operations teams. Check out how Looker can support your marketing analytics.

Fantasy Football: Player and Position Analysis

$
0
0

ff

Forbes pegged the fantasy football market at over $70 billion dollars, with nearly 60 million players building teams. Because the game becomes more popular every year, and there are lots of Looker fans and leagues, we decided to analyze the data (because that’s what we do here).

We have a quarterly “Use the Product” day at Looker, where anyone can use the day to do some hacking on a dataset of their choice. For this quarter’s event, I decided to get my hands on some Fantasy Data in anticipation of this year’s draft.

Looking at the data, we learned a lot more than just drafting strategy. This is the first post of a series, where we will dive into a dataset from FantasyData.com, and see what we can – and can’t – learn from the past to use in the new season.

Let’s start by getting our bearings and taking a look at our data set. We’ve got some stats from 2008 through 2016 to look at player and team performance, including draft data from the past few seasons. There are a bunch of different ways to slice this, but we focused mainly on individual players and positions.

Top Performing Players

First let’s look at an overview of top single season performers from the last 8 years.

  1. At the top of the ranks we have Peyton Manning’s 2013 season. To put that glorious performance into context, not only did it put him at the top of the list, but it did so by a full 14 points. It’s also important to note that he doesn’t appear on the list again, which tells us that one season’s performance is not a guarantee of continued success.

  2. Unsurprisingly, there are more quarterbacks on the list of top single season performers than any other position (8), but what is surprising is the amount of repeats in this list. Aaron Rodgers leads with five appearances followed by Tom Brady and Drew Brees who both have three seasons making the list. When evaluating players to draft, sustained high level performance is a much better predictor of future success.

ffView the Look

  1. While Aaron Rodgers makes the most appearances in the Top Performers list, Drew Brees is actually THE single top point-earning player when we look at performance overall (for our dataset that is from 2008 to 2016).

ffView the Look

Comparing the two, Aaron Rodger’s 2013 injury is what lost him the lead.

ffView the Look

Now, let’s take a look at the top performing player at each position year-by-year.

  1. When looking for repeat performers by position, we saw multiple repeat performers at quarterback, tight end, and wide receiver, which again emphasizes that past performance is an indicator of how they will play in future seasons. However, it is interesting to note that among running backs, there is no repeat top performer. Considering the high injury risk, this is not surprising, especially as more teams move towards pass-heavy offenses.

  2. However, it is interesting to note that among running backs, there is no repeat top performer. Considering the high injury risk, this is not surprising, especially as more teams move towards pass-heavy offenses.

ffView the Look

The Draft: Analyzing Patterns by Position

Before we look at which positions should be drafted first, let’s take a peek at last year’s draft order by position. Below we see the percentage of times that each position was picked in each round:

ffView the Look

Looking at the first two rounds, we can see a couple things:

  1. The Gronk hype was real last year and almost single-handedly dragged tight ends into the first two rounds.

  2. Over 90% of drafters chose to stick with some combination of running back and wide receiver. Not surprising, but it definitely shows that if it’s a specific RB or WR that you want, get him in your first round.

So now that we have an overview, who was making the right decisions? Is scrambling for premium running backs and wide receivers the right play? Or did the outliers that went with early quarterbacks or tight ends get rewarded. Let’s take a look at output.

What History Tells Us about Drafting

ffView the Look

  1. Prepare for injuries with running backs: Looking at the chart, there is a bit more noise than we might have expected. Right off the bat, we noticed some massive disappointments amongst the running backs – often resulting from significant injuries:

Where Todd Gurley was just an underperformer at full strength, Adrian Peterson, Eddie Lacey, and Jamaal Charles all suffered season-ending injuries. Looking at previous seasons, we found similar patterns with top running backs.

Here’s a look at 2014:

ffView the Look

Adrian Peterson missed time due to off-field issues, Doug Martin and Giovani Bernard suffered from injuries, and a healthy Montee Ball failed to live up to expectations. Running backs are risky – we already knew that – but take injury chances seriously and remember #4 in this list.

  1. Quarterbacks are predictable. Turning to quarterbacks, we saw a different pattern. The data demonstrates the impact of the NFL putting a strong focus on protecting quarterbacks, and the result is much more predictability out of fantasy QBs.

ff

View the Look

We’re missing some data from 2015, but taking a look at 2014 and 2016, we see some pretty consistent success in the top five, other than a little outperformance from Aaron Rodgers in 2016 and a bit of disappointment with Cam Newton. The only real outlier was Nick Foles’ 129-point 2014 campaign, the only significant injury in the group.

  1. The drop-off rates here in the top six quarterbacks are pretty gentle, indicating that stretching for the superstars may be misplaced if you can swing the 5th or 6th QB in the draft. This year (below), you may want to look to Derek Carr and Russell Wilson based on early drafts (the bold pick may be Andrew Luck, but beware the injury issues).

ff

View the Look

  1. Wide receivers provide flexibility. Next let’s look at wide receiver production. The operative word here is depth. Here are the outputs of the top 30 picks in 2014 and 2016:

ffView the Look

Other than a Keenan Allen injury (orange, bottom left) and the Josh Gordon off-the-field issues (orange, bottom right), you are getting what you pay for with wide receivers. A nice gentle decay means you can fight for the studs early or backload with the middlers. Wide receivers are going to give you the most optionality in structuring your team. So our recommendation would be to work your wide receiving corp at the margin, filling gaps when you don’t have great options at other positions.

  1. The tight end drop-off. And finally, let’s take a look at tight ends. With tight ends, it’s all about the outliers. This chart explains why Gronk was going in the first round in 2016:

ffView the Look

Gronk and Jimmy Graham scored 2-10 points per game more than a typical top tight end in their best years. Compare that to the gentler curves with quarterbacks and wide receivers and you can understand why a full strength Gronk (or Jimmy Graham in New Orleans) could command a premium.

Looking at this season’s early draft numbers, Gronk is definitely worth the 20th overall ADP right now if he can stay healthy. You are looking at a pretty steady drop-off after that (with some long shot potential if Jordan Reed can stay healthy at 50th overall).

The data confirms what we all know to be true: the players most likely to be injured are your riskiest bets. But it also shows that you can counterweight that risk with some reliable picks.

What can we learn from the biggests busts of 2016?

Last, we’ll take a look at the biggest busts of 2016 to see if there’s anything we can learn. Here’s a quick overview of the draft class of 2016, with projections on the x-axis and outperformance vs. projections along the y-axis:

ffView the Look

  1. Projections Outperformance: Outperformance, is pretty well centered on the 0, meaning Yahoo estimates are a pretty good ballpark estimate. That said, we have some pretty significant divergence, especially on the higher estimates. Digging in to Yahoo’s blue chips (the right side of our chart) we see some themes we covered above:

ff

  1. Be aware of the high injury risk associated with the running back position (as well as high-risk players at other positions such as Keenan Allen and Rob Gronkowski).

  2. Blue chip quarterbacks are giving you about what you expect, even though Cam Newton had one of the worst divergences in recent history.

  3. On the upside, young, workhorse running backs are where the value is. Although most people may not have a shot to draft David Johnson, who was first overall in almost 50% of leagues.

Looking at the big outperformers out of nowhere, we have a few themes, and a lot we can ignore. Here’s everyone who beat their pre-season projection by more than 100 points:

ffView the Look

  1. First, we have lots of unexpected starting quarterbacks. Not a ton of action we can take there, since these still aren’t starter-level outputs, save Dak Prescott. Might be worth a flyer on some of the new rookies, especially if they have a strong chance to win the starting role.

  2. Taking a look at the other outperformers, we have young wide receivers that stepped into big roles with injuries or shifts in the offense – Adam Thielen is the old man of the bunch at 26, with Tyreek Hill, Cameron Meredith, Tyrell Williams, and Davonte Adams all under 25.

  3. We have Pitta’s return from injury (the only veteran on here), David Johnson’s monster season, and Jordan Howard emerging from the running back committee in Chicago (sometimes it’s worth taking a flier on a rookie like Joe Mixon in a running back by committee on the chance he wins the starting job outright).

And for fun, before you leave, let’s look at the worst efforts ...because sometimes being benched does you more good than when you play.

  1. Remarkably, Marcus Thigpen’s 2015 takes the cake at -4.10 points, spread over three individually negative games (beware the kick/punt returner, no points except for touchdowns and turnovers):

ffView the Look

In our next post, we’ll dig further into this dataset and see if there are any more concrete tips we can get for choosing a draft pick. Stay in the know by subscribing to the blog in the box below.

If you want to learn more about Looker’s complete data platform and how it can help you get this level of insight from your business, sign up for a demo here.

Viewing all 281 articles
Browse latest View live