Building a data analytics organization that drives consistent business value depends as much on organizational design principles as on raw analytical aptitude.
It has become uninteresting — perhaps even trite — to suggest that most organizations should invest in becoming more data-driven. In highly digitized industries, a robust data analytics program is table stakes; in industries still in the process of digital transformation, it can be a powerful competitive differentiator.
While the imperative to invest in analytics is clear, selecting the ideal approach to doing so can be complicated, as a tremendous variety of skill-sets falls beneath the umbrella of “data analytics.” To consistently transform information into insight, an analytics program must interweave advanced mathematics, computer science, an array of data disciplines, and industry- and/or business function-specific domain knowledge. Because it is virtually impossible to find — let alone afford — individuals who possess all these skills, in the vast majority of cases, analytics is a team sport.
Building an effective analytics team involves assembling the right “players,” but it also involves positioning them in a way that sets them — and their non-analytics colleagues — up for success. In sports and analytics alike, success is a function not only of a team’s skill, but of its internal organization. As such, for analytics leaders, being deliberate about their team’s structure is nearly as important as populating their team with top talent. While the business demands with which they are faced will determine the nuances of this structure, most analytics teams adhere to one of three organizational models: centralized, distributed, or hub-and-spoke.
The Centralized Model
A highly centralized analytics program (Fig 1.1) is built around a multi-capability center of excellence (COE). Concentrating diverse analytics talent — data strategists, data engineers, data scientists, business intelligence experts, and so forth — in one place creates a tight-knit community of data practitioners who understand how to work collaboratively and enjoy enough operational continuity to develop repeatable, even productized analytics solutions.
Fig 1.1: The Centralized Analytics Team Model
A COE is particularly effective in circumstances in which the business units it is tasked with serving have similar analytics needs. If one group of data professionals within a COE encounters the same question several times and, as a result, manages to develop a streamlined, easily repeatable approach to answering that type of question, this approach can be standardized — and potentially automated — across the entire COE with minimal effort. By amassing a portfolio of analytics solutions that are tailored to its organization’s most prominent needs and fine-tuning an analytics request intake form that captures all relevant data and as much domain context as possible, a COE is able to boost analytics efficiency and drive down analytics costs.
The risk of a highly centralized model is that an organization’s analytics program may become efficient to a fault, coming to resemble an off-the-shelf input/output machine instead of a strategic insights engine. When an organization’s business units utilize its analytics COE in the same way they utilize, for instance, an IT support desk — that is, as a ticket-based “drive-up window” — they create an artificial cap on the value driven by their data. Data professionals working at a “drive-up” COE can become so focused on closing tickets that they fall into mass-produced analytics — which is not the same as sophisticated analytics performed at scale — viewing every request through the same tactical lens. This approach to analytics can deliver some value, but it is unlikely to maximize the business value of an organization’s data.
What is more, it can be difficult for a COE that operates in its own silo to tailor its efforts to specific key business questions (KBQs), both because it lacks a nuanced understanding of the strategic context behind analytics requests and because it does not have access to regular feedback from either clients or the internal business units it serves — its work goes out the window, and that is the last it hears of it. Since overall analytics success is contingent upon an adept traversal of the “first mile” of analytics — identifying the right KBQs and structuring analytics requests accordingly — organizations must involve data strategists early and often in all their analytics endeavors. In theory, a well-designed intake form functions as an ersatz data strategist, but it seldom provides an analytics COE with the same depth of background on a business unit’s unique needs that an embedded data strategist would. The resulting disconnect between what business units really need and what their analytics COE delivers only reinforces the cap on the value an organization is able to derive from its data.
The Distributed Model
A highly distributed analytics program (Fig 1.2) is grounded in the establishment of long-term working relationships between a data professional (or, in some cases, a small team of data professionals) and a specific client or internal business unit. An analytics department that adheres to this organizational model is led by a department head, participates in periodic all-hands meetings, and engages in basic knowledge-sharing — all hallmarks of a traditional team — but its members’ day-to-day work is focused on collaborating with non-analytics stakeholders on the front lines of the business.
Fig 1.2: The Distributed Analytics Team Model
As alluded to above, identifying the right KBQs and structuring data analyses in such a way that they actually answer these questions is a necessary condition of analytics success. Adeptly traversing this “first mile” — and most subsequent “miles” — requires both technical analytics acumen and an intimate familiarity with a client’s or internal business unit’s daily operations, long-term objectives, and overarching structure. Absent such familiarity, a data professional may end up making recommendations or producing insights that are not activation-ready — that is, insights that cannot be put into play in the market.
By embedding its data professionals in specific business units, an organization dramatically reduces the occurrence of misdirected analyses and insufficiently activation-ready insights. It also creates both the incentive and conditions of possibility for strategic exploration and continuous improvement. A client’s or business unit’s dedicated data professional is part of their own organization’s analytics team, but, for all intents and purposes, they are also part of the client team or business unit to which they are assigned. Unlike a data professional operating within a COE, an embedded data professional has a meaningful relationship with the stakeholders for whom they are performing analyses, enabling them to not only innately grasp analytics requests, but observe the outcomes of their work firsthand. This helps the data professional to make adjustments in near-real time and continuously look for new ways in which their client’s or business unit’s data could be mined for value.
The downside of a highly distributed model is that the analytical specificity it engenders does not lend itself to repeatability. When the overwhelming majority of an organization’s data professionals’ work is undertaken in client or internal business unit silos, the professionals lose visibility into what their analytics colleagues are doing. It is not entirely unusual for data professionals who work in the same office to never meaningfully collaborate — for a data professional, being embedded in a client or internal business unit team often means eating, sleeping, and breathing the team’s work. Consequently, a distributed analytics program may result in multiple data professionals from the same organization dedicating significant time to solving the same (or very similar) problems that one of their analytics colleagues has already solved in their own silo.
Particularly at a large scale, achieving analytics efficiency involves abstracting proven point solutions into reusable analytics products, an exercise that few, if any, data professionals have the requisite visibility to perform within a distributed model. This introduces a great deal of “reinventing the wheel” across an organization, meaning that, while innovation may occur, it will occur quite slowly.
Additionally, because it is so difficult to find proverbial “unicorns” — data professionals whose expertise spans advanced mathematics, computer science, and multiple data disciplines — there is typically a limit on the sophistication and/or variety of the analyses that can be performed within a distributed model. Even if an organization has the wherewithal to assign multiple data professionals to each client or internal business unit, these analytics micro-departments will almost never include the breadth of expertise found in a COE. If a client or business unit has an analytics request that falls outside its assigned group of data professionals’ capabilities, it will be forced to either make a request-specific hire or outsource the request to a consultancy or data firm. Both options are only cost-effective at a large scale, and the latter option recreates many of the problems presented by a highly distributed analytics program.
The Hybrid (Hub-and-Spoke) Model
Borrowing elements from both the centralized and distributed models, a hybrid (hub-and-spoke) analytics program (Fig 1.3) organizes its talent into two primary functions. The first part of the program consists of a roster of data strategists who are assigned to specific client or internal business unit teams. These embedded professionals guide their team’s KBQ selection, shape its analytics requests, and oversee big picture data strategy — and, depending on their background, may perform a range of rudimentary analyses for their team, as well. Within the hybrid model, an organization’s data strategists are also responsible for liaising with the second part of the organization’s analytics program: a lean COE that houses technical analytics experts like data engineers and data scientists.
Fig 1.3: The Hybrid Model
In many ways, this hybrid model gives organizations the benefits of the centralized and distributed models without the models’ most prominent drawbacks. An organization utilizing a hybrid approach is able to ensure that data is gathered, probed, and activated against on the front lines of its business in a strategic fashion while simultaneously maintaining the kind of collaboration and operational continuity among its technical practitioners that leads to the development of repeatable, efficiency-driving analytics solutions.
This organizational differentiation closely aligns with different data professionals’ ideal ways of working. For instance, because the pattern-recognition that underlies the abstraction of proven point solutions into reusable analytics products is facilitated by both visibility and scale, data engineers and data scientists benefit from a panoramic perspective of their organization’s entire analytics portfolio. Further, since data strategists’ core skill-sets are orchestrative in nature, strategists are perfectly positioned to bridge the gap between an organization’s clients and internal business units and its lean COE, transforming a “drive-up window” into an easily accessible suite of shared analytics services.
The hub-and-spoke model’s shortcomings are more administrative than functional. First and foremost, staffing an analytics department that adheres to this model can be a heavy lift for organizations without the proper operational scale, as both parts of the model require sizable teams. Relatedly, finding success with the hub-and-spoke model requires considerably greater communication and a more refined faculty for collaboration than is required by the centralized and distributed models.
Clear downsides notwithstanding, the analytics work performed in both the centralized and distributed models tends to be highly concentrated — in the COE in the former, on the front lines of the business in the latter — reducing the need for extensive, ongoing cross-team communication and collaboration. This is not the case in the context of a hybrid model. As such, organizations that opt for a hub-and-spoke approach must not only hire more data professionals, but more data professionals who have robust soft skills. As any analytics hiring manager will attest to, finding data strategists with such skills is difficult enough; finding technical practitioners with such skills can feel like searching for a needle in a haystack.
Choosing the Right Organizational Model
Each of the three models explored above has distinct benefits and drawbacks. In most cases, an organization’s current analytics maturity and/or the nature of its desired data usage will determine which approach to organizational design will best support its needs.
If Organization A sells data as a product or delivers descriptive analytics as part of a larger platform or tool, it will be well-served by the centralized model. Whether Organization A is a data vendor, a research organization that reports on the same data points year after year, or a company that sells, for instance, a standardized set of dashboards that track the performance of organizations’ IT infrastructures, it will derive significant business value from the repeatability and analytics productization that can be achieved in a COE. When the analytics ask remains unchanged — that is, when the data itself is the only variable from project to project — the increase in efficiency and cost-effectiveness produced by a COE far outweighs the attendant decrease in customizability and situational awareness. Simply put, Organization A does not need its analytics department to function as a strategic insights engine, but rather as a well-oiled machine that continuously improves its fixed product (or suite of products) over time.
If Organization B is a small organization or is building an analytics team from scratch and has more diverse analytics needs than Organization A, the distributed model will be its best bet. Analytics can be used to support business growth, develop insights into consumer preferences, facilitate data-driven decision-making, and much more, but to deliver this kind of analytics-as-a-service (as opposed to data or reporting as a product), Organization B needs to tailor its offerings to a variety of use cases. Doing so may involve developing a nuanced understanding of a client’s KBQs and the idiosyncrasies of its industry landscape or it may involve digging into the intricacies of an internal business unit’s needs. Either way, Organization B needs a team of data strategists embedded on the front lines.
Because the hub-and-spoke model requires a certain degree of scale to be cost-effective, the distributed model is the approach to organizational design that will be able to provide Organization B with a strategic analytics foundation upon which it can build as it grows. In the early stages of an organization’s analytics maturity, efficiency and repeatability are less important than inculcating an analytics mindset in every team — a task that is made far easier when each team has its own dedicated data strategist.
Despite delivering the best of both the centralized and distributed models, organizations should resist adopting the hub-and-spoke model before they are equipped to do so properly. In theory, Organization C, a large organization focused on delivering analytics-as-a-service (either internally or externally), could adopt a hub-and-spoke model in one fell swoop. If Organization C is a legacy organization that is only now acknowledging the importance of analytics — such organizations still exist, but they are becoming fewer and farther between — it could make a massive upfront investment in both a roster of data strategists and a lean COE and build a hybrid analytics department virtually overnight. Or, if Organization C is a large organization with an established centralized or distributed analytics department, it could reorganize its analytics talent into a hub-and-spoke model (making additional hires to fill any talent gaps that emerge during the reorganization), but it should be prepared to go through an extended adjustment period before its new organizational model starts delivering returns.
However, more often than not, a hub-and-spoke analytics department is the result of the natural evolution of a distributed analytics department. Over time, Organization B’s embedded data strategists will start to perform increasingly sophisticated analytics work — or, at the very least, will start to identify opportunities for such work — in their client team or internal business unit silos. As this work is shared during all-hands meetings, Organization B’s analytics leaders should look for instances of particularly mature analytics that could be standardized across the organization’s distributed department. In pursuit of this standardization, the leaders might make their next hire a technical analytics expert like a data engineer or data scientist who will be able to help embedded strategists reproduce — and even automate — solutions that were developed elsewhere in the department. And thus, a lean COE is born, and Organization B has taken the first step toward a best-of-both-worlds hybrid analytics department.
Laying the Foundation for an Analytics Powerhouse
Clearly, there is not a definitive “right answer” when choosing among the centralized, distributed, and hub-and-spoke organizational models. An organization aiming to derive value from its data in a single functional area may (rightly) opt for a different model than it would were it aiming to derive value from its data across its R&D, marketing, business development, and supply chain functions.
Ultimately, the particulars of an organization’s circumstances will dictate which trade-offs make the most business sense, but what is clear is that every analytics leader should dedicate ample time and energy to deliberately fine-tuning their team’s structure. Especially for organizations that are relatively new to analytics, each marginal hire can have a significant impact on how an organization’s analytics capabilities mature. Hiring purely for talent (with no regard for fit) may well support some sort of mature analytics, but they may not be the analytics an organization actually needs to drive business value for itself or its clients. An organization may be able to skate by with a haphazardly assembled group of extraordinarily talented data professionals, but, like any championship team, a sustainable, dynamic insights engine is the product of carefully considered structure and strategy.