How do you decide which projects in data science are worth pursuing?

Almost every company has developed a data strategy. But what separates a good one from a great one? Further, how can an organization with a good strategy transition into one with great strategy?

We can recognize failure when we see it. Resources are spent, teams are created, time passes by — but then it yields nothing. No one can actually pinpoint the reason; each person says it’s someone else’s fault.

It’s even more difficult to distinguish between a modestly successful endeavor and greatness. Indeed, in the field of data science these two can possibly appear very similar for even as long as a year. After a couple of years, however, an excellent strategy should deliver valuable results of a higher magnitude in greatness.

Regardless of whether a strategy is excellent and or mediocre, both start with investment and a set of experiments that lead to data projects. After a couple of years, several of the projects are on their way to production because they “work out”.

For a strategy that is mediocre, there might be an obvious ROI to the company for one or more of these projects. Usually, these projects will involve some type of automation that saves on cost, or apply methods from machine learning to a business process that already exists in an attempt to increase performance and/or efficiency. This appears similar to a success, and may be successful enough, however it does not have all of the benefits that accompany a data strategy that is excellent.

For a data strategy that is excellent, there are more projects that have worked, and these projects were phenomenally cost-effective to produce. Additionally, the process of creating the initial couple of projects fuels the production of new ideas for projects. For a strategy that is excellent, the projects will be accompanied by automation, boosts in efficiency, and performance enhancements, however they will also come with ideas and projects for additional generation of revenue and completely new companies fueled by the distinct data assets that you have. The teams of data professionals function well together, add to the work of each other, and working with their business partners in a smooth, seamless manner. The company has a clear vision of how it’s future of machine-learning driven projects will look, and everybody is striving together to accomplish it.

Building an Excellent Data Strategy

Creating a successful strategy relies on many groups working together at the table. This includes technology leadership, business experts, data experts, and subject-matter experts. It also involves support for leadership that extends beyond simply looking to mark complete a “machine learning” box.

Here’s how most businesses determine whether to pursue specific data projects, which by itself is the way to make a mediocre data strategy. Management determines a group of projects it would prefer and makes a scatterplot of prioritization: one axis is the project’s business value and the other axis is the estimated complexity/development cost. Every project receives a place among other projects on the chart, and the leaders of the company allocate the limited resources to projects that they think will add the highest business value and cost the least.

This way isn’t wrong, but it’s also not the best way to do things. For a data strategy to be excellent, it must go past straightforward analysis of projects in isolation and take into consideration a few more dimensions.

First, an excellent strategy involves an organizational core that is well-coordinated. It’s created upon centralized tech investments. The purpose of the centralization of defaults is that it enables each application to make different choices as required while holding to maximum compatibility throughout the company and flexibility as time progresses by default.

As an example, one global media business had expanded rapidly via acquisitions. Each line of business relies on a different tech stack and IT group, which lead to difficulties with integrating already existing data and different architectures for all future investments. Centralizing their business was vital to building ongoing success.

Second, for a data strategy to be excellent there is a specific goal for the near future and flexible goal for down the road. We know a lot about the machine learning potential of tomorrow, but less about what this looks in the coming year. We only have speculation of the possibilities in five years. In a similar way, the landscape of business is evolving, which leads to new opportunities and new competition. Organizations with five-year cycles of planning will miss out on emerging opportunities in the meantime. An excellent strategy is one that is a living, adapting document.

The result of this approach yields higher-value projects — ones which might have appeared too ambitious — look less costly to push forward. Instead, it shows that projects of this nature may indeed be more cost effective and easier to proceed with than unrelated projects of less value that appeared attractive from a naive analysis.

In other words, an excellent data strategy recognizes the interrelated nature of projects and that the costs shift over time as a result of other projects that are tackled. This enables for more accurate planning and can expand the capabilities of a company more than otherwise expected. You can come back to this process of planning on a quarterly basis, which is on par with how rapidly technologies of machine learning technologies are evolving.

The optimal data strategies have strong directional conviction but are flexible regarding the details. You want to know the end goal, but you do not need to pre-determine every step you must follow to arrive at that goal.

Finally, an optimal strategy considers the following key insight: data science projects are not independent from one another. With each completed project, successful or not, you create a foundation to build later projects more easily and at lower cost.

Choosing Between Data Science Projects

Here’s what project selection looks like in a firm with an excellent data strategy: First, the business gathers project ideas. This effort should be as broad as possible throughout an organization, at all levels. If you only see ideas on your list that appear to be good and obvious, you should be concerned — that’s an indication of a lack of creative thinking. Once your list is larger, filter by the technical feasibility of all the potential projects. Then, make the scatterplot that was referenced above, which analyzes each idea on its potential cost/complexity and potential value contributed to the business.

Here comes the fun. On the scatterplot that you made, add lines between projects that might be related. These connections are where data resources are shared by projects; or where data collection for one project may aid another project; or where the foundation of one project is also the foundation for work on another project. This method recognizes the realities of work on such projects, such as the fact that creating a precursor project facilitates the successor project (even for a precursor that flops). The costs of gathering and obtaining data as well as building common components is amortized across all of the projects.

We’re currently developing AI, machine learning, and data to the point at which the technology isn’t commoditized and it isn’t completely obvious where we should invest. Businesses that have excellent strategies will more likely choose well.

Leave a comment

Your email address will not be published. Required fields are marked *