Data collection, and the tools and services that allow it, provide farmers with possibly the greatest technology benefit across a farming operation. Like calorie counting can help folks lose weight by simply gaining better visibility over what they are eating, data collection can reveal where resources such as labor and capital are being spent to better optimize. This allows for cost savings and greater yields as growers move closer to precision agriculture.
Regardless of the type of crop, farming inherently has always been a data-driven operation. The methods used to collect and analyze data, however, have been changing at a rapid pace. As a result, growers have started to go from clipboards and Excel spreadsheets to ‘Internet of Things’ data collection and business intelligence. Row crops have certainly led the way on automated data collection, given the large number of acres per farm compared to other types of crops. Specialty crops, however, have been catching up in the past couple of years with new AgTech companies forming that focus on crops such as wine grapes and tree fruit.
Where To Start
Before jumping into the mess of AgTech vendors though, it’s important to identify the four pieces you’re going to need to modernize data collection and analysis across your agriculture operations. This will help ensure you have a strong foundation from which to build from.
My advice is to start with data analysis as the first step (even though it’s listed at the end of this post) in order to understand what you want to measure. It’s easy to incur a lot of costs quickly if you bite off too much at once when it comes to building up a modern day data ecosystem. Having a clear set of metrics will allow you to tie what you’re measuring to what you’re saving / gaining and get to a return on investment faster. From there, you can add to what you’ve built, and measure more as your use of data grows across the operation.
Part 1) Data Collection
Recently there has been a surge in sensors for farms that can track everything from soil moisture to pest counts. There’s two things to focus on when considering what tools to use for data collection.
- How many sensors and what type are needed for collection.
- How “friendly” the sensors are to feeding data to the system you’ve chosen to consolidate and analyze it in.
Most vendors offer web-based solutions for showing you their data, but it’s only really useful if you’re able to extract it into a central place (such as a data warehouse) because the real value of this data isn’t what’s happening to an acre of crops by itself, but what collectively is happening within that acre. This is important because you will need to pull together individual readings over a given acre to then compare to an output metric such as yield. As for concentration, without sensors the data is likely already being collected today using manual sampling methods. It’s important, then, to consider the ROI from automated data collection at a sufficient concentration. And of course, always start with just a couple sensors to validate the technology will work for you before scaling. If you only have 1 type of sensor though, to track something specific, then the vendor provided solution may end up working fine.
Part 2) Data Consolidation (or ELT)
Once you have your data collection sorted out, and you’ve begun to measure data in an automated way, you’re going to need to get the data into a central spot, such as a data warehouse, in order to put that data to best use. In that case, you’re probably going to need to pick a vendor to help with this. Though there are ways to do it yourself, it requires hiring one or more experienced data engineers / programmers that can build and maintain the system.
For most farming operations, picking an ELT (extract-load-transform) vendor is going to be a better bet, such as FiveTran or Dell Boomi. These tools will allow you to connect to a data source, such as your accounting system or field data collection, and extract / load / transform the data into a central place which then allows you access to all of your raw data across as many systems as you’re able to connect to. This is why, when considering data collection, you need to understand what vendors have capabilities such as an API which is a capability of a software platform that allows customers to connect to the platform and extract (and in some cases write) data.
Part 3) Data Warehousing
Having a central location for your data is the next foundational component. Depending on the volume of data, you may be able to use a standard database such as MySQL or SQL Server. If the volume of data you’re receiving is large enough (500 GB or more to start with) you may need to consider a Columnar based database such as Snowflake or Azure Synapse which is designed for large data sets. When you’re storing a lot of data from sensors across your operation, you are likely going to get larger data volumes sooner than later. The amount of data a sensor puts out varies quite a bit, so be sure and discuss with your vendor before signing up for the sensors. The key though, is to start with smaller data sets that are useful right away vs starting big by storing everything and having to support too much at once. You will be paying for storage monthly, so having too much unused data will just drive up the cost vs data you can put to work to justify what you’re investing (keep in mind though, storage is relatively cheap in the cloud so it takes a lot of data to drive up the cost of storage).
It’s also recommended to host your data in the cloud, and to get as much of your ecosystem into the cloud as possible. Though you likely have data hosted locally today, which in some cases is required for business continuity, having a plan to get as much into the cloud as possible has a number of benefits. These include automated back-ups, easy scalability (you pay for what you use, and can increase the size of your server as needed), and streamlined pricing along with the convenience of having the services managed by the cloud provider vs handling onsite servers yourself among several other benefits.
Part 4) Data Analysis
Once you are measuring the right information, and then extracting / storing it in a central location, you need to consider what metrics you are going to track. Having ten or so metrics from which to measure your biggest cost drivers (such as labor or parts) means you can get actionable insights faster, versus guessing what might be useful. It will also help you understand what data to collect first, instead of boiling the ocean and gathering all the data at once. Using a business intelligence tool such as Tableau or PowerBI will allow you to point to your data warehouse and begin to construct these metrics to begin building automated reporting.
It’s recommended to work with an experienced IT resource that has worked with these tools in the past to get your reporting up and running, although if you’re technically inclined and on a budget, there are great BI tutorials available online. Once you have these metrics established, you can look at other capabilities under categories such as data science, artificial intelligence, or machine learning. However, that will require specialized expertise to build solutions using these advanced methods. The good news is these capabilities have been growing over the years, so it’s a lot easier to find expertise with these skills than it used to be a couple years ago. And the tools to roll your own advanced analytics system are getting better every day.
Though this is a very simplified view of a data ecosystem for your farming operation, I hope this provides you with the 30,000 foot view of the pieces essential to a modern data-driven architecture. Understanding where AgTech vendors fit is critical to ensure you don’t have a bunch of isolated (or “silo’ed”) point solutions that you aren’t able to bring together into a single set of metrics.
The next step is to learn more about each, which I’ll be covering in future blog entries. Of course, if you have any questions, feel free to reach out directly or leave a comment.