That Smart Grid Data Surge We Mentioned Earlier? You Can’t Ignore It
Nov 3, 2009
A couple of weeks ago, I took a look at the data provided by the teams at PGE and Austin Energy, combined it with data provided by DOE, and I arrived at the conclusion that the Smart Grid will create a glut of information that the utilities had best begin planning for, because it could easily swamp both the utility and the networks that are expected to carry it.
Not surprisingly, there was a fair amount of interest in both the conclusions I reached and in the substantiation of the data I used. Some of the inquiries were pretty straightforward. Others were less open to the concept, and there were two main objections to the data.
The first was based in existing utility practices. This line of questioning had within it the expectation that a meter read would only contain basic information about the identity of the power meter, the time stamp, and the meter reading itself. Were that the case, it would be possible that the data would be in a paltry range, around 14 bytes per read, resulting in a belief that such a small amount of data would never amount to anything like the avalanche I had described in the piece.
The second objection was that there was little likelihood that such data was going to be stored for long, meaning, I guess, that we could design the system as though it had never arrived at all. Many of the questions came from individuals with strong/long histories in utilities, so I felt it my responsibility to validate, again, my data.
While I consider myself to be relatively well-versed in the core of these topics, it is the nature of this blog to focus on my expectations of the future based on information provided elsewhere, by others more directly in the path of the Smart Grid. That said, credibility is a big deal for us, and I decided to go back to Austin Energy, and understand better the reality of the situation from the folks who are actually doing the job, and who are considering these concerns as fundamental parts of their planning for successfully serving their clients on the new grid in the years to come. Andy and I called Andres Carvallo and Karl R. Rábago at Austin Energy, and they generously agreed to help us understand the world and the Smart Grid that they are planning for.
Smarter Grid vs. Simpler Meter Reading
One of the first things I learned was the richness of information gathering and interactivity that these gentlemen expect to coax from the new grid infrastructure. While time, location, and power used are at the heart of a meter read, there is much more to be learned. Investment in the Smart Grid would have a maximum return when the savings were more than a human reader's footwear and gasoline. Some examples are:
Device Health Information – By watching for varying temperature, periods since outage, battery power, heartbeat, and other meter variables, it is possible to better predict and recover from any failures that may happen.
Real-Time Monitoring – As has happened historically with most new technologies, it can be expected that people who yearn for more data will only be satisfied by that which is most current. It is unlikely to happen in the general population immediately, but history shows us that it is likely that such a real-time monitoring feed may be in demand almost immediately, as customers recognize that there is now more information through which they can better manage their energy.
Energy Services Provision Trumps Energy Provision Services – There are doubtless going to be additional requirements from the newly informed and empowered customer base for functionality that is logically delivered by the provider. This was a real eye opener for me; that power providers are now actively thinking about services that they can offer over the new and smarter infrastructure. Things like profiled energy use: "I am going away, manage my power" or "There is a spike in prices, manage me down by 10%" or "I only want to use power that is generated from renewable resources." These all require data, new interfaces, and a channel over which all of the control and monitoring information can be passed. Winners in the new market will be finding ways to capitalize on the need for energy-related services, and will not limit their investment to further driving down the costs of simply providing energy.
Networking Overhead – Given the complexity, regularity, and importance of this data, it is clear that a protocol (like IP) will probably be adopted to package up and send all of this information in a payload to central systems for analysis, aggregation, storage, and action. Protocols carry their own overhead in terms of describing their content, sources, destinations, etc. None of this is free from the perspective of the systems carrying or storing the data.
Other Factors – We are only just beginning to see the potential for Smart Grid and Soft Grid enablers, leading me to believe that even my estimates are very likely to be low, particularly as we clamor for real-time monitoring and data analysis.
Based on all of this, it looks like the numbers are far from a simple 14-byte read, and are more likely in the range given by Andres of 4K to 16K per reading. If we estimate the maximum case, the numbers are even higher than I had referenced in the earlier article. Let's not think about real-time (the numbers are mind-numbing), but instead look at a simple check every 5 minutes:
12 (reads/hr) X 24 (hrs/day) X (365 days/yr) X 16K (bytes/read) yields roughly 1.7GB/meter/year
Multiply that by the number of meters (pick your own scope), and I think the challenge is clear. For more reality, take that number and multiply by 5 for readings every minute, or by 300 for readings every second. That's big.
So, is this a problem because the data is going to cause the Smart Grid to explode like a flawed radiator hose in July? I don't think so. I think that time has proven that technical advancement has always helped us stay ahead of crushing data or processing burdens by decreasing computing and memory costs. This has allowed us to paper over our excesses with iron and silicon.
No, this is a problem because rushed, tactical, and incremental hardware additions will not make that data secure. It has to be expected that as organizations run out of room for data, they will simply rush to add more. Caught in a flood of data, the pressures for survival and successful operation will naturally trump any meaningful consideration of re-architecting data storage for adequate and appropriate security.
This planning (and budgeting) needs to happen now. As Andres said on our call, "You cannot simply build an airplane for passengers who are 5 foot 6 and weigh 140, because you can guess that your average passenger, much less your larger passengers, will simply not fit, because they are not that small." In other words, you need to plan for what you can reasonably expect, not for what will make your life, your business, or your CFO, ecstatic.
I think that this is the final insight. For firms that are seeing the Smart Grid as an enabler for cost-savings by transferring operations onto an IP infrastructure, or a wireless metering system, then there is little reason to be concerned with a data glut.
For those who recognize that the Smart Grid and the coming Soft Grid will need data, and will need security, and will likely grow to fill whatever space is available, the call is clear. Plan for an avalanche, plan for a flood. Create systems and segregations that will allow for managing these flows reliably. Characterize what must come through and what can be dropped along the way to the back end. Do all of those things and the current systems will be fine, the next systems will not choke, and the ultimate end state will be familiar enough to what has been planned to ensure stability, quality, and cost-effective services to all who connect to the grid.
The data surge is coming, and you can either surf it, or be pounded by it. You certainly will not be able to ignore it.
In looking at the list I think there are some real questions about whether a utility would collect all the data you describe.
I believe much of the data will not be sent back to the utility but will be captured at the home and monitored there without being sent back to the utility. For instance, when we get smart appliances, there will be a portal on your PC to control them and the information will be kept within your home. All the data if a user wants it will be collected on their home PC. Google's Power Meter will be looking at that kind of application and Zigbee will be a standard on your PC, just like WIFI is now.
Real Time Monitoring is another thing that will stay within the home. If someone really wants to track this, they'll setup their Google Power Meter (or something like it) to track it.
I think you are exactly right that IP will be used for networking but if you think about it little of that IP information needs to be stored...it's use is to get the packets of information to the right place.
I think your point on monitoring Device Health is dead on, but that alone won't drive the huge storage needs you describe.
One that you missed is electric vehicles, they will drive extra data but again not enough to drive the huge volumes of data you describe.
It's a really good question about how much data will be sent over the utility grid and how much will stay locally with the consumer. I am of the opinion that most of the individual loads within the home will not go up to the utility. In general, utilities don't want to deal with storing how much energy the refrigerator is using in 5 minute increments. Utilities need to efficiently supply the aggregate need.
While my background is on the communications side, I don't see these huge storage needs if the network is created as I would expect.
Phil Korest - 11/04/2009 - 08:45
Nice things to know
I totally agree with Phil. However, what Jack did up there is very helpful, good to know, and surely a warning to communication guys like us. Currently I am doing my phd and would like SG to be my research. Data surge and security will be the main issues I will focus on. Thanks to everyone for the sharings. Keep in touch.
Thomas Lo - 11/04/2009 - 09:07
EPRI Research on Data Tsunami
EPRI has a significant initiative to address this very issue. I urge interested parties to download the research, it's free. Here's the name of the report
Program on Technology Innovation: Advanced Information Technology Requirements for the Electric Power Industry: A CIO's Perspective
Kevin - 11/04/2009 - 10:28
Note on the numbers for Phil K.
Hey Phil -
Thanks for the comment/questions. The reality is that I was extremely interested in understanding the -actual- way in which the utilities were planning for integrating smart grid data. That's why I reached out to Andres and Karl, as two guys who I see as being real leaders in maximizing the effectiveness of investing in the Smart Grid. The numbers and scenarios I give are based in their appraisal of what will happen.
That said, while I agree that there would be ways to do the type of leaf-storage data representation you cite on a per-home, non-centralized basis, the expectation of the people who are doing it for a living is that just such a centralization will occur. In addition, in speaking with other utilities before and after the conversation with Austin, I can tell you that real-time is a planned eventuality across each of the utilities that I have spoken with, even though we might think it is overkill.
I have no expectation that I will be revisiting these totals anytime soon, as I am pretty comfortable that they have been posited, posted, and now validated by multiple sources.
I respect your experience and expectations, but am going to be recommending that people think hard about the amount of data that their infrastructure is likely to generate. If history and internetworking has shown us anything, it is that we cannot possibly imagine what the visionaries of tomorrow will want to do with the baseline infrastructure. Security planning up front, in terms of characterizing, segmenting, and modeling data content and flows, will naturally advantage any data volume reality, and may be the only way to ensure good security going forward if the growth I project does indeed occur. Only upside.
Again, thanks for the thoughtful read and commentary.
Jack
Jack Danahy - 11/04/2009 - 21:15
data...and analysis
Nicely done, Jack
This is a hard issue to get across, especially the need to set up the data collection with the simple question "What do I want to know?" I think you're correct about the coming data surge - but it won't only come from the customer side.
The ability to collect information and automate some decision-making from generation through distribution will also involve enormouse data flows. What will make the grid smarter is to integrate the data on many scales; to provide analsyses for optimizing all kinds of operations along the value chain; and to enable better planning and resourcing. And how to do all this securly - and with many different legacy systems in the mix - is a huge challenge.
One point you weren't clear on was the nature of the data flows - did your estimates include the router data (flows) or just the information in the packets. I've seen some very large organizations that had massive (relatively) data communications think that it was too expensive to retain the flow data for more than a few weeks - only to regret that decision later when they tried to reconstruct how a breach or network failure occurred.
Thanks for keeping the data issue in front of the community.
Jim McCurley - 11/06/2009 - 04:46
Data storage versus Data Stream
Good work gents,
The numbers are staggering we are looking at going from 1100 meter data reads a day to 4.8million per day with only a customer base of 100K and the data count is building.
This leads into the bigger issues, what about the fact most utilities are not concerning themselves about the traffic issues. Currently most if not all network monitoring systems (DMS,OMS,EMS or NMS) utilized at the moment can only can only handle into the 10K per minute (more in avalanche). While we here are taking into the 100’s if not millions on pieces of data a minute depending on how big the network not to mention any network loading. Great to have it stored somewhere but what's the point if the information does not make it, or the value is reduced because of the time element is made void because of traffic bottlenecks or buffering in the systems.
The issues of infrastructure resilience, I see have still not being resolved. I have already seen this where a system has crashed because of this fact, these were not built for these large data streams coming in and currently this space has been left overshadowed as people install the end product without suitable money or effort being placed on the support systems and structures. I know the vendors are quickly trying to make their products capable of handling this load but nothing so far has been utilizes tested ……
We're getting mixed signals about the vitality of the smart grid market. On the one hand, the recent DistribuTECH conference was one of the most successful ever. On the other, a well-known Wall Street analyst recently told his clients that the smart metering sector is "facing several headwinds," including weak regulatory support in the U.S. and delays in European adoption. Taking the pulse of the smart grid industry is this week's Tuesday Topic.