Analysis of Big Data has the potential to unearth myriad benefits for a company – but how can we capture the valuable, and ignore the superfluous, in such a huge digital expanse? Alexandra Leonards explores…
The creation of data is growing at an ever-quickening pace. By 2020, the number of existing digital bits will almost match the number of stars in our universe. What’s more, according to IDC, the digital world is doubling every two years.
This sprawling “Big Data” conceals a multitude of valuable information. But within this huge digital space, how can we identify what is actually valuable, and what is valueless?
“The first question, which is quite an important one is, how do you define big data? And this isn’t a rhetorical question, because different people consider big data in very different ways,” says Simon Bunegar, head of marketing and business development at Transport Exchange Group. Before taking on his existing role last year, Bunegar spent 15 years working in the IT industry, and the last seven specifically working with Big Data.
“Some people say it’s sheer magnitude,” he says. “Other people will fold in elements such as Big Data is by its definition unstructured data. Other people will say its unstructured data across highly heterogeneous data sources. So its not just the structure, its also the nature of the data that is different.”
Bunegar says that Big Data is about complexity, and having access to technology that generates data of great magnitude, as well as a huge number of data sources. And he reckons the IT industry has a vested interest in talking about it.
“Big Data comes from the IT industry – it doesn’t come from the users,” he says. “The IT industry turns to users and goes “oh you’ve got big data – it will cost you X to look at our wonderful new database, it’s Big Data”. It doesn’t mean anything to users. Users are interested when someone says that you’ve got all this stuff that you couldn’t analyse before, that you now can.”
Bunegar puts focus on identifying “meaningful” statistics and data: “Sometimes it’s hype, sometimes it has real value,” he says. “What we in the logistics sector should be looking at, irrespective of what you call it [big data], is: Where’s the value? Not where’s the value in theory, but where’s the value in very, very hard numbers.”
Paul Ridden, managing director of technology solutions company Skillweb, also thinks it is important to dig deep for real value in the growing digital space.
“It’s not about Big Data – it’s about the availability of that data in a form that business can use,” he says. “ And that’s a difficult thing to do because the more data you generate, the more difficult it is to see the wood for the trees.”
Harnessing information that is traditionally handled in an IT environment is something of a challenge for those outside of the sector. Ridden says that it’s difficult to achieve a balance between finding the right tools, and being able to use those tools effectively.
“Some of the reporting tools are very powerful and can be operated by less technical people – that’s fine as long as there is a data model that underpins it in a form that it can be understood by those business people,” says Ridden.
“The tools are great, most businesses have these solutions in place to generate lots of data. What’s missing is the translation between the pool of data into the business data set, that is then in such a form that it can be understood by these business people.”
He believes that there is a movement towards having more understanding about the data being generated – but that IT specialists remain heavily involved in the handling of this type of data.
“I see a lot of technical people still in control of the data, and I have this feeling that the business driver of the analytics has something missing,” says Ridden. “You do kind of get this mixed approach in the technical world – some really great analytics tools and some really good databases but ultimately there is no substitute for an understanding of the business needs connected with the data.
“You have to find somebody capable of understanding the data and building it into a model that can also understand the business because that just doesn’t happen in a pure IT world.”
Bunegar finds further issue with the lack of communication between telematics companies.
“Right now we’re integrated with 15 telematics companies,” says Bunegar. “The telematics industry is very, very interesting. But one thing that is a little bit odd about something that is otherwise IT, is that none of them talk to each other.
“So all these people with essentially potential big data sources, because they track all the vehicles and the behaviour of all the vehicles and the trailers etc, but that data is only of value to the fleet owner.”
So how can logistics companies and retailers use data effectively?
William Salter, managing director of Paragon, looks back on the days when its software solutions were installed on “clunky old PCs” in the sad corners of transport offices.
“There may have been a connection with an order processing system – that would have been about it,” he says. “Things have moved on a lot since then.”
Salter says that what has changed most is that its systems now reach a wider remit.
“Basically what we’re talking about is something that is just far more connected now than it used to be,” says Salter. “Paragon, or things related to what we do, is a bit more all pervading across an organisation.
Provisions of information
“Whereas maybe we used to just touch the transport department, now we touch all of the other departments in terms of the provisions of information, so everything is just far more connected.”
To exploit Big Data, and its many hidden advantages, sometimes it’s necessary to combine stats from these different sources. Bunegar raises the issue of CO2 emissions to demonstrate how useful pulling data from different places is.
“You spend X amount of money on this new fleet of lorries to hit a particular target, we would argue that is completely pointless if for half of the journey your lorry is empty,” he says. As a hypothetical example, he makes up some figures. “Say lorry A is running at an industry leading 50 grams per km – that’s down from the industry average of 60, or 70.
“But it is running on industry standard occupancy levels of 30 to 35 per cent empty running. Now if a third of the time it’s running empty, it means a third of the time somebody else’s vehicle has to be taking that load.
“Somebody else is running at nine per cent empty running, and maybe their vehicle is 55 grams CO2. But if they are only empty nine per cent of the time actually the amount of CO2 that is being ticked out into the atmosphere is significantly less because you need less vehicles full-stop.”
In other words, data is often taken at face value. In this example, CO2 data would have been taken from the vehicle’s telematics – which reveals how much diesel has been used and how far a vehicle has driven.
“End of. Easy Peasy. However, was the vehicle full or not? Well that will typically come from the fleet management system, a completely different piece of software and data source – so typically, those two pieces of data never meet each other,” says Bunegar.
He adds: “Where we then start to get much more fun is when you then overlay things like matching the capabilities of a given driver, vehicle, fleet with the perceived value of them doing work for you relative to what they’ll otherwise be getting on that.
“And then in real time saying actually its worth this much to you and if its worth this much to you we’ll let you know about it, if its not, we wont even bother you with it.”
Transport Exchange Group uses disparate data to construct a “snapshot” of a particular operator. It connects an image of an insurance document with a ‘date data field’, information about expiry, feedback and feedback rate, by combining regulatory data with image data. All of this information is typically presented in a single screen – with various “traffic light flags”, so that in a single glance a business or client can make a decision about the operator.
It is also looking to introduce a new feature, which will enable companies to replay a job – which involves going back over every stage of the process to identify what could be improved.
“Trust is absolutely everything. It’s fine when they are your own drivers, its fine when they are your own companies,” says Bunegar. “But when it’s somebody who is perhaps bidding for the first time to be a partner of yours, trust becomes everything, because don’t forget they are representing you.
“If they turn up and for example, have done a bad job, who’s it going to reflect on? You. If they turn up and leave their business card with your best customer and say ‘well actually we delivered it’, it’s no longer your best customer.”
Only 100 gigabytes of data were transmitted across the Internet per day in 1992. CISCO research finds that 20 years later, in 2012, 12,000 gigabytes were used every single second. This is estimated to treble by next year. So while the digital space continues to grow at speed, can the logistics and retail sectors keep up, and start to truly exploit the Big Data?