Data growth
IDC published a white paper in November 2018 investigating global data growth. According to the study, we will see an accrual of the global data sphere from 33 to 150 Zettabytes by the year 2025. Whereas the paper focusses on the data producer and storage locations in information management, one conclusion is obvious for professionals in this area:
This tremendous data growth will accentuate existing information management challenges. There will be more ‚lost‘ data and higher risk exposure due to unmanageable data silos. There will also be an additional wastage of storage space and augmenting inefficiency forced through search and retrieval.
Knowledge creation in information management is becoming harder with the big data trends.
The industry’s current answer to the challenges are more technical than organisational. Today, highly efficient machine learning algorithms can train themselves to dig through mountains of data. They can be able to identify valuable information objects out of the data mess which was created over decades. They keep getting better every day, but is machine learning for business intelligence the solution? If we want to reduce car accidents, do we try to make driving safer or do we speed up rescue services?
We should perhaps start by reducing the mayhem in terms of the emergence of data through better information management practices.
Remembering what Bob McLean of ARMA once said: You can either stand in front of the elephant and control what you feed it, or you remain at its end try to work through whatever comes out.
If we analyse how and why data proliferates throughout organisations, we identify two main issues in information and records management. The first one is that there still no data ownership addressed and the other is that there is a rather mediocre and insufficient metadata. Both are a clear sign for lack of organisation.
Ownership
A decade ago, everyone in the information management universe was talking about ownership. Records management policies were written to tackle the problem, with almost no effect.
The workforce does not seem to care about data and information management as long it is available to them. Paradoxically speaking, it only seems important if they lose it or delete it by accident. Most of the employees are not even conscious of the fact that the data they create does not belong to them but the organisation.
When people terminate their jobs, they usually leave data traces behind them. It’s up to the organisations to either delete it right away, with the legal risks or they keep it in a continuous backup loop. This is often unmanaged and untouched as privacy laws tend to prevent any workaround.
Why don’t HR departments treat custody of employees’ data when they leave in the same way they do for badges and keys? They could at least address the company knowledge within the organisation and make sure it does not get lost.
Uncontrolled data redundancy is a direct consequence of unaddressed ownership. Most staff keep saving information just in case they do not distinguish between useful and non-useful records or if they aren’t sure of their responsibility. As long as it exists gets saved! This is an issue that leads to an average of 9 to 11 copies of every document within an organisation. Versioning is not counted herein.
If we add to this to the fact that only 10% of all documents would need retaining and you can understand what uncontrolled means in terms of records management.
All this proliferation could be handled with ease provided that good metadata is attached to the information objects.
Information management metadata
The attributes related to information management are in 99 % of the cases of inferior quality. All too often fields are only filled partially and cover merely one aspect of the lifecycle.
If an organisation chooses to allocate metadata to its information, it needs to be done at the time of creation. This is the only moment in the life of an information object where all metadata is ad hoc available. The metadata set needs to be complete and cover the holistic requirements of an organisation: short term quick retrievability and long-term simple manageability.
If metadata is captured at a later stage it usually requires substantial effort. Sometimes it is even impossible to entirely complete the attribute set.
One would ask, what is a complete metadata set in information management?
Next to semantical data, context is key to information to ensure simple retrieval. It needs structural data to guarantee that it can be stored automatically at its location. It needs business rules to govern: ownership, retention, access, privacy, classification, vital data management.
It needs information rules to ensure that all this metadata can be operationalised in DRM-Systems. Putting such rule set in place will enable organisations to take stewardship of their information. The rule set will not only act as a filter to sort redundant and non-useful records, but it will also ensure total control over all data for its entire lifecycle.
DocuType Information Compliance
It’s high time we start to filtering and controlling what we feed to our elephant!
Being able to control its entire information has until now been a quantum leap for organisations. The ones who have gone through this process say it is one of the most beneficial initiatives they have undertaken.
phase3 has developed Docutype, a metadata service that ensures all essential attributes are attached to the information at its birth. It acts as a filter engine to wipe out redundancy serving as single governance engine for all information managing systems.
DocuType can be used to fuel machine learning algorithms and addresses every aspect of the holistic information management in organisations.