Below is a background summary with references that supports the view that traditional records management practices—often centered on folder structures and varying retention requirements—can create friction in high-volume, transactional environments. This summary also highlights the potential of metadata-driven approaches to improve usability and compliance.
Background:
Traditional records management (RM) systems have historically focused on hierarchical folder structures and classification schemes to manage records based on function and retention rules. According to ISO 15489-1:2016 (Information and documentation – Records management), records are classified “to link records to the business activities that generated them” through functional classification schemes. While this functional approach provides a solid foundation for compliance and accountability, it often diverges from how practitioners perceive their work and handle day-to-day tasks.
In high-volume transactional contexts—such as building permits, contracts, project documents, or case files—end-users frequently conceptualize their activities as cohesive workflows or projects rather than distinct record types with different retention periods. Lappin (2010) notes that traditional classification methods may not align with the natural work processes of users, causing frustration and resistance. Users find it cumbersome to split related files across different folders or repositories to meet varying retention requirements. The result is that the complexity grows with scale. As volumes increase, the necessity to manually classify documents into multiple subfolders, each with its own retention rules, becomes tedious. This complexity leads to lower user engagement, adoption hurdles, and sometimes the abandonment of RM practices.
Recent literature and best practices suggest that leveraging metadata and automation can bridge this gap. ARMA International’s Generally Accepted Recordkeeping Principles® (2017) and AIIM (Association for Intelligent Information Management) publications emphasize that metadata-driven classification and automated workflows can streamline processes. By applying metadata tags—ideally assigned automatically through integrated workflows or intelligent capture solutions—documents can inherit their retention rules and governance controls without forcing users to navigate complex folder structures.
Metadata also enhances search and retrieval. When documents are classified by content type, project, or transactional attributes at capture, users can easily locate what they need without drilling through multiple layers of folders. This shift towards metadata-centric approaches aligns with Gartner research on Information Governance (2021), which notes that enterprises adopting automated classification and metadata tagging see improved compliance, reduced user friction, and higher system adoption rates.
In summary, while traditional RM approaches rely on folder structures that can create end-user frustration—especially in high-volume transactional work—emerging best practices and standards encourage a move toward metadata-driven classification and automated workflows. This approach respects users’ perceptions of their work, simplifies retention compliance, and improves overall usability.
References:
ISO. (2016). ISO 15489-1:2016 Information and documentation – Records management – Part 1: Concepts and principles. Geneva: International Organization for Standardization. Retrieved from https://www.iso.org/standard/62542.html
ARMA International. (2017). Generally Accepted Recordkeeping Principles®. ARMA International Educational Foundation. Retrieved from https://www.arma.org/page/principles
AIIM. (n.d.). Metadata and Taxonomies. AIIM Industry Watch. Retrieved from https://www.aiim.org
Gartner. (2021). Magic Quadrant for Enterprise Information Archiving. Gartner Research. Retrieved from https://www.gartner.com
Are you preparing for an upcoming migration from an on-premises version of SharePoint to Microsoft 365 / Sharepoint Online? The process of migrating content from an on-premises SharePoint instance to the cloud requires meticulous planning and execution. It is essential to ensure a smooth transition and maximize the benefits of the new environment.
Why Listen to Me?
In the realm of ECM migrations, certain principles transcend account situations and technologies. These principles stem from how humans interact with information. Drawing from my extensive experience, I aim to share valuable insights that can help you avoid the pitfalls and challenges encountered in past migrations. Over the course of two decades, I have observed numerous high-profile enterprise projects and accumulated valuable knowledge that I am eager to share. While it may not provide a foolproof formula for success, it serves as a framework to help you create your own migration strategy.
Developing an Effective Migration Strategy
To ensure a successful migration, it’s crucial to have a well-defined strategy in place. Especially when dealing with highly customized SharePoint instances or known usability issues, additional preparation becomes essential. Expecting different results from the same approach is unrealistic, and customizations might malfunction or fail to migrate properly. Before transitioning to the Cloud, some homework is necessary, providing an opportunity to enhance user experience and address their needs.
At first glance, the business may see a SharePoint instance as relatively small since users can only see what has been shared with them. However, it’s important to consider that SharePoint consists of a combination of websites with attached libraries. For large organizations, the scale of the migration can be daunting, involving tens of thousands of sites and terabytes of content and data. The business needs to take this seriously.
In this document, we refer to the pre-migration analysis and development work as the “foundational” phase. This work needs to be considered before migration begins, and some tasks may need to be performed in-flight, during the migration process, potentially during waves of migration. It is crucial to give adequate attention to this phase to ensure a smooth and successful transition.
Foundation
There are several foundational pieces of work , mainly analysis, that can be done pre-migration in order to ensure that any agile waves of deployment are uniform and coherent.
Ontology – It’s a common word in academic and medical circles, but not so much in business. It refers to the representation and organization of knowledge within a specific domain. It involves creating a formal structure that defines the categories, properties, and relationships between concepts, data, and entities that pertain to the subject area of the business [1].
It will help in defining document types, taxonomy, identifying duplications in the site/folder structures, and in defining search criteria and naming standards.
Taxonomy for Site and Folder Designs – Site designs can quickly get out of control if an overall structure is not agreed to in advance and maintained over time. It isn’t easy, but it’s worth the effort. It’s a bit like keeping up with weeds in your garden. You may never be able to keep a perfect garden, but if you don’t try, it will be chaos. This impacts navigation, service desk support, records management activities, and access controls.
The most commonly used heirarchical organization of information is called a functional taxonomy. A functional taxonomy refers to a classification system that arranges records based on the functions or activities of an organization.
Unlike earlier systems that organized records based on the creator or subject, a functional taxonomy focuses on identifying and analyzing the key functions of an institution and breaking them down into sub-functions and activities.
Exceptions to administrative functional taxonomies are project files and case files. Trying to fit them into an administrative functional structure is an almost certain recipe for user rejection. For example, a large construction project may contain contracts, financial projections, specifications, project charters, architecural drawings and invoices. While the project is in-flight the documents need to stay in the project folders. Spreading them out across the legal, human resources, financial, and planning functional taxonomy locations will only frustrate users and cause heavy access support request loads. After the project closes, you may consider having the records team or an automatic process spread them out across the other structures. That may help with records retention policy application.
Mandated taxonomies may also need to be created to augment the administrative structures. These may stem from the mission or legislated function of the organization. For example, if you are a regulatory body that accepts registrations and also monitors professional behaviour, you can create a function and subfunctions that support those activities. It may be that your organization is organized by function, and that the functional taxonomy looks like your organizational structure, but don’t confuse the two. It is important to stay focused on the business function and not the business units when designing a mandated functional structure.
The creation of records typically occurs at the activity level within these functions [8].
It provides a structured framework for organizing records and enables efficient records management practices.
By categorizing records based on the functions they relate to, it becomes easier to locate and manage specific records within an organization. This approach helps establish clear relationships between records and the activities they support [8].
This also disambiguates types that two or more different business units commonly use, but name or search for differently.
Site Design Strategies – When deciding whether to build SharePoint sites as site-subsite or hub-and-spoke, it is important to consider the advantages and limitations of each approach. Here are some key points to consider:
Site-Subsite – Allows permissions inheritance, but complicates access granting when access is not uniform across subsites. Causes headaches if the structure changes.
Hub-and-Spoke: – Provides consistent branding across associated sites, but allows for differentiated permissions, and suits flat structures and differentiation between different sites.
It also makes it very easy to change the structure, since moving a site means changing a link. [1].
List of Document Types – In the context of metadata structures within Enterprise Content Management (ECM) systems, document types refer to categories or classifications used to organize and manage different types of documents based on their characteristics, purpose, or content. Document types help establish a consistent structure for documents and enable efficient search, retrieval, and management of information within an ECM system.
SharePoint, being an ECM system, uses the term “content types” to represent document types. A content type in SharePoint is a reusable collection of settings and metadata that define the attributes and behaviors of a specific type of content.
It provides a template or blueprint for creating and managing documents with similar characteristics or properties.
By creating and associating content types in SharePoint, you can define unique metadata, workflows, document templates, and other settings specific to each document type.
This allows for consistent categorization, organization, and management of documents within SharePoint sites and libraries.
Content types in SharePoint provide flexibility and customization options to meet the specific requirements of different document types within an organization [3].
Business Unit Roles – Having a pre-defined standard for creating and naming roles reduces confusion when records and support teams need to add them during the waves of agile deployment.
Typically they will involve business units, job titles and activity, like business_unit_manager _approver.
Access Matrix – This where the types, roles and lifecycle states converge. Typically you create a spreadsheet with types, with a column for each identified lifecycle state (if they are involved in a workflow) across the top row, and another row for each role. Inside the cells you have the access grants that role needs for that perticular type at each lifecycle state. If the access does not change for that role at the different states you don’t need them.
You can create Access Control Lists (ACLs) out of each column, providing an ACL for each content type/state.
Search Criteria (by type) – It’s worth noting that in addition to content types, SharePoint also offers other metadata-related features such as managed metadata, which allows you to create and manage controlled vocabularies and taxonomies to classify and tag content. Managed metadata can be used in conjunction with content types to enhance the categorization and discoverability of documents in SharePoint [4]. Leaving the search function to the whims of a full-text search engine, may work well for random searches, but it will frustrate users that deal with high volumes of document-centric, and customer-facing processes.
You need metadata in order to provide reliable search results or to provide just-in-time delivery of content in support of a business process.
Naming Conventions – After the ontology, types and search criteria are defined, you can create naming conventions that make sense across multiple business units that use the various content types. This, and the search criteria, can be done as part of the deployment waves, as long as the content types that are shared between business units are included in the same wave. That means multiple business units may need to be clumped into the same deployment wave if they share common libraries.
Workflows – Typically, digital conversions of workflows can offer efficiencies by changing or simplifying an existing workflow process. This analysis can take more time that is available in a two month period and may not suit an agile process. Since the point of digitizing workflows is to create efficiencies, there may not be much sense in automating a manual process without re-engineering it. If this can be done, then you get the true value of a digital system.
If you are going to include workflows, sometimes they span business units and share content types. If included business units and content are in scope of the deployment wave, then workflows can be defined within the wave.
Preparation
So, do you try to migrate your existing sites and libraries to MS 365 / Sharepoint Online and see what happens, fixing it later? Or, do you do your homework and plan it out, detail by meticulous detail? In my experience there is a tendancy to “kick the can down the road”, only to find out that the business cannot prioritize the expense of the cleanup after you lose the momentum of the funded project. On the other hand, you don’t want to enter the “analysis paralysis” zone either, so you have to make the business case to do it up front, get executive support, and stay lean.
Now we get into the migration planning steps:
Assess and analyze the existing SharePoint content.
Identify old content that is not needed anymore, that can be deleted or archived.
Analyse content types for formats and consider cleanup of duplicates.
Consider transformation of inefficient formats like TIFFs to PDFs to save space during migration and cloud storage.
Consider metadata tagging. This can be done in the cloud post-migration, but if you have access to a toolset, it may be worth populating metadata or tagging files before migration.
Plan your outcomes by performing an assessment of your current source environment.
If your pages are heavily customized you will need a strategy to migrate your libraries to new sites that are compatible with MS 365 and then migrate them with the standard tools, or consider migrating the libraries to new sites in MS 365. Either way, it will be a site by site modification and migration. Make sure your stakeholders get to see the new sites early in the design process and involve them in UAT.
Ensure that you have the required permissions for migration.
Depending on the level of migration, you would need to be either a Global or SharePoint Admin, or a Site Admin.
Review the system prerequisites, endpoints, and SPMT settings.
Prepare your target SharePoint and OneDrive environment(s).
Setup Migration Tools
There are three choices for the migration:
Use the standard and free Microsoft migration tools.
Buy third party migration tools.
Use a migration vendor who may have a mix of tools, or their own custom scripts.
Assuming that you have cleaned up your on-prem content and or SharePoint sites, and ensured they are compatible with MS 365 SharePoint, we are assuming you use the free tools:
Use the SharePoint Migration Tool (SPMT) for the migration process, and Data Boxes to migrate large volumes of content to Azure. SPMT supports migration from SharePoint Server 2010, 2013, 2016, and 2019, SharePoint Foundation 2010 and 2013.
Start with a small site and or library to check for system performance and to refine your procedural checklists.
Test the target site completely post-migration.
You can also use PowerShell for migration, if that is your preferred method.
If you have files on-premises, you can use Migration Manager. Migration Manager allows you to set up multiple servers as agents, helping you scale your migration project.
If you have workflows, SPMT can support the migration of SharePoint Server 2010 out-of-the-box workflows, SharePoint Designer 2010 & 2013 workflows.
Migration
Migrate your files, folders, SharePoint Server sites and content, and SharePoint Server 2010 workflows. Experience tells me that a big-bang approach, where all your users come to work Monday morning and find themselves using a new system, will risk missing deadlines, providing out of date or stale training and change management, and overwhelming the service desk with irate uers.
Migrating in Waves – The easiest way to migrate ECM systems is by business unit, where you limit the scope of the analysis, change management, training, testing, and migration activities into bite-sized chunks. Keep in mind that the more foundation analysis you have performed, (see above) the easier this will be.
Define your strategy and scope for each wave in advance. Things will change as the waves progress as priorities and availabilities are modified in the business units. This allows you to set expectations and at least start to communicate resource requirements to the business.
You can roll with the changes. Try to add 50% to your time estimates if you can.
Post-Migration
User onboarding and training are crucial post-migration steps. Regular communication with users, providing training, and documentation for making the switch is essential. If you are using a business unit wave approach you can build confidence and goodwill among stakeholders, building momentum.
Review
Monitor and address any post-migration issues. Ensure that all migrated content is accessible and functional. Check for any necessary updates or changes to the Microsoft 365 SharePoint site.
Conclusion
If you have a clean SharePoint instance with few customizations then you can just jump straight into the migration phase. If you have customizations you have to check to see if they still work in MS 365. The best way to do that is to setup a migration sand box and start experimenting with single site migrations. If you have user issues such as poor search results and navigation or issues with access controls, then it is time to re-think your foundation. The best time to do that is before your migration, not afterwards, when you don’t have any budget allocated to it.
You can probably tell that I used ChatGPT to generate the research, and I left in some of the reference links for your convenience. However, the content and especially the concepts of foundational analysis for ECM came out of my upcoming book . I hope you find it useful.
Keep in mind that migrating content may result in a surge of database and network activity as large amounts of data are moved to SharePoint and OneDrive.
There are other 3rd party migration tools out there, with varying degress of support and community. Beware. Once you stray outside the standard MS toolsets you are on your own or at the mercy of the vendor(s).
I’d love to hear about any other points or perspectives and especially successful strategies. Feel free to drop me a line or add comments below.
Footnotes and Research
There is so much reference material available from Microsoft that the problem is which one to read first. Here are a few to get you started.
Are you thinking of relying on AI to understand how your organization uses information? Well, the truth is, AI hasn’t reached that level of sophistication just yet. If you have any thoughts to share on this, please comment below.
The responsibility falls on you to uncover how your organization utilizes unstructured data. By understanding the utility derived from metadata, which includes classification, you can unlock the power of AI to classify and tag large volumes of documents effectively.
Why Metadata?
We don’t need to emphasize how much time information workers waste searching for unstructured information—it’s common knowledge. However, what’s not commonly understood is how to reliably determine your metadata requirements. Surprisingly, many organizations repeatedly attempt and fail in this area.
Full Text Indexing VS. Metadata or Database Search
Sure, search tools may improve with the help of AI in the coming years. But even with advancements, simply relying on search algorithms may not significantly reduce search times or improve reliability—unless something is done with your metadata.
I propose that there are two main categories of search:
Process – Documents used in a defined workflow, typically in high volume and critical tasks. Metadata is critical.
Non-Process – These documents or images are accessed in a more random or ad-hoc manner, or created and used in non-structured workflows. Metadata helps.
Process Searches
These searches involve documents or images used, referenced, updated, reviewed, or approved as part of a work process. In high-volume and critical processes, like aircraft manufacturing or healthcare, documents play a crucial role in the organization’s core functions. Wasting time searching for them can have a direct impact on the organization’s effectiveness and bottom line. Frustration and anger towards the technology responsible for search inefficiencies are often directed by business units facing these challenges.
To determine the metadata requirements for process-oriented information, you can analyze the lifecycle states of the information within the workflow. By asking questions like, “What key piece of data is used to determine the correct document for this task?” you can identify the document categories and essential metadata values. Later, with the help of AI, these metadata models can be used to train the system for accurate classification and tagging.
Automating the metadata tagging process within process tasks is crucial for deriving value from your ECM system. By integrating workflows with associated unstructured information and automatically tagging them as they go through assessment and approval processes, you eliminate the need for repetitive data entry and enhance efficiency. The result? Unambiguous search results that operators in critical process workflows can rely on, saving time and ensuring accuracy.
Non-Process Searches
These searches are more ad-hoc and less time-bound. While time is a factor, it has a lesser impact on the outcome. In this scenario, metadata still contributes to user satisfaction and overall efficiency, but contextual results from full-text index searches are more acceptable.
Full-text indexing combines file names, associated metadata, and the words found within the documents. However, organizations often have varying file naming conventions across different business units, leading to search results that heavily rely on context and file contents, which can vary from one author or business unit to another.
For non-process searches, an automated AI metadata tagging system becomes an efficient and productive way to tag documents. If there isn’t an automated workflow or process for tagging documents, AI remains the most viable method to improve search efficiency through effective tagging.
Conclusion
The bottom line is that, yes, AI can help you tag your content with metadata. However, for it to be relevant to your business, you need to categorize it and determine your metadata model. Once you do that, you can train your AI against it.
For process-centric content you can add the required search metadata either on ingestion or during a workflow process, if the volumes justify it. AI powered metadata tagging can also be used to augment the tagging done in the workflow process.
For non-process-centric content AI metadata tagging can be utilized to make existing content more findadable either during ingestion, or after the fact.
Keep in mind that folder and file naming conventions will also improve search results without the need for AI metadata tagging, while improving the site/folder navigation experience. More on site/folder structure in a later later post.
Summary
AI can help tag content with metadata, but first you have to categorize the content files, E.g. contracts, invoices, assessments, design spcifications, marketing plans, business plans. Then, find out from the business how they search for the files, what keywords they use, and where they find them. Keep the required number of metadata fields under three for each category.
Process-centric searches or just-in-time information delivery requires metadata. Full-text-indexing is not the correct solution due to the time constraints associated with service delivery. You can look for ways to automate the population of metadata during ingestion of files or during the process tasks or transitions. AI metadata tagging can help after-the-fact, if required.
Non-process centric or ad-hoc document searches can be helped in full-text indexing cases by using standardized and enforced site/folder and file naming conventions. This requires change management and training. Automated file name generation helps by making it easy for users to conform to policy.
AI metadata population is very useful for tagging content to be used to enhance search result reliability. Some full text search engines can be optimized to prioritize specific keywords or tags.
AI platforms that can be used to tag content files.
For more information, or if you’d like to chat, please reach out to me at mike.clarke at qtility dot com. (You will need to change the stop words to make it a proper email address. Sorry for the inconvenience, but you would not believe my spam folder.)
It used to be that we followed analyst companies like a religion in the enterprise software space, with chants of “best of breed”. Now you ask an AI what the ECM trends will be for the next five years, and it says:
“The enterprise content management (ECM) market is projected to undergo significant changes over the next five years. Here are some of the current trends and expected market behaviors:
AI Integration: The use of AI in ECM is a major trend as it helps to automate content management, including content creation, editing, and distribution. This automation process allows for improved efficiency and reduced human error.
Cloud Deployment: With the rise of remote work because of the COVID-19 pandemic, cloud deployment has gained a spike in interest. Businesses are increasingly looking towards cloud-based ECM solutions for better collaboration and access to documents from anywhere, anytime.
Knowledge Management: The importance of managing and using knowledge within an organization has become more clear. Companies are investing in ECM solutions that offer knowledge management features to improve decision-making and efficiency.
Data Privacy: With increasing concerns over data security and privacy, ECM solutions are becoming more robust and secure. This includes features such as encryption, access controls, and regular security updates.
Content Explosion: With the explosion of digital content, companies are adopting ECM solutions to manage and organize this surge in data. We expect this trend to drive the growth of the ECM market.
In terms of market behavior, the ECM market size was USD 10.33 billion in 2020 and we expected it to grow at a Compound Annual Growth Rate (CAGR) of 21.9% over the next five years [1][3]. This shows a steady growth across all regions, suggesting that more organizations are acknowledging the importance of effective content management.