Organising information is significant for effective retrieval and usage of large scale Alfresco repositories. Inaccessible or poorly organised information has therefore no real significance, as it has no strategic advantage to the organisation.
Alfresco content model uses properties, categories, aspects and tagging to organise content.
However, since these approaches require human involvement, documents lack sufficient amount of manually added metadata. Even if metadata is present, most of them are inconsistent or erroneous. Depending on users to classify information is untrustworthy and non feasible. Besides, manually creating an organisational level taxonomy to organise information is time consuming and labor intensive. This is due to the tremendous volume of content available in Alfresco repositories.
This presentation will show how to “automate" content identification, consistent data labeling and categorisation for Alfresco content using widely used machine learning techniques such as clustering, classification and topic modelling.
Further, it demonstrates how to enhance faceted search and provide recommendations to end users using the conceptually enhanced content.